See our blog or arXiv preprint for more info.
- Python: Version 3.12 or higher.
- API Keys:
EDISON_API_KEY: For accessing Edison platform agents (Crow, Falcon - now called 'Literature'). Obtain from https://platform.edisonscientific.com/profile. You must first create an Edison profile, purchase credits and then create an API key (Account -> Profile -> API Tokens).- An API key for your chosen LLM provider (e.g.,
OPENAI_API_KEYif using OpenAI models). Robin uses LiteLLM, so it can support various providers. - The data analysis portion of this repo requires access to the Edison platform. Without access, all the hypothesis and experiment generation code can still be run.
Docker is a tool that packages software into a self-contained "container" that runs the same way on any computer, regardless of your operating system or what else is installed. It's the recommended approach for Robin as it avoids the most common installation issues.
Install Docker first: Download and install Docker Desktop for your operating system (Mac or Windows). Once installed, open Docker Desktop and make sure it is running (you should see the Docker icon in your menu bar/system tray) before proceeding.
For a fully self-contained environment that avoids OS-level dependency conflicts, Docker is the recommended approach:
-
Build the image:
docker build -t robin . -
Set up API keys:
cp .env.example .env # Edit .env and fill in your EDISON_API_KEY and OPENAI_API_KEYImportant: do not wrap values in quotes (e.g.
OPENAI_API_KEY=sk-abc123, notOPENAI_API_KEY="sk-abc123"). Docker reads the file differently from Python and will include the quotes as part of the key. -
Run Jupyter:
docker run -p 8888:8888 --env-file .env robin
Jupyter will print three URLs — use only the one that starts with
http://127.0.0.1:8888/(the other two are internal container addresses and will not work). Your URL will look like:http://127.0.0.1:8888/lab/tree/robin_demo.ipynb?token=...
-
Clone the Repository:
git clone https://github.com/Future-House/robin.git cd robin -
Create and Activate a Virtual Environment (Recommended):
uv venv .venv source .venv/bin/activateOR
python3 -m venv .robin_env source .robin_env/bin/activate -
Install Dependencies: The project uses
pyproject.tomlfor dependency management. Install the base package and development dependencies (which include Jupyter):uv pip install -e '.[dev]'OR
pip install -e '.[dev]' -
Set API Keys: Copy the provided template and fill in your keys:
cp .env.example .env # Then edit .env with your actual keysRobin will automatically load this
.envfile at startup. Alternatively, you can export the variables in your shell, or pass them directly when creating theRobinConfigurationobject.
In order to run Robin as used in the manuscript, only input the name of a disease, with no other text. If you wish to optimize how Robin searches for experimental models and therapeutic candidates, we suggest changing the internal prompts of Robin (via prompts.py), not the initial input to the pipeline.
-
Launch Jupyter Notebook or JupyterLab: Navigate to the
robindirectory in your terminal (ensure your virtual environment is activated) and run:jupyter notebook # OR jupyter lab -
Open the Notebook: In the Jupyter interface, open
robin_demo.ipynb. -
Configure Robin: Locate the cell where the
RobinConfigurationobject is created:config = RobinConfiguration( disease_name="DISEASE_NAME", # <-- Customize the disease name here # You can also explicitly set API keys here if not using environment variables: # edison_api_key="your_edison_api_key_here" )
- Modify
disease_name: Change"DISEASE_NAME"to your target disease. - API Keys: If you didn't set environment variables, you can provide the keys directly in the
RobinConfigurationinstantiation. - LLM Choice: The default is
o4-mini. You can changellm_nameandllm_configinRobinConfigurationif you wish to use a different model supported by LiteLLM (ensure you have the corresponding API key set). - Other parameters like
num_queries,num_assays,num_candidatescan also be adjusted here if needed.
- Modify
-
Run the Notebook Cells: Execute the cells in the notebook sequentially. The notebook is structured to guide you through:
- Experimental Assay Generation: Generates and ranks potential experimental assays.
- Therapeutic Candidate Generation: Based on the top assay, generates and ranks therapeutic candidates.
- (Optional) Experimental Data Analysis: If you have experimental data, this section can analyze it and feed insights back into candidate generation. This requires access to the Edison platform data analysis features.
-
Logs: Detailed logs will be printed in the notebook output and/or your console, showing the progress of each step (e.g., query generation, literature search, candidate proposal, ranking).
-
Files: Results are saved in a new subdirectory within
robin_output/, named after thedisease_nameand a timestamp (e.g.,robin_output/DISEASE_NAME_YYYY-MM-DD_HH-MM/). This directory contains a structured set of outputs, including:- Folders for detailed hypotheses and literature reviews for both experimental assays and therapeutic candidates (e.g.,
experimental_assay_detailed_hypotheses/,therapeutic_candidate_literature_reviews/). - CSV files for ranking results and final ranked lists (e.g.,
experimental_assay_ranking_results.csv,ranked_therapeutic_candidates.csv). - Text summaries for proposed assays and candidates (e.g.,
experimental_assay_summary.txt,therapeutic_candidates_summary.txt). - If the optional data analysis step is run (using the
data_analysisfunction), there will be an additionaldata_analysis/subfolder containing outputs from the Finch agent (e.g.,consensus_results.csv). Correspondingly, some therapeutic candidate-related files generated after this step may have an_experimentalsuffix (e.g.,ranked_therapeutic_candidates_experimental.csv,therapeutic_candidate_detailed_hypotheses_experimental/).
- Folders for detailed hypotheses and literature reviews for both experimental assays and therapeutic candidates (e.g.,
The examples folder provides practical usage demonstrations of pre-generated output directories from complete Robin runs for 10 diseases:
- Age-Related Hearing Loss
- Celiac Disease
- Charcot-Marie-Tooth Disease
- Chronic Kidney Disease
- Friedreich's Ataxia
- Glaucoma
- Idiopathic Pulmonary Fibrosis
- Non-alcoholic Steatohepatitis
- Polycystic Ovary Syndrome
- Sarcopenia
Each disease-specific subfolder mirrors the exact file and directory structure a user would obtain in their own robin_output/ directory after a run:
experimental_assay_detailed_hypotheses/: Text files containing detailed reports for each proposed experimental assay.experimental_assay_literature_reviews/: Text files of literature reviews generated from queries related to assay development.experimental_assay_ranking_results.csv: CSV file showing pairwise comparison results for assay ranking.experimental_assay_summary.txt: A textual summary of the proposed experimental assays.ranked_therapeutic_candidates.csv: CSV file listing the final ranked therapeutic candidates and their strength scores.therapeutic_candidate_detailed_hypotheses/: Text files with detailed reports for each proposed therapeutic candidate.therapeutic_candidate_literature_reviews/: Text files of literature reviews for therapeutic candidate queries.therapeutic_candidate_ranking_results.csv: CSV file of pairwise comparison results for candidate ranking.therapeutic_candidates_summary.txt: A textual summary of the proposed therapeutic candidates.
These example outputs are provided to help users to understand the depth, format, and typical errors seen in Robin runs across various diseases.
A full example trajectory of both the initial therapeutic candidate generation and experimental data analysis can be found in the robin_full.ipynb notebook. This notebook includes the parameters and agents used in the paper.
While this guide focuses on the robin_demo.ipynb notebook, the robin Python module (in the robin/ directory) can be imported and its functions (experimental_assay, therapeutic_candidates, data_analysis) can be used programmatically in your own Python scripts for more customized workflows.