Claude Code Scientist

Fully open-source autonomous scientific research capabilities for Claude Code.

NOTE: NO official ties with Anthropic or Claude! Completely independent, open-source project.

What This Is

Claude Code Scientist transforms Claude Code into a semi-autonomous, self-improving research system. It provides:

Research Director logic via CLAUDE.md
Specialized subagents for literature review, synthesis, peer review, experiments
Skills for orchestrating multi-step research workflows
Provenance tracking ensuring every claim has a source

Prerequisites

Python 3.9+ with pip
Claude Code CLI installed and authenticated
~2GB disk space for models and caches
Claude Code subscription: Required - Pro or Max (run /login inside Claude Code)

Installation

git clone https://github.com/rhowardstone/Claude-Code-Scientist.git
cd Claude-Code-Scientist

# Install Python dependencies
pip install -r requirements.txt

# Download spaCy model
python -m spacy download en_core_web_sm

Optional: Install to Other Projects

# Install to another project's .claude/ directory
./install.sh /path/to/your/project

# Or install globally to ~/.claude/
./install.sh --global

Quick Start

./session.sh new "Your research goal here"

or even more simply:

./session.sh new

That's it! This creates a session and launches Claude Code, which automatically:

Decompose the goal into Research Questions
Search literature across multiple databases (OpenAlex, PubMed, Semantic Scholar)
Extract evidence with full provenance (DOI + quote + page)
Synthesize findings into a LaTeX paper
Run peer review with three specialized reviewers
Iterate until unanimous acceptance

Session Management

Each research project gets its own isolated session:

./session.sh new "goal"      # Create new session
./session.sh list            # List all sessions
./session.sh resume <id>     # Resume a session
./session.sh current         # Check active session

Sessions store all artifacts in workspace/sessions/session_<id>/.

What You Get

A completed session produces:

workspace/sessions/session_abc123/
├── synthesis/
│   ├── paper.tex          # LaTeX paper with citations
│   ├── paper.pdf          # Compiled PDF
│   └── references.bib     # Bibliography with DOIs
├── literature/
│   └── preread_papers.json # Discovered papers with abstracts
├── peer_review/
│   ├── methodology_review.json
│   ├── statistics_review.json
│   └── impact_review.json
├── experiments/           # If experiments were run
│   ├── results.json
│   └── figures/
└── world_model.json       # Research state

Self-Improvement (Optional)

After completing research sessions, you can run CORTEX to analyze what went well and what could be improved:

./session.sh cortex   # Launch cortex session

Then run /cortex to start the self-improvement cycle. CORTEX traces the narrative flow of prior sessions, diagnoses issues, and generates fixes. It's how this system improves itself.

Architecture

Claude Code Scientist
    │
    ├── CLAUDE.md (Research Director prompt)
    │
    ├── .claude/
    │   ├── agents/              # Specialized subagent configs (7)
    │   │   ├── lit-scout.md
    │   │   ├── synthesizer.md
    │   │   ├── reviewer-*.md    # 3 reviewers
    │   │   ├── experimentalist.md
    │   │   └── tool-acquirer.md
    │   │
    │   ├── skills/              # Orchestration workflows (24)
    │   │   ├── literature-search/
    │   │   ├── peer-review/
    │   │   ├── goal-decomposition/
    │   │   ├── synthesizer/
    │   │   └── ...
    │   │
    │   ├── hooks/               # Validation automation (8)
    │   │   ├── validate-claims.py
    │   │   ├── validate-doi.py
    │   │   └── verify-provenance.py
    │   │
    │   └── rules/               # Conventions (3)
    │       ├── provenance-tracking.md
    │       ├── world-model.md
    │       └── workflow.md
    │
    ├── craig/                   # Python utilities (137 files)
    │   ├── world_model.py       # Research state management
    │   ├── doi_fetcher.py       # DOI validation
    │   ├── latex_compiler.py    # Paper compilation
    │   ├── literature/          # Database clients
    │   │   ├── openalex_client.py
    │   │   ├── pubmed_client.py
    │   │   └── semantic_scholar_client.py
    │   ├── pipeline/            # Phase implementations
    │   └── experiment_harness_templates/  # Experiment scaffolding
    │
    └── mcp-servers/literature/  # Literature search MCP

Capabilities

Literature Acquisition

Triple-search strategy (keywords, semantic, "googling the question")
Multi-database support (OpenAlex, PubMed, Semantic Scholar)
Citation graph expansion (forward + reverse)
Relevance filtering with provenance

Evidence Extraction

Rigorous claim extraction with DOI + quote + page
Confidence scoring with justification
Conflict detection across papers
Gap identification for experimental follow-up

Synthesis

Academic paper generation (LaTeX)
Proper citations with DOIs
Narrative flow (not database dump)
Separation of direct vs analogical evidence

Peer Review

Three reviewers: methodology, statistics, impact
Actionable feedback with specific locations
Revision cycles until unanimous acceptance
External validation (Codex) when available

Experimentation

Phased design → implementation → validation → execution
Real data only (no mock/simulated)
Timing validation before full runs
Incremental saves and checkpointing

Key Principles

Provenance is everything - Every claim needs DOI + quote + page
No simulation trap - Run actual tools, not simulations
Writing code ≠ running code - Always execute to verify
Honesty over completion - Missing evidence > false evidence
Unanimous peer review - Not majority vote

Hardware Requirements

Minimum (Literature-Only Research)

RAM: 8GB (32GB+ recommended)
Storage: 10GB for caches and session data
CPU: Any modern multi-core
GPU: Not required (CPU embeddings work fine)

Recommended (Full Pipeline with Experiments)

RAM: 32GB+ (genomics/scRNA-seq data can be large)
Storage: 50GB+ for paper PDFs and knowledge graphs
GPU: NVIDIA with CUDA for faster embeddings (optional)

What Gets Installed

Python packages: ~500MB
Embedding models: ~400MB (downloaded on first use)
spaCy models: ~50MB
FAISS: ~10MB

API Costs

Claude Code subscription: Required - Pro or Max (run /login inside Claude Code)
Literature search: Free (OpenAlex, PubMed, Semantic Scholar are free APIs)
PDF access: Free (uses open access sources only)

Customization

Adding MCP Servers

Create .mcp.json to add literature databases:

{
  "mcpServers": {
    "openalex": {
      "type": "stdio",
      "command": "python",
      "args": ["mcp-servers/literature/server.py"]
    }
  }
}

Adding Skills

Create .claude/skills/my-skill/SKILL.md:

---
name: my-skill
description: What it does and when to use it
user-invocable: true
---

# Skill instructions here

Adding Agents

Create .claude/agents/my-agent.md:

---
name: my-agent
description: Specialized agent description
model: sonnet
---

# Agent instructions here

Workspace

Research artifacts are stored in workspace/:

workspace/
├── world_model.json      # Research state
├── literature/           # Search results, papers
├── synthesis/            # Paper drafts
├── peer_review/          # Review feedback
└── experiments/          # Experimental artifacts

Python Utilities

The craig/ directory contains 137 Python files providing:

Core Utilities

world_model.py - Research state management (papers, claims, RQs)
doi_fetcher.py - DOI validation and metadata retrieval
latex_compiler.py - LaTeX paper compilation
conflict_detector.py - Detect contradictions across sources
data_provenance.py - Track evidence chains

Literature Clients

openalex_client.py - 200M+ open access works
pubmed_client.py - Biomedical literature
semantic_scholar_client.py - CS/AI papers with embeddings
citation_expander.py - Forward/reverse citation traversal

Experiment Harness

The craig/experiment_harness_templates/ provides scaffolding for experiments:

run.sh - Master experiment runner
steps/ - Modular experiment phases
lib/ - Utilities for checkpointing, scaling, validation

LaTeX (Optional)

For compiling the generated papers to PDF:

apt install texlive-latex-base texlive-latex-extra  # Debian/Ubuntu
# or: brew install --cask mactex                    # macOS

GPU Acceleration (Optional)

For faster embeddings during knowledge graph ingestion:

# Replace faiss-cpu with faiss-gpu
pip uninstall faiss-cpu
pip install faiss-gpu

# Use CUDA for embeddings
python -m craig.literature.knowledge_graph.ingest --device cuda --batch-size 128

Troubleshooting

Duplicate Hooks Running

If you see hooks running twice (e.g., PostToolUse:Write hook succeeded appearing 6 times instead of 3), you have hooks configured in both:

Global: ~/.claude/settings.json
Local: .claude/settings.json

Claude Code merges both, so they stack. Solutions:

Remove global hooks if you only use this project
Remove local hooks if you prefer global configuration
Accept duplicates - they're harmless, just verbose

API Key Warnings

Messages like NCBI_API_KEY not set are informational. The pipeline works without API keys but may hit rate limits. To add keys:

cp .env.example .env
# Edit .env with your keys

Get keys at:

NCBI: https://www.ncbi.nlm.nih.gov/account/settings/
OpenAlex: https://openalex.org/users/me
Semantic Scholar: https://www.semanticscholar.org/product/api

"All source phases exhausted" Errors

This means the paper couldn't be downloaded from any open access source. It's normal for paywalled papers. The pipeline continues with abstract-only data.

Contributing

We welcome contributions! The most valuable way to contribute is:

Run Cortex between your research sessions
Submit improvements as pull requests

# After a research session, run:
./session.sh cortex
# Then in Claude:
/cortex

Cortex analyzes past sessions, diagnoses issues, and generates fixes. Submit the improvements back!

See CONTRIBUTING.md for full guidelines.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 276 Commits
.claude		.claude
craig		craig
data		data
docs		docs
examples		examples
mcp-servers/literature		mcp-servers/literature
scripts		scripts
tests		tests
workspace		workspace
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
pytest.ini		pytest.ini
requirements.txt		requirements.txt
session.sh		session.sh

Folders and files

Latest commit

History

Repository files navigation

Claude Code Scientist

What This Is

Prerequisites

Installation

Optional: Install to Other Projects

Quick Start

Session Management

What You Get

Self-Improvement (Optional)

Architecture

Capabilities

Literature Acquisition

Evidence Extraction

Synthesis

Peer Review

Experimentation

Key Principles

Hardware Requirements

Minimum (Literature-Only Research)

Recommended (Full Pipeline with Experiments)

What Gets Installed

API Costs

Customization

Adding MCP Servers

Adding Skills

Adding Agents

Workspace

Python Utilities

Core Utilities

Literature Clients

Experiment Harness

LaTeX (Optional)

GPU Acceleration (Optional)

Troubleshooting

Duplicate Hooks Running

API Key Warnings

"All source phases exhausted" Errors

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages