Skip to content

rhowardstone/Claude-Code-Scientist

Repository files navigation

Claude Code Scientist

Fully open-source autonomous scientific research capabilities for Claude Code.

NOTE: NO official ties with Anthropic or Claude! Completely independent, open-source project.

What This Is

Claude Code Scientist transforms Claude Code into a semi-autonomous, self-improving research system. It provides:

  • Research Director logic via CLAUDE.md
  • Specialized subagents for literature review, synthesis, peer review, experiments
  • Skills for orchestrating multi-step research workflows
  • Provenance tracking ensuring every claim has a source

Prerequisites

  • Python 3.9+ with pip
  • Claude Code CLI installed and authenticated
  • ~2GB disk space for models and caches
  • Claude Code subscription: Required - Pro or Max (run /login inside Claude Code)

Installation

git clone https://github.com/rhowardstone/Claude-Code-Scientist.git
cd Claude-Code-Scientist

# Install Python dependencies
pip install -r requirements.txt

# Download spaCy model
python -m spacy download en_core_web_sm

Optional: Install to Other Projects

# Install to another project's .claude/ directory
./install.sh /path/to/your/project

# Or install globally to ~/.claude/
./install.sh --global

Quick Start

./session.sh new "Your research goal here"

or even more simply:

./session.sh new

That's it! This creates a session and launches Claude Code, which automatically:

  1. Decompose the goal into Research Questions
  2. Search literature across multiple databases (OpenAlex, PubMed, Semantic Scholar)
  3. Extract evidence with full provenance (DOI + quote + page)
  4. Synthesize findings into a LaTeX paper
  5. Run peer review with three specialized reviewers
  6. Iterate until unanimous acceptance

Session Management

Each research project gets its own isolated session:

./session.sh new "goal"      # Create new session
./session.sh list            # List all sessions
./session.sh resume <id>     # Resume a session
./session.sh current         # Check active session

Sessions store all artifacts in workspace/sessions/session_<id>/.

What You Get

A completed session produces:

workspace/sessions/session_abc123/
├── synthesis/
│   ├── paper.tex          # LaTeX paper with citations
│   ├── paper.pdf          # Compiled PDF
│   └── references.bib     # Bibliography with DOIs
├── literature/
│   └── preread_papers.json # Discovered papers with abstracts
├── peer_review/
│   ├── methodology_review.json
│   ├── statistics_review.json
│   └── impact_review.json
├── experiments/           # If experiments were run
│   ├── results.json
│   └── figures/
└── world_model.json       # Research state

Self-Improvement (Optional)

After completing research sessions, you can run CORTEX to analyze what went well and what could be improved:

./session.sh cortex   # Launch cortex session

Then run /cortex to start the self-improvement cycle. CORTEX traces the narrative flow of prior sessions, diagnoses issues, and generates fixes. It's how this system improves itself.

Architecture

Claude Code Scientist
    │
    ├── CLAUDE.md (Research Director prompt)
    │
    ├── .claude/
    │   ├── agents/              # Specialized subagent configs (7)
    │   │   ├── lit-scout.md
    │   │   ├── synthesizer.md
    │   │   ├── reviewer-*.md    # 3 reviewers
    │   │   ├── experimentalist.md
    │   │   └── tool-acquirer.md
    │   │
    │   ├── skills/              # Orchestration workflows (24)
    │   │   ├── literature-search/
    │   │   ├── peer-review/
    │   │   ├── goal-decomposition/
    │   │   ├── synthesizer/
    │   │   └── ...
    │   │
    │   ├── hooks/               # Validation automation (8)
    │   │   ├── validate-claims.py
    │   │   ├── validate-doi.py
    │   │   └── verify-provenance.py
    │   │
    │   └── rules/               # Conventions (3)
    │       ├── provenance-tracking.md
    │       ├── world-model.md
    │       └── workflow.md
    │
    ├── craig/                   # Python utilities (137 files)
    │   ├── world_model.py       # Research state management
    │   ├── doi_fetcher.py       # DOI validation
    │   ├── latex_compiler.py    # Paper compilation
    │   ├── literature/          # Database clients
    │   │   ├── openalex_client.py
    │   │   ├── pubmed_client.py
    │   │   └── semantic_scholar_client.py
    │   ├── pipeline/            # Phase implementations
    │   └── experiment_harness_templates/  # Experiment scaffolding
    │
    └── mcp-servers/literature/  # Literature search MCP

Capabilities

Literature Acquisition

  • Triple-search strategy (keywords, semantic, "googling the question")
  • Multi-database support (OpenAlex, PubMed, Semantic Scholar)
  • Citation graph expansion (forward + reverse)
  • Relevance filtering with provenance

Evidence Extraction

  • Rigorous claim extraction with DOI + quote + page
  • Confidence scoring with justification
  • Conflict detection across papers
  • Gap identification for experimental follow-up

Synthesis

  • Academic paper generation (LaTeX)
  • Proper citations with DOIs
  • Narrative flow (not database dump)
  • Separation of direct vs analogical evidence

Peer Review

  • Three reviewers: methodology, statistics, impact
  • Actionable feedback with specific locations
  • Revision cycles until unanimous acceptance
  • External validation (Codex) when available

Experimentation

  • Phased design → implementation → validation → execution
  • Real data only (no mock/simulated)
  • Timing validation before full runs
  • Incremental saves and checkpointing

Key Principles

  1. Provenance is everything - Every claim needs DOI + quote + page
  2. No simulation trap - Run actual tools, not simulations
  3. Writing code ≠ running code - Always execute to verify
  4. Honesty over completion - Missing evidence > false evidence
  5. Unanimous peer review - Not majority vote

Hardware Requirements

Minimum (Literature-Only Research)

  • RAM: 8GB (32GB+ recommended)
  • Storage: 10GB for caches and session data
  • CPU: Any modern multi-core
  • GPU: Not required (CPU embeddings work fine)

Recommended (Full Pipeline with Experiments)

  • RAM: 32GB+ (genomics/scRNA-seq data can be large)
  • Storage: 50GB+ for paper PDFs and knowledge graphs
  • GPU: NVIDIA with CUDA for faster embeddings (optional)

What Gets Installed

  • Python packages: ~500MB
  • Embedding models: ~400MB (downloaded on first use)
  • spaCy models: ~50MB
  • FAISS: ~10MB

API Costs

  • Claude Code subscription: Required - Pro or Max (run /login inside Claude Code)
  • Literature search: Free (OpenAlex, PubMed, Semantic Scholar are free APIs)
  • PDF access: Free (uses open access sources only)

Customization

Adding MCP Servers

Create .mcp.json to add literature databases:

{
  "mcpServers": {
    "openalex": {
      "type": "stdio",
      "command": "python",
      "args": ["mcp-servers/literature/server.py"]
    }
  }
}

Adding Skills

Create .claude/skills/my-skill/SKILL.md:

---
name: my-skill
description: What it does and when to use it
user-invocable: true
---

# Skill instructions here

Adding Agents

Create .claude/agents/my-agent.md:

---
name: my-agent
description: Specialized agent description
model: sonnet
---

# Agent instructions here

Workspace

Research artifacts are stored in workspace/:

workspace/
├── world_model.json      # Research state
├── literature/           # Search results, papers
├── synthesis/            # Paper drafts
├── peer_review/          # Review feedback
└── experiments/          # Experimental artifacts

Python Utilities

The craig/ directory contains 137 Python files providing:

Core Utilities

  • world_model.py - Research state management (papers, claims, RQs)
  • doi_fetcher.py - DOI validation and metadata retrieval
  • latex_compiler.py - LaTeX paper compilation
  • conflict_detector.py - Detect contradictions across sources
  • data_provenance.py - Track evidence chains

Literature Clients

  • openalex_client.py - 200M+ open access works
  • pubmed_client.py - Biomedical literature
  • semantic_scholar_client.py - CS/AI papers with embeddings
  • citation_expander.py - Forward/reverse citation traversal

Experiment Harness

The craig/experiment_harness_templates/ provides scaffolding for experiments:

  • run.sh - Master experiment runner
  • steps/ - Modular experiment phases
  • lib/ - Utilities for checkpointing, scaling, validation

LaTeX (Optional)

For compiling the generated papers to PDF:

apt install texlive-latex-base texlive-latex-extra  # Debian/Ubuntu
# or: brew install --cask mactex                    # macOS

GPU Acceleration (Optional)

For faster embeddings during knowledge graph ingestion:

# Replace faiss-cpu with faiss-gpu
pip uninstall faiss-cpu
pip install faiss-gpu

# Use CUDA for embeddings
python -m craig.literature.knowledge_graph.ingest --device cuda --batch-size 128

Troubleshooting

Duplicate Hooks Running

If you see hooks running twice (e.g., PostToolUse:Write hook succeeded appearing 6 times instead of 3), you have hooks configured in both:

  • Global: ~/.claude/settings.json
  • Local: .claude/settings.json

Claude Code merges both, so they stack. Solutions:

  1. Remove global hooks if you only use this project
  2. Remove local hooks if you prefer global configuration
  3. Accept duplicates - they're harmless, just verbose

API Key Warnings

Messages like NCBI_API_KEY not set are informational. The pipeline works without API keys but may hit rate limits. To add keys:

cp .env.example .env
# Edit .env with your keys

Get keys at:

"All source phases exhausted" Errors

This means the paper couldn't be downloaded from any open access source. It's normal for paywalled papers. The pipeline continues with abstract-only data.

Contributing

We welcome contributions! The most valuable way to contribute is:

  1. Run Cortex between your research sessions
  2. Submit improvements as pull requests
# After a research session, run:
./session.sh cortex
# Then in Claude:
/cortex

Cortex analyzes past sessions, diagnoses issues, and generates fixes. Submit the improvements back!

See CONTRIBUTING.md for full guidelines.

License

MIT

Releases

No releases published

Packages

 
 
 

Contributors