HPC-AutoResearch targets HPC environments with a Singularity-based, split-phase execution path. It orchestrates: idea loading/generation → BFTS tree search experiments → plot aggregation → LaTeX writeup → optional PDF review.
- Split-phase execution with explicit install/coding/compile/run steps inside Singularity.
- Per-run isolation: every experiment gets its own config, logs, workspace, and artifacts.
- Tree-search agent manager with parallel workers (GPU-aware, CPU fallback).
- Optional MemGPT-style memory with LLM-based compression across branches for longer context.
- Resource files to mount datasets and inject templates/docs into prompts.
- Configurable persona system for role-specific prompt customization.
- Token tracking for monitoring LLM usage and costs.
- Multi-seed evaluation for robust result validation.
launch_scientist_bfts.py: main launcher (ideas -> experiments -> plots/writeup/review).generate_paper.py: plots/writeup/review for an existing run directory.ai_scientist/: core agent logic, tree search, prompts, memory, and utilities.memory/: MemGPT-style hierarchical memory implementation.treesearch/: BFTS agent manager and parallel workers.utils/: token tracking and model parameter utilities.persona.py: configurable persona system for prompt customization.
prompt/: split-phase and stage prompts, response schemas, and writeup templates.common/: base system prompts and domain-neutral instructions.phases/: Phase 0 planning and Phase 1 installer prompts.memory/: memory compression prompt templates.schemas/: structured response schemas for split execution.
template/: base Singularity image instructions (template/README.md).docs/: expanded guides for requirements, configuration, resources, outputs, and troubleshooting.tests/: unit tests for memory, resources, compression, and parallelism.experiments/: run outputs (generated at runtime).
- Run end-to-end: generate ideas (optional) -> run
launch_scientist_bfts.py-> reviewexperiments/<run>/. - Reuse a run: skip the experiment and run
generate_paper.pyfor plots/writeups. - Iterate locally: use
--phase_mode singlefor quick iteration without Singularity. - Custom persona: set
agent.role_descriptionin config to customize the agent's role (e.g., "HPC Researcher"). - Enable memory: use
--enable_memgptor setmemory.enabled=truefor hierarchical context management.- Note: Without MemGPT, there is no context budget management. Idea/task descriptions are injected as full text, which may exceed LLM context limits for complex experiments. See docs/memory/memory.md for details.
- Parallel experiments: adjust
--num_workersto scale across available GPUs.
- If you are new: start with
docs/getting-started/requirements.md,docs/getting-started/installation.md, anddocs/getting-started/quickstart.md. - If you are operating on HPC: read
docs/configuration/execution-modes.md,docs/configuration/configuration.md, anddocs/configuration/outputs.md. - If you are extending prompts/resources: read
docs/architecture/llm-context.mdanddocs/architecture/resource-files.md.
Detailed guides live in docs/. Start with
docs/README.md for the full index.
- Requirements: docs/getting-started/requirements.md (host + container dependencies, optional tools); related: requirements.txt, bfts_config.yaml, template/README.md.
- Installation: docs/getting-started/installation.md (conda/pip/torch setup, image prep); related: requirements.txt, template/README.md.
- Credentials: docs/getting-started/credentials.md (model provider API keys and scope); related: bfts_config.yaml, ai_scientist/llm/.
- CLI entry points: docs/configuration/cli-entry-points.md (what each script does); related: launch_scientist_bfts.py, generate_paper.py, ai_scientist/perform_ideation_temp_free.py.
- Quickstart: docs/getting-started/quickstart.md (minimal end-to-end run); related: template/README.md, data_resources.json.
- Configuration: docs/configuration/configuration.md (how
bfts_config.yamlis applied); related: bfts_config.yaml, launch_scientist_bfts.py. - Execution modes: docs/configuration/execution-modes.md (split vs single, worker behavior); related: prompt/execution_split_schema.txt, ai_scientist/treesearch/parallel_agent.py.
- LLM context: docs/architecture/llm-context.md (prompt assembly and stage inputs); related: prompt/, prompt/base_system.txt.
- MemGPT-style memory: docs/memory/memory.md (hierarchical memory + persistence); related: ai_scientist/memory/memgpt_store.py, ai_scientist/memory/resource_memory.py.
- Resource files: docs/architecture/resource-files.md (JSON/YAML schema and staging rules); related: data_resources.json, tests/test_resource.py.
- Outputs: docs/configuration/outputs.md (run directories, logs, artifacts); related: ai_scientist/treesearch/utils/viz_templates/template.html.
- Testing: docs/development/testing.md (unit tests and scope); related: tests/.
- Troubleshooting: docs/development/troubleshooting.md (common failures and fixes); related: bfts_config.yaml, template/README.md.
- Citation: docs/citation.md (bibtex and paper link); related: README.md.
The tree search component is built on top of the AIDE project. This project extends the original AI-Scientist-v2 with split-phase execution, MemGPT-style context-engineering and Singularity container support.