Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Curio AI Context Entry Point

This repository uses modular AI context files under [`scripts/ai-context/`](scripts/ai-context/).

## Start Here
- Read [`scripts/ai-context/README.md`](scripts/ai-context/README.md) for purpose and workflow.
- Use [`scripts/ai-context/index.md`](scripts/ai-context/index.md) to choose which files to load for a task.

## Runtime Context (Coding/Design Chats)
- Load context files from [`scripts/ai-context/context/`](scripts/ai-context/context/).
- Default context loading (for every new coding/design chat):
1. `04-coding-preferences.md`
2. `00-project-overview.md`
3. `01-architecture.md`
4. `05-common-patterns.md`
- Conditional context (load only when relevant):
- DB/schema/pipeline-state work:
- `03-database-semantics.md`
- Active or moving subsystem work:
- `06-current-subsystems.md`
- `07-recent-decisions.md`
- `08-open-questions.md`
- Runtime rules:
- Do not summarize context files unless asked.
- If task-specific input conflicts with older context, call out the conflict and follow task-specific input.
- Treat prompt files under `scripts/ai-context/prompt/` as maintenance-only, not runtime context.
- Never add machine-specific local absolute paths or local-editor URIs to repository files; use repo-relative paths only.

## Required Verification Before Finalizing Code Edits
- Trigger: run this only when the user indicates the coding pass is done (for example: "finalize").
- For Go/backend/codegen-affecting edits, run these from repo root:
1. `LANG=en-US FFI_USE_OPENCL=1 make gen`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. `LANG=en-US FFI_USE_OPENCL=1 make gen`
1. `LANG=en-US FFI_USE_OPENCL=1 make gen` # OpenCL flag avoids CUDA build dependency during codegen

2. `golangci-lint run -v --timeout 15m --concurrency 4`
- `make gen` is the canonical generation/import-format path; do not run separate formatting passes by default.
- Do not rerun expensive checks unless code changed after the last verify run or the user asks for rerun.
- Do not claim checks passed unless they were actually run.
- If any check cannot be run (missing tool/dependency/environment/time), state that explicitly with the failing command.

## Prompt Templates (Context Maintenance)
- Prompt templates are in [`scripts/ai-context/prompt/`](scripts/ai-context/prompt/).
- Prompt files are for creating/updating context docs, not for normal runtime coding chats.

## Update Model
- Context is maintained on `main` via PRs.
- Rolling files should be updated from commit deltas on `main` (commit-based, not time-based).
- Use [`scripts/ai-context/context/99-context-validation-loop.md`](scripts/ai-context/context/99-context-validation-loop.md) after major context updates.
80 changes: 80 additions & 0 deletions scripts/ai-context/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Curio AI Context

## What this is
`scripts/ai-context/` is Curio's repository-native AI briefing system.

It has two parts:
- `context/`: reusable project briefings for runtime coding/design chats.
- `prompt/`: templates for generating or updating context files.

This directory is intended for both humans and AIs.

## Directory layout
- [`context/`](./context/): project context documents (`00`-style numbered files).
- [`prompt/`](./prompt/): context maintenance prompts (`00`-style and utility prompts).
- [`index.md`](./index.md): quick lookup for which file to load/use.

## How to start a new chat
Use a small relevant subset of `context/` files, not the full pack.

Suggested baseline:
1. `04-coding-preferences.md`
2. `00-project-overview.md`
3. `01-architecture.md`
4. `03-database-semantics.md`
5. `05-common-patterns.md`

Add when needed:
- `06-current-subsystems.md` for active moving areas.
- `07-recent-decisions.md` for decision-sensitive work.
- `08-open-questions.md` when uncertainty may affect design choices.

## Context maintenance model
- Source of truth branch: `main`.
- Updates happen through PRs.
- Rolling files should use commit-based deltas from `main`.

Commit-based update flow (example):
1. Pick target context file (for example `context/07-recent-decisions.md`).
2. Find baseline commit = last commit that touched that file.
3. Review commits/diffs from baseline to current `main`.
4. Update only durable or materially relevant content.

Useful commands:
```bash
# from repo root
TARGET=scripts/ai-context/context/07-recent-decisions.md
BASE=$(git log -1 --format=%H -- "$TARGET")

# inspect what changed on main since that context file was last updated
git log --oneline "${BASE}..main"
git diff --name-only "${BASE}..main"
git diff "${BASE}..main"
```

## Minimal edit checklist
Use this checklist for edits to `context/` and `prompt/`:
- Preserve the existing top-level structure unless there is a strong reason to change it.
- Do not invent details; if uncertain, label uncertainty explicitly.
- Keep context compact and reusable; remove stale/contradicted claims.
- For rolling files (`06`, `07`, `08`), include commit-based input from `main` (commit range used).

## Which prompt to use
- Create/regenerate a specific context file: corresponding prompt in `prompt/` (same number/name when present).
- Incremental update of an existing context file: `prompt/util-incremental-update.md`.
- Recompress/clean noisy context file: `prompt/util-recompress.md`.
- Regenerate validation loop: `prompt/99-context-validation-loop.md`.

## Validation
After major context updates, run the process in:
- [`context/99-context-validation-loop.md`](./context/99-context-validation-loop.md)

This catches contradictions, stale claims, and missing invariants.

For a quick local sanity check (headings + broken links + local path leak scan):
```bash
bash scripts/ai-context/check-context.sh
```

## Safety invariant
Context and prompt docs must use repo-relative paths/links only. Never commit machine-specific absolute paths or local URIs.
110 changes: 110 additions & 0 deletions scripts/ai-context/check-context.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
#!/usr/bin/env bash
set -euo pipefail

repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
context_dir="$repo_root/scripts/ai-context/context"
prompt_dir="$repo_root/scripts/ai-context/prompt"
all_md_dir="$repo_root/scripts/ai-context"

fail=0

err() {
echo "ERROR: $*" >&2
fail=1
}

info() {
echo "INFO: $*"
}

check_context_headings() {
info "Checking context file headings..."
while IFS= read -r -d '' file; do
first_nonempty="$(awk 'NF {print; exit}' "$file")"
if [[ -z "$first_nonempty" || ! "$first_nonempty" =~ ^#\ ]]; then
err "Context file must start with a level-1 heading: $file"
fi
done < <(find "$context_dir" -maxdepth 1 -type f -name '*.md' -print0 | sort -z)
}

normalize_target() {
local raw="$1"

# Drop surrounding whitespace.
raw="$(echo "$raw" | sed -E 's/^[[:space:]]+|[[:space:]]+$//g')"

# Skip title part in markdown links if present: path "title"
raw="$(echo "$raw" | sed -E 's/[[:space:]]+".*"$//')"

# Remove anchor.
raw="${raw%%#*}"

# Remove line suffix of form :123 or :123:45 for local file references.
raw="$(echo "$raw" | sed -E 's/:[0-9]+(:[0-9]+)?$//')"

echo "$raw"
}

check_links() {
info "Checking markdown links in scripts/ai-context..."
while IFS= read -r -d '' file; do
while IFS= read -r link; do
target="$(normalize_target "$link")"

[[ -z "$target" ]] && continue
[[ "$target" =~ ^https?:// ]] && continue
[[ "$target" =~ ^mailto: ]] && continue
[[ "$target" =~ ^# ]] && continue

if [[ "$target" == /* ]]; then
resolved="$target"
else
resolved="$(cd "$(dirname "$file")" && realpath -m "$target")"
fi

if [[ ! -e "$resolved" ]]; then
err "Broken link in $file -> $link (resolved: $resolved)"
fi
done < <(grep -oE '\[[^]]+\]\(([^)]+)\)' "$file" | sed -E 's/.*\(([^)]+)\)/\1/' || true)
done < <(find "$all_md_dir" -maxdepth 2 -type f -name '*.md' -print0 | sort -z)
}

check_prompt_nonempty() {
info "Checking prompt files are non-empty..."
while IFS= read -r -d '' file; do
if [[ ! -s "$file" ]]; then
err "Prompt file is empty: $file"
fi
done < <(find "$prompt_dir" -maxdepth 1 -type f -name '*.md' -print0 | sort -z)
}

check_no_local_paths() {
info "Checking for local absolute path/URI leaks..."
local pattern='/Users/|/home/|[A-Za-z]:\\Users\\|file://|vscode://|/var/folders/'
local matches

matches="$(
rg -n -S -e "$pattern" \
"$repo_root/AGENTS.md" \
"$all_md_dir"/*.md \
"$context_dir"/*.md \
"$prompt_dir"/*.md 2>/dev/null || true
)"

if [[ -n "$matches" ]]; then
err "Found local absolute path/URI patterns in AI context docs:"
echo "$matches" >&2
fi
}

check_context_headings
check_prompt_nonempty
check_links
check_no_local_paths

if [[ "$fail" -ne 0 ]]; then
echo "FAILED: scripts/ai-context sanity checks failed." >&2
exit 1
fi

echo "OK: scripts/ai-context sanity checks passed."
61 changes: 61 additions & 0 deletions scripts/ai-context/context/00-project-overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Curio Project Overview

## What Curio is
Curio is a clustered Filecoin Storage Provider runtime focused on replacing the legacy `lotus-miner`/`lotus-worker` operational model with a single `curio` binary plus shared control-plane state in YugabyteDB. It is designed for high availability, multi-node scheduling, and multi-miner operation with centralized configuration layering.

## Main problem areas
- Distributed sealing and proving orchestration (SDR, trees, PoRep, PoSt, commit/precommit messaging).
- Reliable on-chain interaction under strict timing and nonce constraints.
- Deal intake, indexing, IPNI announcements, and retrieval serving across a cluster.
- Multi-node storage management (sealing vs long-term storage, path metadata, GC, recovery).
- Operator ergonomics: layered config, GUI/RPC control, alerting, and maintenance workflows.

## Major subsystems
- `harmony/harmonydb`: DB abstraction, schema migrations, and cluster coordination state.
- `harmony/harmonytask` + `harmony/resources`: distributed task engine (polling, claiming, retries, resource gating).
- `tasks/*`: concrete task implementations (sealing, snap, window/winning post, message queue, indexing, GC, balance manager, proofshare, PDP).
- `lib/chainsched` + `tasks/message`: chain head callbacks and message send/watch pipelines.
- `market/*`: market implementations (`mk12`, `mk20`), libp2p endpoints, retrieval, and indexstore integration.
- `cuhttp` + `market/http` + `pdp`: public HTTP server for market/retrieval/IPNI/PDP routes.
- `web/*`: Curio Web UI + JSON-RPC-backed dashboards and admin views.
- `deps/*` + `deps/config/*`: dependency wiring and typed config system (including dynamic/layered behavior).

## External systems and dependencies
- Filecoin chain nodes (Lotus API; ETH-RPC compatibility is required for some market/PDP paths).
- YugabyteDB is core infrastructure (YSQL for HarmonyDB state; YCQL/Cassandra protocol for indexstore).
- Filecoin proving stack via `filecoin-ffi`; optional CUDA/OpenCL/SupraSeal acceleration paths.
- Libp2p + IPNI ecosystem for deal/discovery/index announcements.
- Prometheus metrics + optional Alertmanager/PagerDuty/Slack integrations.
- Let’s Encrypt/autocert support in Curio HTTP mode when TLS is not delegated.

## Primary languages and where they are used
- Go: primary implementation across runtime, tasks, APIs, market, HTTP, and tooling.
- SQL: HarmonyDB schema migrations in `harmony/harmonydb/sql`.
- CQL: indexstore schema in `market/indexstore/cql`.
- JavaScript/HTML/CSS: Web UI under `web/static` (ES modules + Lit-based components).
- Shell/Make: build/test/devnet automation in `scripts/makefiles`, `docker/`, CI workflows.

## What makes this codebase non-trivial
- The scheduler is distributed and DB-coordinated, not a single-process queue.
- Task correctness depends on chain timing, retries, and reorg-aware callbacks.
- Hardware-sensitive execution paths (GPU/CPU/OpenCL/CUDA/SupraSeal) materially change behavior and build requirements.
- State is long-lived in DB and frequently evolved via migrations; runtime behavior is tightly coupled to schema.
- Multiple product surfaces coexist: sealing/proving, market MK1.2/MK2.0, retrieval, IPNI, PDP, proofshare.

## Things an AI should understand before suggesting changes
- Treat Yugabyte-backed schema/state as part of the protocol of the system; schema and task logic must evolve together.
- Respect task idempotency and ownership/claim semantics in HarmonyTask; duplicate work prevention is task-defined.
- On-chain messaging logic is safety-critical (nonce ordering, batching, retries, failure handling).
- Config is layered (`base` always included) and can affect many machines/miners at once; avoid node-local assumptions.
- Curio is cluster software: scheduling, storage access, and maintenance actions (cordon/uncordon) have cross-node effects.
- PoSt and sealing concurrency has explicit operational safety warnings in code/docs.
- Test strategy is integration-heavy (local PG + Scylla for dev; CI matrix shards); unit-only validation is insufficient.
- Network boundaries matter: one Curio cluster should not mix miners from different Filecoin networks.

## Known areas of ongoing evolution
- Market 2.0 (`mk20`) is active and contract/product-driven; docs and code both indicate ongoing expansion.
- PDP is explicitly labeled alpha/under development.
- Proofshare/Snark market features are marked experimental.
- GUI is under active iteration; pages and behaviors may change.
- Dynamic config hot-reload coverage is growing (not all settings are restart-free yet).
- Forest integration status is **uncertain** across docs (references exist, but some docs still call it in-progress).
Loading
Loading