filecoin-project · LexLuthr · Mar 23, 2026 · Mar 23, 2026 · Reiers · Mar 23, 2026
@@ -0,0 +1,46 @@
+# Curio AI Context Entry Point
+
+This repository uses modular AI context files under [`scripts/ai-context/`](scripts/ai-context/).
+
+## Start Here
+- Read [`scripts/ai-context/README.md`](scripts/ai-context/README.md) for purpose and workflow.
+- Use [`scripts/ai-context/index.md`](scripts/ai-context/index.md) to choose which files to load for a task.
+
+## Runtime Context (Coding/Design Chats)
+- Load context files from [`scripts/ai-context/context/`](scripts/ai-context/context/).
+- Default context loading (for every new coding/design chat):
+1. `04-coding-preferences.md`
+2. `00-project-overview.md`
+3. `01-architecture.md`
+4. `05-common-patterns.md`
+- Conditional context (load only when relevant):
+- DB/schema/pipeline-state work:
+- `03-database-semantics.md`
+- Active or moving subsystem work:
+- `06-current-subsystems.md`
+- `07-recent-decisions.md`
+- `08-open-questions.md`
+- Runtime rules:
+- Do not summarize context files unless asked.
+- If task-specific input conflicts with older context, call out the conflict and follow task-specific input.
+- Treat prompt files under `scripts/ai-context/prompt/` as maintenance-only, not runtime context.
+- Never add machine-specific local absolute paths or local-editor URIs to repository files; use repo-relative paths only.
+
+## Required Verification Before Finalizing Code Edits
+- Trigger: run this only when the user indicates the coding pass is done (for example: "finalize").
+- For Go/backend/codegen-affecting edits, run these from repo root:
+1. `LANG=en-US FFI_USE_OPENCL=1 make gen`
-1. `LANG=en-US FFI_USE_OPENCL=1 make gen`
+1. `LANG=en-US FFI_USE_OPENCL=1 make gen`  # OpenCL flag avoids CUDA build dependency during codegen
-1. `LANG=en-US FFI_USE_OPENCL=1 make gen`
+1. `LANG=en-US FFI_USE_OPENCL=1 make gen`  # OpenCL flag avoids CUDA build dependency during codegen
+2. `golangci-lint run -v --timeout 15m --concurrency 4`
+- `make gen` is the canonical generation/import-format path; do not run separate formatting passes by default.
+- Do not rerun expensive checks unless code changed after the last verify run or the user asks for rerun.
+- Do not claim checks passed unless they were actually run.
+- If any check cannot be run (missing tool/dependency/environment/time), state that explicitly with the failing command.
+
+## Prompt Templates (Context Maintenance)
+- Prompt templates are in [`scripts/ai-context/prompt/`](scripts/ai-context/prompt/).
+- Prompt files are for creating/updating context docs, not for normal runtime coding chats.
+
+## Update Model
+- Context is maintained on `main` via PRs.
+- Rolling files should be updated from commit deltas on `main` (commit-based, not time-based).
+- Use [`scripts/ai-context/context/99-context-validation-loop.md`](scripts/ai-context/context/99-context-validation-loop.md) after major context updates.
@@ -0,0 +1,80 @@
+# Curio AI Context
+
+## What this is
+`scripts/ai-context/` is Curio's repository-native AI briefing system.
+
+It has two parts:
+- `context/`: reusable project briefings for runtime coding/design chats.
+- `prompt/`: templates for generating or updating context files.
+
+This directory is intended for both humans and AIs.
+
+## Directory layout
+- [`context/`](./context/): project context documents (`00`-style numbered files).
+- [`prompt/`](./prompt/): context maintenance prompts (`00`-style and utility prompts).
+- [`index.md`](./index.md): quick lookup for which file to load/use.
+
+## How to start a new chat
+Use a small relevant subset of `context/` files, not the full pack.
+
+Suggested baseline:
+1. `04-coding-preferences.md`
+2. `00-project-overview.md`
+3. `01-architecture.md`
+4. `03-database-semantics.md`
+5. `05-common-patterns.md`
+
+Add when needed:
+- `06-current-subsystems.md` for active moving areas.
+- `07-recent-decisions.md` for decision-sensitive work.
+- `08-open-questions.md` when uncertainty may affect design choices.
+
+## Context maintenance model
+- Source of truth branch: `main`.
+- Updates happen through PRs.
+- Rolling files should use commit-based deltas from `main`.
+
+Commit-based update flow (example):
+1. Pick target context file (for example `context/07-recent-decisions.md`).
+2. Find baseline commit = last commit that touched that file.
+3. Review commits/diffs from baseline to current `main`.
+4. Update only durable or materially relevant content.
+
+Useful commands:
+```bash
+# from repo root
+TARGET=scripts/ai-context/context/07-recent-decisions.md
+BASE=$(git log -1 --format=%H -- "$TARGET")
+
+# inspect what changed on main since that context file was last updated
+git log --oneline "${BASE}..main"
+git diff --name-only "${BASE}..main"
+git diff "${BASE}..main"
+```
+
+## Minimal edit checklist
+Use this checklist for edits to `context/` and `prompt/`:
+- Preserve the existing top-level structure unless there is a strong reason to change it.
+- Do not invent details; if uncertain, label uncertainty explicitly.
+- Keep context compact and reusable; remove stale/contradicted claims.
+- For rolling files (`06`, `07`, `08`), include commit-based input from `main` (commit range used).
+
+## Which prompt to use
+- Create/regenerate a specific context file: corresponding prompt in `prompt/` (same number/name when present).
+- Incremental update of an existing context file: `prompt/util-incremental-update.md`.
+- Recompress/clean noisy context file: `prompt/util-recompress.md`.
+- Regenerate validation loop: `prompt/99-context-validation-loop.md`.
+
+## Validation
+After major context updates, run the process in:
+- [`context/99-context-validation-loop.md`](./context/99-context-validation-loop.md)
+
+This catches contradictions, stale claims, and missing invariants.
+
+For a quick local sanity check (headings + broken links + local path leak scan):
+```bash
+bash scripts/ai-context/check-context.sh
+```
+
+## Safety invariant
+Context and prompt docs must use repo-relative paths/links only. Never commit machine-specific absolute paths or local URIs.
@@ -0,0 +1,110 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+repo_root="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
+context_dir="$repo_root/scripts/ai-context/context"
+prompt_dir="$repo_root/scripts/ai-context/prompt"
+all_md_dir="$repo_root/scripts/ai-context"
+
+fail=0
+
+err() {
+  echo "ERROR: $*" >&2
+  fail=1
+}
+
+info() {
+  echo "INFO: $*"
+}
+
+check_context_headings() {
+  info "Checking context file headings..."
+  while IFS= read -r -d '' file; do
+    first_nonempty="$(awk 'NF {print; exit}' "$file")"
+    if [[ -z "$first_nonempty" || ! "$first_nonempty" =~ ^#\  ]]; then
+      err "Context file must start with a level-1 heading: $file"
+    fi
+  done < <(find "$context_dir" -maxdepth 1 -type f -name '*.md' -print0 | sort -z)
+}
+
+normalize_target() {
+  local raw="$1"
+
+  # Drop surrounding whitespace.
+  raw="$(echo "$raw" | sed -E 's/^[[:space:]]+|[[:space:]]+$//g')"
+
+  # Skip title part in markdown links if present: path "title"
+  raw="$(echo "$raw" | sed -E 's/[[:space:]]+".*"$//')"
+
+  # Remove anchor.
+  raw="${raw%%#*}"
+
+  # Remove line suffix of form :123 or :123:45 for local file references.
+  raw="$(echo "$raw" | sed -E 's/:[0-9]+(:[0-9]+)?$//')"
+
+  echo "$raw"
+}
+
+check_links() {
+  info "Checking markdown links in scripts/ai-context..."
+  while IFS= read -r -d '' file; do
+    while IFS= read -r link; do
+      target="$(normalize_target "$link")"
+
+      [[ -z "$target" ]] && continue
+      [[ "$target" =~ ^https?:// ]] && continue
+      [[ "$target" =~ ^mailto: ]] && continue
+      [[ "$target" =~ ^# ]] && continue
+
+      if [[ "$target" == /* ]]; then
+        resolved="$target"
+      else
+        resolved="$(cd "$(dirname "$file")" && realpath -m "$target")"
+      fi
+
+      if [[ ! -e "$resolved" ]]; then
+        err "Broken link in $file -> $link (resolved: $resolved)"
+      fi
+    done < <(grep -oE '\[[^]]+\]\(([^)]+)\)' "$file" | sed -E 's/.*\(([^)]+)\)/\1/' || true)
+  done < <(find "$all_md_dir" -maxdepth 2 -type f -name '*.md' -print0 | sort -z)
+}
+
+check_prompt_nonempty() {
+  info "Checking prompt files are non-empty..."
+  while IFS= read -r -d '' file; do
+    if [[ ! -s "$file" ]]; then
+      err "Prompt file is empty: $file"
+    fi
+  done < <(find "$prompt_dir" -maxdepth 1 -type f -name '*.md' -print0 | sort -z)
+}
+
+check_no_local_paths() {
+  info "Checking for local absolute path/URI leaks..."
+  local pattern='/Users/|/home/|[A-Za-z]:\\Users\\|file://|vscode://|/var/folders/'
+  local matches
+
+  matches="$(
+    rg -n -S -e "$pattern" \
+      "$repo_root/AGENTS.md" \
+      "$all_md_dir"/*.md \
+      "$context_dir"/*.md \
+      "$prompt_dir"/*.md 2>/dev/null || true
+  )"
+
+  if [[ -n "$matches" ]]; then
+    err "Found local absolute path/URI patterns in AI context docs:"
+    echo "$matches" >&2
+  fi
+}
+
+check_context_headings
+check_prompt_nonempty
+check_links
+check_no_local_paths
+
+if [[ "$fail" -ne 0 ]]; then
+  echo "FAILED: scripts/ai-context sanity checks failed." >&2
+  exit 1
+fi
+
+echo "OK: scripts/ai-context sanity checks passed."
@@ -0,0 +1,61 @@
+# Curio Project Overview
+
+## What Curio is
+Curio is a clustered Filecoin Storage Provider runtime focused on replacing the legacy `lotus-miner`/`lotus-worker` operational model with a single `curio` binary plus shared control-plane state in YugabyteDB. It is designed for high availability, multi-node scheduling, and multi-miner operation with centralized configuration layering.
+
+## Main problem areas
+- Distributed sealing and proving orchestration (SDR, trees, PoRep, PoSt, commit/precommit messaging).
+- Reliable on-chain interaction under strict timing and nonce constraints.
+- Deal intake, indexing, IPNI announcements, and retrieval serving across a cluster.
+- Multi-node storage management (sealing vs long-term storage, path metadata, GC, recovery).
+- Operator ergonomics: layered config, GUI/RPC control, alerting, and maintenance workflows.
+
+## Major subsystems
+- `harmony/harmonydb`: DB abstraction, schema migrations, and cluster coordination state.
+- `harmony/harmonytask` + `harmony/resources`: distributed task engine (polling, claiming, retries, resource gating).
+- `tasks/*`: concrete task implementations (sealing, snap, window/winning post, message queue, indexing, GC, balance manager, proofshare, PDP).
+- `lib/chainsched` + `tasks/message`: chain head callbacks and message send/watch pipelines.
+- `market/*`: market implementations (`mk12`, `mk20`), libp2p endpoints, retrieval, and indexstore integration.
+- `cuhttp` + `market/http` + `pdp`: public HTTP server for market/retrieval/IPNI/PDP routes.
+- `web/*`: Curio Web UI + JSON-RPC-backed dashboards and admin views.
+- `deps/*` + `deps/config/*`: dependency wiring and typed config system (including dynamic/layered behavior).
+
+## External systems and dependencies
+- Filecoin chain nodes (Lotus API; ETH-RPC compatibility is required for some market/PDP paths).
+- YugabyteDB is core infrastructure (YSQL for HarmonyDB state; YCQL/Cassandra protocol for indexstore).
+- Filecoin proving stack via `filecoin-ffi`; optional CUDA/OpenCL/SupraSeal acceleration paths.
+- Libp2p + IPNI ecosystem for deal/discovery/index announcements.
+- Prometheus metrics + optional Alertmanager/PagerDuty/Slack integrations.
+- Let’s Encrypt/autocert support in Curio HTTP mode when TLS is not delegated.
+
+## Primary languages and where they are used
+- Go: primary implementation across runtime, tasks, APIs, market, HTTP, and tooling.
+- SQL: HarmonyDB schema migrations in `harmony/harmonydb/sql`.
+- CQL: indexstore schema in `market/indexstore/cql`.
+- JavaScript/HTML/CSS: Web UI under `web/static` (ES modules + Lit-based components).
+- Shell/Make: build/test/devnet automation in `scripts/makefiles`, `docker/`, CI workflows.
+
+## What makes this codebase non-trivial
+- The scheduler is distributed and DB-coordinated, not a single-process queue.
+- Task correctness depends on chain timing, retries, and reorg-aware callbacks.
+- Hardware-sensitive execution paths (GPU/CPU/OpenCL/CUDA/SupraSeal) materially change behavior and build requirements.
+- State is long-lived in DB and frequently evolved via migrations; runtime behavior is tightly coupled to schema.
+- Multiple product surfaces coexist: sealing/proving, market MK1.2/MK2.0, retrieval, IPNI, PDP, proofshare.
+
+## Things an AI should understand before suggesting changes
+- Treat Yugabyte-backed schema/state as part of the protocol of the system; schema and task logic must evolve together.
+- Respect task idempotency and ownership/claim semantics in HarmonyTask; duplicate work prevention is task-defined.
+- On-chain messaging logic is safety-critical (nonce ordering, batching, retries, failure handling).
+- Config is layered (`base` always included) and can affect many machines/miners at once; avoid node-local assumptions.
+- Curio is cluster software: scheduling, storage access, and maintenance actions (cordon/uncordon) have cross-node effects.
+- PoSt and sealing concurrency has explicit operational safety warnings in code/docs.
+- Test strategy is integration-heavy (local PG + Scylla for dev; CI matrix shards); unit-only validation is insufficient.
+- Network boundaries matter: one Curio cluster should not mix miners from different Filecoin networks.
+
+## Known areas of ongoing evolution
+- Market 2.0 (`mk20`) is active and contract/product-driven; docs and code both indicate ongoing expansion.
+- PDP is explicitly labeled alpha/under development.
+- Proofshare/Snark market features are marked experimental.
+- GUI is under active iteration; pages and behaviors may change.
+- Dynamic config hot-reload coverage is growing (not all settings are restart-free yet).
+- Forest integration status is **uncertain** across docs (references exist, but some docs still call it in-progress).