Releases: docxology/template
Research Project Template v3.4.0
ddd0b9d release: v3.4.0 — comprehensive review + thermo-nuclear v2 + backlog closeout
5aacc3b fix(ci): stabilize active_inference timeout and link-check xdist worker
38ec01d docs(backlog): close REPRO-VERIFY-1 + EVIDENCE-CLAIM-1; reframe RELEASE-TAG-1; refresh snapshot
00ff182 fix(ci): sync generated docs and repair maintenance link
ed01f5b style(infra): ruff-format deferred follow-up leaves
ccc2d91 refactor(infra): complete deferred thermo-nuclear audit follow-ups
370718b fix(ci): refresh COUNTS and reporting doc paths after close-out
ef90407 docs(infra): point multi-project reporting docs at reporting leaf
8db3bb9 refactor(infra): close out thermo-nuclear audit waves F–O
4ac5fcf test(llm): use the discovered Ollama model so requires_ollama tests pass with any pulled model
621d357 chore(skills): regenerate skill manifest after the merge's SKILL.md edits
2b11c5f fix(docs): retarget archived-audit links from canonical_facts.md to COUNTS.md
15c3602 fix(template_code_project): keep src/ standalone (no infrastructure import) + standalone-loadable
352dd95 chore: capture concurrent thermo-nuclear-remediation pass (idle ~5h, uncommitted WIP)
680b102 chore(counts): regenerate COUNTS.md after Q2-A scientific module deletions
deb4e59 feat+refactor: Daniel-approved decisions Q1-B/Q2-A/Q3-A/Q4-B/Q5-A
30a800d chore(counts): regenerate COUNTS.md after committing the generator (553->554 tracked infra .py)
48d2b4c feat+refactor: COUNTS generator closes the doc-drift loop; canonical_facts->COUNTS rename (Phase D / SystemsThinking root-cause)
4667ae4 fix+docs: remaining RedTeam findings — error-handling, gates, determinism, env hygiene (Phases F/G)
6916207 test(active_inference): reach 90%+ standalone coverage; stop dirtying tracked manuscript (Phase E / RedTeam EX-2/EX-6)
4166e49 feat(determinism): SOURCE_DATE_EPOCH byte-stability for PDFs/manuscripts/data — Phase C
dc2a108 feat(pipeline): reproducible project x stage run matrix (run.config) — Phase B
0c20b9d docs: complete dead-module doc cleanup for removed modules (Phase A finish)
0f77780 refactor: close thermo-nuclear remediation (poster, CLI, test splits, AGENTS stubs)
e79193d refactor: remove 3 confirmed-dead modules (RedTeam INFRA-CORE-03/VAL-2/REN-1)
244aa3f fix(test): align repro-bundle absent-output tests with project-relative paths (REPRO-1 follow-up)
c02a563 style: ruff format test_backends_http.py (arxiv test from ARX-1)
240fd6c docs: accuracy pass across docs/, infrastructure AGENTS, scripts, exemplars (RedTeam 30+ findings)
e3d8345 fix(deep_research): fail-fast provider validation in submit_many (RedTeam DR-1)
6adae5f fix(scripts,search): drop dead MENU_SCRIPT_MAPPING; fix old-style arXiv IDs (RedTeam SCRIPTS-03/ARX-1)
3193ad7 fix(orchestration,pipeline,publishing): 4 correctness fixes (RedTeam ARCH-01/INFRA-CORE-02/PUB-1/VAL-1)
72fef86 refactor(scripts): group operator tooling into scripts/maintenance/ (RedTeam SCRIPTS-08)
5aafdf6 fix(exemplars): gate active_inference figure test on artifacts; correct claim ledger (RedTeam EX-1)
20715f3 fix(publishing): rebase repro-bundle output paths so verify actually works (RedTeam REPRO-1)
c0f20cc fix(scripts,docs): repair broken bootstraps + refresh stale generated docs
3c23094 fix(gates): harden confidentiality guard + no-mocks enforcer (RedTeam CONF-1/GATE-1/GATE-3)
bb82e8c Update 08_methods_sheaf.md
b2ac6d2 up
2c15511 feat: deep research, output validation, Kmyth steganography, and infra refresh
138d63a feat(exemplars): active_inference use-when landed — remove self-expiring roster pin
ee87e2b feat(active_inference): land in-flight multi-track evolution + newcomer-grade sheaf docs
808c819 feat(exemplars): uniform differentiation map + roster-derived git guards
b5e9265 fix(search): retry arXiv fetch_by_id on 429/503 with exponential backoff
48cb644 fix(ci): land prior-session work the regenerated factsheet pins depend on
208e532 style: ruff-format working_render.py (CI format-check blocker at HEAD)
5838221 fix(ci): land in-tree fixes for both pre-existing main CI failures
43dcd3a test+docs+ci: purge vacuous tests, real-HTTP Zenodo test, TeX Live on CI, mermaid provisioning docs
96661c2 docs+test+script RedTeam: fix doc drift, thin-orchestrator violations, dedupe tests
7c18a33 feat(EVIDENCE-CLAIM-1): ingest autoresearch claim ledger into evidence graph
ae38ec9 fix(publishing): REPRO-VERIFY-1 — fail closed on declared-but-absent outputs
v3.3.0
[3.3.0] — 2026-06-07
Added
- 🔎 Reference-existence verification (
infrastructure/reference/verification) —
deterministic anti-hallucination gate that resolves each cited reference against
Crossref → OpenAlex / arXiv, classifying itok/mismatch/fabricated/
unverifiable/unchecked/anachronism. Offline-first with a persistent
SQLite cache; live resolution is opt-in. CLI:python -m infrastructure.reference.verification verify <bib>. - ✍️ AI-writing fingerprint detector (
infrastructure/validation/content/ai_writing.py,
validation.cli prose-quality) — flags AI-typical phrasing, em-dash density, and
low sentence-length burstiness. Both distilled clean-room from Academic Research
Skills ideas (CC-BY-NC-4.0); no code vendored. - 🕸️ Evidence graph (
infrastructure/reporting/evidence_graph.py) — typed
producer/consumer/validator/claim/artifact graph assembled from the real stage DAG,
with a query API and byte-stable JSON (EVIDENCE-GRAPH-1). - 📦 Reproduction bundle (
infrastructure/publishing/repro_bundle.py,
scripts/10_repro_bundle.py) — deterministic repro manifest (lockfile, artifact
hashes, canonical-facts pointer, repro command) plus a fail-closed verifier (REPRO-BUNDLE-1). - 📊 Release-readiness dashboard (
infrastructure/reporting/release_readiness.py) —
local, no-network report aggregating docs-lint, coverage/test facts, pipeline
snapshots, evidence-graph status, and release metadata (DASHBOARD-1). - 🧩 Pipeline plugin stages (
infrastructure/core/pipeline/plugins.py) — schema-validated
projects/{name}/pipeline_plugins.yamladds DAG stages without core edits. Opt-in;
default plan unchanged (PLUGIN-STAGES-1). - ⏭️ Incremental pipeline skipping (
infrastructure/core/pipeline/incremental.py,
IncrementalConfig) — content-hash stage skipping with downstream invalidation and
fail-safe (never skip when outputs absent). Opt-in, default-off (INCREMENTAL-PIPELINE-1).
Changed
- ⚡ Parallel infrastructure tests — CI
test-infraruns withpytest-xdist -n auto
(~892s → ~585s per leg); suite verified parallel-safe. - 🧬 Dynamic CI project matrix —
test-projectderives its matrix from
infrastructure.project.public_scopeviafromJSON(detect-projectsjob), so
adding/retiring atemplates/exemplar no longer edits the matrix literal (CI-MATRIX-DYNAMIC-1). - 🔇 Quieter terminal logging — console handler floors at INFO (no DEBUG/spinner
chrome on stdout) while the file handler retains timestamped DEBUG; per-file render
internals demoted to DEBUG; default-vdropped from pytestaddopts(LOG-CLEAN-1). - 🧱 Consolidated safe markdown reader —
infrastructure/validation/docs/_io.py
hostsread_markdown; doc linters route their read-and-skip sites through it (READFILE-SAFE-1). - 📚 Documentation accuracy passes — deep audit + fixes across
docs/and every
infrastructure/*/{SKILL,README,AGENTS}.md, correcting examples that cited
methods/params/CLI flags/test paths that no longer exist; new deterministic infra is
wired into thedocs/promptsworkflows.
Research Project Template v3.2.0
Research Project Template v3.2.0
Minor public release focused on agentic workflow routing, public-scope hardening, documentation auditability, and release-gate reliability.
🧭 Agent Routing
- Adds the first-party
template-agentic-useworkflow underdocs/prompts/agentic-use/. - Routes requests such as “make template more agentic,” “find relevant skills,” and “improve agent routing” to a narrower child workflow instead of overloading
template-workflows. - Documents agent onboarding, local skill inventory, workflow selection, contract checks, eval use, and external skill review.
- Keeps repo-local skills as the primary routing layer and treats external skills as optional developer-level companions.
- Refreshes generated routing surfaces:
docs/prompts/SKILL.mddocs/prompts/README.mddocs/prompts/MODE_REGISTRY.mddocs/_generated/skills_index.md.cursor/skill_manifest.json
- Extends prompt eval coverage with agentic-use inventory and routing-hardening scenarios.
🧪 Validation And Skill Evals
- Skill contracts remain strict through:
uv run python -m infrastructure.skills checkuv run python -m infrastructure.skills check-contractsuv run python -m infrastructure.skills check-all-exports
- Skill eval harness result:
- 29 evals
- 58 grading runs
with_skillall: 100.0%with_skillpositive-only: 100.0%- threshold: 0.96
- Adds document-consistency validation for memory and decision-record rules.
- Adds public documentation audit coverage for private-path, output-artifact, and generated-surface hazards.
- Expands validation tests around link extraction, documentation accuracy, public audits, and memory-decision documentation.
- Keeps the no-mocks policy enforced through both direct verification and pre-push smoke gates.
🧱 CI And Public Scope
- Bumps public release metadata to
3.2.0inpyproject.toml,uv.lock, andCITATION.cff. - Updates
CHANGELOG.mdwith a densev3.2.0section. - Strengthens project-test coverage handling by invoking coverage through the active Python interpreter with
python -m coverage. - Preserves public scope discipline through
infrastructure.project.public_scope source-paths. - Keeps Ruff, format-check, and mypy scoped to the public CI source path set.
- Confirms broad infra coverage remains above the 60% floor.
- Confirms public project split coverage remains above the combined project gate.
- Keeps generated
output/artifacts out of the release commit.
🔐 Public Repo Safety
- Reconfirms the confidentiality guard:
- only public canonical template project paths are tracked
- rotating
projects/active,projects/working,projects/archive,projects/published, andprojects/othertrees remain excluded
- Reconfirms the generated-artifact guard:
- no tracked generated artifacts detected
- no staged
output/release diff
- Documents the no-vendoring default for external skills.
- Adds public audit tooling for documentation surfaces that can accidentally leak private or generated paths.
- Keeps local eval artifacts, caches, coverage outputs, and generated project outputs outside the public release.
- Preserves the dirty original checkout by preparing and validating the release in a separate worktree.
📚 Documentation
- Adds
docs/rules/memory_and_decision_records.md. - Adds ADR 005 for decision memory and adversarial validation.
- Updates guide material in
docs/guides/extending-and-automation.mdwith an “Agentic use” section. - Updates agent-facing documentation across
AGENTS.mdsurfaces to reflect new routing and public-safety expectations. - Updates development, audit, maintenance, architecture, and CI documentation with current public-scope and validation language.
- Expands active-inference exemplar scholarship and sheaf-method documentation.
- Adds scholarship source-map structure for active-inference roadmap tracks.
- Keeps measured or volatile claims routed through generated or validated surfaces rather than unmanaged prose.
🚀 Pipeline And Publishing
- Hardens the active-inference manuscript-variable generator with a faster
--allow-draftpath while preserving strict generation semantics. - Adds compatibility for combined-source rendering expectations through a retained source-file alias.
- Improves release workflow reliability after worktree relocation by avoiding stale executable wrappers in coverage reporting.
- Keeps publication metadata in sync across Python package metadata, citation metadata, and changelog narrative.
- Supports the release publication flow through an annotated
v3.2.0tag.
🧰 Developer Workflow
- Release prepared from fresh
origin/mainin a scoped worktree:/Users/4d/.config/superpowers/worktrees/template/codex-release-v3.2.0
- Release branch:
codex/release-v3.2.0
- Release commit:
cff9fa05 chore(release): prepare v3.2.0
- Local
mainfast-forwarded to the release commit after direct push succeeded. gh-axiused for release publication and verification.- External companion skills remain reference material or personal installs only; they are not vendored into the public repository.
✅ Verification
Fresh release gates run before commit, push, tag, and release publication:
uv run python scripts/check_tracked_projects.pyuv run python scripts/check_tracked_generated_artifacts.pygit diff --checkuv run python -m infrastructure.skills checkuv run python -m infrastructure.skills check-contractsuv run python -m infrastructure.skills check-all-exportsuv run pytest tests/infra_tests/skills -quv run python docs/prompts/_skill-eval/scripts/run_eval_harness.py --write-review --fail-under 0.96uv run python scripts/lint_docs.pyuv run python scripts/verify_no_mocks.pyuv run python -m infrastructure.project.public_scope source-paths | xargs uv run ruff checkuv run python -m infrastructure.project.public_scope source-paths | xargs uv run ruff format --checkuv run python -m infrastructure.project.public_scope source-paths | xargs uv run mypyuv run bandit -c bandit.yaml -r -ll infrastructure/ scripts/ projects/uv run pip-auditCOVERAGE_FILE=.coverage.infra uv run pytest tests/infra_tests/ --cov=infrastructure --cov-fail-under=60 -m "not requires_ollama and not slow and not bench" -qCOVERAGE_FILE=.coverage.project uv run python scripts/01_run_tests.py --project-only --all-projects --public-projects --non-strict --include-slow- Pre-push quick hooks for generated-artifact safety, no-mocks policy, hook smoke pytest, docs contract guard, Bandit quick, skill manifest freshness, and export audit.
Result highlights:
- Infra tests: 7184 passed, 1 skipped, 70 deselected.
- Infra coverage: 82.39%, above the 60% floor.
- Public project split gate: combined coverage 92.89%, above the configured gate.
- Skill tests: 99 passed.
- Skill eval harness: 100.0% with-skill score against the 0.96 threshold.
- Documentation lint: 236 Mermaid blocks checked, 0 broken links, 0 consistency issues, 0 doc-pair issues.
- Security checks: Bandit medium/high clean,
pip-auditclean. - Release absence verified before tagging: no local tag, no remote tag, and no existing GitHub release for
v3.2.0.
Research Project Template v3.1.0
What changed
v3.1.0 turns the template's public surface into a tighter checked contract: six public exemplars, typed projects/templates/... documentation, and a project-test harness that no longer depends on incidental coverage versions inside each project virtual environment.
Highlights
Six public exemplars
templates/template_sia is now part of the public scope alongside the code, prose, autoresearch, active-inference, and meta-template exemplars. The generated docs now show that roster from infrastructure.project.public_scope, so release and CI docs cite measured facts instead of handwritten project lists.
Active Inference validation spine
The Active Inference exemplar now emits first-class provenance, reproducibility replay, semantic sheaf, evidence crosswalk, dependency graph, policy-comparison, graph-world, animation, and counterexample artifacts. The manuscript binds those artifacts into provenance/replay/counterexample tracks and the refreshed PDF/web output demonstrates the path end to end.
Documentation as a checked interface
Root docs, folder AGENTS.md files, folder README.md files, generated indexes, GitHub docs, and exemplar docs were re-audited and aligned. Stale projects/template_* public-exemplar paths are now lint failures, and the doc-pair guard covers the new validation-spine and SIA directories.
Test harness reliability
run_per_project_pytest now pins coverage to the workspace version inside project subprocesses before using append mode. That fixes the shared .coverage.project corruption mode seen when public exemplars have different transitive coverage versions.
Release metadata
Package, citation, badge, changelog, generated project roster, generated canonical facts, publication records, skills index, and architecture overview were refreshed for 3.1.0.
Verification
uv run python scripts/01_run_tests.py --project-only --all-projects --public-projects --include-slow: all six public projects passed; combined coverage gate passed at 93.91 percent.- Active Inference focused gate: 224 passed, 1 skipped, 91.14 percent coverage.
- Documentation suite: 103 passed across docs validation, documentation index, and discovery consistency tests.
- Focused infra/SIA contract suite: 49 passed.
- Ruff check, Ruff format check, mypy, template drift, skills check,
check-all-exports, module line count, SIA task validation, tracked-project guard, generated-artifact guard, andgit diff --checkpassed. - Pre-push hooks passed, including no-mocks, hook smoke tests, docs contract guard, Bandit medium+ gate, skills freshness, and export audit.
Note: the full scripts/lint_docs.py --mermaid-only renderer sweep hit a local headless Chrome timeout. Mermaid parser/unit coverage passed in the docs test suite, and generated architecture Mermaid/SVG artifacts were refreshed.
Research Project Template v3.0.0
f0edf91 feat: add Zenodo publication DOI and citation info
ff38ead A template/ approach to Reproducible Generative Research: Architecture and Ergonomics from Configuration through Publication
1cc241d update
2a1292e remove
a2a025e update
fb97834 update
7963c55 Update README.md
8feb9e5 update
2265924 desloppify: remove deprecated aliases, narrow exception types
ea9bb14 fix: remove unused Path imports in test files; change _DANGEROUS_PATTERNS to tuple
53c0078 fix: masked-exception-fixes cluster (narrow exception types, promote debug→warning)
15ca548 fix: quick-convention-fixes cluster (emoji logging, stdlib logging, security/config/retry/prompt cleanup)
e9669d0 test: add 16 tests for pipeline_types (PipelineConfig, PipelineStageResult, StageSpec)
eaf35df desloppify: add direct tests for 3 transitive-only modules (session 8)
ff992f9 desloppify: fix error-pattern-consistency cluster (session 8)
538e020 desloppify: add tests for template helpers and review analysis modules
9529617 desloppify: add tests for install_commands and _pdf_latex_helpers modules
4ccd4eb desloppify: fix review queue — explicit fallback warning, resolved false positives
5d08b20 desloppify: remove unused imports of ManuscriptQualityReview, ManuscriptMethodologyReview
fbc46a5 desloppify: extract create_parser(), add ETA tests, use real parser in CLI tests
4bb27c0 desloppify: avoid mid-function config re-read in extract_manuscript_text
60e0adc desloppify: logic clarity, timeout constant, plotly dead code, double env check
bab41a5 desloppify: fix naming, type safety, error context, os.path migration
166b6bc desloppify: fix api surface documentation and config reload issues
56411c5 desloppify: consolidate prompt-system sentinels; remove noisy logging; improve docstrings
22a3120 desloppify: flatten stream_short/stream_long 3-layer chain
1665c1d desloppify: strip llm/init.py facade; remove forwarding wrappers
6cf5bb1 refactor: strip unused re-export facade from infrastructure/project/init.py
4a22a22 test: add offline tests for retry loop and warmup_model branches
77190ad refactor: inline trivial wrapper methods and improve docstrings
eacdf7c fix: narrow bare except Exception to OSError in tmp-file write patterns
7a96feb fix: rename misleading function names and fix broken test import
50b2788 fix: remove impossible OverflowError, param order bug, and stale migration shims
ba4aac4 desloppify: fix logic bugs, quick-wins, test-quality, incomplete-migrations
0e96f2a desloppify: narrow broad except Exception to specific types in tests and src
1da4321 desloppify: fix validate_structure return type and chain streaming exceptions
0d53148 desloppify: strip bloated docstrings, rename format_benchmark_report
4b508de desloppify: fix type safety and init coupling issues
1228f20 desloppify: fix elegance issues from deferred-elegance cluster
6c7d922 desloppify: fix logic/naming/security issues from review batch
7571641 desloppify: fix contracts, conventions, deprecation
9525c16 desloppify: fix unused imports in test files
a2fcf08 fix test_security.py: update SecurityHeaders references to module-level functions
193acc6 fix: remove unused imports in test files; add coverage_cleanup tests
77a64ce fix: rename ZenodoClient.upload_file bucket→deposition_id, fix URL path; update security.py Union→str|Path
6d04e9f desloppify: add tests for 5 untested modules (fix-test-coverage cluster)
aa9918e desloppify: fix 16 findings across 6 clusters (sessions 5-6)
86827a5 desloppify: fix 5 T1 review issues
f964526 desloppify: fix unused import and annotation quality
8c21c22 desloppify: consolidate functools imports in security.py
Full Changelog: 0.6...v3.0.0
v0.6 — Desloppify: Code Health Campaign
🧹 Code Health: Desloppify Campaign
The largest code-quality improvement cycle since the template's inception. 162 commits across 948 files, systematically eliminating technical debt from all 8 infrastructure packages through 26 rounds of blind review.
Fixes
- Import hygiene — Removed unused imports across 20+ files; eliminated
sys.pathmutations from CLI modules; properTYPE_CHECKINGguards throughout - Exception handling — Narrowed broad
except Exception/ bareexceptclauses inintegrity.py,logging_utils,config_loader, andllmmodules; fixed silentJSONDecodeErrorswallowing; restored exception context withraise ... from exc - Dead code removal — Deleted orphaned
coverage_reporter.py(zero importers); removed stub/passthrough wrapper methods across 10+ modules; eliminated dead HTML-entities dict fromInputSanitizer - Type annotations — Modernised legacy
typingimports (List[x]→list[x],Optional[x]→x | None) across 30+ modules; addedTypedDictreturns for integrity results; annotated CLI re-exports - API surface — Consolidated
LLMConfigenv-read wrappers (ABS-001); merged duplicatePerformanceMetricsnaming conflict; removedProjectLoggerpure-forwarder abstraction; eliminatedcalculate_file_hashre-export from publishing boundary - Bug fixes — Fixed inverted
scan_errorsbool in doc scanner; fixed stall-detection dead branch in pipeline reporter; fixedconfig_filespath bug inconfig_cli; fixedclean_output_directoryreturn type; fixed broken accessor imports aftercore.pyhub elimination - Structural — Eliminated
infrastructure/core/core.pyhub; extracted_build_stage_listto remove stage-list duplication; movedMultiProjectResulttoTYPE_CHECKINGto breakreporting→corecircular dependency - Logging — Removed noisy debug logs from LLM and environment modules; downgraded verbose entry logs; added
get_loggerto logic modules lacking structured logging - Docstrings — Stripped AI-generated boilerplate from 40+ functions; removed restating comments and banner comments
- Tests — Fixed test name collisions; added deterministic tests for
validate_review_qualityand exception types; added integration tests totestpaths; removed orphan test files - Dependencies — Removed
scipyfrom infrastructure env check; resolved stalepsutilguards; movedmatplotlibto optional dep group
Quality Gates
| Gate | Status |
|---|---|
| Desloppify blind reviews | 26 rounds |
| Commits | 162 |
| Files changed | 948 |
ruff check |
✅ 0 errors |
mypy --strict |
✅ 0 errors (all 8 packages) |
bandit -ll |
✅ 0 MEDIUM+ findings |
pip-audit |
✅ Blocking gate |
pytest |
✅ All pass |
Documentation
- Updated
CHANGELOG.mdwith full v0.6.0 entry - Updated
docs/development/roadmap.mdwith desloppify results and refreshed quality metrics - Updated
docs/audit/documentation-review-report.mdfor v0.6 scope
See CHANGELOG.md for full details.
Multiproject
Now there are multiple projects and all associated functionalities.