Name	Name	Last commit message	Last commit date
parent directory ..
drift	drift
evidence_packs	evidence_packs
fixtures	fixtures
workflows	workflows
INDEX.md	INDEX.md
README.md	README.md

Examples

Committed examples show evidence-pack shape and gate behavior. They are engineering records, not clinical or regulatory artifacts.

Agent navigation: start with INDEX.md for anchor packs before opening study trees.

Layout

examples/
  evidence_packs/     # canonical single-run pass/fail packs
  studies/            # compact narrative studies only; generated records stay ignored
  drift/              # drift comparison examples
  fixtures/           # tiny shared example inputs
  workflows/          # workflow YAML examples

Reading examples

Start with workflow_run_record.md, then inspect:

validation_summary.json for gate status
output.json for parsed skill output
manifest.json for command, fixture, and file inventory
runtime_profile.json, cost_profile.json, and environment.lock for reproducibility context

Committed example environment.lock files are compacted to the packages used by env-pin checks. Regenerate a pack under runs/ when you need a full local pip freeze.

The canonical file list is in docs/replay.md.

Baselines

*_pass/ and *_clean/ packs are drift anchors for the same spec. They are not cross-skill comparisons and should not be read as performance ordering.

This branch keeps DICOM utilities and NVIDIA-Medtech nv-* skills only. Important anchors:

dicom_metadata_pass/
dicom_metadata_trusted_warn/
dicom_series_preflight_trusted_pass/
dicom_series_to_volume_pass/
dicom_series_to_volume_trusted_pass/
nv_segment_ct_pass/
nv_segment_ct_trusted_pass/
nv_segment_ctmr_trusted_pass/
nv_segment_ct_finetune_trusted_smoke_pass/
nv_generate_ct_rflow_pass/
nv_generate_ct_rflow_trusted_inventory_pass/
nv_generate_mr_trusted_inventory_pass/
nv_generate_mr_brain_trusted_inventory_pass/
nv_reason_cxr_trusted_mock_pass/
benchmark_decathlon_spleen_clean/
benchmark_decathlon_with_corruption/
benchmark_ct_segmentation_spleen_msd09_pass/
ct_segmentation_finetune_quality_v1_pass/

Negative packs intentionally fail specific gates, such as invalid DICOM input, silent segmentation failure, integrity failure, benchmark corruption, or spec-completeness failures.

Study packs under examples/studies/ are optional narrative records, not the canonical list. Generated with-vs-without study records now stay under runs/with_vs_without_nv/studies/, with the checked-in summary at docs/with-vs-without-skill-experiment.md.

Regenerate or compare examples with make run-skill, make run-benchmark, and make diff.

evidence_packs/nv_reason_cxr_trusted_mock_pass/ is the deterministic mock CXR reasoning trust anchor. It pairs nv_reason_cxr with nv_reason_cxr_quality_v1 to confirm generated synthetic image handling, image hash binding, runtime identity, response non-emptiness, scope disclosure, and forbidden-phrase guardrails without downloading model weights.

evidence_packs/nv_generate_ct_rflow_trusted_inventory_pass/ is the CT synthesis trusted inventory anchor. It pairs nv_generate_ct_rflow with ct_synthesis_quality_v1 to confirm a real CUDA rflow-ct run, generated image/label artifact bytes and hashes, geometry consistency, CT-HU range floors, label-set sanity, model inventory, and GPU provenance without committing the generated NIfTI volumes.

evidence_packs/nv_generate_mr_trusted_inventory_pass/ and evidence_packs/nv_generate_mr_brain_trusted_inventory_pass/ are the MR synthesis trusted inventory anchors. They pair the image-only MR wrappers with mr_synthesis_quality_v1 to confirm real CUDA rflow-mr/rflow-mr-brain runs, generated image bytes and hashes, requested geometry, finite nonconstant nonnegative voxel values, model inventory, and GPU provenance without committing generated NIfTI volumes.

evidence_packs/nv_segment_ct_finetune_trusted_smoke_pass/ is the NV-Segment-CT continual-finetune smoke trust anchor. It pairs nv_segment_ct_finetune with ct_segmentation_finetune_quality_v1 on the four-case spleen_micro fixture to confirm the MONAI bundle launches, writes a checkpoint, has finite training loss, avoids OOM, records a validation trajectory, and passes checkpoint-load inspection. It is plumbing evidence only and does not replace the Task06 Lung Tumor sanity run or convergence-quality evidence. The generated 872 MB checkpoint is referenced by path, size, and verifier facts but is not committed.

evidence_packs/nv_segment_ctmr_trusted_pass/ is the NV-Segment-CTMR CT-body trust anchor. It pairs nv_segment_ctmr with ct_segmentation_quality_v1 on the shared spleen CT fixture to confirm the upstream MONAI bundle entrypoint ran on CUDA, loaded the pinned model inventory, preserved input geometry, emitted task-valid CT-body labels, and passed organ-volume, fragmentation, liver>spleen, and bilateral-kidney plausibility checks. The generated label-map NIfTI is referenced by path and verifier facts but is not committed. The CT verifier support is intentionally limited to CT_BODY; MRI_BODY and MRI_BRAIN need modality-specific verifier anchors.

Verifier-only anchors close trust-layer lifecycle gaps without claiming that the corresponding heavy upstream skill has a trusted run. The ct_segmentation_finetune_quality_v1_pass/ pack audits a committed synthetic finetune fixture.

Flagship workflow A1: DICOM preflight gate (start here)

GPU-free trusted preflight for a DICOM folder:

make run-workflow \
  WORKFLOW=examples/workflows/dicom_preflight_gate.yaml \
  WORKFLOW_INPUT=skills/dicom-series-preflight/fixtures/clean_no_phi \
  WORKFLOW_OUT=runs/dicom_preflight_gate

Fixture	Expected workflow `overall`
`fixtures/clean_no_phi`	`passed`
`fixtures/clean_axial`	`warn` (PHI tags populated)
`fixtures/flipped_lr`	`failed` (orientation gate)

Canonical trusted-run anchor: evidence_packs/dicom_series_preflight_trusted_pass/ contains the same GPU-free clean-no-PHI path as a committed trusted run: skill_run/, verifiers/dicom_preflight_quality_v1/, and trust_summary.json.

The smaller single-file DICOM metadata anchor evidence_packs/dicom_metadata_trusted_warn/ demonstrates trusted-run warning semantics: the skill gates pass, dicom_metadata_quality_v1 accepts the pack, and the trust summary records standard PHI tag presence as a warning rather than treating it as de-identification proof.

The DICOM-to-volume trusted anchor evidence_packs/dicom_series_to_volume_trusted_pass/ contains the conversion pack, the emitted volume.nii.gz, verifiers/dicom_volume_quality_v1/, and a trust summary that checks the NIfTI artifact against reported geometry and voxel range evidence.

Flagship workflow 1: CT DICOM to segmentation evidence

Path:

DICOM series
  -> dicom_series_to_volume (metadata + geometry preflight, DICOM-to-NIfTI)
  -> nv_segment_ct (trusted)
  -> ct_segmentation_quality_v1
  -> workflow / trust summary

Workflow 1 chains dicom_series_to_volume with a trusted nv_segment_ct step that runs ct_segmentation_quality_v1 on the segmentation pack. Full spec: workflows/README.md.

Canonical direct CT segmentation trusted anchor: evidence_packs/nv_segment_ct_trusted_pass/ contains the VISTA3D wrapper pack, the ct_segmentation_quality_v1 verifier pack, and a trust summary with anatomy plausibility, label-set containment, and artifact-hash evidence for the local spleen fixture.

# Positive path (clean axial synthetic CT series)
make run-workflow \
  WORKFLOW=examples/workflows/ct_dicom_to_segmentation_evidence.yaml \
  WORKFLOW_INPUT=skills/dicom-series-to-volume/fixtures/clean_axial \
  WORKFLOW_OUT=runs/ct_dicom_seg_evidence

# Negative path (LR-flipped IOP — halts at convert, segment never runs)
make run-workflow \
  WORKFLOW=examples/workflows/ct_dicom_to_segmentation_evidence.yaml \
  WORKFLOW_INPUT=skills/dicom-series-to-volume/fixtures/flipped_lr \
  WORKFLOW_OUT=runs/ct_dicom_seg_flipped_fail

Inspect WORKFLOW_OUT/workflow_summary.json for per-step status and trust linkage; the segment step writes segment/trust_summary.json and segment/skill_run/ when the convert step passes.

On the committed clean_axial synthetic series, segmentation often passes skill gates while ct_segmentation_quality_v1 fails anatomy-plausibility bounds (tiny phantom volumes). That is expected engineering behavior, not a workflow bug — the workflow still produces conversion, segmentation, and verifier evidence packs in one run.

Contribution policy

External contributions are welcome when the example is curated evidence, not a dumped local run. Acceptable additions:

a small pass pack for a new or changed spec
a negative pack that proves a gate fails correctly
a drift example with a clear lesson
a compact study that connects several evidence packs

Generated work starts in runs/. Promote only the small, sanitized subset that should become a shared reading or regression artifact. Never commit patient data, large medical volumes, model weights, raw recordings, secrets, bulky provider logs, raw provider responses, per-repeat study JSON, detailed generated reports, or runtime environments.

Verifier anti-patterns and negative fixtures live with the owning verifier under verifiers/<name>/fixtures/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

README.md

Examples

Layout

Reading examples

Baselines

Flagship workflow A1: DICOM preflight gate (start here)

Flagship workflow 1: CT DICOM to segmentation evidence

Contribution policy

Uh oh!

FilesExpand file tree

examples

Directory actions

More options

Directory actions

More options

Latest commit

History

examples

Folders and files

parent directory

README.md

Examples

Layout

Reading examples

Baselines

Flagship workflow A1: DICOM preflight gate (start here)

Flagship workflow 1: CT DICOM to segmentation evidence

Contribution policy