Skip to content

docs: add proposed ADR-073 on scoring/reward language scope#681

Open
Brad-Edwards wants to merge 1 commit into
devfrom
671-scoring-reward-scope
Open

docs: add proposed ADR-073 on scoring/reward language scope#681
Brad-Edwards wants to merge 1 commit into
devfrom
671-scoring-reward-scope

Conversation

@Brad-Edwards

Copy link
Copy Markdown
Owner

Summary

Adds proposed ADR-073 examining whether the OCR-inherited SDL scoring pipeline (metrics/evaluations/tlos/goals) and the CybORG agents.reward_calculator label belong in ACES, plus scoring-scope research notes and a SEM-206 assessment-semantics compatibility guardrail. Doc-only; the decision is deferred to review (status proposed). Recommends treating these surfaces as vestigial against the experiment-vs-data-use boundary already drawn by ADR-055/064/069 and routing graded scoring to the experiment/evaluator plane, keeping objectives+conditions and narrowing objective success to observable state. Closes #671.

Requirement UIDs

  • SEM-206

Related Issues

Closes #671

ADR Impact

  • ADR-073

Changes

  • docs/decisions/adrs/adr-073-scoring-reward-language-scope.md: proposed ADR with full-removal recommendation, migration path, and answers to the issue's four questions
  • docs/research/scoring-scope/: prior-art-and-design-criteria and scoring-surface-inventory research notes (registered in docs/index.md)
  • docs/explain/reference/assessment-semantics.md: SEM-206 compatibility guardrail (do not expand SDL scoring/reward language while the boundary is open)
  • docs/decisions/adrs/README.md: ADR-073 toctree + table row
  • changelog.d/671.added.md: changelog fragment

Test Plan

  • Documentation-only change. nox -s verify passes locally (Sphinx docs build, ADR acceptance-content pin, schema publication manifest + generated-schema drift, ruff, full pytest + integration suite).
  • Pre-push codex review and test-quality review: clean.

Ground Control Checks

  • GRC screening: not_security_relevant (docs-only; 0 gaps/impacts/stale).
  • nox -s verify green.
  • Pre-push codex + test-quality reviews clean; findings recorded on the issue thread.

Traceability

  • IMPLEMENTS: (none — proposed design ADR; no implementation surface)
  • TESTS: (none — documentation-only)
  • DOCUMENTS: ADR-073 → SEM-206 (Assessment Semantics)

Checklist

  • Proposed ADR is not pinned in adr-index.yaml (correct for proposed status).
  • Research notes registered in docs/index.md toctree.
  • Decision deferred to review (status: proposed); no schema/model/scenario edits in this PR.

Examine whether the OCR-inherited SDL scoring pipeline
(metrics/evaluations/tlos/goals) and the CybORG agents.reward_calculator
label belong in ACES, against the experiment-vs-data-use boundary already
drawn by ADR-055/064/069. Recommend (proposed) treating these surfaces as
vestigial in the SDL, keeping objectives+conditions, narrowing objective
success to observable state, and routing graded scoring to the
experiment/evaluator plane. Add scoring-scope research notes
(docs/research/scoring-scope/) and a SEM-206 assessment-semantics
compatibility guardrail. Decision deferred to review.

Refs #671. ADR-073. SEM-206.
@sonarqubecloud

sonarqubecloud Bot commented Jul 5, 2026

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant