Skip to content

feat: add model capabilities catalog#7

Merged
matdev83 merged 3 commits into
mainfrom
feature/model-capabilities-catalog
Apr 27, 2026
Merged

feat: add model capabilities catalog#7
matdev83 merged 3 commits into
mainfrom
feature/model-capabilities-catalog

Conversation

@matdev83

@matdev83 matdev83 commented Apr 27, 2026

Copy link
Copy Markdown
Owner

Summary

  • Add the model capabilities catalog pipeline across config, runtimebundle, core modelcatalog, and stdhttp diagnostics wiring.
  • Wire catalog facts into routing/eligibility and expose protected operator diagnostics for catalog status.
  • Add tests covering runtime wiring, diagnostics, and architecture boundaries.

Validation

  • go test ./...
  • go run ./cmd/lipstd --help

Summary by CodeRabbit

Release Notes

  • New Features

    • Added model catalog support with local snapshots, capability overrides, and backend-specific model configuration
    • Added SQL query caching for secure session database operations to reduce query overhead
    • Added context-limit eligibility checks during request routing to enforce model-specific constraints
    • Added catalog diagnostics endpoint for real-time monitoring of snapshot refresh status
  • Documentation

    • Expanded capability catalogs documentation to cover models.dev snapshot configuration and behavior

@coderabbitai

coderabbitai Bot commented Apr 27, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@matdev83 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 31 minutes and 30 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6d8300be-6515-40a9-98f7-712f36d5a5e0

📥 Commits

Reviewing files that changed from the base of the PR and between 47b6965 and 638de88.

📒 Files selected for processing (12)
  • internal/core/config/effective_secure_session.go
  • internal/core/config/model_catalog_test.go
  • internal/core/config/validate.go
  • internal/core/modelcatalog/diagnostics.go
  • internal/core/modelcatalog/diagnostics_test.go
  • internal/core/modelcatalog/eligibility.go
  • internal/core/modelcatalog/matcher.go
  • internal/core/modelcatalog/matcher_test.go
  • internal/core/securesession/adapters/bunstore/store_test.go
  • internal/infra/modelcatalog/modelsdev/normalize.go
  • internal/infra/runtimebundle/build.go
  • internal/plugins/stores/.gitkeep
📝 Walkthrough

Walkthrough

This pull request introduces a comprehensive model catalog feature enabling local snapshot-based model capability facts and context-limit enforcement. It adds configuration structures, core catalog logic for fact resolution and eligibility checking, models.dev catalog ingestion (parsing, HTTP fetch, file caching), integration with the executor for routing decisions, SQL query TTL caching for secure sessions, diagnostics endpoints, and extensive test coverage.

Changes

Cohort / File(s) Summary
Configuration & Design
.kiro/specs/model-capabilities-catalog/design.md, .kiro/specs/model-capabilities-catalog/spec.json, .kiro/specs/model-capabilities-catalog/tasks.md, config/config.yaml
Design documentation, feature readiness flag, implementation task checklist updates, and commented example configuration options for catalog snapshot and SQL caching features.
Core Model Catalog
internal/core/modelcatalog/doc.go, internal/core/modelcatalog/facts.go, internal/core/modelcatalog/matcher.go, internal/core/modelcatalog/overrides.go, internal/core/modelcatalog/eligibility.go, internal/core/modelcatalog/estimate.go, internal/core/modelcatalog/resolver.go, internal/core/modelcatalog/ports.go, internal/core/modelcatalog/runtime.go, internal/core/modelcatalog/*_test.go
New package defining capability facts, tri-state enums, matching logic, override resolution, eligibility decisions with context-limit checks, byte-size estimation, unified resolver combining catalog/overrides/backend caps, snapshot ports, runtime lifecycle with background refresh, and comprehensive unit tests.
Models.dev Integration
internal/infra/modelcatalog/modelsdev/doc.go, internal/infra/modelcatalog/modelsdev/schema.go, internal/infra/modelcatalog/modelsdev/normalize.go, internal/infra/modelcatalog/modelsdev/source.go, internal/infra/modelcatalog/modelsdev/store.go, internal/infra/modelcatalog/modelsdev/*_test.go, Makefile
JSON parsing schema, snapshot normalization with capability/limit derivation, HTTP-backed snapshot fetching with size limits and timeout defense-in-depth, disk-backed file cache with atomic writes and hash validation, and fuzz test harness.
Executor Integration
internal/core/runtime/catalog_ports.go, internal/core/runtime/executor.go, internal/core/runtime/executor_catalog.go, internal/core/runtime/executor_open_attempt.go, internal/core/runtime/attempt_stream.go, internal/core/runtime/executor_route_obs.go, internal/core/runtime/executor_route_trace_catalog.go, internal/core/runtime/executor_modelcatalog_*.go
New resolver ports, executor fields for catalog and eligibility resolvers, per-attempt effective facts computation, context-limit exhaustion tracking, route-trace catalog metadata population, and comprehensive integration tests across negotiation/downgrade/e2e scenarios.
Diagnostics & Observability
internal/core/diag/route_trace.go, internal/core/diag/route_trace_test.go, internal/core/modelcatalog/diagnostics.go, internal/core/modelcatalog/diagnostics_test.go, internal/stdhttp/catalog_status_handler.go, internal/stdhttp/catalog_status_handler_test.go, internal/stdhttp/modelcatalog_diag_test.go
Route trace catalog metadata structure, diagnostics status enums and JSON payload builders with staleness evaluation, HTTP handler for /catalog-status with method-only GET validation, and endpoint tests with shared-secret protection.
Secure Session SQL Caching
internal/core/config/effective_secure_session.go, internal/core/securesession/adapters/bunstore/options.go, internal/core/securesession/adapters/bunstore/meta_cache.go, internal/core/securesession/adapters/bunstore/store.go, internal/core/securesession/adapters/sqlite/options.go, internal/core/securesession/adapters/sqlite/meta_cache.go, internal/core/securesession/adapters/sqlite/store.go, internal/core/securesession/adapters/sqlite/store_cache_test.go, internal/core/securesession/adapters/bunstore/store_test.go, internal/core/securesession/storecontract/*_test.go
TTL-cache options and metadata caching for both postgres (bunstore) and sqlite adapters, invalidation on row-misses and FK constraint errors, and extensive cache-behavior tests validating staleness windows.
Configuration Loading & Validation
internal/core/config/model.go, internal/core/config/model_catalog.go, internal/core/config/model_catalog_duration.go, internal/core/config/loader.go, internal/core/config/validate.go, internal/core/config/model_catalog_duration_test.go, internal/core/config/model_catalog_test.go, internal/core/config/sql_query_cache_test.go, internal/core/config/loader_test.go
Config struct fields for catalog and SQL caching, override entry structs with optional capability/token-limit pointers, duration parsing helpers with positive-value enforcement, loader defaulting of empty slices, and comprehensive validation tests covering cache-path requirements, URL/duration/token-limit checks, and diagnostics path conflicts.
Architecture & Boundaries
internal/archtest/modelcatalog_boundaries_test.go, testdata/architecture/hexagonal_migration_baseline.json
New boundary tests asserting core/infra separation for model catalog and SDK imports, updated migration baseline allowing internal/core/modelcatalog imports in adapter packages.
Runtime Bundle & Server
internal/infra/runtimebundle/build.go, internal/infra/runtimebundle/built.go, internal/infra/runtimebundle/modelcatalog_attach.go, internal/infra/runtimebundle/modelcatalog_override_set.go, internal/infra/runtimebundle/modelcatalog_refresh_loop.go, internal/infra/runtimebundle/secure_session.go, internal/infra/runtimebundle/build_modelcatalog_test.go, internal/infra/runtimebundle/modelcatalog_override_set_test.go, internal/stdhttp/server.go
Catalog runtime attachment with background refresh loop, override set builder from config, SQL cache option wiring to secure-session adapters, executor resolver wiring, catalog status diagnostics mounting, and integration tests.
Error Handling & Frontend Response
pkg/lipapi/errors.go, pkg/lipapi/errors_test.go, internal/plugins/frontends/execerr/execerr.go, internal/plugins/frontends/execerr/execerr_test.go, internal/plugins/frontends/openailegacy/handler.go, internal/plugins/frontends/openairesponses/handler.go
New ErrAllCandidatesContextLimitExceeded sentinel for context-limit-excluded candidates, HTTP 413 classification in error handlers, and dynamic OpenAI error code omission for 413 responses.
Testing & Tooling
internal/archtest/contract_lipapi_auth_imports_test.go, internal/core/continuity/bunstore/store_test.go, internal/core/continuity/sqlitestore/store_test.go, internal/core/continuity/store_test.go, internal/core/accessmode/validate_test.go, internal/infra/authevent/redaction_test.go, internal/infra/db/db_test.go, internal/infra/runtimebundle/auth_compose_test.go, internal/stdhttp/auth/frontend_bundle_auth_test.go, internal/stdhttp/outer_recovery_test.go, internal/stdhttp/security_guard_test.go, scripts/check-adhoc-goroutines.sh, scripts/check-adhoc-goroutines.ps1, internal/testkit/executor_stub.go, internal/testkit/modelcatalog_snapshot.go, pkg/lipsdk/auth/events_test.go, pkg/lipsdk/auth/inbound_test.go
Linter suppressions, loop variable capture fixes, loop simplifications with for range N, t.Parallel() additions for concurrency, type-safe assertions replacing unchecked casts, test helper markers, goroutine allowlist updates, and reflect API modernizations.
Documentation
docs/capability-catalogs.md, go.mod
Expanded catalog documentation covering snapshot mechanism, refresh semantics, override precedence, diagnostics exposure, and trust boundaries; added github.com/jellydator/ttlcache/v3 dependency.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Hop-hop, a catalog of hops!
Models bop through facts that never stop,
Contexts counted, limits tight,
Overrides dance left and right,
Snapshots refreshed when sources align—
The greatest catalog feature, oh so fine! 🥕

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/model-capabilities-catalog

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/stdhttp/security_guard_test.go (1)

17-44: ⚠️ Potential issue | 🔴 Critical

Race condition: parallel tests share the runningAsAdmin global.

All three tests now run with t.Parallel() while each one swaps the package-level runningAsAdmin via withRunningAsAdmin. Concurrent execution will interleave the writes/reads of this shared global, producing flaky outcomes (e.g. rejectsAdminUser may observe the false stub installed by another test) and tripping -race. Either drop t.Parallel() for these three, or refactor validateStartupSecurity to take the admin-check as a parameter/dependency so each test has its own isolated stub.

🔧 Minimal fix — keep them serial
 func TestValidateStartupSecurity_rejectsAdminUser(t *testing.T) {
-	t.Parallel()
 	withRunningAsAdmin(t, func() (bool, error) { return true, nil })
 ...
 func TestValidateStartupSecurity_rejectsNoAuthNonLoopback(t *testing.T) {
-	t.Parallel()
 	withRunningAsAdmin(t, func() (bool, error) { return false, nil })
 ...
 func TestValidateStartupSecurity_allowsLoopbackNonAdmin(t *testing.T) {
-	t.Parallel()
 	withRunningAsAdmin(t, func() (bool, error) { return false, nil })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/stdhttp/security_guard_test.go` around lines 17 - 44, These tests
race because they share the package-level runningAsAdmin stub toggled by
withRunningAsAdmin while each test calls t.Parallel(); remove t.Parallel() from
TestValidateStartupSecurity_rejectsAdminUser,
TestValidateStartupSecurity_rejectsNoAuthNonLoopback, and
TestValidateStartupSecurity_allowsLoopbackNonAdmin so they run serially, or
alternatively refactor validateStartupSecurity to accept an injected admin-check
function (instead of using the runningAsAdmin global) and update the tests to
pass isolated stubs via that new parameter; ensure references to runningAsAdmin
and withRunningAsAdmin are updated/removed if you choose the injection approach.
🧹 Nitpick comments (37)
internal/core/config/model_catalog_duration.go (1)

10-33: Optional: extract shared positive-duration parser.

The two methods are byte-for-byte identical apart from the field accessed. A tiny private helper would remove the duplication and make future additions (e.g., a RetryBackoff duration field) trivial:

♻️ Proposed refactor
 func (mc ModelCatalogConfig) UpdateIntervalDuration() (d time.Duration, ok bool) {
-	s := strings.TrimSpace(mc.UpdateInterval)
-	if s == "" {
-		return 0, false
-	}
-	d, err := time.ParseDuration(s)
-	if err != nil || d <= 0 {
-		return 0, false
-	}
-	return d, true
+	return parsePositiveDuration(mc.UpdateInterval)
 }

 // FetchTimeoutDuration returns a positive parsed model_catalog.fetch_timeout duration.
 func (mc ModelCatalogConfig) FetchTimeoutDuration() (d time.Duration, ok bool) {
-	s := strings.TrimSpace(mc.FetchTimeout)
-	if s == "" {
-		return 0, false
-	}
-	d, err := time.ParseDuration(s)
-	if err != nil || d <= 0 {
-		return 0, false
-	}
-	return d, true
+	return parsePositiveDuration(mc.FetchTimeout)
+}
+
+func parsePositiveDuration(raw string) (time.Duration, bool) {
+	s := strings.TrimSpace(raw)
+	if s == "" {
+		return 0, false
+	}
+	d, err := time.ParseDuration(s)
+	if err != nil || d <= 0 {
+		return 0, false
+	}
+	return d, true
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/config/model_catalog_duration.go` around lines 10 - 33, Extract
a small private helper like parsePositiveDuration(s string) (time.Duration,
bool) and have both ModelCatalogConfig.UpdateIntervalDuration and
ModelCatalogConfig.FetchTimeoutDuration call it: trim the input string, return
false on empty, parse with time.ParseDuration, and ensure d > 0; this removes
duplicated logic and makes adding future fields (e.g., RetryBackoff)
straightforward—update UpdateIntervalDuration and FetchTimeoutDuration to simply
pass mc.UpdateInterval and mc.FetchTimeout (after trimming if you prefer) into
the new helper and return its results.
internal/core/modelcatalog/matcher.go (1)

54-60: Optional: use slices.Clone for the ambiguous candidates copy.

Since slices is already imported and used for slices.Sort, the manual append([]string(nil), ids...) could be slices.Clone(ids) for slightly clearer intent. Behaviorally identical.

♻️ Proposed nit
 	default:
-		out := append([]string(nil), ids...)
+		out := slices.Clone(ids)
 		return MatchResult{
 			Kind:       MatchAmbiguous,
 			InputModel: input,
 			Candidates: out,
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/matcher.go` around lines 54 - 60, Replace the
manual copy append([]string(nil), ids...) with slices.Clone(ids) to make the
intent clearer; update the default branch in the matcher (the block that
constructs MatchResult with Kind MatchAmbiguous, InputModel input, and
Candidates out) to set Candidates to slices.Clone(ids) instead of creating out
via append — slices is already imported and used elsewhere (e.g., slices.Sort)
so this is a behavior-preserving clarity improvement.
internal/core/modelcatalog/facts.go (1)

66-72: Nit: missing doc comments on exported EligibilityReason constants.

Other exported constants in this file (CapabilityUnknown, LimitUnknown, FactSource*, Match*) carry godoc, but the three Eligibility* constants do not. For consistency with the rest of the file (and to satisfy revive/staticcheck exported-comment rules), add brief doc comments.

📝 Proposed comments
 const (
+	// EligibilityUnknown is the zero value used before an eligibility decision is computed.
 	EligibilityUnknown EligibilityReason = iota
+	// EligibilityEligible means the candidate may proceed to backend open.
 	EligibilityEligible
+	// EligibilityContextLimitExceeded means the request size estimate exceeds a known catalog/override context limit.
 	EligibilityContextLimitExceeded
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/facts.go` around lines 66 - 72, Add godoc comments
for the exported EligibilityReason type and its exported constants to match the
rest of the file; specifically add brief one-line comments above
EligibilityReason and each constant: EligibilityUnknown, EligibilityEligible,
and EligibilityContextLimitExceeded, describing what the type represents and
what each constant means (e.g., "EligibilityUnknown is the zero value for
EligibilityReason", "EligibilityEligible indicates the subject is eligible",
"EligibilityContextLimitExceeded indicates eligibility failed due to context
limits"). Ensure comments use complete sentences and appear immediately above
the type and each constant declaration.
internal/archtest/modelcatalog_boundaries_test.go (1)

36-46: Consider extending forbidden SDK list for infra modelcatalog closure for consistency.

TestCoreModelCatalogImportClosure forbids OpenAI, Anthropic, and Google GenAI SDKs, but TestInfraModelCatalogImportClosure only forbids the OpenAI SDK. Since internal/infra/modelcatalog/modelsdev is a JSON/HTTP adapter for models.dev metadata (no provider invocation), it has no legitimate need for any provider SDK. Adding Anthropic and Google GenAI here would prevent accidental coupling and keep the boundary symmetric.

♻️ Proposed addition
 		{
 			Substr: "github.com/openai/openai-go",
 			ErrMsg: "internal/infra/modelcatalog must not depend on OpenAI SDK",
 		},
+		{
+			Substr: "github.com/anthropics/anthropic-sdk-go",
+			ErrMsg: "internal/infra/modelcatalog must not depend on Anthropic SDK",
+		},
+		{
+			Substr: "google.golang.org/genai",
+			ErrMsg: "internal/infra/modelcatalog must not depend on Google GenAI SDK",
+		},
 	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/archtest/modelcatalog_boundaries_test.go` around lines 36 - 46,
Update the forbidden dependency list in TestInfraModelCatalogImportClosure (the
call to assertDepsExcludeForbidden that currently blocks
"github.com/openai/openai-go") to also include Anthropic and Google GenAI SDKs;
add forbiddenDep entries with Substr values matching the Anthropic SDK package
(e.g., "github.com/anthropic-ai/..." or the repo used in the codebase) and the
Google GenAI SDK package (e.g., "cloud.google.com/go/genai" or the repo used)
and appropriate ErrMsg strings so the test symmetrically blocks Anthropic and
Google GenAI imports just like TestCoreModelCatalogImportClosure does.
internal/plugins/frontends/openailegacy/handler.go (1)

113-118: Consider discriminating by error reason rather than HTTP status.

Using out.Status == 413 to decide whether to blank the OpenAI code works for the current "all candidates context limit exceeded" sentinel, but it couples wire-shape selection to the transport status. If another KindClientReject cause is later mapped to 413 (e.g., a different size-limit sentinel), it will silently inherit code: "", which may not be intended. Threading a semantic field on the execerr.Out (e.g., Reason / WireCode) and switching on it here would keep this branch resilient.

Not blocking for this PR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/plugins/frontends/openailegacy/handler.go` around lines 113 - 118,
The current branch that handles execerr.KindClientReject in handler.go
determines the OpenAI `code` by checking out.Status ==
http.StatusRequestEntityTooLarge, which couples behavior to transport status;
instead add/consume a semantic field (e.g., Reason or WireCode) on the
execerr.Out type and switch on that value here (in the KindClientReject case
within the handler where h.logWriteJSONErr and WriteErrorJSON are invoked),
making the decision to set code = "" only when Reason ==
"context_length_exceeded" (or whatever sentinel you define); update the
execerr.Out model to include Reason/WireCode and ensure callers that construct
execerr.Out populate it so this branch uses out.Reason rather than out.Status.
internal/core/modelcatalog/facts_test.go (1)

9-108: Tests primarily validate field round-trips, not behavior.

These tests assign constants to ModelFacts/LimitFact fields and assert they read back to the same constants. They will pass for any non-trivial enum/struct definition and don't exercise comparators, predicates, or downstream consumers (e.g., eligibility checks, source-priority resolution). Consider replacing with tests over the actual behavior these constants drive (e.g., EffectiveFacts, override precedence, eligibility decisions) — override_resolver_test.go is a good template. The single useful invariant here is the LimitUnknown != LimitUnsupported distinction asserted in TestLimitFact_unknownVsUnsupported.

As per coding guidelines: "Tests are behavior contracts, not implementation snapshots; write failing tests first (TDD)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/facts_test.go` around lines 9 - 108, The current
TestModelFacts_* functions only assert field round-trips; replace them with
behavior-driven tests that exercise the logic these constants affect (e.g., call
the functions that compute effective facts, override precedence and eligibility)
rather than re-asserting setters/getters; keep the meaningful
TestLimitFact_unknownVsUnsupported check. Specifically, delete or convert
TestModelFacts_catalogExactMatch, TestModelFacts_modelOverride,
TestModelFacts_backendModelPairOverride, TestModelFacts_backendDeclaration,
TestModelFacts_noCatalogMatch, and TestModelFacts_ambiguousCatalogMatch into
tests that invoke the real resolution paths (e.g., EffectiveFacts, override
resolution/precedence, and eligibility checks) and assert expected outcomes
using override_resolver_test.go as a template, and ensure
TestLimitFact_unknownVsUnsupported remains to validate LimitUnknown !=
LimitUnsupported.
internal/core/modelcatalog/override_resolver_test.go (1)

10-99: Coverage gaps: ContextLimit overrides and key normalization edge cases.

The summary mentions "token limit overrides" but no test asserts a LimitFact (e.g., LimitPresent/LimitUnsupported) survives via the resolver. Also missing: case sensitivity of the (backend, model) and model keys (e.g., OpenAI-Responses vs openai-responses, GPT-4o vs gpt-4o), and models whose names contain : (which could collide with the pair key delimiter if the resolver uses "backend:model" concatenation). Adding these would harden the resolver against silent miss/collision regressions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/override_resolver_test.go` around lines 10 - 99,
Add tests to cover ContextLimit (token limit) overrides and
key-normalization/collision edge cases: create OverrideSet entries (Model and
Pair) that include ModelFacts.Limit (e.g., LimitPresent or LimitUnsupported) and
assert the resolver (NewOverrideResolver/Resolve) returns those Limit values;
add cases where candidate backend/model casing differs (e.g., "OpenAI-Responses"
vs "openai-responses", "GPT-4o" vs "gpt-4o") to verify case-insensitive
matching, and a case where a model name contains ':' (e.g., "custom:withcolon")
to ensure Pair key concatenation like "backend:model" does not collide — for
each, construct AttemptCandidate with routing.Primary and assert Resolve hits
the expected override and preserves Limit and Source fields from ModelFacts.
internal/core/modelcatalog/backend_caps_clone_test.go (1)

10-23: Optional: tighten the clone independence assertion.

Current test proves the maps are distinct (delete on clone leaves orig intact). Two minor additions would harden the contract:

  • Assert len(cp) == 1 before delete and len(cp) == 0 after, to make the intent unambiguous.
  • Add the reverse direction (mutate orig, verify cp unchanged).

Not required, but cheap insurance against future regressions if the helper ever switches away from maps.Clone.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/backend_caps_clone_test.go` around lines 10 - 23,
Tighten the clone independence assertions in TestCloneBackendCaps_nilAndCopy:
after calling modelcatalog.CloneBackendCaps(orig) assert len(cp) == 1 before
deleting the key and assert len(cp) == 0 after the delete to make intent
explicit, then perform the reverse mutation by deleting lipapi.CapabilityTools
from orig (or otherwise mutating orig) and assert cp still contains
lipapi.CapabilityTools (and its length remains as expected) to verify changes to
orig do not affect the clone; reference the test function
TestCloneBackendCaps_nilAndCopy and variables orig, cp, and
modelcatalog.CloneBackendCaps.
internal/core/modelcatalog/estimate_addsat_test.go (1)

8-19: LGTM — covers the documented saturating semantics.

Optional: an additional case for a negative b operand (e.g., addSaturatingInt64(10, -5)) would lock in symmetry and prevent silent regressions if the helper ever changes to clamp b differently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/estimate_addsat_test.go` around lines 8 - 19, Add
a test case to TestAddSaturatingInt64_clampsAtMaxInt64 to cover a negative
second operand: call addSaturatingInt64(10, -5) and assert the result equals 5
to ensure the function treats negative b symmetrically to negative a (i.e.,
clamps only on overflow/underflow and not by treating b as zero); update the
test in internal/core/modelcatalog/estimate_addsat_test.go inside the existing
TestAddSaturatingInt64_clampsAtMaxInt64 function to include this assertion using
the same t.Fatalf style for failures.
internal/stdhttp/server.go (1)

330-330: Place context.Context first in the parameter list.

Per Go convention (and go vet's contextcheck-style linters), context.Context is conventionally the first non-receiver parameter. Here logCtx is last.

Proposed signature change
-func mountModelCatalogDiagnostics(mux *http.ServeMux, cfg *config.Config, log *slog.Logger, built *runtimebundle.Built, logCtx context.Context) {
+func mountModelCatalogDiagnostics(ctx context.Context, mux *http.ServeMux, cfg *config.Config, log *slog.Logger, built *runtimebundle.Built) {

And update the call site:

-	mountModelCatalogDiagnostics(mux, cfg, log, built, logCtx)
+	mountModelCatalogDiagnostics(logCtx, mux, cfg, log, built)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/stdhttp/server.go` at line 330, Change the function signature of
mountModelCatalogDiagnostics to accept context.Context as the first non-receiver
parameter (move logCtx to be the first parameter) and update all call sites
accordingly; specifically, modify mountModelCatalogDiagnostics(...) so the
context parameter appears before cfg, log, and built, and adjust any callers to
pass the existing logCtx as the first argument to match the new signature.
internal/infra/modelcatalog/modelsdev/fuzz_test.go (1)

8-12: Consider adding seed inputs to guide fuzzing.

f.Fuzz without any f.Add seeds (or a testdata/fuzz/FuzzParseSnapshot/ corpus) starts from the empty input only, which is slow to discover the JSON structure that exercises ParseSnapshot's normalization paths. Adding a couple of representative models.dev fixtures via f.Add(...) significantly improves coverage in short fuzz runs.

Example seed
 func FuzzParseSnapshot(f *testing.F) {
+	f.Add([]byte(`{"providers":{"openai":{"models":{"gpt-4o":{"context":128000}}}}}`))
+	f.Add([]byte(`{}`))
 	f.Fuzz(func(t *testing.T, b []byte) {
 		_, _ = ParseSnapshot(b, time.Unix(0, 0).UTC())
 	})
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/infra/modelcatalog/modelsdev/fuzz_test.go` around lines 8 - 12, The
fuzz test FuzzParseSnapshot currently provides no seed inputs so it only starts
with an empty corpus; add representative seed byte slices via f.Add(...) inside
FuzzParseSnapshot (targeting the ParseSnapshot function) to guide the fuzzer
toward valid JSON/fixture shapes (e.g., a minimal valid models.dev snapshot and
one or two normalized variants used in other tests or fixtures), or populate
testdata/fuzz/FuzzParseSnapshot/ with real snapshot files and call f.Add with
their contents so fuzzing exercises ParseSnapshot's normalization paths more
quickly.
internal/core/runtime/executor_catalog.go (1)

13-24: Reorder nil check on e for clarity.

base := e.backendCapsForAttempt(...) runs before the e == nil guard at line 20. It happens to be safe because backendCapsForAttempt does its own e != nil check, but the apparent ordering invites a future "this could NPE" bug if either method's invariants change. Either drop the now-redundant e == nil here or hoist the guard above the call.

♻️ Proposed reorder
 func (e *Executor) effectiveFactsForAttempt(
 	ctx context.Context,
 	be execbackend.Backend,
 	attempt lipapi.Call,
 	c routing.AttemptCandidate,
 ) modelcatalog.EffectiveFacts {
-	base := e.backendCapsForAttempt(ctx, be, attempt, c)
-	if e == nil || e.CatalogResolver == nil {
+	if e == nil {
+		return syntheticBackendOnlyFacts(execbackend.EffectiveCaps(ctx, be, attempt, c), c)
+	}
+	base := e.backendCapsForAttempt(ctx, be, attempt, c)
+	if e.CatalogResolver == nil {
 		return syntheticBackendOnlyFacts(base, c)
 	}
 	return e.CatalogResolver.Resolve(ctx, c, attempt, base)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/runtime/executor_catalog.go` around lines 13 - 24, The function
effectiveFactsForAttempt calls e.backendCapsForAttempt before checking e == nil
which is fragile; move the nil guard (e == nil || e.CatalogResolver == nil) to
the top of effectiveFactsForAttempt so you return syntheticBackendOnlyFacts(...)
immediately when e or e.CatalogResolver is nil, and only then call
e.backendCapsForAttempt to compute base and finally call
e.CatalogResolver.Resolve(ctx, c, attempt, base); reference symbols:
effectiveFactsForAttempt, backendCapsForAttempt, CatalogResolver,
syntheticBackendOnlyFacts, Resolve.
internal/stdhttp/catalog_status_handler.go (2)

26-33: Minor: encode errors after status 200 produce a truncated body that the client can't distinguish from success.

Once enc.Encode starts writing, the 200 status is already flushed, so a mid-stream encode error leaves the client with a partial JSON document and only a server-side log entry. For a small diagnostics payload, encoding to a buffer first (then w.Write) lets you fall back to a 5xx on encode failure. Optional — current behavior matches the common stdlib pattern and BuildCatalogDiagnosticsJSON is unlikely to fail to marshal.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/stdhttp/catalog_status_handler.go` around lines 26 - 33, The handler
currently streams JSON via enc.Encode(v) which can flush a 200 before an encode
error; instead, marshal/encode the diagnostics into a temporary buffer first
(e.g., use json.Marshal(v) or encode into a bytes.Buffer), check for errors, and
only if successful set the Content-Type and write the buffer to w; on
marshal/encode error log via log.ErrorContext(r.Context(), "modelcatalog:
catalog status encode", "err", err) and return a 5xx (e.g., http.Error(w,
"internal server error", http.StatusInternalServerError)) so the client does not
receive a truncated 200 response — locate code around
BuildCatalogDiagnosticsJSON(cfg) and enc.Encode to implement this change.

18-25: Redundant HEAD branch and missing Allow header on 405.

The inner if r.Method == http.MethodHead { ... } produces a response identical to the fallthrough, so it's dead control flow. Separately, RFC 7231 §6.5.5 requires a 405 response to include an Allow header listing supported methods; http.Error does not set one for you.

♻️ Proposed simplification
 	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
 		if r.Method != http.MethodGet {
-			if r.Method == http.MethodHead {
-				http.Error(w, "method not allowed", http.StatusMethodNotAllowed)
-				return
-			}
-			http.Error(w, "method not allowed", http.StatusMethodNotAllowed)
+			w.Header().Set("Allow", "GET")
+			http.Error(w, "method not allowed", http.StatusMethodNotAllowed)
 			return
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/stdhttp/catalog_status_handler.go` around lines 18 - 25, The branch
checking r.Method == http.MethodHead is redundant because it returns the same
http.Error as the fallthrough; remove the inner HEAD branch and simplify the
method check to reject non-GET methods once. Also, when returning
http.StatusMethodNotAllowed, set the "Allow" response header to "GET" before
calling http.Error so the 405 response complies with RFC 7231; update the
method-checking code that uses r.Method, http.MethodGet, http.MethodHead,
http.Error and http.StatusMethodNotAllowed accordingly.
internal/infra/modelcatalog/modelsdev/source_test.go (1)

96-109: Optional: assert via a typed sentinel rather than substring.

strings.Contains(err.Error(), "exceeds") will silently regress if the size-cap error message ever gets reworded. If modelsdev exposes (or can expose) a typed sentinel like ErrResponseTooLarge, prefer errors.Is(err, modelsdev.ErrResponseTooLarge) here for the same reason the disabled path on Line 141 already does. Minor.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/infra/modelcatalog/modelsdev/source_test.go` around lines 96 - 109,
The test TestHTTPSource_bodyTooLarge currently checks the oversized-response
error by substring on err.Error(); instead, change the assertion to use a typed
sentinel via errors.Is(err, modelsdev.ErrResponseTooLarge) so the test remains
robust to text changes—locate the failing assertion in
TestHTTPSource_bodyTooLarge and replace the strings.Contains check with
errors.Is against modelsdev.ErrResponseTooLarge (import "errors" in the test if
needed) and adjust the t.Fatalf message accordingly.
internal/infra/runtimebundle/secure_session.go (1)

167-172: Optional: factor the cache-options setup into a tiny helper.

Both branches duplicate the same 4-line pattern for translating EffectiveSecureSessionSQLQueryCache outputs into adapter Options. The two Options types are distinct, so the cleanest dedup is a small generic helper or a typed struct of (ttl, max) consumed at the call site. Not blocking.

♻️ Sketch
type sqlCacheOpts struct {
    TTL time.Duration
    Max int
}

func effectiveSecureSessionSQLCache(ss config.SecureSession) (sqlCacheOpts, bool) {
    ttl, maxE, ok := config.EffectiveSecureSessionSQLQueryCache(ss)
    if !ok {
        return sqlCacheOpts{}, false
    }
    return sqlCacheOpts{TTL: ttl, Max: int(maxE)}, true
}

Then each branch becomes a 3-line apply.

Also applies to: 229-233

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/infra/runtimebundle/secure_session.go` around lines 167 - 172, The
code duplicates the 4-line translation of
config.EffectiveSecureSessionSQLQueryCache into adapter options in two places;
add a tiny helper type (e.g., sqlCacheOpts with fields TTL time.Duration and Max
int) and a helper function (e.g., effectiveSecureSessionSQLCache(ss
config.SecureSession) (sqlCacheOpts, bool)) that calls
config.EffectiveSecureSessionSQLQueryCache and converts maxE to int, then use
that helper to apply TTL and Max to
sqlite.Options.SQLQueryCacheTTL/SQLQueryCacheMaxEntries (and the other adapter's
option fields) before calling sqlite.OpenContextWithOptions, removing the
duplicated lines in both branches.
internal/core/modelcatalog/diagnostics_test.go (1)

53-54: Bypassing the runtime constructor couples this test to internal layout.

var rt CatalogRuntime; rt.active.Store(&s) skips NewCatalogRuntime / Start, unlike the sibling TestBuildCatalogDiagnosticsJSON_unavailable. If CatalogRuntime later gains required initialization in the constructor (channels, schedulers, additional atomics), this test will silently misrepresent reality or panic. A small package-private helper that publishes a snapshot via the normal lifecycle would keep the test resilient to internal changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/diagnostics_test.go` around lines 53 - 54, The
test currently bypasses the CatalogRuntime constructor by doing "var rt
CatalogRuntime; rt.active.Store(&s)" which couples the test to internal layout;
instead create and use a small package-private helper that constructs a
CatalogRuntime via NewCatalogRuntime (and calls Start if required) and publishes
the snapshot through the runtime's normal lifecycle, then replace the direct
store call in this test with that helper so the test uses
CatalogRuntime/NewCatalogRuntime/Start and remains resilient to future internal
initialization changes.
internal/infra/modelcatalog/modelsdev/normalize_test.go (1)

149-157: Subtest named inf is not actually infinity.

JSON has no infinity literal, and 1e308 is a finite float64. The two cases here both exercise "finite double that overflows int64". Either rename inf to something like near_max_double for clarity, or — if you genuinely want non-finite handling — assert it via a +Inf value injected through the parser's lower layers (or document that JSON cannot encode it). Test name aside, both branches assert the same LimitUnknown outcome, so consider folding to one case unless you’re specifically targeting different float magnitudes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/infra/modelcatalog/modelsdev/normalize_test.go` around lines 149 -
157, The subtest named "inf" in
TestParseSnapshot_limitRejectsOverflowAndNonFinite is misleading because JSON
cannot represent +Inf and the literal 1e308 is finite; update the test by either
renaming the "inf" case to something like "near_max_double" (or "large_finite")
to reflect it exercises large finite doubles, or replace the case with a true
non-finite value injected at a lower layer if you intend to test +Inf handling;
also consider collapsing the two cases into one since both currently assert the
same LimitUnknown outcome, and add a short comment in
TestParseSnapshot_limitRejectsOverflowAndNonFinite explaining why JSON cannot
encode infinity if you keep the finite-large test.
internal/core/runtime/executor_route_trace_catalog.go (1)

29-34: Function name routeTraceCatalogJSON is misleading.

The helper returns a *diag.RouteTraceCatalog struct, not JSON. Consider renaming to something like buildRouteTraceCatalog or newRouteTraceCatalog to match its actual contract; the JSON encoding happens at the route-trace handler boundary.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/runtime/executor_route_trace_catalog.go` around lines 29 - 34,
The function name routeTraceCatalogJSON is misleading because it returns a
*diag.RouteTraceCatalog, so rename routeTraceCatalogJSON to a clearer name such
as buildRouteTraceCatalog or newRouteTraceCatalog and update all call sites to
use the new name; ensure you also update any comments/docs referencing
routeTraceCatalogJSON and run/adjust tests that reference the symbol (look for
usages in route-trace handler code and helper callers to replace
routeTraceCatalogJSON with the new function name).
internal/core/config/sql_query_cache_test.go (1)

11-60: Consider consolidating overlapping empty-TTL tests.

TestValidate_secureSession_sqlQueryCacheTTL_emptyOK and TestValidate_secureSession_sqlQueryCacheTTL_emptyAndMaxOK both assert that an empty TTL is accepted — only SQLQueryCacheMaxEntries differs (100 vs 0). A single table-driven test with {maxEntries int} cases would make the contract clearer (empty TTL is valid regardless of MaxEntries) and reduce drift between near-duplicate fixtures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/config/sql_query_cache_test.go` around lines 11 - 60, Replace
the two near-duplicate tests TestValidate_secureSession_sqlQueryCacheTTL_emptyOK
and TestValidate_secureSession_sqlQueryCacheTTL_emptyAndMaxOK with a single
table-driven test (e.g.,
TestValidate_secureSession_sqlQueryCacheTTL_empty_variousMaxEntries) that
iterates over maxEntries cases (e.g., 100 and 0), sets SQLQueryCacheTTL to ""
and SQLQueryCacheMaxEntries to the table value, calls config.Validate(cfg) for
each subtest, and asserts no error; keep t.Parallel() and reuse the same
secureSessionBaselinePlugins(), TokenFingerprintKey, and other fixture values so
the behavior (empty TTL accepted regardless of MaxEntries) is verified in one
consolidated test while leaving
TestValidate_secureSession_sqlQueryCacheMaxEntries_negative unchanged.
internal/core/config/validate.go (2)

625-632: Clarify error when update_interval is empty.

When external_updates_enabled=true and update_interval is omitted, time.ParseDuration("") produces time: invalid duration "", which is functional but not as actionable as the surrounding "required when …" messages. Consider an explicit empty-string branch.

♻️ Proposed refactor
 		ui := strings.TrimSpace(mc.UpdateInterval)
+		if ui == "" {
+			return fmt.Errorf("model_catalog.update_interval: required when model_catalog.external_updates_enabled is true")
+		}
 		d, err := time.ParseDuration(ui)
 		if err != nil {
 			return fmt.Errorf("model_catalog.update_interval: %w", err)
 		}
 		if d <= 0 {
 			return fmt.Errorf("model_catalog.update_interval: must be positive when model_catalog.external_updates_enabled is true")
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/config/validate.go` around lines 625 - 632, When validating
mc.UpdateInterval before calling time.ParseDuration in the validation logic (the
ui variable and the time.ParseDuration call), add an explicit check for an empty
string and return a clear, actionable error like "model_catalog.update_interval:
required when model_catalog.external_updates_enabled is true" instead of letting
time.ParseDuration produce "time: invalid duration \"\""; keep the existing
non-positive-duration check (d <= 0) after parsing.

634-639: Source URL re-checked when external_updates_enabled=true.

When mc.ExternalUpdatesEnabled is true, lines 614-624 already validate SourceURL (including the https requirement). This block then revalidates the same URL with a weaker check. Not incorrect, but consider gating it with else so the validation paths stay disjoint.

♻️ Proposed refactor
-	if su := strings.TrimSpace(mc.SourceURL); su != "" {
+	} else if su := strings.TrimSpace(mc.SourceURL); su != "" {
 		u, err := url.Parse(su)
 		if err != nil || u.Scheme == "" || u.Host == "" {
 			return fmt.Errorf("model_catalog.source_url: invalid URL")
 		}
 	}

(Move this block into the else arm of the if mc.ExternalUpdatesEnabled above.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/config/validate.go` around lines 634 - 639, The SourceURL
validation block currently runs unconditionally and duplicates a weaker check
when mc.ExternalUpdatesEnabled already validated SourceURL; modify the logic so
the snippet that parses and validates mc.SourceURL (url.Parse and scheme/host
checks) is executed only in the else branch of the existing if
mc.ExternalUpdatesEnabled block (i.e., move this validation into the else arm),
ensuring the two validation paths are disjoint and SourceURL is not re-checked
with a weaker rule when ExternalUpdatesEnabled=true.
internal/infra/runtimebundle/modelcatalog_attach.go (2)

32-34: Errors from duration parsers are intentionally dropped — depends on Validate having run.

UpdateIntervalDuration() and FetchTimeoutDuration() errors are discarded with _. This is safe only because config.Validate (which checks both durations) is required to have run before Build. If Build is ever invoked on an un-validated config (e.g., in a future test or fast-path), zero-duration values silently disable refresh / timeout. A short comment near these lines anchoring the assumption — or a defensive log — would harden against future regressions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/infra/runtimebundle/modelcatalog_attach.go` around lines 32 - 34,
The calls to mc.UpdateIntervalDuration() and mc.FetchTimeoutDuration() currently
discard parse errors (interval, _ := ...; fetchTimeout, _ := ...), relying on
config.Validate having run; add a short explanatory comment next to these lines
that documents this precondition (i.e., Validate must be called before Build),
and optionally add a defensive debug/warn log via the same component (e.g.,
processLogger or mc logger) that emits a message if either parsed duration is
zero to detect un-validated configs during tests — reference the
UpdateIntervalDuration, FetchTimeoutDuration calls and the variables interval
and fetchTimeout in modelcatalog_attach.go when making the change.

56-66: Clarify ownership when closers == nil.

The branching here implies an unwritten contract: when closers == nil, the caller takes ownership of the returned *CatalogRuntime (must call Close), but the function still eagerly cancels any running refresh loop. This is workable, but:

  • If closers == nil and external updates are disabled, neither cat.Close is registered nor invoked anywhere in this function, relying entirely on the caller noticing the return value. A passing caller that ignores the return on the mc.Enabled && !mc.ExternalUpdatesEnabled path would leak the runtime.
  • The doc comment on attachModelCatalog doesn't state this ownership contract.

Consider documenting the contract on the function (e.g., "callers must Close the returned runtime when closers is nil") or always registering cat.Close and only short-circuiting refreshCleanup when there's no slice to receive it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/infra/runtimebundle/modelcatalog_attach.go` around lines 56 - 66,
The function attachModelCatalog currently leaves ownership ambiguous when
closers == nil: callers may leak *CatalogRuntime (cat) on the mc.Enabled &&
!mc.ExternalUpdatesEnabled path because cat.Close is neither registered nor
called; update attachModelCatalog to either document that callers must Close the
returned runtime when closers is nil or (preferably) always register/handle
cat.Close internally — e.g., if closers == nil then call cat.Close before
returning or append a local closer that will be invoked, and only suppress
invoking refreshCleanup when you intentionally transfer ownership; adjust logic
around cat.Close and refreshCleanup to ensure no leak and add a short doc
comment on attachModelCatalog mentioning the ownership contract referencing
cat.Close, refreshCleanup, and the mc.Enabled && !mc.ExternalUpdatesEnabled
branch.
internal/core/runtime/executor_modelcatalog_e2e_test.go (1)

105-143: Test asserts Resolve was called more times, not that new generations were observed.

The test name implies sequential executes observe newer snapshot generations, but the only assertion is gAfterSecond > gAfterFirst, i.e. that Resolve was called more times. The Snapshot.Generation value the resolver emits per call is never inspected on the executor side. Either tighten the assertion (e.g., capture stream events / route trace and verify the second execute saw gen2 rather than gen1) or rename to something like TestExecutor_sequentialExecutes_resolvePerCall.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/runtime/executor_modelcatalog_e2e_test.go` around lines 105 -
143, The test TestExecutor_sequentialExecutes_seeNewCatalogGenerations currently
asserts only that the resolver (snapshotGenResolver.Resolve) was invoked more
times; update the test to assert the executor actually observed newer
Snapshot.Generation values instead of just count. Specifically, when calling
ex.Execute(...) capture the execution's route/trace or inspect the stream events
returned by mustStream/Collect to extract the Snapshot.Generation or
resolver-emitted generation token for the first and second executions, then
assert secondGeneration > firstGeneration; alternatively, if you prefer the
simpler fix, rename the test to TestExecutor_sequentialExecutes_resolvePerCall
to match the current behavior. Ensure you reference
TestExecutor_sequentialExecutes_seeNewCatalogGenerations, snapshotGenResolver
(n/Resolve), and Snapshot.Generation in your changes so the intent is clear.
internal/core/runtime/executor_open_attempt.go (1)

88-94: Facts captured by closure may be stale relative to downgrades.

facts is computed inside the safety.CallValue closure with the original attempt. If negotiation returns NegotiationDowngrade and lipapi.ApplyNegotiatedDowngrades mutates attempt at line 137, the originally-captured facts no longer reflects effective state. This is correctly handled later (line 142 recomputes when eligibility runs, and lines 159-161 recompute when eligibility is off but downgrade occurred), so the route-trace at line 124 (negotiation reject path) is the only consumer of pre-downgrade facts — which is fine since reject means no downgrade applied. Just noting the data flow is intentional but subtle; a brief comment near line 88 explaining "facts captures pre-downgrade state; recomputed below if negotiation alters attempt" would help future maintainers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/runtime/executor_open_attempt.go` around lines 88 - 94, Add a
short comment near the facts capture inside the safety.CallValue closure (around
the call to e.effectiveFactsForAttempt and the local variable facts) explaining
that facts intentionally reflect the pre-downgrade state of attempt because
negotiation is performed first; note that if negotiation returns
NegotiationDowngrade and lipapi.ApplyNegotiatedDowngrades mutates attempt later,
facts will be recomputed for eligibility checks (via e.effectiveFactsForAttempt)
so the route-trace built on the rejection path can safely use the pre-downgrade
facts while downstream logic refreshes them as needed.
internal/infra/modelcatalog/modelsdev/store.go (1)

117-132: Optional: fsync the directory after rename for full crash atomicity.

tmp.Sync() flushes the file's contents, but on POSIX the directory entry created by os.Rename isn't durable until the parent directory is fsynced. For a cache that's regenerated on next fetch this is acceptable, but if you want the on-disk envelope to survive an immediate kernel/host crash after Save returns, open the directory and Sync it before returning.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/infra/modelcatalog/modelsdev/store.go` around lines 117 - 132, The
Save routine currently fsyncs the temp file (tmp.Sync()) and renames it to
f.path but doesn't fsync the parent directory, so the rename may not be durable;
after os.Rename(tmpPath, f.path) open the parent directory (use the directory
portion of f.path), call Sync on that directory file descriptor, then close it,
and return any Sync error wrapped similarly (e.g., "modelsdev cache dir sync:
%w") to ensure full on-disk durability.
internal/infra/modelcatalog/modelsdev/source.go (2)

58-58: Optional: set Accept: application/json on the request.

Some CDNs/proxies content-negotiate or return HTML error pages by default; an explicit Accept header makes the contract clearer and improves debuggability if the upstream ever serves multiple representations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/infra/modelcatalog/modelsdev/source.go` at line 58, The HTTP request
created with http.NewRequestWithContext(opCtx, http.MethodGet, h.url, nil)
should explicitly set an Accept header to "application/json" to avoid
content-negotiation returning non-JSON (e.g., HTML error pages); after creating
req (the variable returned from http.NewRequestWithContext) add
req.Header.Set("Accept", "application/json") before the request is sent (in the
same function handling the request/response).

70-80: Status check should precede (or at least accompany) the size error.

When the upstream returns a non-2xx with a body larger than maxCatalogHTTPResponseBytes, the user sees response body exceeds 67108864 bytes and the actual HTTP status is silently dropped. Reordering also avoids reading up to 64 MiB on every error response.

♻️ Proposed reorder
 	defer func() { _ = resp.Body.Close() }()
+	if resp.StatusCode < 200 || resp.StatusCode >= 300 {
+		return zero, fmt.Errorf("modelsdev source: http status %d", resp.StatusCode)
+	}
 	limited := io.LimitedReader{R: resp.Body, N: maxCatalogHTTPResponseBytes + 1}
 	body, err := io.ReadAll(&limited)
 	if err != nil {
 		return zero, fmt.Errorf("modelsdev source: read body: %w", err)
 	}
 	if int64(len(body)) > maxCatalogHTTPResponseBytes {
 		return zero, fmt.Errorf("modelsdev source: response body exceeds %d bytes", maxCatalogHTTPResponseBytes)
 	}
-	if resp.StatusCode < 200 || resp.StatusCode >= 300 {
-		return zero, fmt.Errorf("modelsdev source: http status %d", resp.StatusCode)
-	}
 	return ParseSnapshot(body, time.Now().UTC())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/infra/modelcatalog/modelsdev/source.go` around lines 70 - 80, The
current logic reads the entire (possibly huge) response into limited and only
afterwards checks resp.StatusCode, which masks non-2xx statuses with a size
error and forces reading up to maxCatalogHTTPResponseBytes; change the order in
the function that uses io.LimitedReader/limited and resp to check
resp.StatusCode first (resp.StatusCode < 200 || resp.StatusCode >= 300) and
return an error that includes the status code (optionally read a small bounded
snippet for context) before attempting the full body read, so the status error
from resp.StatusCode is returned immediately and large error responses are not
always fully read.
internal/core/securesession/adapters/bunstore/store.go (2)

92-111: Document that transcript_enabled is treated as immutable for cache correctness.

transcriptEnabledCached only writes the cache on hit and only invalidates on ErrSessionNotFound paths. If any code path later mutates transcript_enabled on an existing session row (admin tooling, policy migration, etc.), readers within SQLQueryCacheTTL will observe the stale prior value with no invalidation hook to flip it. Today that field appears set only at Create time, so the assumption holds — adding a one-line doc comment on the helper (or on the transcript_enabled column) protects the next person who's tempted to add an UPDATE.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/securesession/adapters/bunstore/store.go` around lines 92 -
111, Add a one-line doc comment to document that the transcript_enabled column
is treated as immutable for cache correctness: annotate the
transcriptEnabledCached function (or the transcript_enabled column) to state
that the value is only set at creation and must not be updated later because the
function caches the value (uses s.meta.transcript.Set and only invalidates on
session-not-found), so any UPDATE to transcript_enabled will lead to stale reads
until TTL expiry; reference the function name transcriptEnabledCached and the
transcript_enabled column in the comment so future maintainers understand the
invariant.

64-70: Avoid shadowing the built-in cap.

cap is a Go predeclared identifier (used for slice/channel capacity). Local shadowing is legal but commonly flagged by govet/predeclared and reduces readability. Rename to e.g. maxEntries.

♻️ Proposed rename
-	if opts.SQLQueryCacheTTL > 0 {
-		cap := uint64(opts.SQLQueryCacheMaxEntries)
-		if cap == 0 {
-			cap = 4096
-		}
-		s.meta = newSessionMetaCache(opts.SQLQueryCacheTTL, cap)
-	}
+	if opts.SQLQueryCacheTTL > 0 {
+		maxEntries := uint64(opts.SQLQueryCacheMaxEntries)
+		if maxEntries == 0 {
+			maxEntries = 4096
+		}
+		s.meta = newSessionMetaCache(opts.SQLQueryCacheTTL, maxEntries)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/securesession/adapters/bunstore/store.go` around lines 64 - 70,
The local variable named `cap` in the conditional that initializes `s.meta`
shadows Go's predeclared `cap` identifier; rename it (e.g., to `maxEntries`) to
avoid shadowing and improve readability. Update the block that checks
`opts.SQLQueryCacheTTL` and sets the capacity—replace `cap` with `maxEntries`
(or similar) and use it when calling `newSessionMetaCache(opts.SQLQueryCacheTTL,
maxEntries)`; leave the `opts.SQLQueryCacheMaxEntries` fallback logic unchanged.
internal/core/modelcatalog/diagnostics.go (1)

134-148: Minor: clock-skew safety in staleness check.

now.Sub(snap.FetchedAt) > 2*cfg.UpdateInterval correctly returns false (not stale) when FetchedAt is in the future, so behavior is benign — but if you'd rather treat future timestamps as a signal worth surfacing to operators, you'd compare |now - FetchedAt| instead. No change required for chill review; flagging for awareness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/diagnostics.go` around lines 134 - 148, The
staleness check in catalogSnapshotIsStale currently ignores clock skew when
snap.FetchedAt is in the future; update the logic in catalogSnapshotIsStale to
treat future timestamps as noteworthy by computing the duration delta :=
now.Sub(snap.FetchedAt) and marking stale if delta < 0 || delta >
2*cfg.UpdateInterval (referencing catalogSnapshotIsStale, snap.FetchedAt, now,
and cfg.UpdateInterval); make sure to keep the existing external-updates and
LastRefreshFailure checks and only change the time-comparison branch to use the
absolute/negativity-aware condition.
internal/core/securesession/adapters/sqlite/store.go (2)

137-156: Negative-cache existence when transcript read sees NoRows.

When transcriptEnabledCached falls through to SQL and gets sql.ErrNoRows, only domain.ErrSessionNotFound is returned — the existence cache is not updated. sessionExists does positively cache false for the same condition, so the two caches can disagree until natural eviction. Aligning them keeps the diagnostic shape coherent and avoids a duplicate SQL probe on the next call path that hits sessionExists.

♻️ Mirror the negative existence cache
 	err := q.QueryRowContext(ctx, `SELECT transcript_enabled FROM lip_secure_sessions WHERE session_id = ?`, string(id)).Scan(&te)
 	if err != nil {
 		if errors.Is(err, sql.ErrNoRows) {
+			if s.meta != nil {
+				s.meta.exists.Set(id, false, ttlcache.DefaultTTL)
+			}
 			return false, domain.ErrSessionNotFound
 		}
 		return false, err
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/securesession/adapters/sqlite/store.go` around lines 137 - 156,
In transcriptEnabledCached, when the SQL query returns sql.ErrNoRows you should
mirror the negative existence cache just like sessionExists does: set
s.meta.transcript.Set(id, false, ttlcache.DefaultTTL) (guarded by s.meta != nil)
before returning domain.ErrSessionNotFound so the transcript existence cache and
sessionExists remain consistent and avoid a duplicate SQL probe.

108-117: Avoid shadowing the cap builtin.

cap := uint64(opts.SQLQueryCacheMaxEntries) shadows the predeclared cap identifier within this function. Harmless today but trips up future refactors that want cap(slice) here.

♻️ Rename the local
-	if opts.SQLQueryCacheTTL > 0 {
-		cap := uint64(opts.SQLQueryCacheMaxEntries)
-		if cap == 0 {
-			cap = 4096
-		}
-		st.meta = newSessionMetaCache(opts.SQLQueryCacheTTL, cap)
-	}
+	if opts.SQLQueryCacheTTL > 0 {
+		maxEntries := uint64(opts.SQLQueryCacheMaxEntries)
+		if maxEntries == 0 {
+			maxEntries = 4096
+		}
+		st.meta = newSessionMetaCache(opts.SQLQueryCacheTTL, maxEntries)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/securesession/adapters/sqlite/store.go` around lines 108 - 117,
The local variable currently named "cap" shadows the predeclared cap builtin;
rename it (for example to "cacheCap" or "maxEntries") where it's declared in the
block that constructs Store (around "st := &Store{db: db}") and update its use
in the call to newSessionMetaCache(opts.SQLQueryCacheTTL, cap) so the code
becomes newSessionMetaCache(opts.SQLQueryCacheTTL, cacheCap). Ensure the renamed
identifier replaces all occurrences in that snippet (the assignment from
opts.SQLQueryCacheMaxEntries and the fallback to 4096) and leave the Store.meta
= newSessionMetaCache(...) logic unchanged.
internal/core/modelcatalog/estimate.go (1)

11-17: Optional: declare a named constant for the tools+session basis.

estimateBasisForSessionPath(true) synthesizes "canonical_utf8_bytes+tools_json_bytes+session_bytes" via string concatenation, while every other basis value has an exported constant (Req 7.7). Diagnostics or tests asserting on basis strings won't have a stable symbol for this combination.

♻️ Add a constant for the combined tools+session basis
 const (
 	EstimateBasisCanonicalUTF8                  = "canonical_utf8_bytes"
 	EstimateBasisCanonicalUTF8AndTools          = "canonical_utf8_bytes+tools_json_bytes"
 	EstimateBasisCanonicalUTF8AndSession        = "canonical_utf8_bytes+session_bytes"
+	EstimateBasisCanonicalUTF8AndToolsAndSession = "canonical_utf8_bytes+tools_json_bytes+session_bytes"
 	EstimateBasisSessionContributionUnavailable = "session_contribution_unavailable"
 )
@@
 func estimateBasisForSessionPath(hasTools bool) string {
 	if hasTools {
-		return EstimateBasisCanonicalUTF8AndTools + "+session_bytes"
+		return EstimateBasisCanonicalUTF8AndToolsAndSession
 	}
 	return EstimateBasisCanonicalUTF8AndSession
 }

Also applies to: 93-98

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/estimate.go` around lines 11 - 17, Declare a new
exported constant (e.g., EstimateBasisCanonicalUTF8AndToolsAndSession =
"canonical_utf8_bytes+tools_json_bytes+session_bytes") alongside the other
EstimateBasis* constants, and replace the string concatenation in
estimateBasisForSessionPath (the branch that currently builds
"canonical_utf8_bytes+tools_json_bytes+session_bytes") with that new constant;
also update any other occurrences (the other spot around the existing
EstimateBasis constants) to reference this new named constant instead of
hardcoded or concatenated strings.
internal/core/modelcatalog/runtime.go (1)

53-69: Consider surfacing cache-load failure during Start.

Start discards both the error and the case where snap.Index == nil. The diagnostics endpoint reports LastRefreshFailure to operators; if the on-disk cache exists but is corrupt/unparseable on boot, you currently report RefreshFailureNone until the first refresh succeeds or fails — which may mask a real persistence issue (especially when refresh is disabled and only the cache is configured).

♻️ Tag cache-load failures as `RefreshFailureCache`
 	if r.cfg.Cache != nil {
-		snap, err := r.cfg.Cache.Load(parent)
-		if err == nil && snap.Index != nil {
-			cp := snap
-			r.active.Store(&cp)
-		}
+		snap, err := r.cfg.Cache.Load(parent)
+		switch {
+		case err == nil && snap.Index != nil:
+			cp := snap
+			r.active.Store(&cp)
+		case err != nil:
+			r.setFailure(RefreshFailureCache)
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/runtime.go` around lines 53 - 69, In
CatalogRuntime.Start, don't silently ignore cache Load errors: when r.cfg.Cache
!= nil and r.cfg.Cache.Load(parent) returns a non-nil err, record that failure
in the runtime diagnostics as a cache-specific refresh failure (e.g., set the
diagnostics LastRefreshFailure to RefreshFailureCache and include the err),
while keeping the existing behavior of loading and storing a valid snap when err
== nil && snap.Index != nil; update the Start implementation (around the
r.cfg.Cache.Load call in CatalogRuntime.Start) to call the diagnostics/logging
helper (e.g., r.diagnostics.SetLastRefreshFailure or similar) with
RefreshFailureCache and the underlying error so the boot-time cache corruption
is surfaced.
internal/core/config/model_catalog_test.go (1)

24-26: Optional: inline yamlPath or drop it.

yamlPath is a one-line wrapper around filepath.ToSlash used identically at every call site. Inlining filepath.ToSlash(...) would remove a small layer of indirection without losing readability. Not a blocker.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/config/model_catalog_test.go` around lines 24 - 26, The
yamlPath helper is a trivial one-line wrapper around filepath.ToSlash and is
used identically everywhere; remove the yamlPath function and replace its call
sites with direct calls to filepath.ToSlash(...) (or, alternatively, inline
filepath.ToSlash(...) at each usage) so there’s no indirection—search for
yamlPath(...) and update those locations and delete the yamlPath function
definition.

Comment thread internal/core/config/effective_secure_session.go
Comment thread internal/core/config/validate.go
Comment thread internal/core/modelcatalog/diagnostics.go
Comment on lines +11 to +16
type EligibilityDecision struct {
IsEligible bool `json:"eligible,omitempty"`
Reason EligibilityReason
Facts EffectiveFacts
Estimate SizeEstimate
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

omitempty on a bool will silently drop the false (ineligible) case.

If EligibilityDecision is ever marshaled (the partial JSON tag suggests it may be), an ineligible decision serializes to {} and is indistinguishable from a missing field. Either drop omitempty or remove the tag if no JSON encoding is intended; also consider tagging the other fields consistently.

As per coding guidelines: "Slices in JSON and returned values must be explicitly initialized as empty ... to prevent null in JSON" — analogous reasoning applies to booleans here.

🛡️ Proposed fix
 type EligibilityDecision struct {
-	IsEligible bool `json:"eligible,omitempty"`
-	Reason     EligibilityReason
-	Facts      EffectiveFacts
-	Estimate   SizeEstimate
+	IsEligible bool              `json:"eligible"`
+	Reason     EligibilityReason `json:"reason,omitempty"`
+	Facts      EffectiveFacts    `json:"facts,omitempty"`
+	Estimate   SizeEstimate      `json:"estimate,omitempty"`
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
type EligibilityDecision struct {
IsEligible bool `json:"eligible,omitempty"`
Reason EligibilityReason
Facts EffectiveFacts
Estimate SizeEstimate
}
type EligibilityDecision struct {
IsEligible bool `json:"eligible"`
Reason EligibilityReason `json:"reason,omitempty"`
Facts EffectiveFacts `json:"facts,omitempty"`
Estimate SizeEstimate `json:"estimate,omitempty"`
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/modelcatalog/eligibility.go` around lines 11 - 16, The JSON tag
on EligibilityDecision.IsEligible uses `omitempty` which will drop false values;
remove the `omitempty` (i.e., change `json:"eligible,omitempty"` to
`json:"eligible"`) or remove the tag entirely so a false/ineligible decision is
serialized, and make the JSON tagging consistent for the other fields on the
EligibilityDecision type (add explicit `json:"reason"`, `json:"facts"`,
`json:"estimate"` tags as appropriate) and ensure any slice/collection fields in
EffectiveFacts or SizeEstimate are initialized to empty slices when constructed
to avoid null in JSON.

Comment thread internal/core/modelcatalog/matcher.go
Comment thread internal/core/securesession/adapters/bunstore/store_test.go
Comment thread internal/core/securesession/adapters/bunstore/store_test.go Outdated
Comment thread internal/infra/modelcatalog/modelsdev/normalize.go
Comment thread internal/infra/runtimebundle/build.go
@matdev83 matdev83 merged commit c307d5f into main Apr 27, 2026
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant