feat: add model capabilities catalog#7
Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (12)
📝 WalkthroughWalkthroughThis pull request introduces a comprehensive model catalog feature enabling local snapshot-based model capability facts and context-limit enforcement. It adds configuration structures, core catalog logic for fact resolution and eligibility checking, models.dev catalog ingestion (parsing, HTTP fetch, file caching), integration with the executor for routing decisions, SQL query TTL caching for secure sessions, diagnostics endpoints, and extensive test coverage. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/stdhttp/security_guard_test.go (1)
17-44:⚠️ Potential issue | 🔴 CriticalRace condition: parallel tests share the
runningAsAdminglobal.All three tests now run with
t.Parallel()while each one swaps the package-levelrunningAsAdminviawithRunningAsAdmin. Concurrent execution will interleave the writes/reads of this shared global, producing flaky outcomes (e.g.rejectsAdminUsermay observe thefalsestub installed by another test) and tripping-race. Either dropt.Parallel()for these three, or refactorvalidateStartupSecurityto take the admin-check as a parameter/dependency so each test has its own isolated stub.🔧 Minimal fix — keep them serial
func TestValidateStartupSecurity_rejectsAdminUser(t *testing.T) { - t.Parallel() withRunningAsAdmin(t, func() (bool, error) { return true, nil }) ... func TestValidateStartupSecurity_rejectsNoAuthNonLoopback(t *testing.T) { - t.Parallel() withRunningAsAdmin(t, func() (bool, error) { return false, nil }) ... func TestValidateStartupSecurity_allowsLoopbackNonAdmin(t *testing.T) { - t.Parallel() withRunningAsAdmin(t, func() (bool, error) { return false, nil })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/stdhttp/security_guard_test.go` around lines 17 - 44, These tests race because they share the package-level runningAsAdmin stub toggled by withRunningAsAdmin while each test calls t.Parallel(); remove t.Parallel() from TestValidateStartupSecurity_rejectsAdminUser, TestValidateStartupSecurity_rejectsNoAuthNonLoopback, and TestValidateStartupSecurity_allowsLoopbackNonAdmin so they run serially, or alternatively refactor validateStartupSecurity to accept an injected admin-check function (instead of using the runningAsAdmin global) and update the tests to pass isolated stubs via that new parameter; ensure references to runningAsAdmin and withRunningAsAdmin are updated/removed if you choose the injection approach.
🧹 Nitpick comments (37)
internal/core/config/model_catalog_duration.go (1)
10-33: Optional: extract shared positive-duration parser.The two methods are byte-for-byte identical apart from the field accessed. A tiny private helper would remove the duplication and make future additions (e.g., a
RetryBackoffduration field) trivial:♻️ Proposed refactor
func (mc ModelCatalogConfig) UpdateIntervalDuration() (d time.Duration, ok bool) { - s := strings.TrimSpace(mc.UpdateInterval) - if s == "" { - return 0, false - } - d, err := time.ParseDuration(s) - if err != nil || d <= 0 { - return 0, false - } - return d, true + return parsePositiveDuration(mc.UpdateInterval) } // FetchTimeoutDuration returns a positive parsed model_catalog.fetch_timeout duration. func (mc ModelCatalogConfig) FetchTimeoutDuration() (d time.Duration, ok bool) { - s := strings.TrimSpace(mc.FetchTimeout) - if s == "" { - return 0, false - } - d, err := time.ParseDuration(s) - if err != nil || d <= 0 { - return 0, false - } - return d, true + return parsePositiveDuration(mc.FetchTimeout) +} + +func parsePositiveDuration(raw string) (time.Duration, bool) { + s := strings.TrimSpace(raw) + if s == "" { + return 0, false + } + d, err := time.ParseDuration(s) + if err != nil || d <= 0 { + return 0, false + } + return d, true }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/config/model_catalog_duration.go` around lines 10 - 33, Extract a small private helper like parsePositiveDuration(s string) (time.Duration, bool) and have both ModelCatalogConfig.UpdateIntervalDuration and ModelCatalogConfig.FetchTimeoutDuration call it: trim the input string, return false on empty, parse with time.ParseDuration, and ensure d > 0; this removes duplicated logic and makes adding future fields (e.g., RetryBackoff) straightforward—update UpdateIntervalDuration and FetchTimeoutDuration to simply pass mc.UpdateInterval and mc.FetchTimeout (after trimming if you prefer) into the new helper and return its results.internal/core/modelcatalog/matcher.go (1)
54-60: Optional: useslices.Clonefor the ambiguous candidates copy.Since
slicesis already imported and used forslices.Sort, the manualappend([]string(nil), ids...)could beslices.Clone(ids)for slightly clearer intent. Behaviorally identical.♻️ Proposed nit
default: - out := append([]string(nil), ids...) + out := slices.Clone(ids) return MatchResult{ Kind: MatchAmbiguous, InputModel: input, Candidates: out, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/modelcatalog/matcher.go` around lines 54 - 60, Replace the manual copy append([]string(nil), ids...) with slices.Clone(ids) to make the intent clearer; update the default branch in the matcher (the block that constructs MatchResult with Kind MatchAmbiguous, InputModel input, and Candidates out) to set Candidates to slices.Clone(ids) instead of creating out via append — slices is already imported and used elsewhere (e.g., slices.Sort) so this is a behavior-preserving clarity improvement.internal/core/modelcatalog/facts.go (1)
66-72: Nit: missing doc comments on exportedEligibilityReasonconstants.Other exported constants in this file (
CapabilityUnknown,LimitUnknown,FactSource*,Match*) carry godoc, but the threeEligibility*constants do not. For consistency with the rest of the file (and to satisfyrevive/staticcheckexported-comment rules), add brief doc comments.📝 Proposed comments
const ( + // EligibilityUnknown is the zero value used before an eligibility decision is computed. EligibilityUnknown EligibilityReason = iota + // EligibilityEligible means the candidate may proceed to backend open. EligibilityEligible + // EligibilityContextLimitExceeded means the request size estimate exceeds a known catalog/override context limit. EligibilityContextLimitExceeded )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/modelcatalog/facts.go` around lines 66 - 72, Add godoc comments for the exported EligibilityReason type and its exported constants to match the rest of the file; specifically add brief one-line comments above EligibilityReason and each constant: EligibilityUnknown, EligibilityEligible, and EligibilityContextLimitExceeded, describing what the type represents and what each constant means (e.g., "EligibilityUnknown is the zero value for EligibilityReason", "EligibilityEligible indicates the subject is eligible", "EligibilityContextLimitExceeded indicates eligibility failed due to context limits"). Ensure comments use complete sentences and appear immediately above the type and each constant declaration.internal/archtest/modelcatalog_boundaries_test.go (1)
36-46: Consider extending forbidden SDK list for infra modelcatalog closure for consistency.
TestCoreModelCatalogImportClosureforbids OpenAI, Anthropic, and Google GenAI SDKs, butTestInfraModelCatalogImportClosureonly forbids the OpenAI SDK. Sinceinternal/infra/modelcatalog/modelsdevis a JSON/HTTP adapter for models.dev metadata (no provider invocation), it has no legitimate need for any provider SDK. Adding Anthropic and Google GenAI here would prevent accidental coupling and keep the boundary symmetric.♻️ Proposed addition
{ Substr: "github.com/openai/openai-go", ErrMsg: "internal/infra/modelcatalog must not depend on OpenAI SDK", }, + { + Substr: "github.com/anthropics/anthropic-sdk-go", + ErrMsg: "internal/infra/modelcatalog must not depend on Anthropic SDK", + }, + { + Substr: "google.golang.org/genai", + ErrMsg: "internal/infra/modelcatalog must not depend on Google GenAI SDK", + }, })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/archtest/modelcatalog_boundaries_test.go` around lines 36 - 46, Update the forbidden dependency list in TestInfraModelCatalogImportClosure (the call to assertDepsExcludeForbidden that currently blocks "github.com/openai/openai-go") to also include Anthropic and Google GenAI SDKs; add forbiddenDep entries with Substr values matching the Anthropic SDK package (e.g., "github.com/anthropic-ai/..." or the repo used in the codebase) and the Google GenAI SDK package (e.g., "cloud.google.com/go/genai" or the repo used) and appropriate ErrMsg strings so the test symmetrically blocks Anthropic and Google GenAI imports just like TestCoreModelCatalogImportClosure does.internal/plugins/frontends/openailegacy/handler.go (1)
113-118: Consider discriminating by error reason rather than HTTP status.Using
out.Status == 413to decide whether to blank the OpenAIcodeworks for the current "all candidates context limit exceeded" sentinel, but it couples wire-shape selection to the transport status. If anotherKindClientRejectcause is later mapped to 413 (e.g., a different size-limit sentinel), it will silently inheritcode: "", which may not be intended. Threading a semantic field on theexecerr.Out(e.g.,Reason/WireCode) and switching on it here would keep this branch resilient.Not blocking for this PR.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/plugins/frontends/openailegacy/handler.go` around lines 113 - 118, The current branch that handles execerr.KindClientReject in handler.go determines the OpenAI `code` by checking out.Status == http.StatusRequestEntityTooLarge, which couples behavior to transport status; instead add/consume a semantic field (e.g., Reason or WireCode) on the execerr.Out type and switch on that value here (in the KindClientReject case within the handler where h.logWriteJSONErr and WriteErrorJSON are invoked), making the decision to set code = "" only when Reason == "context_length_exceeded" (or whatever sentinel you define); update the execerr.Out model to include Reason/WireCode and ensure callers that construct execerr.Out populate it so this branch uses out.Reason rather than out.Status.internal/core/modelcatalog/facts_test.go (1)
9-108: Tests primarily validate field round-trips, not behavior.These tests assign constants to
ModelFacts/LimitFactfields and assert they read back to the same constants. They will pass for any non-trivial enum/struct definition and don't exercise comparators, predicates, or downstream consumers (e.g., eligibility checks, source-priority resolution). Consider replacing with tests over the actual behavior these constants drive (e.g.,EffectiveFacts, override precedence, eligibility decisions) —override_resolver_test.gois a good template. The single useful invariant here is theLimitUnknown != LimitUnsupporteddistinction asserted inTestLimitFact_unknownVsUnsupported.As per coding guidelines: "Tests are behavior contracts, not implementation snapshots; write failing tests first (TDD)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/modelcatalog/facts_test.go` around lines 9 - 108, The current TestModelFacts_* functions only assert field round-trips; replace them with behavior-driven tests that exercise the logic these constants affect (e.g., call the functions that compute effective facts, override precedence and eligibility) rather than re-asserting setters/getters; keep the meaningful TestLimitFact_unknownVsUnsupported check. Specifically, delete or convert TestModelFacts_catalogExactMatch, TestModelFacts_modelOverride, TestModelFacts_backendModelPairOverride, TestModelFacts_backendDeclaration, TestModelFacts_noCatalogMatch, and TestModelFacts_ambiguousCatalogMatch into tests that invoke the real resolution paths (e.g., EffectiveFacts, override resolution/precedence, and eligibility checks) and assert expected outcomes using override_resolver_test.go as a template, and ensure TestLimitFact_unknownVsUnsupported remains to validate LimitUnknown != LimitUnsupported.internal/core/modelcatalog/override_resolver_test.go (1)
10-99: Coverage gaps: ContextLimit overrides and key normalization edge cases.The summary mentions "token limit overrides" but no test asserts a
LimitFact(e.g.,LimitPresent/LimitUnsupported) survives via the resolver. Also missing: case sensitivity of the(backend, model)andmodelkeys (e.g.,OpenAI-Responsesvsopenai-responses,GPT-4ovsgpt-4o), and models whose names contain:(which could collide with the pair key delimiter if the resolver uses"backend:model"concatenation). Adding these would harden the resolver against silent miss/collision regressions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/modelcatalog/override_resolver_test.go` around lines 10 - 99, Add tests to cover ContextLimit (token limit) overrides and key-normalization/collision edge cases: create OverrideSet entries (Model and Pair) that include ModelFacts.Limit (e.g., LimitPresent or LimitUnsupported) and assert the resolver (NewOverrideResolver/Resolve) returns those Limit values; add cases where candidate backend/model casing differs (e.g., "OpenAI-Responses" vs "openai-responses", "GPT-4o" vs "gpt-4o") to verify case-insensitive matching, and a case where a model name contains ':' (e.g., "custom:withcolon") to ensure Pair key concatenation like "backend:model" does not collide — for each, construct AttemptCandidate with routing.Primary and assert Resolve hits the expected override and preserves Limit and Source fields from ModelFacts.internal/core/modelcatalog/backend_caps_clone_test.go (1)
10-23: Optional: tighten the clone independence assertion.Current test proves the maps are distinct (delete on clone leaves orig intact). Two minor additions would harden the contract:
- Assert
len(cp) == 1before delete andlen(cp) == 0after, to make the intent unambiguous.- Add the reverse direction (mutate
orig, verifycpunchanged).Not required, but cheap insurance against future regressions if the helper ever switches away from
maps.Clone.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/modelcatalog/backend_caps_clone_test.go` around lines 10 - 23, Tighten the clone independence assertions in TestCloneBackendCaps_nilAndCopy: after calling modelcatalog.CloneBackendCaps(orig) assert len(cp) == 1 before deleting the key and assert len(cp) == 0 after the delete to make intent explicit, then perform the reverse mutation by deleting lipapi.CapabilityTools from orig (or otherwise mutating orig) and assert cp still contains lipapi.CapabilityTools (and its length remains as expected) to verify changes to orig do not affect the clone; reference the test function TestCloneBackendCaps_nilAndCopy and variables orig, cp, and modelcatalog.CloneBackendCaps.internal/core/modelcatalog/estimate_addsat_test.go (1)
8-19: LGTM — covers the documented saturating semantics.Optional: an additional case for a negative
boperand (e.g.,addSaturatingInt64(10, -5)) would lock in symmetry and prevent silent regressions if the helper ever changes to clampbdifferently.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/modelcatalog/estimate_addsat_test.go` around lines 8 - 19, Add a test case to TestAddSaturatingInt64_clampsAtMaxInt64 to cover a negative second operand: call addSaturatingInt64(10, -5) and assert the result equals 5 to ensure the function treats negative b symmetrically to negative a (i.e., clamps only on overflow/underflow and not by treating b as zero); update the test in internal/core/modelcatalog/estimate_addsat_test.go inside the existing TestAddSaturatingInt64_clampsAtMaxInt64 function to include this assertion using the same t.Fatalf style for failures.internal/stdhttp/server.go (1)
330-330: Placecontext.Contextfirst in the parameter list.Per Go convention (and
go vet's contextcheck-style linters),context.Contextis conventionally the first non-receiver parameter. HerelogCtxis last.Proposed signature change
-func mountModelCatalogDiagnostics(mux *http.ServeMux, cfg *config.Config, log *slog.Logger, built *runtimebundle.Built, logCtx context.Context) { +func mountModelCatalogDiagnostics(ctx context.Context, mux *http.ServeMux, cfg *config.Config, log *slog.Logger, built *runtimebundle.Built) {And update the call site:
- mountModelCatalogDiagnostics(mux, cfg, log, built, logCtx) + mountModelCatalogDiagnostics(logCtx, mux, cfg, log, built)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/stdhttp/server.go` at line 330, Change the function signature of mountModelCatalogDiagnostics to accept context.Context as the first non-receiver parameter (move logCtx to be the first parameter) and update all call sites accordingly; specifically, modify mountModelCatalogDiagnostics(...) so the context parameter appears before cfg, log, and built, and adjust any callers to pass the existing logCtx as the first argument to match the new signature.internal/infra/modelcatalog/modelsdev/fuzz_test.go (1)
8-12: Consider adding seed inputs to guide fuzzing.
f.Fuzzwithout anyf.Addseeds (or atestdata/fuzz/FuzzParseSnapshot/corpus) starts from the empty input only, which is slow to discover the JSON structure that exercisesParseSnapshot's normalization paths. Adding a couple of representative models.dev fixtures viaf.Add(...)significantly improves coverage in short fuzz runs.Example seed
func FuzzParseSnapshot(f *testing.F) { + f.Add([]byte(`{"providers":{"openai":{"models":{"gpt-4o":{"context":128000}}}}}`)) + f.Add([]byte(`{}`)) f.Fuzz(func(t *testing.T, b []byte) { _, _ = ParseSnapshot(b, time.Unix(0, 0).UTC()) }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infra/modelcatalog/modelsdev/fuzz_test.go` around lines 8 - 12, The fuzz test FuzzParseSnapshot currently provides no seed inputs so it only starts with an empty corpus; add representative seed byte slices via f.Add(...) inside FuzzParseSnapshot (targeting the ParseSnapshot function) to guide the fuzzer toward valid JSON/fixture shapes (e.g., a minimal valid models.dev snapshot and one or two normalized variants used in other tests or fixtures), or populate testdata/fuzz/FuzzParseSnapshot/ with real snapshot files and call f.Add with their contents so fuzzing exercises ParseSnapshot's normalization paths more quickly.internal/core/runtime/executor_catalog.go (1)
13-24: Reorder nil check onefor clarity.
base := e.backendCapsForAttempt(...)runs before thee == nilguard at line 20. It happens to be safe becausebackendCapsForAttemptdoes its owne != nilcheck, but the apparent ordering invites a future "this could NPE" bug if either method's invariants change. Either drop the now-redundante == nilhere or hoist the guard above the call.♻️ Proposed reorder
func (e *Executor) effectiveFactsForAttempt( ctx context.Context, be execbackend.Backend, attempt lipapi.Call, c routing.AttemptCandidate, ) modelcatalog.EffectiveFacts { - base := e.backendCapsForAttempt(ctx, be, attempt, c) - if e == nil || e.CatalogResolver == nil { + if e == nil { + return syntheticBackendOnlyFacts(execbackend.EffectiveCaps(ctx, be, attempt, c), c) + } + base := e.backendCapsForAttempt(ctx, be, attempt, c) + if e.CatalogResolver == nil { return syntheticBackendOnlyFacts(base, c) } return e.CatalogResolver.Resolve(ctx, c, attempt, base) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/runtime/executor_catalog.go` around lines 13 - 24, The function effectiveFactsForAttempt calls e.backendCapsForAttempt before checking e == nil which is fragile; move the nil guard (e == nil || e.CatalogResolver == nil) to the top of effectiveFactsForAttempt so you return syntheticBackendOnlyFacts(...) immediately when e or e.CatalogResolver is nil, and only then call e.backendCapsForAttempt to compute base and finally call e.CatalogResolver.Resolve(ctx, c, attempt, base); reference symbols: effectiveFactsForAttempt, backendCapsForAttempt, CatalogResolver, syntheticBackendOnlyFacts, Resolve.internal/stdhttp/catalog_status_handler.go (2)
26-33: Minor: encode errors after status 200 produce a truncated body that the client can't distinguish from success.Once
enc.Encodestarts writing, the 200 status is already flushed, so a mid-stream encode error leaves the client with a partial JSON document and only a server-side log entry. For a small diagnostics payload, encoding to a buffer first (thenw.Write) lets you fall back to a 5xx on encode failure. Optional — current behavior matches the common stdlib pattern andBuildCatalogDiagnosticsJSONis unlikely to fail to marshal.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/stdhttp/catalog_status_handler.go` around lines 26 - 33, The handler currently streams JSON via enc.Encode(v) which can flush a 200 before an encode error; instead, marshal/encode the diagnostics into a temporary buffer first (e.g., use json.Marshal(v) or encode into a bytes.Buffer), check for errors, and only if successful set the Content-Type and write the buffer to w; on marshal/encode error log via log.ErrorContext(r.Context(), "modelcatalog: catalog status encode", "err", err) and return a 5xx (e.g., http.Error(w, "internal server error", http.StatusInternalServerError)) so the client does not receive a truncated 200 response — locate code around BuildCatalogDiagnosticsJSON(cfg) and enc.Encode to implement this change.
18-25: Redundant HEAD branch and missingAllowheader on 405.The inner
if r.Method == http.MethodHead { ... }produces a response identical to the fallthrough, so it's dead control flow. Separately, RFC 7231 §6.5.5 requires a 405 response to include anAllowheader listing supported methods;http.Errordoes not set one for you.♻️ Proposed simplification
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { if r.Method != http.MethodGet { - if r.Method == http.MethodHead { - http.Error(w, "method not allowed", http.StatusMethodNotAllowed) - return - } - http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + w.Header().Set("Allow", "GET") + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/stdhttp/catalog_status_handler.go` around lines 18 - 25, The branch checking r.Method == http.MethodHead is redundant because it returns the same http.Error as the fallthrough; remove the inner HEAD branch and simplify the method check to reject non-GET methods once. Also, when returning http.StatusMethodNotAllowed, set the "Allow" response header to "GET" before calling http.Error so the 405 response complies with RFC 7231; update the method-checking code that uses r.Method, http.MethodGet, http.MethodHead, http.Error and http.StatusMethodNotAllowed accordingly.internal/infra/modelcatalog/modelsdev/source_test.go (1)
96-109: Optional: assert via a typed sentinel rather than substring.
strings.Contains(err.Error(), "exceeds")will silently regress if the size-cap error message ever gets reworded. Ifmodelsdevexposes (or can expose) a typed sentinel likeErrResponseTooLarge, prefererrors.Is(err, modelsdev.ErrResponseTooLarge)here for the same reason the disabled path on Line 141 already does. Minor.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infra/modelcatalog/modelsdev/source_test.go` around lines 96 - 109, The test TestHTTPSource_bodyTooLarge currently checks the oversized-response error by substring on err.Error(); instead, change the assertion to use a typed sentinel via errors.Is(err, modelsdev.ErrResponseTooLarge) so the test remains robust to text changes—locate the failing assertion in TestHTTPSource_bodyTooLarge and replace the strings.Contains check with errors.Is against modelsdev.ErrResponseTooLarge (import "errors" in the test if needed) and adjust the t.Fatalf message accordingly.internal/infra/runtimebundle/secure_session.go (1)
167-172: Optional: factor the cache-options setup into a tiny helper.Both branches duplicate the same 4-line pattern for translating
EffectiveSecureSessionSQLQueryCacheoutputs into adapterOptions. The twoOptionstypes are distinct, so the cleanest dedup is a small generic helper or a typed struct of (ttl, max) consumed at the call site. Not blocking.♻️ Sketch
type sqlCacheOpts struct { TTL time.Duration Max int } func effectiveSecureSessionSQLCache(ss config.SecureSession) (sqlCacheOpts, bool) { ttl, maxE, ok := config.EffectiveSecureSessionSQLQueryCache(ss) if !ok { return sqlCacheOpts{}, false } return sqlCacheOpts{TTL: ttl, Max: int(maxE)}, true }Then each branch becomes a 3-line apply.
Also applies to: 229-233
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infra/runtimebundle/secure_session.go` around lines 167 - 172, The code duplicates the 4-line translation of config.EffectiveSecureSessionSQLQueryCache into adapter options in two places; add a tiny helper type (e.g., sqlCacheOpts with fields TTL time.Duration and Max int) and a helper function (e.g., effectiveSecureSessionSQLCache(ss config.SecureSession) (sqlCacheOpts, bool)) that calls config.EffectiveSecureSessionSQLQueryCache and converts maxE to int, then use that helper to apply TTL and Max to sqlite.Options.SQLQueryCacheTTL/SQLQueryCacheMaxEntries (and the other adapter's option fields) before calling sqlite.OpenContextWithOptions, removing the duplicated lines in both branches.internal/core/modelcatalog/diagnostics_test.go (1)
53-54: Bypassing the runtime constructor couples this test to internal layout.
var rt CatalogRuntime; rt.active.Store(&s)skipsNewCatalogRuntime/Start, unlike the siblingTestBuildCatalogDiagnosticsJSON_unavailable. IfCatalogRuntimelater gains required initialization in the constructor (channels, schedulers, additional atomics), this test will silently misrepresent reality or panic. A small package-private helper that publishes a snapshot via the normal lifecycle would keep the test resilient to internal changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/modelcatalog/diagnostics_test.go` around lines 53 - 54, The test currently bypasses the CatalogRuntime constructor by doing "var rt CatalogRuntime; rt.active.Store(&s)" which couples the test to internal layout; instead create and use a small package-private helper that constructs a CatalogRuntime via NewCatalogRuntime (and calls Start if required) and publishes the snapshot through the runtime's normal lifecycle, then replace the direct store call in this test with that helper so the test uses CatalogRuntime/NewCatalogRuntime/Start and remains resilient to future internal initialization changes.internal/infra/modelcatalog/modelsdev/normalize_test.go (1)
149-157: Subtest namedinfis not actually infinity.JSON has no infinity literal, and
1e308is a finitefloat64. The two cases here both exercise "finite double that overflowsint64". Either renameinfto something likenear_max_doublefor clarity, or — if you genuinely want non-finite handling — assert it via a+Infvalue injected through the parser's lower layers (or document that JSON cannot encode it). Test name aside, both branches assert the sameLimitUnknownoutcome, so consider folding to one case unless you’re specifically targeting different float magnitudes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infra/modelcatalog/modelsdev/normalize_test.go` around lines 149 - 157, The subtest named "inf" in TestParseSnapshot_limitRejectsOverflowAndNonFinite is misleading because JSON cannot represent +Inf and the literal 1e308 is finite; update the test by either renaming the "inf" case to something like "near_max_double" (or "large_finite") to reflect it exercises large finite doubles, or replace the case with a true non-finite value injected at a lower layer if you intend to test +Inf handling; also consider collapsing the two cases into one since both currently assert the same LimitUnknown outcome, and add a short comment in TestParseSnapshot_limitRejectsOverflowAndNonFinite explaining why JSON cannot encode infinity if you keep the finite-large test.internal/core/runtime/executor_route_trace_catalog.go (1)
29-34: Function namerouteTraceCatalogJSONis misleading.The helper returns a
*diag.RouteTraceCatalogstruct, not JSON. Consider renaming to something likebuildRouteTraceCatalogornewRouteTraceCatalogto match its actual contract; the JSON encoding happens at the route-trace handler boundary.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/runtime/executor_route_trace_catalog.go` around lines 29 - 34, The function name routeTraceCatalogJSON is misleading because it returns a *diag.RouteTraceCatalog, so rename routeTraceCatalogJSON to a clearer name such as buildRouteTraceCatalog or newRouteTraceCatalog and update all call sites to use the new name; ensure you also update any comments/docs referencing routeTraceCatalogJSON and run/adjust tests that reference the symbol (look for usages in route-trace handler code and helper callers to replace routeTraceCatalogJSON with the new function name).internal/core/config/sql_query_cache_test.go (1)
11-60: Consider consolidating overlapping empty-TTL tests.
TestValidate_secureSession_sqlQueryCacheTTL_emptyOKandTestValidate_secureSession_sqlQueryCacheTTL_emptyAndMaxOKboth assert that an empty TTL is accepted — onlySQLQueryCacheMaxEntriesdiffers (100 vs 0). A single table-driven test with{maxEntries int}cases would make the contract clearer (empty TTL is valid regardless ofMaxEntries) and reduce drift between near-duplicate fixtures.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/config/sql_query_cache_test.go` around lines 11 - 60, Replace the two near-duplicate tests TestValidate_secureSession_sqlQueryCacheTTL_emptyOK and TestValidate_secureSession_sqlQueryCacheTTL_emptyAndMaxOK with a single table-driven test (e.g., TestValidate_secureSession_sqlQueryCacheTTL_empty_variousMaxEntries) that iterates over maxEntries cases (e.g., 100 and 0), sets SQLQueryCacheTTL to "" and SQLQueryCacheMaxEntries to the table value, calls config.Validate(cfg) for each subtest, and asserts no error; keep t.Parallel() and reuse the same secureSessionBaselinePlugins(), TokenFingerprintKey, and other fixture values so the behavior (empty TTL accepted regardless of MaxEntries) is verified in one consolidated test while leaving TestValidate_secureSession_sqlQueryCacheMaxEntries_negative unchanged.internal/core/config/validate.go (2)
625-632: Clarify error whenupdate_intervalis empty.When
external_updates_enabled=trueandupdate_intervalis omitted,time.ParseDuration("")producestime: invalid duration "", which is functional but not as actionable as the surrounding "required when …" messages. Consider an explicit empty-string branch.♻️ Proposed refactor
ui := strings.TrimSpace(mc.UpdateInterval) + if ui == "" { + return fmt.Errorf("model_catalog.update_interval: required when model_catalog.external_updates_enabled is true") + } d, err := time.ParseDuration(ui) if err != nil { return fmt.Errorf("model_catalog.update_interval: %w", err) } if d <= 0 { return fmt.Errorf("model_catalog.update_interval: must be positive when model_catalog.external_updates_enabled is true") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/config/validate.go` around lines 625 - 632, When validating mc.UpdateInterval before calling time.ParseDuration in the validation logic (the ui variable and the time.ParseDuration call), add an explicit check for an empty string and return a clear, actionable error like "model_catalog.update_interval: required when model_catalog.external_updates_enabled is true" instead of letting time.ParseDuration produce "time: invalid duration \"\""; keep the existing non-positive-duration check (d <= 0) after parsing.
634-639: Source URL re-checked whenexternal_updates_enabled=true.When
mc.ExternalUpdatesEnabledis true, lines 614-624 already validateSourceURL(including the https requirement). This block then revalidates the same URL with a weaker check. Not incorrect, but consider gating it withelseso the validation paths stay disjoint.♻️ Proposed refactor
- if su := strings.TrimSpace(mc.SourceURL); su != "" { + } else if su := strings.TrimSpace(mc.SourceURL); su != "" { u, err := url.Parse(su) if err != nil || u.Scheme == "" || u.Host == "" { return fmt.Errorf("model_catalog.source_url: invalid URL") } }(Move this block into the
elsearm of theif mc.ExternalUpdatesEnabledabove.)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/config/validate.go` around lines 634 - 639, The SourceURL validation block currently runs unconditionally and duplicates a weaker check when mc.ExternalUpdatesEnabled already validated SourceURL; modify the logic so the snippet that parses and validates mc.SourceURL (url.Parse and scheme/host checks) is executed only in the else branch of the existing if mc.ExternalUpdatesEnabled block (i.e., move this validation into the else arm), ensuring the two validation paths are disjoint and SourceURL is not re-checked with a weaker rule when ExternalUpdatesEnabled=true.internal/infra/runtimebundle/modelcatalog_attach.go (2)
32-34: Errors from duration parsers are intentionally dropped — depends onValidatehaving run.
UpdateIntervalDuration()andFetchTimeoutDuration()errors are discarded with_. This is safe only becauseconfig.Validate(which checks both durations) is required to have run beforeBuild. IfBuildis ever invoked on an un-validated config (e.g., in a future test or fast-path), zero-duration values silently disable refresh / timeout. A short comment near these lines anchoring the assumption — or a defensive log — would harden against future regressions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infra/runtimebundle/modelcatalog_attach.go` around lines 32 - 34, The calls to mc.UpdateIntervalDuration() and mc.FetchTimeoutDuration() currently discard parse errors (interval, _ := ...; fetchTimeout, _ := ...), relying on config.Validate having run; add a short explanatory comment next to these lines that documents this precondition (i.e., Validate must be called before Build), and optionally add a defensive debug/warn log via the same component (e.g., processLogger or mc logger) that emits a message if either parsed duration is zero to detect un-validated configs during tests — reference the UpdateIntervalDuration, FetchTimeoutDuration calls and the variables interval and fetchTimeout in modelcatalog_attach.go when making the change.
56-66: Clarify ownership whenclosers == nil.The branching here implies an unwritten contract: when
closers == nil, the caller takes ownership of the returned*CatalogRuntime(must callClose), but the function still eagerly cancels any running refresh loop. This is workable, but:
- If
closers == niland external updates are disabled, neithercat.Closeis registered nor invoked anywhere in this function, relying entirely on the caller noticing the return value. A passing caller that ignores the return on themc.Enabled && !mc.ExternalUpdatesEnabledpath would leak the runtime.- The doc comment on
attachModelCatalogdoesn't state this ownership contract.Consider documenting the contract on the function (e.g., "callers must
Closethe returned runtime whenclosersis nil") or always registeringcat.Closeand only short-circuitingrefreshCleanupwhen there's no slice to receive it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infra/runtimebundle/modelcatalog_attach.go` around lines 56 - 66, The function attachModelCatalog currently leaves ownership ambiguous when closers == nil: callers may leak *CatalogRuntime (cat) on the mc.Enabled && !mc.ExternalUpdatesEnabled path because cat.Close is neither registered nor called; update attachModelCatalog to either document that callers must Close the returned runtime when closers is nil or (preferably) always register/handle cat.Close internally — e.g., if closers == nil then call cat.Close before returning or append a local closer that will be invoked, and only suppress invoking refreshCleanup when you intentionally transfer ownership; adjust logic around cat.Close and refreshCleanup to ensure no leak and add a short doc comment on attachModelCatalog mentioning the ownership contract referencing cat.Close, refreshCleanup, and the mc.Enabled && !mc.ExternalUpdatesEnabled branch.internal/core/runtime/executor_modelcatalog_e2e_test.go (1)
105-143: Test assertsResolvewas called more times, not that new generations were observed.The test name implies sequential executes observe newer snapshot generations, but the only assertion is
gAfterSecond > gAfterFirst, i.e. thatResolvewas called more times. TheSnapshot.Generationvalue the resolver emits per call is never inspected on the executor side. Either tighten the assertion (e.g., capture stream events / route trace and verify the second execute sawgen2rather thangen1) or rename to something likeTestExecutor_sequentialExecutes_resolvePerCall.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/runtime/executor_modelcatalog_e2e_test.go` around lines 105 - 143, The test TestExecutor_sequentialExecutes_seeNewCatalogGenerations currently asserts only that the resolver (snapshotGenResolver.Resolve) was invoked more times; update the test to assert the executor actually observed newer Snapshot.Generation values instead of just count. Specifically, when calling ex.Execute(...) capture the execution's route/trace or inspect the stream events returned by mustStream/Collect to extract the Snapshot.Generation or resolver-emitted generation token for the first and second executions, then assert secondGeneration > firstGeneration; alternatively, if you prefer the simpler fix, rename the test to TestExecutor_sequentialExecutes_resolvePerCall to match the current behavior. Ensure you reference TestExecutor_sequentialExecutes_seeNewCatalogGenerations, snapshotGenResolver (n/Resolve), and Snapshot.Generation in your changes so the intent is clear.internal/core/runtime/executor_open_attempt.go (1)
88-94: Facts captured by closure may be stale relative to downgrades.
factsis computed inside thesafety.CallValueclosure with the originalattempt. If negotiation returnsNegotiationDowngradeandlipapi.ApplyNegotiatedDowngradesmutatesattemptat line 137, the originally-capturedfactsno longer reflects effective state. This is correctly handled later (line 142 recomputes when eligibility runs, and lines 159-161 recompute when eligibility is off but downgrade occurred), so the route-trace at line 124 (negotiation reject path) is the only consumer of pre-downgrade facts — which is fine since reject means no downgrade applied. Just noting the data flow is intentional but subtle; a brief comment near line 88 explaining "facts captures pre-downgrade state; recomputed below if negotiation altersattempt" would help future maintainers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/runtime/executor_open_attempt.go` around lines 88 - 94, Add a short comment near the facts capture inside the safety.CallValue closure (around the call to e.effectiveFactsForAttempt and the local variable facts) explaining that facts intentionally reflect the pre-downgrade state of attempt because negotiation is performed first; note that if negotiation returns NegotiationDowngrade and lipapi.ApplyNegotiatedDowngrades mutates attempt later, facts will be recomputed for eligibility checks (via e.effectiveFactsForAttempt) so the route-trace built on the rejection path can safely use the pre-downgrade facts while downstream logic refreshes them as needed.internal/infra/modelcatalog/modelsdev/store.go (1)
117-132: Optional: fsync the directory after rename for full crash atomicity.
tmp.Sync()flushes the file's contents, but on POSIX the directory entry created byos.Renameisn't durable until the parent directory is fsynced. For a cache that's regenerated on next fetch this is acceptable, but if you want the on-disk envelope to survive an immediate kernel/host crash afterSavereturns, open the directory andSyncit before returning.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infra/modelcatalog/modelsdev/store.go` around lines 117 - 132, The Save routine currently fsyncs the temp file (tmp.Sync()) and renames it to f.path but doesn't fsync the parent directory, so the rename may not be durable; after os.Rename(tmpPath, f.path) open the parent directory (use the directory portion of f.path), call Sync on that directory file descriptor, then close it, and return any Sync error wrapped similarly (e.g., "modelsdev cache dir sync: %w") to ensure full on-disk durability.internal/infra/modelcatalog/modelsdev/source.go (2)
58-58: Optional: setAccept: application/jsonon the request.Some CDNs/proxies content-negotiate or return HTML error pages by default; an explicit
Acceptheader makes the contract clearer and improves debuggability if the upstream ever serves multiple representations.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infra/modelcatalog/modelsdev/source.go` at line 58, The HTTP request created with http.NewRequestWithContext(opCtx, http.MethodGet, h.url, nil) should explicitly set an Accept header to "application/json" to avoid content-negotiation returning non-JSON (e.g., HTML error pages); after creating req (the variable returned from http.NewRequestWithContext) add req.Header.Set("Accept", "application/json") before the request is sent (in the same function handling the request/response).
70-80: Status check should precede (or at least accompany) the size error.When the upstream returns a non-2xx with a body larger than
maxCatalogHTTPResponseBytes, the user seesresponse body exceeds 67108864 bytesand the actual HTTP status is silently dropped. Reordering also avoids reading up to 64 MiB on every error response.♻️ Proposed reorder
defer func() { _ = resp.Body.Close() }() + if resp.StatusCode < 200 || resp.StatusCode >= 300 { + return zero, fmt.Errorf("modelsdev source: http status %d", resp.StatusCode) + } limited := io.LimitedReader{R: resp.Body, N: maxCatalogHTTPResponseBytes + 1} body, err := io.ReadAll(&limited) if err != nil { return zero, fmt.Errorf("modelsdev source: read body: %w", err) } if int64(len(body)) > maxCatalogHTTPResponseBytes { return zero, fmt.Errorf("modelsdev source: response body exceeds %d bytes", maxCatalogHTTPResponseBytes) } - if resp.StatusCode < 200 || resp.StatusCode >= 300 { - return zero, fmt.Errorf("modelsdev source: http status %d", resp.StatusCode) - } return ParseSnapshot(body, time.Now().UTC())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infra/modelcatalog/modelsdev/source.go` around lines 70 - 80, The current logic reads the entire (possibly huge) response into limited and only afterwards checks resp.StatusCode, which masks non-2xx statuses with a size error and forces reading up to maxCatalogHTTPResponseBytes; change the order in the function that uses io.LimitedReader/limited and resp to check resp.StatusCode first (resp.StatusCode < 200 || resp.StatusCode >= 300) and return an error that includes the status code (optionally read a small bounded snippet for context) before attempting the full body read, so the status error from resp.StatusCode is returned immediately and large error responses are not always fully read.internal/core/securesession/adapters/bunstore/store.go (2)
92-111: Document thattranscript_enabledis treated as immutable for cache correctness.
transcriptEnabledCachedonly writes the cache on hit and only invalidates onErrSessionNotFoundpaths. If any code path later mutatestranscript_enabledon an existing session row (admin tooling, policy migration, etc.), readers withinSQLQueryCacheTTLwill observe the stale prior value with no invalidation hook to flip it. Today that field appears set only atCreatetime, so the assumption holds — adding a one-line doc comment on the helper (or on thetranscript_enabledcolumn) protects the next person who's tempted to add anUPDATE.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/securesession/adapters/bunstore/store.go` around lines 92 - 111, Add a one-line doc comment to document that the transcript_enabled column is treated as immutable for cache correctness: annotate the transcriptEnabledCached function (or the transcript_enabled column) to state that the value is only set at creation and must not be updated later because the function caches the value (uses s.meta.transcript.Set and only invalidates on session-not-found), so any UPDATE to transcript_enabled will lead to stale reads until TTL expiry; reference the function name transcriptEnabledCached and the transcript_enabled column in the comment so future maintainers understand the invariant.
64-70: Avoid shadowing the built-incap.
capis a Go predeclared identifier (used for slice/channel capacity). Local shadowing is legal but commonly flagged bygovet/predeclaredand reduces readability. Rename to e.g.maxEntries.♻️ Proposed rename
- if opts.SQLQueryCacheTTL > 0 { - cap := uint64(opts.SQLQueryCacheMaxEntries) - if cap == 0 { - cap = 4096 - } - s.meta = newSessionMetaCache(opts.SQLQueryCacheTTL, cap) - } + if opts.SQLQueryCacheTTL > 0 { + maxEntries := uint64(opts.SQLQueryCacheMaxEntries) + if maxEntries == 0 { + maxEntries = 4096 + } + s.meta = newSessionMetaCache(opts.SQLQueryCacheTTL, maxEntries) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/securesession/adapters/bunstore/store.go` around lines 64 - 70, The local variable named `cap` in the conditional that initializes `s.meta` shadows Go's predeclared `cap` identifier; rename it (e.g., to `maxEntries`) to avoid shadowing and improve readability. Update the block that checks `opts.SQLQueryCacheTTL` and sets the capacity—replace `cap` with `maxEntries` (or similar) and use it when calling `newSessionMetaCache(opts.SQLQueryCacheTTL, maxEntries)`; leave the `opts.SQLQueryCacheMaxEntries` fallback logic unchanged.internal/core/modelcatalog/diagnostics.go (1)
134-148: Minor: clock-skew safety in staleness check.
now.Sub(snap.FetchedAt) > 2*cfg.UpdateIntervalcorrectly returnsfalse(not stale) whenFetchedAtis in the future, so behavior is benign — but if you'd rather treat future timestamps as a signal worth surfacing to operators, you'd compare|now - FetchedAt|instead. No change required for chill review; flagging for awareness.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/modelcatalog/diagnostics.go` around lines 134 - 148, The staleness check in catalogSnapshotIsStale currently ignores clock skew when snap.FetchedAt is in the future; update the logic in catalogSnapshotIsStale to treat future timestamps as noteworthy by computing the duration delta := now.Sub(snap.FetchedAt) and marking stale if delta < 0 || delta > 2*cfg.UpdateInterval (referencing catalogSnapshotIsStale, snap.FetchedAt, now, and cfg.UpdateInterval); make sure to keep the existing external-updates and LastRefreshFailure checks and only change the time-comparison branch to use the absolute/negativity-aware condition.internal/core/securesession/adapters/sqlite/store.go (2)
137-156: Negative-cache existence when transcript read sees NoRows.When
transcriptEnabledCachedfalls through to SQL and getssql.ErrNoRows, onlydomain.ErrSessionNotFoundis returned — the existence cache is not updated.sessionExistsdoes positively cachefalsefor the same condition, so the two caches can disagree until natural eviction. Aligning them keeps the diagnostic shape coherent and avoids a duplicate SQL probe on the next call path that hitssessionExists.♻️ Mirror the negative existence cache
err := q.QueryRowContext(ctx, `SELECT transcript_enabled FROM lip_secure_sessions WHERE session_id = ?`, string(id)).Scan(&te) if err != nil { if errors.Is(err, sql.ErrNoRows) { + if s.meta != nil { + s.meta.exists.Set(id, false, ttlcache.DefaultTTL) + } return false, domain.ErrSessionNotFound } return false, err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/securesession/adapters/sqlite/store.go` around lines 137 - 156, In transcriptEnabledCached, when the SQL query returns sql.ErrNoRows you should mirror the negative existence cache just like sessionExists does: set s.meta.transcript.Set(id, false, ttlcache.DefaultTTL) (guarded by s.meta != nil) before returning domain.ErrSessionNotFound so the transcript existence cache and sessionExists remain consistent and avoid a duplicate SQL probe.
108-117: Avoid shadowing thecapbuiltin.
cap := uint64(opts.SQLQueryCacheMaxEntries)shadows the predeclaredcapidentifier within this function. Harmless today but trips up future refactors that wantcap(slice)here.♻️ Rename the local
- if opts.SQLQueryCacheTTL > 0 { - cap := uint64(opts.SQLQueryCacheMaxEntries) - if cap == 0 { - cap = 4096 - } - st.meta = newSessionMetaCache(opts.SQLQueryCacheTTL, cap) - } + if opts.SQLQueryCacheTTL > 0 { + maxEntries := uint64(opts.SQLQueryCacheMaxEntries) + if maxEntries == 0 { + maxEntries = 4096 + } + st.meta = newSessionMetaCache(opts.SQLQueryCacheTTL, maxEntries) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/securesession/adapters/sqlite/store.go` around lines 108 - 117, The local variable currently named "cap" shadows the predeclared cap builtin; rename it (for example to "cacheCap" or "maxEntries") where it's declared in the block that constructs Store (around "st := &Store{db: db}") and update its use in the call to newSessionMetaCache(opts.SQLQueryCacheTTL, cap) so the code becomes newSessionMetaCache(opts.SQLQueryCacheTTL, cacheCap). Ensure the renamed identifier replaces all occurrences in that snippet (the assignment from opts.SQLQueryCacheMaxEntries and the fallback to 4096) and leave the Store.meta = newSessionMetaCache(...) logic unchanged.internal/core/modelcatalog/estimate.go (1)
11-17: Optional: declare a named constant for the tools+session basis.
estimateBasisForSessionPath(true)synthesizes"canonical_utf8_bytes+tools_json_bytes+session_bytes"via string concatenation, while every other basis value has an exported constant (Req 7.7). Diagnostics or tests asserting on basis strings won't have a stable symbol for this combination.♻️ Add a constant for the combined tools+session basis
const ( EstimateBasisCanonicalUTF8 = "canonical_utf8_bytes" EstimateBasisCanonicalUTF8AndTools = "canonical_utf8_bytes+tools_json_bytes" EstimateBasisCanonicalUTF8AndSession = "canonical_utf8_bytes+session_bytes" + EstimateBasisCanonicalUTF8AndToolsAndSession = "canonical_utf8_bytes+tools_json_bytes+session_bytes" EstimateBasisSessionContributionUnavailable = "session_contribution_unavailable" ) @@ func estimateBasisForSessionPath(hasTools bool) string { if hasTools { - return EstimateBasisCanonicalUTF8AndTools + "+session_bytes" + return EstimateBasisCanonicalUTF8AndToolsAndSession } return EstimateBasisCanonicalUTF8AndSession }Also applies to: 93-98
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/modelcatalog/estimate.go` around lines 11 - 17, Declare a new exported constant (e.g., EstimateBasisCanonicalUTF8AndToolsAndSession = "canonical_utf8_bytes+tools_json_bytes+session_bytes") alongside the other EstimateBasis* constants, and replace the string concatenation in estimateBasisForSessionPath (the branch that currently builds "canonical_utf8_bytes+tools_json_bytes+session_bytes") with that new constant; also update any other occurrences (the other spot around the existing EstimateBasis constants) to reference this new named constant instead of hardcoded or concatenated strings.internal/core/modelcatalog/runtime.go (1)
53-69: Consider surfacing cache-load failure duringStart.
Startdiscards both the error and the case wheresnap.Index == nil. The diagnostics endpoint reportsLastRefreshFailureto operators; if the on-disk cache exists but is corrupt/unparseable on boot, you currently reportRefreshFailureNoneuntil the first refresh succeeds or fails — which may mask a real persistence issue (especially when refresh is disabled and only the cache is configured).♻️ Tag cache-load failures as `RefreshFailureCache`
if r.cfg.Cache != nil { - snap, err := r.cfg.Cache.Load(parent) - if err == nil && snap.Index != nil { - cp := snap - r.active.Store(&cp) - } + snap, err := r.cfg.Cache.Load(parent) + switch { + case err == nil && snap.Index != nil: + cp := snap + r.active.Store(&cp) + case err != nil: + r.setFailure(RefreshFailureCache) + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/modelcatalog/runtime.go` around lines 53 - 69, In CatalogRuntime.Start, don't silently ignore cache Load errors: when r.cfg.Cache != nil and r.cfg.Cache.Load(parent) returns a non-nil err, record that failure in the runtime diagnostics as a cache-specific refresh failure (e.g., set the diagnostics LastRefreshFailure to RefreshFailureCache and include the err), while keeping the existing behavior of loading and storing a valid snap when err == nil && snap.Index != nil; update the Start implementation (around the r.cfg.Cache.Load call in CatalogRuntime.Start) to call the diagnostics/logging helper (e.g., r.diagnostics.SetLastRefreshFailure or similar) with RefreshFailureCache and the underlying error so the boot-time cache corruption is surfaced.internal/core/config/model_catalog_test.go (1)
24-26: Optional: inlineyamlPathor drop it.
yamlPathis a one-line wrapper aroundfilepath.ToSlashused identically at every call site. Inliningfilepath.ToSlash(...)would remove a small layer of indirection without losing readability. Not a blocker.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/config/model_catalog_test.go` around lines 24 - 26, The yamlPath helper is a trivial one-line wrapper around filepath.ToSlash and is used identically everywhere; remove the yamlPath function and replace its call sites with direct calls to filepath.ToSlash(...) (or, alternatively, inline filepath.ToSlash(...) at each usage) so there’s no indirection—search for yamlPath(...) and update those locations and delete the yamlPath function definition.
| type EligibilityDecision struct { | ||
| IsEligible bool `json:"eligible,omitempty"` | ||
| Reason EligibilityReason | ||
| Facts EffectiveFacts | ||
| Estimate SizeEstimate | ||
| } |
There was a problem hiding this comment.
omitempty on a bool will silently drop the false (ineligible) case.
If EligibilityDecision is ever marshaled (the partial JSON tag suggests it may be), an ineligible decision serializes to {} and is indistinguishable from a missing field. Either drop omitempty or remove the tag if no JSON encoding is intended; also consider tagging the other fields consistently.
As per coding guidelines: "Slices in JSON and returned values must be explicitly initialized as empty ... to prevent null in JSON" — analogous reasoning applies to booleans here.
🛡️ Proposed fix
type EligibilityDecision struct {
- IsEligible bool `json:"eligible,omitempty"`
- Reason EligibilityReason
- Facts EffectiveFacts
- Estimate SizeEstimate
+ IsEligible bool `json:"eligible"`
+ Reason EligibilityReason `json:"reason,omitempty"`
+ Facts EffectiveFacts `json:"facts,omitempty"`
+ Estimate SizeEstimate `json:"estimate,omitempty"`
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| type EligibilityDecision struct { | |
| IsEligible bool `json:"eligible,omitempty"` | |
| Reason EligibilityReason | |
| Facts EffectiveFacts | |
| Estimate SizeEstimate | |
| } | |
| type EligibilityDecision struct { | |
| IsEligible bool `json:"eligible"` | |
| Reason EligibilityReason `json:"reason,omitempty"` | |
| Facts EffectiveFacts `json:"facts,omitempty"` | |
| Estimate SizeEstimate `json:"estimate,omitempty"` | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/core/modelcatalog/eligibility.go` around lines 11 - 16, The JSON tag
on EligibilityDecision.IsEligible uses `omitempty` which will drop false values;
remove the `omitempty` (i.e., change `json:"eligible,omitempty"` to
`json:"eligible"`) or remove the tag entirely so a false/ineligible decision is
serialized, and make the JSON tagging consistent for the other fields on the
EligibilityDecision type (add explicit `json:"reason"`, `json:"facts"`,
`json:"estimate"` tags as appropriate) and ensure any slice/collection fields in
EffectiveFacts or SizeEstimate are initialized to empty slices when constructed
to avoid null in JSON.
Summary
Validation
go test ./...go run ./cmd/lipstd --helpSummary by CodeRabbit
Release Notes
New Features
Documentation