[codex] preserve SRS across core dictionary updates#51
Conversation
|
Caution Review failedPull request was closed or merged during review No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughWhen bundled core dictionary version changes, core words are synchronized non-destructively: matched words keep their IDs and SRS progress, new words are inserted, removed core words are marked inactive (kept for history but excluded from planning), and any active session at sync time is marked abandoned. Changes
Sequence DiagramsequenceDiagram
participant App
participant Store
participant SessionRepo as Session Repo
participant WordSync as Word Sync
participant AppMeta as App Meta
App->>Store: SeedWords(dict.CoreWordsVersion)
Store->>Store: Check currentVersion vs version
alt Version changed and core words present
Store->>SessionRepo: abandonActiveSessionsTx()
activate SessionRepo
SessionRepo->>SessionRepo: UPDATE sessions SET status='abandoned', finished_at=now()
deactivate SessionRepo
Store->>WordSync: syncCoreWordsTx(entries)
activate WordSync
WordSync->>WordSync: Load existing core rows keyed by normalized(lemma,pos)
WordSync->>WordSync: UPDATE matched rows (preserve id, update metadata, is_active=1)
WordSync->>WordSync: INSERT new core rows (is_active=1)
WordSync->>WordSync: UPDATE unmatched prev core -> is_active=0
WordSync->>Store: return counts (inserted/updated/retired)
deactivate WordSync
end
Store->>AppMeta: Set dict_version
Store->>Store: Commit transaction
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 104 |
| Duplication | 19 |
TIP This summary will be updated as you push new changes. Give us feedback
There was a problem hiding this comment.
Pull request overview
Preserves users’ SRS progress across bundled core dictionary (dict_version) updates by diff-syncing core words on normalized lemma/pos, retiring removed core rows via words.is_active, and preventing resumed sessions from drifting after a core update.
Changes:
- Replace destructive core reseed-on-version-bump with transactional diff sync that preserves
word_id/progress, inserts new core words, and retires removed ones (is_active=0). - Update session planning / candidate-selection queries and doctor diagnostics to treat only active words as eligible, while keeping historical access to retired words.
- Add migrations + regression tests (including abandoning active sessions on version bump) and update ADR/changelog/spec notes.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tasks/todo.md | Tracks completion of issue #49 work items. |
| tasks/lessons.md | Documents the session-drift risk and the “abandon on version bump” policy. |
| tasks/feature_spec.md | Defines required behavior/acceptance for non-destructive core sync + retirement. |
| internal/store/word_write.go | Implements core diff sync, key validation, and supports is_active in inserts/updates. |
| internal/store/word_repo.go | Filters planning queries to is_active=1; adds shared tx helper for abandoning active sessions. |
| internal/store/migrate.go | Updates SeedWords() to no-op on same version and to diff-sync + abandon sessions on version bump. |
| internal/store/store_test.go | Adds/updates regression tests for preserved progress, session abandonment, and retirement behavior. |
| internal/store/doctor.go | Makes doctor resilient to pre-006 schema and reports using active/retired core counts; ignores retired words in certain checks. |
| internal/store/doctor_test.go | Expands test coverage for retired-word behavior and legacy-schema scenarios. |
| internal/app/cmds_test.go | Updates test schema to include words.is_active. |
| docs/adr/0002-bundled-core-dictionary-lifecycle.md | Updates lifecycle ADR to reflect diff-sync + retirement + reset semantics. |
| CHANGELOG.md | Adds 0.7.0 entry describing the user-visible behavior changes. |
| assets/migrations/006_words_is_active.sql | Adds words.is_active column and supporting indexes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tasks/lessons.md (1)
42-42: Move the version-bump behavior contract todocs/specs, keeplessonsas a recurrence rule.This line currently mixes a permanent session-resume contract with the lesson text. Keep the anti-pattern/rule here, and place the normative behavior (abandon or snapshot requirement) in
docs/specs/*.As per coding guidelines
tasks/lessons.mdは再発防止ルールを記録する場所であり、設計判断や仕様そのものを置く場所として使わないこと.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tasks/lessons.md` at line 42, The line in tasks/lessons.md mixes a normative session-resume contract with a recurrence rule; remove the specification text describing how sessions must behave on a version bump (the "abandoned" transition and the "question snapshot" requirement) from tasks/lessons.md and instead add a new document under docs/specs (e.g., a version-bump or session-resume spec) that normatively defines the behavior for version bumps, including the conditions to mark a session "abandoned" or the requirement to store a "question snapshot" / fix prompt/choices; leave tasks/lessons.md as a short recurrence/anti-pattern note only (keep the rule that session resume should not rely on live words rows or distractor reloads) and update the wording to remove implementation/contract language.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tasks/lessons.md`:
- Line 42: The line in tasks/lessons.md mixes a normative session-resume
contract with a recurrence rule; remove the specification text describing how
sessions must behave on a version bump (the "abandoned" transition and the
"question snapshot" requirement) from tasks/lessons.md and instead add a new
document under docs/specs (e.g., a version-bump or session-resume spec) that
normatively defines the behavior for version bumps, including the conditions to
mark a session "abandoned" or the requirement to store a "question snapshot" /
fix prompt/choices; leave tasks/lessons.md as a short recurrence/anti-pattern
note only (keep the rule that session resume should not rely on live words rows
or distractor reloads) and update the wording to remove implementation/contract
language.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fb1f039e-c206-471a-93ab-b585bb6f542b
📒 Files selected for processing (13)
CHANGELOG.mdassets/migrations/006_words_is_active.sqldocs/adr/0002-bundled-core-dictionary-lifecycle.mdinternal/app/cmds_test.gointernal/store/doctor.gointernal/store/doctor_test.gointernal/store/migrate.gointernal/store/store_test.gointernal/store/word_repo.gointernal/store/word_write.gotasks/feature_spec.mdtasks/lessons.mdtasks/todo.md
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
internal/store/doctor.go (2)
524-570:⚠️ Potential issue | 🟡 MinorExclude invalid
frequency_rankvalues from duplicate-rank checks.Lines 442-465 already treat
frequency_rank <= 0as missing, but these new queries only excludeNULL. That means multiple rows with rank0now get reported twice: once as missing metadata and again as duplicate ranks. Please filter to valid positive ranks here as well.Suggested fix
- WHERE frequency_rank IS NOT NULL + WHERE frequency_rank > 0- WHERE is_active = 1 AND frequency_rank IS NOT NULL + WHERE is_active = 1 AND frequency_rank > 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/store/doctor.go` around lines 524 - 570, The duplicate-rank SQL (duplicateRankCountQuery and duplicateRankSampleQuery) currently only excludes NULL ranks and thus still counts non-positive ranks like 0; update the WHERE clauses used both in the base queries and the hasIsActive branch to require frequency_rank > 0 (e.g., replace/augment "frequency_rank IS NOT NULL" with "frequency_rank IS NOT NULL AND frequency_rank > 0" or simply "frequency_rank > 0") so only valid positive ranks are considered when checking duplicates; ensure both duplicateRankCountQuery and duplicateRankSampleQuery strings (including the versions inside the hasIsActive conditional) are changed consistently.
959-968:⚠️ Potential issue | 🟡 MinorUse an active-specific empty-state message here.
With
is_activepresent,wordCountis counting active rows only. If the DB contains only retired words, this now reportswords table is empty, which is misleading. Please change this to something likeno active words are available for quizability.Suggested fix
if wordCount == 0 { - return diagnosticCheckError("quizability", "words table is empty") + summary := "words table is empty" + if hasIsActive { + summary = "no active words are available for quizability" + } + return diagnosticCheckError("quizability", summary) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/store/doctor.go` around lines 959 - 968, The empty-state error message is misleading when hasIsActive is true because wordCount then reflects only active rows; update the branch that handles wordCount == 0 in the doctor check to return a different diagnosticCheckError message when hasIsActive is true (e.g., "no active words are available for quizability") and keep the existing "words table is empty" message when hasIsActive is false; locate the logic around wordCountQuery, hasIsActive, s.countRows and the wordCount == 0 return and swap the returned message based on hasIsActive.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@internal/store/doctor.go`:
- Around line 524-570: The duplicate-rank SQL (duplicateRankCountQuery and
duplicateRankSampleQuery) currently only excludes NULL ranks and thus still
counts non-positive ranks like 0; update the WHERE clauses used both in the base
queries and the hasIsActive branch to require frequency_rank > 0 (e.g.,
replace/augment "frequency_rank IS NOT NULL" with "frequency_rank IS NOT NULL
AND frequency_rank > 0" or simply "frequency_rank > 0") so only valid positive
ranks are considered when checking duplicates; ensure both
duplicateRankCountQuery and duplicateRankSampleQuery strings (including the
versions inside the hasIsActive conditional) are changed consistently.
- Around line 959-968: The empty-state error message is misleading when
hasIsActive is true because wordCount then reflects only active rows; update the
branch that handles wordCount == 0 in the doctor check to return a different
diagnosticCheckError message when hasIsActive is true (e.g., "no active words
are available for quizability") and keep the existing "words table is empty"
message when hasIsActive is false; locate the logic around wordCountQuery,
hasIsActive, s.countRows and the wordCount == 0 return and swap the returned
message based on hasIsActive.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a152d00e-cdc5-4746-899c-e0684fc7c3e3
📒 Files selected for processing (2)
internal/store/doctor.gotasks/lessons.md
🚧 Files skipped from review as they are similar to previous changes (1)
- tasks/lessons.md
Summary
lemma/poswords.is_activeand exclude inactive rows from future session planning0.7.0changelog entryWhy
The previous bundled core reseed path wiped learning tables on every
dict_versionbump, so users lost SRS progress when bundled words were added. The follow-up review also found that keeping active sessions across a live dictionary sync was unsafe because questions are rebuilt from currentwordsrows and current distractor candidates on resume.User impact
Users keep their existing SRS history for matched bundled core words when the embedded core dictionary updates, see only newly added bundled words as new material, and do not get dropped into broken or drifted in-progress sessions after a dictionary update.
Validation
go test ./internal/store ./internal/app ./internal/quiz ./cmd/eitangogo test ./...goimportslintCloses #49
Summary by CodeRabbit
New Features
Bug Fixes
Documentation