feat: add Concrete Table Inheritance as alternative inheritance strategy#26
feat: add Concrete Table Inheritance as alternative inheritance strategy#26
Conversation
Introduce a new concrete_table inheritance strategy that coexists with the existing class_table (joined table) approach. When an entity hierarchy opts in via `inheritance = "concrete_table"`, each concrete class produces a single table containing all attributes (own + inherited), eliminating joins on read. Changes: - Add `inheritance` class attribute, `get_inheritance_strategy()`, and `is_concrete_table()` to EntityClass in structures.py - Add `get_all_items()` method for flattened attribute retrieval - Branch _create_definition_query, _save_query, _update_query, _remove_query, and _find_query on inheritance strategy - Skip parent table creation/joins for concrete_table entities - Add migration script (scripts/migrate_inheritance.py) with backup, validation, dry-run, and transactional execution - Add test entities and unit tests for the new strategy Closes #25 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds selectable Concrete Table Inheritance (per-class concrete tables) with strategy selection on EntityClass, flattened item resolution, query and schema generation updates for concrete-mode CRUD and reads, new migration and benchmark CLI scripts for converting/comparing strategies, expanded concrete-table mocks and extensive tests, plus design docs and changelog updates. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant EM as EntityManager
participant DB as Database
rect rgba(100, 150, 255, 0.5)
Note over App,EM: Class Table Inheritance (existing)
App->>EM: save(Employee)
EM->>EM: split attrs across ancestor tables
EM->>DB: INSERT parent table(s)
DB-->>EM: OK
EM->>DB: INSERT leaf table
DB-->>EM: OK
App->>EM: find(Employee, id)
EM->>DB: SELECT leaf JOIN parent ... WHERE id=...
DB-->>EM: joined row set
end
rect rgba(100, 255, 150, 0.5)
Note over App,EM: Concrete Table Inheritance (new)
App->>EM: save(EmployeeConcrete)
EM->>EM: flatten inherited + own attrs
EM->>DB: INSERT INTO employee (all_cols..., _class)
DB-->>EM: OK
App->>EM: find(EmployeeConcrete, id)
EM->>DB: SELECT * FROM employee WHERE id=...
DB-->>EM: single row
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Fill in previously stubbed tests (test_self_relation, test_database_integrity, test_invalid_type, test_map, test_range) and add 15 new edge case tests: - test_get_nonexistent: retrieving non-existent entity returns None - test_find_empty: find on empty table returns empty list - test_find_filters: equals, greater, lesser, and combined filters - test_count_with_filters: count with filter conditions - test_update_nullify_field: setting a field to None - test_update_relation_change: changing a FK relation owner - test_update_nullify_relation: removing a FK relation - test_multiple_inheritance: Employee with Person+Loggable+Taxable - test_polymorphism_employee: polymorphic find across hierarchy - test_save_update_remove_cycle: full CRUD lifecycle - test_file_data_type: File entity with data type field - test_unicode_fields: unicode characters in text fields - test_find_with_set: result set mode retrieval - test_breeder_subclass: Breeder/BreedDog subclass relations - test_remove_with_relations: removing entities with FK relations Also fixes concrete table _class column qualifier in SELECT queries to use the entity's own table name instead of the top parent table, and fixes _process_table_name to resolve all attributes to the entity's own table for concrete table inheritance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add EntityManagerMigrationTestCase with 11 tests covering the migration script functionality: - test_validate_same_strategy: rejects migration to current strategy - test_validate_concrete_same_strategy: same for concrete table - test_validate_cti_to_concrete: validates CTI->Concrete migration - test_validate_concrete_to_cti: validates Concrete->CTI migration - test_dry_run_cti_to_concrete: dry-run generates SQL without changes - test_dry_run_concrete_to_cti: same for reverse direction - test_backup_sqlite: creates backup file before migration - test_migrate_cti_to_concrete_schema: verifies CTI table structure - test_migrate_rollback_on_failure: simulated failure triggers rollback - test_generate_queries_cti_to_concrete: verifies CREATE/INSERT/DROP/ALTER - test_generate_queries_concrete_to_cti: same for reverse direction Also moves inline imports to top-level per project conventions and uses importlib for robust migration module loading across Colony's plugin cache directory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ests Expand migration backup test coverage with three new tests: - test_backup_data_integrity: opens the backup as a separate SQLite connection and verifies table structure and row data match the original database (tables exist, column values preserved) - test_migrate_creates_backup: verifies that migrate() creates a backup file when skip_backup=False, using a failing connection to isolate the backup step from the actual migration SQL - test_migrate_aborts_on_backup_failure: verifies that migrate() returns failure with appropriate message when backup fails Also adds backup naming convention assertion (.backup.timestamp) to the existing test_backup_sqlite test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Guard `import importlib.util` with try/except for Python 2.7 compatibility (PyPy 2.7 and CPython 2.7 do not have importlib.util) - Add `import sqlite3` at module level instead of inline - Fix test_backup_data_integrity for Windows: close backup_conn in the finally block before os.remove() to release file handle - Add WAL checkpoint before backup to ensure data is flushed to the main database file on Windows (WAL mode may leave data in the WAL file which shutil.copy2 does not capture) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ilable The migration tests use importlib.util for dynamic module loading which is not available in Python 2.7. Instead of erroring, the tests now gracefully skip via unittest.skipTest when running on Python 2.7. - Add HAS_IMPORTLIB_UTIL flag set during import - Add setUp override in EntityManagerMigrationTestCase that calls skipTest when importlib.util is not available - Guard _load_migration_module to return None on Python 2.7 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add scripts/benchmark_inheritance.py that compares Class Table Inheritance vs Concrete Table Inheritance performance across all CRUD operations using SQLite. The script measures: schema creation, bulk insert, select by ID, select all, count, bulk update, and bulk delete - printing a formatted report with speedup ratios. Usage: python scripts/benchmark_inheritance.py [iterations] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a second benchmark section with 20 columns per level (60 total) to demonstrate that concrete table's write advantage narrows with wider rows and its read advantage can reverse entirely when the single-table row becomes too wide for SQLite's page format. Narrow (8 cols): concrete table ~2x faster on writes, ~same on reads Wide (60 cols): concrete table ~1.1x on writes, can be slower on reads Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revise the concrete table inheritance strategy so that every concrete class in the hierarchy gets its own table containing all fields flattened down to that hierarchy level. Writing an entity inserts/updates/deletes rows in every ancestor table, enabling polymorphic queries at any hierarchy level without joins. Schema example (root -> person -> employee): - _concrete_root_entity: object_id, status, _class, _mtime - _concrete_person: object_id, status, name, age, _class, _mtime - _concrete_employee: object_id, status, name, age, salary, _class, _mtime Changes: - create(): remove parent table skip, create all ancestor tables with flattened columns via get_all_items() - _save_query: build items_map with one entry per ancestor class, each using that class's get_all_items() - _update_query: same ancestor-level items_map approach - _remove_query: delete from all ancestor tables (same as CTI) - Add test_polymorphic_query verifying queries at root, person, and employee levels return correct entity counts - Add design document COP-002 describing the schema, query behavior, implementation details, and trade-offs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Complete rewrite of scripts/migrate_inheritance.py: - Uses ORM's own entity class methods (get_name, get_all_items, get_names_map, get_id) instead of reimplementing name resolution - Handles multiple inheritance (Employee -> Person + Loggable + Taxable) with duplicate detection in hierarchy traversal - Filters hierarchy by table existence when connection is available, so only tables present in the data source are migrated - Creates flat tables at every hierarchy level for CTI->Concrete (matching the new ancestor-population design) - Uses INSERT OR IGNORE for Concrete->CTI to handle duplicates when multiple concrete tables map to the same CTI level - Recreates all indexes: PK (hash+btree), _mtime (hash+btree), and FK indexes on mapped relation columns - Adds validate_data() for pre-migration referential integrity checks (orphaned row detection) End-to-end migration tests: - test_end_to_end_cti_to_concrete: creates CTI tables via ORM, inserts data, migrates, verifies schema changes and data integrity at every hierarchy level including polymorphic rows - test_end_to_end_concrete_to_cti: creates concrete tables via ORM, inserts data, migrates, verifies CTI table structure and column distribution - test_migration_preserves_indexes: verifies index count on migrated tables Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a DATA_INHERITANCE environment variable that overrides the
class-level inheritance attribute for all entity hierarchies,
enabling system-wide strategy switching for migration debugging
without modifying entity code.
Resolution order in get_inheritance_strategy():
1. Cached _inheritance_strategy (per-class, avoids repeated lookups)
2. colony.conf("DATA_INHERITANCE") with module-level fallback
3. Class-level inheritance attribute
4. Parent hierarchy traversal
5. Default "class_table"
Usage:
export DATA_INHERITANCE=concrete_table # force all entities
unset DATA_INHERITANCE # use class attributes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 47dbdbb128
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # are skipped as all columns reside in the entity's own | ||
| # table, for class table inheritance the joins are required | ||
| # to reconstruct the full entity from multiple tables | ||
| if not is_concrete: |
There was a problem hiding this comment.
Preserve parent aliases for inherited concrete relations
Skipping parent-table joins for every concrete-table query leaves inherited relation joins pointing at aliases that were never introduced. In _join_query_f, relations declared on a parent class still build conditions like _parent_table.<relation> via _prefix, but with this guard those parent aliases are absent, so eager-loading a relation inherited by a concrete subclass (e.g., querying ConcreteEmployee with a relation from ConcretePerson) produces invalid SQL at runtime.
Useful? React with 👍 / 👎.
scripts/migrate_inheritance.py
Outdated
| "select count(*) from sqlite_master where type='table' and name=?", | ||
| (cls.get_name(),), |
There was a problem hiding this comment.
Use non-SQLite metadata queries for non-SQLite engines
The migration query generators always probe table existence with sqlite_master when a connection is provided. migrate() passes the live connection through, so using MySQL/PostgreSQL connections will fail before migration starts because those engines do not expose sqlite_master. This breaks the advertised non-SQLite migration path in migrate().
Useful? React with 👍 / 👎.
scripts/migrate_inheritance.py
Outdated
| "insert or ignore into %s(%s) select %s from %s " | ||
| "where _class = '%s'" |
There was a problem hiding this comment.
Avoid SQLite-only INSERT syntax in cross-engine migration
The concrete→CTI path emits insert or ignore, which is SQLite-specific SQL. When migrate() is used with MySQL/PostgreSQL (the function exposes those engines), this statement is invalid and aborts the migration. Use an engine-specific conflict strategy (or generic inserts plus dedup logic) instead of hardcoding SQLite syntax.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Adds a new ORM inheritance strategy (“concrete table inheritance”) to complement the existing class-table inheritance, plus tooling/docs/tests to support strategy selection and migration.
Changes:
- Introduces
inheritance = "concrete_table"on entity hierarchies, with strategy resolution (get_inheritance_strategy(),is_concrete_table()) and flattened field resolution (get_all_items()). - Updates entity manager SQL generation to create flattened tables and to read from a single table (no parent joins) while writing to all ancestor tables.
- Adds migration + benchmark scripts, design documentation (COP-002), changelog entry, and expanded test coverage (including end-to-end migration tests).
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
data/src/entity_manager/structures.py |
Adds strategy selector + cache, concrete-table helpers, and get_all_items() for flattened field maps. |
data/src/entity_manager/system.py |
Branches table creation / save / update / find query-building behavior based on inheritance strategy. |
data/src/entity_manager/mocks.py |
Adds concrete-table mock entities mirroring existing inheritance test hierarchies. |
data/src/entity_manager/test.py |
Adds concrete-table tests plus migration/backup tests and fills in previously-stubbed test cases. |
scripts/migrate_inheritance.py |
New standalone migration tool for converting hierarchies between strategies (backup/dry-run/transactional). |
scripts/benchmark_inheritance.py |
New benchmark script comparing CTI vs concrete-table operations on SQLite. |
doc/design/README.md |
Adds COP-002 to design doc index. |
doc/design/002-concrete_table_inheritance.md |
Design document describing schema/query behavior, tradeoffs, and configuration. |
CHANGELOG.md |
Documents the new inheritance strategy, migration tooling, and tests under Added. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # table inheritance strategy, in which case a single | ||
| # insert query is generated for all attributes |
There was a problem hiding this comment.
The comment says concrete-table mode generates “a single insert query”, but the implementation builds an items_map containing all ancestor tables and then iterates it, producing multiple INSERTs (one per ancestor table). Please update the comment to reflect the actual behavior so future maintainers don’t get misled.
| # table inheritance strategy, in which case a single | |
| # insert query is generated for all attributes | |
| # table inheritance strategy, in which case an insert | |
| # query is generated for each concrete/ancestor table |
| unset DATA_INHERITANCE | ||
| ``` | ||
|
|
||
| The override is resolved via `colony.conf("DATA_INHERITANCE", None)` at module load time in `structures.py` and checked as the first step in `get_inheritance_strategy()`. |
There was a problem hiding this comment.
The design doc says the global override is “checked at module load time”, but structures.py resolves the override inside get_inheritance_strategy() via colony.conf(...) and caches the result per class. Please update this section to reflect the actual runtime + caching behavior (and how to clear the cache when switching overrides).
| The override is resolved via `colony.conf("DATA_INHERITANCE", None)` at module load time in `structures.py` and checked as the first step in `get_inheritance_strategy()`. | |
| The override is resolved via `colony.conf("DATA_INHERITANCE", None)` inside `get_inheritance_strategy()` at runtime. The effective inheritance strategy is cached per entity class, so changes to `DATA_INHERITANCE` in a long-running process only take effect after clearing the per-class inheritance-strategy cache (or restarting the process) and then reloading/creating the relevant entities. |
| SQL_TYPES_MAP = dict( | ||
| text="text", | ||
| string="varchar(255)", | ||
| integer="integer", | ||
| long="bigint", | ||
| float="double precision", | ||
| decimal="double precision", | ||
| date="double precision", | ||
| data="text", | ||
| metadata="text", | ||
| ) | ||
| """ The default SQL types map for column type resolution """ | ||
|
|
||
|
|
There was a problem hiding this comment.
SQL_TYPES_MAP is defined but never used in this benchmark script. Consider removing it (or wiring it into schema creation) to avoid dead code and keep the benchmark focused on the operations being measured.
| SQL_TYPES_MAP = dict( | |
| text="text", | |
| string="varchar(255)", | |
| integer="integer", | |
| long="bigint", | |
| float="double precision", | |
| decimal="double precision", | |
| date="double precision", | |
| data="text", | |
| metadata="text", | |
| ) | |
| """ The default SQL types map for column type resolution """ |
scripts/migrate_inheritance.py
Outdated
| if connection: | ||
| _hierarchy = [] | ||
| for cls in hierarchy: | ||
| cursor = connection.cursor() | ||
| cursor.execute( | ||
| "select count(*) from sqlite_master where type='table' and name=?", | ||
| (cls.get_name(),), | ||
| ) | ||
| exists = cursor.fetchone()[0] > 0 | ||
| cursor.close() |
There was a problem hiding this comment.
migrate()/query generation is written with SQLite-specific assumptions (e.g., querying sqlite_master for table existence) even when engine is set to mysql/pgsql. This makes the advertised non-SQLite support unreliable; consider either (1) explicitly restricting the migration module to SQLite-only, or (2) branching table-existence detection and SQL generation per engine (and using the correct INSERT/DDL dialect).
| if not is_valid: | ||
| return False, messages | ||
|
|
||
| if validate_only: |
There was a problem hiding this comment.
validate_data() implements the orphan-row / referential consistency checks described in the validation docstrings, but it is never called by migrate() (including --validate). As a result, validation currently only checks strategy configuration and ID presence, not data integrity; either invoke validate_data() during validation, or adjust the validation messaging/docs to match actual behavior.
| if validate_only: | |
| if validate_only: | |
| # performs data-level validation (eg orphan rows, referential integrity) | |
| # without executing any migration queries | |
| is_data_valid, data_messages = validate_data( | |
| entity_class, | |
| connection, | |
| connection_params, | |
| ) | |
| messages.extend(data_messages) | |
| if not is_data_valid: | |
| return False, messages |
| try: | ||
| cursor = connection.cursor() | ||
|
|
||
| for query in queries: | ||
| messages.append("executing: %s" % query) | ||
| cursor.execute(query) | ||
|
|
||
| connection.commit() |
There was a problem hiding this comment.
The migration executes via a cursor created at the start of the transaction but never closes it. Please ensure the cursor is closed in a finally block (success and failure paths) to avoid leaking resources across repeated runs/tests.
| __license__ = "Apache License, Version 2.0" | ||
| """ The license for the module """ | ||
|
|
||
| import os |
There was a problem hiding this comment.
Unused import: os is imported but not used anywhere in this script. Removing it avoids lint noise and keeps the migration module minimal.
| import os |
| # table inheritance strategy, in which case a single | ||
| # update query is generated for all attributes |
There was a problem hiding this comment.
Similarly, this comment says concrete-table mode generates “a single update query”, but the code constructs an items_map across ancestors and generates multiple UPDATE statements. Please align the comment with the real multi-table update behavior in concrete-table mode.
| # table inheritance strategy, in which case a single | |
| # update query is generated for all attributes | |
| # table inheritance strategy, in which case coordinated | |
| # update statements are generated across the concrete tables |
scripts/migrate_inheritance.py
Outdated
| insert_query = ( | ||
| "insert or ignore into %s(%s) select %s from %s " | ||
| "where _class = '%s'" | ||
| % ( | ||
| new_table_name, | ||
| ", ".join(select_cols), | ||
| ", ".join(select_cols), | ||
| concrete_table, | ||
| cls.__name__, | ||
| ) | ||
| ) | ||
| queries.append(insert_query) |
There was a problem hiding this comment.
generate_concrete_to_cti_queries() uses insert or ignore, which is SQLite syntax. If migrate() is intended to be usable with MySQL/PostgreSQL connections (as implied by the engine parameter and backup support), this should be generated per-engine (e.g., INSERT IGNORE for MySQL, ON CONFLICT DO NOTHING for PostgreSQL) or the script should fail fast when engine != sqlite.
There was a problem hiding this comment.
Actionable comments posted: 13
🧹 Nitpick comments (2)
data/src/entity_manager/test.py (2)
3308-3549: Exercise the ORM after each migration, not just raw SQLite reads.These end-to-end cases stop at schema and row-level assertions. That won't catch a migration that produces the right tables but still breaks
EntityManager.get/find/saveagainst the migrated strategy. Reopening the entity manager and doing one ORM round-trip here would cover the integration boundary this PR is changing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@data/src/entity_manager/test.py` around lines 3308 - 3549, Add ORM round-trip checks after each migration in test_end_to_end_cti_to_concrete and test_end_to_end_concrete_to_cti: after calling migration.migrate and asserting success, reinitialize or reopen the EntityManager (or at least clear its session), then use EntityManager.get/find/save (and entity_manager.begin/commit where needed) to load an existing migrated entity and to persist a new one to verify full ORM operations work against the migrated schema; reference the existing helpers used in the test (self.entity_manager, self._get_connection(), and migration.migrate) to locate where to insert these checks.
2642-2713: Move the inheritance override state management intosetUp/tearDown.This test mutates
structures.DATA_INHERITANCEand per-class caches inline. Thetry/finallyhelps, but the suite guideline for class-level test state is to capture it insetUpand restore it intearDownso isolation is uniform across the module.As per coding guidelines, "For class-level state in tests, save the original state in
setUpand restore it intearDownto ensure test isolation".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@data/src/entity_manager/test.py` around lines 2642 - 2713, The test mutates module-global structures.DATA_INHERITANCE and per-class _inheritance_strategy caches inline; move that state management into the test class's setUp and tearDown: in setUp capture structures.DATA_INHERITANCE into self._original_data_inheritance, build the test_classes list and save any existing cls._inheritance_strategy into self._saved_caches, and clear caches (del cls._inheritance_strategy) so each test starts fresh; in tearDown restore structures.DATA_INHERITANCE from self._original_data_inheritance, clear any temporary _inheritance_strategy entries, and reassign saved values from self._saved_caches; then simplify test_global_inheritance_override by removing the try/finally and relying on setUp/tearDown to manage state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@data/src/entity_manager/mocks.py`:
- Around line 539-543: _relation_boss currently sets reverse="children", which
pairs the boss relation to the parent/child metadata; change the reverse to a
dedicated reports collection (e.g., reverse="reports") so boss lookups and lazy
loading traverse the correct relation on ConcretePerson, and ensure
ConcretePerson defines a matching reports relation name; update the dict
returned by _relation_boss to use reverse="reports" (or the actual reports
relation name on ConcretePerson).
In `@data/src/entity_manager/structures.py`:
- Around line 147-151: The code currently reserves the public attribute name
"inheritance" (listed alongside "_inheritance_strategy" in the
INVALID_NAMES-like list), which breaks backwards compatibility; change the
behavior so the framework either uses a private sentinel name (e.g., keep and
use "_inheritance_strategy" as the config key and remove "inheritance" from the
invalid/reserved names) or make the reservation conditional by only treating
"inheritance" as the framework flag when its value equals one of the supported
strategy strings (check membership against your strategy constants) — update the
INVALID_NAMES array and any validation/filtering logic that references
"inheritance" accordingly (also apply the same change in the other occurrences
referenced in the review).
- Around line 1813-1875: get_all_items() currently merges parent and own items
by assigning into all_items such that later entries overwrite earlier ones
(last-wins), which can conflict with the ORM's owner precedence used by
get_names_map() that preserves the first owner; change the merge logic in
get_all_items() (the loop over cls.get_all_parents() and the subsequent
own_items loop that writes into all_items and sets setattr(cls, cache_key,
all_items)) so that on key collisions you preserve the existing owner/value
(first-wins) rather than overwriting, and ensure own_items still take precedence
over parents by only writing a key from own_items if it does not already exist
or by writing parents first then own items using a non-overwriting assign policy
consistent with get_names_map().
In `@data/src/entity_manager/system.py`:
- Around line 4673-4697: The joins for inherited relations still use the
ancestor class name as the SQL prefix, causing predicates to reference Parent.*
even after Parent is removed from FROM; update join_tables() so that when a
relation is declared on an ancestor it computes the SQL prefix from the
concrete/table-owning class (not _entity_class.get_name())—e.g. resolve the
concrete subclass/table name (or the concrete table alias mapping used when
building the FROM joins) and use that for _prefix and any predicate generation;
apply the same fix to the equivalent logic referenced around lines 4771-4780 so
inherited relations always target the concrete table alias rather than the
ancestor name.
- Around line 3063-3065: The concrete-table path flattens parent columns via
entity_class.get_all_items() but schema setup still only calls
index_fields(entity_class) which in turn uses entity_class.get_indexed(), so
inherited indexed fields on the concrete table are not registered; update the
indexing logic so that for concrete tables (when is_concrete and table_items
comes from entity_class.get_all_items()) you call the inherited-aware accessor
(e.g. entity_class.get_all_indexed() or otherwise merge parent indexed fields
into entity_class.get_indexed()) before calling index_fields(entity_class) so
the child's table carries secondary indexes defined on parents.
- Around line 5143-5148: The issue is that ordering resolution still uses
_resolve_name() which consults get_names_map() and returns parent-qualified
names for inherited concrete-table fields; update _resolve_name() (or add the
same early-return logic used in the filter branch) to check
entity_class.is_concrete_table() and return table_name or the entity's own name
(entity_class.get_name() / _own_table_name) for any resolved attribute path so
that _order_query_f() and eager relation paths do not produce Parent.field
aliases when the child uses concrete table inheritance; ensure this same change
covers any code paths where order_by or relation-paths call _resolve_name().
In `@doc/design/002-concrete_table_inheritance.md`:
- Around line 144-145: The doc line incorrectly states that DATA_INHERITANCE is
read at module load time; update the sentence to say that
`colony.conf("DATA_INHERITANCE", None)` is read inside
`get_inheritance_strategy()` at call time (not at import/module load), and
mention `get_inheritance_strategy()` in `structures.py` as the place the
override is checked so the documented lifecycle matches the code.
In `@scripts/benchmark_inheritance.py`:
- Around line 162-176: The benchmark helpers (e.g., insert_concrete) currently
only operate on the leaf concrete table, which underestimates real-world cost
because the real implementation duplicates writes into ancestor tables and
serves parent-level reads via UNION ALL; update insert_concrete and the
corresponding read helpers (the other functions mentioned around lines 239-252,
278-315, 429-510) to perform the same behavior as the real concrete-table
strategy: when inserting a leaf row, also insert into each ancestor table with
the appropriate columns and _class marker, and when running parent-level queries
use a UNION ALL over the relevant ancestor+leaf tables (or simulate that by
querying the ancestor tables combined) so the benchmark reflects the extra write
and read costs of duplicated ancestor rows and UNION ALL scans; locate and
modify functions named insert_concrete and the sibling insert/read helpers to
implement these changes.
- Around line 575-755: The cleanup currently deletes temporary DB files in the
outer finally without guaranteeing the SQLite connections (cti_conn,
concrete_conn, wide_cti_conn, wide_concrete_conn) are closed, which can fail on
Windows and leak wide DB files; modify the function to track these connection
objects and paths (cti_conn, concrete_conn, wide_cti_conn, wide_concrete_conn
and cti_path, concrete_path, wide_cti_path, wide_concrete_path), and in the
finally block first safely close any non-None/open connections (catch and ignore
errors on close) and then remove their corresponding files if they exist; ensure
wide_* variables are defined (init to None) before use so early exceptions still
trigger proper closure and deletion.
In `@scripts/migrate_inheritance.py`:
- Around line 707-725: The migrate() flow currently stops after
validate_hierarchy() and skips validate_data(), so run validate_data() after a
successful hierarchy validation and before proceeding to build queries (i.e.,
after the existing is_valid check and before the current_strategy logic); if
validate_data() returns failures, append its messages to messages and return
False, messages, and if validate_only is true return True, messages only when
both validate_hierarchy() and validate_data() succeed; ensure you call
validate_data(entity_class, connection, types_map) (or the appropriate
signature) and propagate its results before calling
generate_cti_to_concrete_queries() or generate_concrete_to_cti_queries().
- Around line 159-163: The backup path code for SQLite does not flush WAL to the
main DB before copying, so modify the sqlite branch (the block that checks if
engine == "sqlite" and uses connection_params/file_path) to open a sqlite3
connection to file_path (or use the existing connection if available) and
execute "PRAGMA wal_checkpoint(FULL)" (via cursor.execute) and commit/close
before calling shutil.copy2; then perform the existing backup_path creation and
copy as before so the backup includes WAL changes.
- Around line 338-345: The code uses SQLite-only SQL (sqlite_master and "insert
or ignore") inside generate_cti_to_concrete_queries and
generate_concrete_to_cti_queries and so will break for mysql/pgsql; update both
functions to accept an engine parameter and branch SQL generation by engine:
keep the sqlite_master query when engine=="sqlite", but use
information_schema.tables or pg_catalog (e.g. query information_schema.tables
with table_schema/name) for mysql and pgsql table-existence checks in place of
sqlite_master, and replace "insert or ignore" with engine-specific upsert syntax
("INSERT IGNORE" for mysql, "INSERT ... ON CONFLICT DO NOTHING" for pgsql) while
preserving behavior; ensure migrate() callers pass the engine through to these
functions and update places that use connection.cursor()/execute to use the
engine-specific SQL strings from generate_cti_to_concrete_queries /
generate_concrete_to_cti_queries.
- Around line 70-80: The current _collect recursion relies on __subclasses__()
which only reflects already-imported modules; before calling
_collect(entity_class) ensure all modules that could define subclasses of
entity_class are imported (e.g., use
pkgutil.walk_packages()/importlib.import_module over the package containing
entity_class.__module__ or otherwise import relevant modules), then run the
existing _collect function that appends to classes; keep the same symbols
(_collect, classes, entity_class, __subclasses__()) but add the module
discovery/import step prior to invoking _collect so the full class hierarchy is
loaded.
---
Nitpick comments:
In `@data/src/entity_manager/test.py`:
- Around line 3308-3549: Add ORM round-trip checks after each migration in
test_end_to_end_cti_to_concrete and test_end_to_end_concrete_to_cti: after
calling migration.migrate and asserting success, reinitialize or reopen the
EntityManager (or at least clear its session), then use
EntityManager.get/find/save (and entity_manager.begin/commit where needed) to
load an existing migrated entity and to persist a new one to verify full ORM
operations work against the migrated schema; reference the existing helpers used
in the test (self.entity_manager, self._get_connection(), and migration.migrate)
to locate where to insert these checks.
- Around line 2642-2713: The test mutates module-global
structures.DATA_INHERITANCE and per-class _inheritance_strategy caches inline;
move that state management into the test class's setUp and tearDown: in setUp
capture structures.DATA_INHERITANCE into self._original_data_inheritance, build
the test_classes list and save any existing cls._inheritance_strategy into
self._saved_caches, and clear caches (del cls._inheritance_strategy) so each
test starts fresh; in tearDown restore structures.DATA_INHERITANCE from
self._original_data_inheritance, clear any temporary _inheritance_strategy
entries, and reassign saved values from self._saved_caches; then simplify
test_global_inheritance_override by removing the try/finally and relying on
setUp/tearDown to manage state.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c8caefd7-f2a1-47cf-91b3-d1b30aeb53db
📒 Files selected for processing (9)
CHANGELOG.mddata/src/entity_manager/mocks.pydata/src/entity_manager/structures.pydata/src/entity_manager/system.pydata/src/entity_manager/test.pydoc/design/002-concrete_table_inheritance.mddoc/design/README.mdscripts/benchmark_inheritance.pyscripts/migrate_inheritance.py
| @staticmethod | ||
| def _relation_boss(): | ||
| return dict( | ||
| type="to-one", target=ConcretePerson, reverse="children", is_mapper=True | ||
| ) |
There was a problem hiding this comment.
boss is paired with the wrong reverse relation.
Using reverse="children" ties manager lookups to the parent/child relation metadata. ConcretePerson needs a dedicated report collection here; otherwise reverse traversal and lazy loading for boss relationships will resolve the wrong side.
🐛 Suggested fix
class ConcretePerson(ConcreteRootEntity):
@@
children = dict(type="relation")
""" The children of the current person """
+ employees = dict(type="relation")
+ """ The employees associated with the person """
+
@@
def _relation_children():
return dict(type="to-many", target=ConcretePerson, reverse="parent")
+ `@staticmethod`
+ def _relation_employees():
+ return dict(type="to-many", target=ConcreteEmployee, reverse="boss")
+
@@
class ConcreteEmployee(ConcretePerson):
@@
`@staticmethod`
def _relation_boss():
return dict(
- type="to-one", target=ConcretePerson, reverse="children", is_mapper=True
+ type="to-one", target=ConcretePerson, reverse="employees", is_mapper=True
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@data/src/entity_manager/mocks.py` around lines 539 - 543, _relation_boss
currently sets reverse="children", which pairs the boss relation to the
parent/child metadata; change the reverse to a dedicated reports collection
(e.g., reverse="reports") so boss lookups and lazy loading traverse the correct
relation on ConcretePerson, and ensure ConcretePerson defines a matching reports
relation name; update the dict returned by _relation_boss to use
reverse="reports" (or the actual reports relation name on ConcretePerson).
| "_inheritance_strategy", | ||
| "_all_items", | ||
| "_all_non_foreign_items", | ||
| "abstract", | ||
| "inheritance", |
There was a problem hiding this comment.
inheritance is now a breaking reserved attribute.
Any model that already persisted an inheritance field will silently lose it after this change: the name is filtered out by INVALID_NAMES, and the same attribute is now interpreted as framework strategy config. For a public ORM surface, that is a backward-compatibility break, not just an additive feature. Please move the strategy flag to a private/sentinel name, or only reserve inheritance when its value is one of the supported strategy strings.
Also applies to: 429-433, 2638-2642
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@data/src/entity_manager/structures.py` around lines 147 - 151, The code
currently reserves the public attribute name "inheritance" (listed alongside
"_inheritance_strategy" in the INVALID_NAMES-like list), which breaks backwards
compatibility; change the behavior so the framework either uses a private
sentinel name (e.g., keep and use "_inheritance_strategy" as the config key and
remove "inheritance" from the invalid/reserved names) or make the reservation
conditional by only treating "inheritance" as the framework flag when its value
equals one of the supported strategy strings (check membership against your
strategy constants) — update the INVALID_NAMES array and any
validation/filtering logic that references "inheritance" accordingly (also apply
the same change in the other occurrences referenced in the review).
| # retrieves all the parent classes and iterates over | ||
| # them to collect their items, this includes abstract | ||
| # parents whose items would normally be inherited into | ||
| # the first concrete parent | ||
| all_parents = cls.get_all_parents() | ||
| for parent in all_parents: | ||
| # retrieves the items directly defined in the parent | ||
| # class using the internal items method and filters | ||
| # them using the same criteria as get_items | ||
| parent_items = parent._items() | ||
| for key, value in colony.legacy.iteritems(parent_items): | ||
| if key in INVALID_NAMES: | ||
| continue | ||
| if key.isupper(): | ||
| continue | ||
| if type(value) in ( | ||
| types.FunctionType, | ||
| types.MethodType, | ||
| staticmethod, | ||
| classmethod, | ||
| property, | ||
| ): | ||
| continue | ||
| if ( | ||
| not foreign_relations | ||
| and cls.is_relation(key) | ||
| and not cls.is_mapped(key) | ||
| ): | ||
| continue | ||
| if not hasattr(value, "get"): | ||
| continue | ||
| all_items[key] = value | ||
|
|
||
| # retrieves the items directly defined in the current | ||
| # class and merges them into the all items map (these | ||
| # take precedence over parent items) | ||
| own_items = cls._items() | ||
| for key, value in colony.legacy.iteritems(own_items): | ||
| if key in INVALID_NAMES: | ||
| continue | ||
| if key.isupper(): | ||
| continue | ||
| if type(value) in ( | ||
| types.FunctionType, | ||
| types.MethodType, | ||
| staticmethod, | ||
| classmethod, | ||
| property, | ||
| ): | ||
| continue | ||
| if ( | ||
| not foreign_relations | ||
| and cls.is_relation(key) | ||
| and not cls.is_mapped(key) | ||
| ): | ||
| continue | ||
| if not hasattr(value, "get"): | ||
| continue | ||
| all_items[key] = value | ||
|
|
||
| # caches the all items in the class to provide | ||
| # fast access in latter access | ||
| setattr(cls, cache_key, all_items) |
There was a problem hiding this comment.
get_all_items() can disagree with the ORM's existing owner precedence.
This merge overwrites on every collision, so the last parent wins. get_names_map() keeps the first owner instead, which means flattened CTI metadata can diverge from the owner/validation path for duplicate fields in multi-parent hierarchies.
♻️ Suggested fix
- # retrieves all the parent classes and iterates over
- # them to collect their items, this includes abstract
- # parents whose items would normally be inherited into
- # the first concrete parent
- all_parents = cls.get_all_parents()
- for parent in all_parents:
- # retrieves the items directly defined in the parent
- # class using the internal items method and filters
- # them using the same criteria as get_items
- parent_items = parent._items()
- for key, value in colony.legacy.iteritems(parent_items):
- if key in INVALID_NAMES:
- continue
- if key.isupper():
- continue
- if type(value) in (
- types.FunctionType,
- types.MethodType,
- staticmethod,
- classmethod,
- property,
- ):
- continue
- if (
- not foreign_relations
- and cls.is_relation(key)
- and not cls.is_mapped(key)
- ):
- continue
- if not hasattr(value, "get"):
- continue
- all_items[key] = value
-
- # retrieves the items directly defined in the current
- # class and merges them into the all items map (these
- # take precedence over parent items)
- own_items = cls._items()
- for key, value in colony.legacy.iteritems(own_items):
- if key in INVALID_NAMES:
- continue
- if key.isupper():
- continue
- if type(value) in (
- types.FunctionType,
- types.MethodType,
- staticmethod,
- classmethod,
- property,
- ):
- continue
- if (
- not foreign_relations
- and cls.is_relation(key)
- and not cls.is_mapped(key)
- ):
- continue
- if not hasattr(value, "get"):
- continue
- all_items[key] = value
+ # reuse the same ownership order used by get_names_map()
+ # so flattened descriptors match the rest of the ORM
+ names_map = cls.get_names_map()
+ for key, owner in colony.legacy.iteritems(names_map):
+ owner_items = owner.get_items(foreign_relations=foreign_relations)
+ if key in owner_items:
+ all_items[key] = owner_items[key]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@data/src/entity_manager/structures.py` around lines 1813 - 1875,
get_all_items() currently merges parent and own items by assigning into
all_items such that later entries overwrite earlier ones (last-wins), which can
conflict with the ORM's owner precedence used by get_names_map() that preserves
the first owner; change the merge logic in get_all_items() (the loop over
cls.get_all_parents() and the subsequent own_items loop that writes into
all_items and sets setattr(cls, cache_key, all_items)) so that on key collisions
you preserve the existing owner/value (first-wins) rather than overwriting, and
ensure own_items still take precedence over parents by only writing a key from
own_items if it does not already exist or by writing parents first then own
items using a non-overwriting assign policy consistent with get_names_map().
| table_items = ( | ||
| entity_class.get_all_items() if is_concrete else entity_class.get_items() | ||
| ) |
There was a problem hiding this comment.
Carry inherited secondary indexes onto concrete tables.
get_all_items() flattens parent columns into the concrete table here, but schema setup still calls index_fields(entity_class), and that method only indexes entity_class.get_indexed() for the current table. Under class-table that was table-scoped by necessity; with concrete-table, indexed parent fields on the child table are left unindexed, so inherited-field filters/orderings regress to table scans.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@data/src/entity_manager/system.py` around lines 3063 - 3065, The
concrete-table path flattens parent columns via entity_class.get_all_items() but
schema setup still only calls index_fields(entity_class) which in turn uses
entity_class.get_indexed(), so inherited indexed fields on the concrete table
are not registered; update the indexing logic so that for concrete tables (when
is_concrete and table_items comes from entity_class.get_all_items()) you call
the inherited-aware accessor (e.g. entity_class.get_all_indexed() or otherwise
merge parent indexed fields into entity_class.get_indexed()) before calling
index_fields(entity_class) so the child's table carries secondary indexes
defined on parents.
| if not is_concrete: | ||
| # iterates over all the parents to provide | ||
| # the necessary (inner) join of them into | ||
| # the current query context, this is a main step | ||
| # in achieving inheritance compliance in the query | ||
| for parent in all_parents: | ||
| # in case the parent class is abstract no need to join | ||
| # it into the current query | ||
| if parent.is_abstract(): | ||
| continue | ||
|
|
||
| # retrieves the parent name, assumes the | ||
| # associated table has the same value | ||
| parent_name = parent.get_name() | ||
|
|
||
| # writes the table inheritance inner join | ||
| # part of the query, ensuring data coherence | ||
| # in the complete inheritance chain | ||
| query_buffer.write(" inner join ") | ||
| query_buffer.write(parent_name) | ||
| query_buffer.write(" on ") | ||
| query_buffer.write(table_name + "." + table_id) | ||
| query_buffer.write(" = ") | ||
| query_buffer.write(parent_name + "." + table_id) | ||
| # retrieves the parent name, assumes the | ||
| # associated table has the same value | ||
| parent_name = parent.get_name() | ||
|
|
||
| # writes the table inheritance inner join | ||
| # part of the query, ensuring data coherence | ||
| # in the complete inheritance chain | ||
| query_buffer.write(" inner join ") | ||
| query_buffer.write(parent_name) | ||
| query_buffer.write(" on ") | ||
| query_buffer.write(table_name + "." + table_id) | ||
| query_buffer.write(" = ") | ||
| query_buffer.write(parent_name + "." + table_id) | ||
|
|
There was a problem hiding this comment.
Inherited relation joins still point at parent aliases.
For concrete-table hierarchies, Lines 4673-4697 remove the parent tables from FROM, but join_tables() still builds _prefix from _entity_class.get_name() when the relation is declared on an ancestor. Eager/default loading of an inherited relation will emit predicates against Parent.* even though Parent is no longer joined.
🐛 Minimal fix
- if entity_class == _entity_class:
- _prefix = prefix or _table_name
- else:
- _prefix = (
- (prefix + "___" + _table_name) if prefix else _table_name
- )
+ if entity_class.is_concrete_table():
+ _prefix = prefix or entity_class.get_name()
+ elif entity_class == _entity_class:
+ _prefix = prefix or _table_name
+ else:
+ _prefix = (
+ (prefix + "___" + _table_name) if prefix else _table_name
+ )Also applies to: 4771-4780
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@data/src/entity_manager/system.py` around lines 4673 - 4697, The joins for
inherited relations still use the ancestor class name as the SQL prefix, causing
predicates to reference Parent.* even after Parent is removed from FROM; update
join_tables() so that when a relation is declared on an ancestor it computes the
SQL prefix from the concrete/table-owning class (not
_entity_class.get_name())—e.g. resolve the concrete subclass/table name (or the
concrete table alias mapping used when building the FROM joins) and use that for
_prefix and any predicate generation; apply the same fix to the equivalent logic
referenced around lines 4771-4780 so inherited relations always target the
concrete table alias rather than the ancestor name.
| cti_fd, cti_path = tempfile.mkstemp(suffix=".db") | ||
| concrete_fd, concrete_path = tempfile.mkstemp(suffix=".db") | ||
| os.close(cti_fd) | ||
| os.close(concrete_fd) | ||
|
|
||
| try: | ||
| cti_conn = sqlite3.connect(cti_path) | ||
| concrete_conn = sqlite3.connect(concrete_path) | ||
|
|
||
| # benchmarks the schema creation | ||
| cti_ms, _ = benchmark("create_schema", create_cti_schema, cti_conn) | ||
| concrete_ms, _ = benchmark( | ||
| "create_schema", create_concrete_schema, concrete_conn | ||
| ) | ||
| results.append(("Schema Creation", cti_ms, concrete_ms)) | ||
|
|
||
| # benchmarks the bulk insert operation | ||
| def bulk_insert_cti(): | ||
| for i in range(1, iterations + 1): | ||
| insert_cti(cti_conn, i, "person_%d" % i, 20 + (i % 50), 1000 + i) | ||
| cti_conn.commit() | ||
|
|
||
| def bulk_insert_concrete(): | ||
| for i in range(1, iterations + 1): | ||
| insert_concrete( | ||
| concrete_conn, i, "person_%d" % i, 20 + (i % 50), 1000 + i | ||
| ) | ||
| concrete_conn.commit() | ||
|
|
||
| cti_ms, _ = benchmark("bulk_insert", bulk_insert_cti) | ||
| concrete_ms, _ = benchmark("bulk_insert", bulk_insert_concrete) | ||
| results.append(("Bulk INSERT (%d rows)" % iterations, cti_ms, concrete_ms)) | ||
|
|
||
| # benchmarks the single select by ID (warm cache, middle of range) | ||
| target_id = iterations // 2 | ||
|
|
||
| def repeat_select_cti(): | ||
| for _i in range(iterations): | ||
| select_by_id_cti(cti_conn, target_id) | ||
|
|
||
| def repeat_select_concrete(): | ||
| for _i in range(iterations): | ||
| select_by_id_concrete(concrete_conn, target_id) | ||
|
|
||
| cti_ms, _ = benchmark("select_by_id", repeat_select_cti) | ||
| concrete_ms, _ = benchmark("select_by_id", repeat_select_concrete) | ||
| results.append(("SELECT by ID (%d lookups)" % iterations, cti_ms, concrete_ms)) | ||
|
|
||
| # benchmarks the find all (full table scan) | ||
| def repeat_find_all_cti(): | ||
| for _i in range(100): | ||
| select_all_cti(cti_conn) | ||
|
|
||
| def repeat_find_all_concrete(): | ||
| for _i in range(100): | ||
| select_all_concrete(concrete_conn) | ||
|
|
||
| cti_ms, _ = benchmark("find_all", repeat_find_all_cti) | ||
| concrete_ms, _ = benchmark("find_all", repeat_find_all_concrete) | ||
| results.append(("SELECT all (100 scans)", cti_ms, concrete_ms)) | ||
|
|
||
| # benchmarks the count operation | ||
| def repeat_count_cti(): | ||
| for _i in range(iterations): | ||
| count_cti(cti_conn) | ||
|
|
||
| def repeat_count_concrete(): | ||
| for _i in range(iterations): | ||
| count_concrete(concrete_conn) | ||
|
|
||
| cti_ms, _ = benchmark("count", repeat_count_cti) | ||
| concrete_ms, _ = benchmark("count", repeat_count_concrete) | ||
| results.append(("COUNT (%d counts)" % iterations, cti_ms, concrete_ms)) | ||
|
|
||
| # benchmarks the bulk update operation | ||
| def bulk_update_cti(): | ||
| for i in range(1, iterations + 1): | ||
| update_cti(cti_conn, i, "updated_%d" % i, 2000 + i) | ||
| cti_conn.commit() | ||
|
|
||
| def bulk_update_concrete(): | ||
| for i in range(1, iterations + 1): | ||
| update_concrete(concrete_conn, i, "updated_%d" % i, 2000 + i) | ||
| concrete_conn.commit() | ||
|
|
||
| cti_ms, _ = benchmark("bulk_update", bulk_update_cti) | ||
| concrete_ms, _ = benchmark("bulk_update", bulk_update_concrete) | ||
| results.append(("Bulk UPDATE (%d rows)" % iterations, cti_ms, concrete_ms)) | ||
|
|
||
| # benchmarks the bulk delete operation | ||
| def bulk_delete_cti(): | ||
| for i in range(1, iterations + 1): | ||
| delete_cti(cti_conn, i) | ||
| cti_conn.commit() | ||
|
|
||
| def bulk_delete_concrete(): | ||
| for i in range(1, iterations + 1): | ||
| delete_concrete(concrete_conn, i) | ||
| concrete_conn.commit() | ||
|
|
||
| cti_ms, _ = benchmark("bulk_delete", bulk_delete_cti) | ||
| concrete_ms, _ = benchmark("bulk_delete", bulk_delete_concrete) | ||
| results.append(("Bulk DELETE (%d rows)" % iterations, cti_ms, concrete_ms)) | ||
|
|
||
| # closes the narrow benchmark connections | ||
| cti_conn.close() | ||
| concrete_conn.close() | ||
|
|
||
| # adds a separator for the wide hierarchy benchmarks | ||
| results.append(None) | ||
|
|
||
| # creates the temporary database files for the wide | ||
| # hierarchy benchmarks (20 columns per level, 60 total) | ||
| wide_cti_fd, wide_cti_path = tempfile.mkstemp(suffix=".db") | ||
| wide_concrete_fd, wide_concrete_path = tempfile.mkstemp(suffix=".db") | ||
| os.close(wide_cti_fd) | ||
| os.close(wide_concrete_fd) | ||
|
|
||
| wide_cti_conn = sqlite3.connect(wide_cti_path) | ||
| wide_concrete_conn = sqlite3.connect(wide_concrete_path) | ||
|
|
||
| # benchmarks the wide schema creation | ||
| cti_ms, _ = benchmark( | ||
| "wide_create_schema", create_wide_cti_schema, wide_cti_conn | ||
| ) | ||
| concrete_ms, _ = benchmark( | ||
| "wide_create_schema", create_wide_concrete_schema, wide_concrete_conn | ||
| ) | ||
| results.append(("Wide Schema Creation (60 cols)", cti_ms, concrete_ms)) | ||
|
|
||
| # benchmarks the wide bulk insert | ||
| def wide_bulk_insert_cti(): | ||
| for i in range(1, iterations + 1): | ||
| insert_wide_cti(wide_cti_conn, i) | ||
| wide_cti_conn.commit() | ||
|
|
||
| def wide_bulk_insert_concrete(): | ||
| for i in range(1, iterations + 1): | ||
| insert_wide_concrete(wide_concrete_conn, i) | ||
| wide_concrete_conn.commit() | ||
|
|
||
| cti_ms, _ = benchmark("wide_bulk_insert", wide_bulk_insert_cti) | ||
| concrete_ms, _ = benchmark("wide_bulk_insert", wide_bulk_insert_concrete) | ||
| results.append( | ||
| ("Wide INSERT (%d rows, 60 cols)" % iterations, cti_ms, concrete_ms) | ||
| ) | ||
|
|
||
| # benchmarks the wide select by ID | ||
| target_id = iterations // 2 | ||
|
|
||
| def wide_repeat_select_cti(): | ||
| for _i in range(iterations): | ||
| select_by_id_wide_cti(wide_cti_conn, target_id) | ||
|
|
||
| def wide_repeat_select_concrete(): | ||
| for _i in range(iterations): | ||
| select_by_id_wide_concrete(wide_concrete_conn, target_id) | ||
|
|
||
| cti_ms, _ = benchmark("wide_select_by_id", wide_repeat_select_cti) | ||
| concrete_ms, _ = benchmark("wide_select_by_id", wide_repeat_select_concrete) | ||
| results.append( | ||
| ("Wide SELECT by ID (%d, 60 cols)" % iterations, cti_ms, concrete_ms) | ||
| ) | ||
|
|
||
| # closes the wide benchmark connections | ||
| wide_cti_conn.close() | ||
| wide_concrete_conn.close() | ||
|
|
||
| if os.path.exists(wide_cti_path): | ||
| os.remove(wide_cti_path) | ||
| if os.path.exists(wide_concrete_path): | ||
| os.remove(wide_concrete_path) | ||
|
|
||
| finally: | ||
| # removes the temporary database files | ||
| if os.path.exists(cti_path): | ||
| os.remove(cti_path) | ||
| if os.path.exists(concrete_path): | ||
| os.remove(concrete_path) | ||
|
|
||
| return results |
There was a problem hiding this comment.
Make temp database cleanup exception-safe.
If any benchmark step raises before the explicit close() calls, the finally block tries to delete still-open SQLite files. That can raise on Windows and mask the real failure, and the wide temp databases leak on early exits.
♻️ Suggested cleanup structure
def run_benchmarks(iterations=None):
@@
- try:
- cti_conn = sqlite3.connect(cti_path)
- concrete_conn = sqlite3.connect(concrete_path)
+ cti_conn = None
+ concrete_conn = None
+ wide_cti_conn = None
+ wide_concrete_conn = None
+ wide_cti_path = None
+ wide_concrete_path = None
+
+ try:
+ cti_conn = sqlite3.connect(cti_path)
+ concrete_conn = sqlite3.connect(concrete_path)
@@
- # closes the wide benchmark connections
- wide_cti_conn.close()
- wide_concrete_conn.close()
-
- if os.path.exists(wide_cti_path):
- os.remove(wide_cti_path)
- if os.path.exists(wide_concrete_path):
- os.remove(wide_concrete_path)
-
finally:
- # removes the temporary database files
- if os.path.exists(cti_path):
- os.remove(cti_path)
- if os.path.exists(concrete_path):
- os.remove(concrete_path)
+ for connection in (
+ wide_cti_conn,
+ wide_concrete_conn,
+ cti_conn,
+ concrete_conn,
+ ):
+ if connection:
+ connection.close()
+ for path in (
+ wide_cti_path,
+ wide_concrete_path,
+ cti_path,
+ concrete_path,
+ ):
+ if path and os.path.exists(path):
+ os.remove(path)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/benchmark_inheritance.py` around lines 575 - 755, The cleanup
currently deletes temporary DB files in the outer finally without guaranteeing
the SQLite connections (cti_conn, concrete_conn, wide_cti_conn,
wide_concrete_conn) are closed, which can fail on Windows and leak wide DB
files; modify the function to track these connection objects and paths
(cti_conn, concrete_conn, wide_cti_conn, wide_concrete_conn and cti_path,
concrete_path, wide_cti_path, wide_concrete_path), and in the finally block
first safely close any non-None/open connections (catch and ignore errors on
close) and then remove their corresponding files if they exist; ensure wide_*
variables are defined (init to None) before use so early exceptions still
trigger proper closure and deletion.
| def _collect(cls): | ||
| for subclass in cls.__subclasses__(): | ||
| # avoids duplicates from multiple inheritance paths | ||
| # (e.g. Employee inherits from both Person and Taxable) | ||
| if subclass in classes: | ||
| continue | ||
| if not subclass.is_abstract(): | ||
| classes.append(subclass) | ||
| _collect(subclass) | ||
|
|
||
| _collect(entity_class) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "migrate_inheritance.py" -type fRepository: hivesolutions/colony-plugins
Length of output: 104
🏁 Script executed:
wc -l scripts/migrate_inheritance.pyRepository: hivesolutions/colony-plugins
Length of output: 106
🏁 Script executed:
sed -n '65,85p' scripts/migrate_inheritance.pyRepository: hivesolutions/colony-plugins
Length of output: 653
🏁 Script executed:
sed -n '820,840p' scripts/migrate_inheritance.pyRepository: hivesolutions/colony-plugins
Length of output: 646
🏁 Script executed:
# Get context around the function definitions
sed -n '1,100p' scripts/migrate_inheritance.py | head -100Repository: hivesolutions/colony-plugins
Length of output: 3114
🏁 Script executed:
sed -n '825,835p' scripts/migrate_inheritance.pyRepository: hivesolutions/colony-plugins
Length of output: 437
🏁 Script executed:
# Get broader context around the main execution
sed -n '800,870p' scripts/migrate_inheritance.pyRepository: hivesolutions/colony-plugins
Length of output: 2019
🏁 Script executed:
# Check for any imports or code that discovers/loads all subclasses
rg "__subclasses__\|import.*subclass" scripts/migrate_inheritance.pyRepository: hivesolutions/colony-plugins
Length of output: 54
🏁 Script executed:
# Look for how the module is imported and if there are any mechanisms to discover all subclasses
sed -n '800,840p' scripts/migrate_inheritance.pyRepository: hivesolutions/colony-plugins
Length of output: 1196
🏁 Script executed:
# Get the full picture around the migrate function call and setup
sed -n '700,870p' scripts/migrate_inheritance.py | tail -100Repository: hivesolutions/colony-plugins
Length of output: 2882
🏁 Script executed:
# Look for the migrate function definition
sed -n '1,400p' scripts/migrate_inheritance.py | grep -A 30 "^def migrate"Repository: hivesolutions/colony-plugins
Length of output: 54
🏁 Script executed:
# Check if there are any warnings or validations for incomplete hierarchies
rg "hierarchy|subclass|import" scripts/migrate_inheritance.py -A 2 -B 2Repository: hivesolutions/colony-plugins
Length of output: 7384
🏁 Script executed:
# Check if there's any code that ensures all subclasses are imported before calling get_hierarchy_classes
sed -n '200,400p' scripts/migrate_inheritance.pyRepository: hivesolutions/colony-plugins
Length of output: 7321
🏁 Script executed:
# Look at the migrate function definition more completely
rg "^def migrate" scripts/migrate_inheritance.py -A 50Repository: hivesolutions/colony-plugins
Length of output: 1755
🏁 Script executed:
# Check if there are any comments or docstrings mentioning the need to import subclasses
rg "import.*subclass|all.*subclass|discover.*class" scripts/migrate_inheritance.py -B 2 -A 2Repository: hivesolutions/colony-plugins
Length of output: 54
🏁 Script executed:
# Look for any mechanism that might ensure subclasses are imported
grep -n "import\|__import__" scripts/migrate_inheritance.py | head -20Repository: hivesolutions/colony-plugins
Length of output: 317
🏁 Script executed:
# Check if there's any documentation or comments about this limitation
sed -n '1,100p' scripts/migrate_inheritance.py | grep -i "import\|subclass"Repository: hivesolutions/colony-plugins
Length of output: 364
🏁 Script executed:
# Get the complete main function and surrounding code to understand the full flow
sed -n '760,870p' scripts/migrate_inheritance.pyRepository: hivesolutions/colony-plugins
Length of output: 3204
Don't rely on __subclasses__() alone to discover the complete hierarchy.
The CLI only imports the root class module via __import__(). If subclasses reside in separate modules and haven't been imported yet, __subclasses__() will silently discover only the imported portion of the hierarchy, resulting in an incomplete migration plan that skips tables from unimported classes.
Also applies to: 828-831
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/migrate_inheritance.py` around lines 70 - 80, The current _collect
recursion relies on __subclasses__() which only reflects already-imported
modules; before calling _collect(entity_class) ensure all modules that could
define subclasses of entity_class are imported (e.g., use
pkgutil.walk_packages()/importlib.import_module over the package containing
entity_class.__module__ or otherwise import relevant modules), then run the
existing _collect function that appends to classes; keep the same symbols
(_collect, classes, entity_class, __subclasses__()) but add the module
discovery/import step prior to invoking _collect so the full class hierarchy is
loaded.
| if engine == "sqlite": | ||
| file_path = connection_params.get("file_path", "") | ||
| backup_path = "%s.backup.%d" % (file_path, timestamp) | ||
| shutil.copy2(file_path, backup_path) | ||
| return backup_path |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, read the actual code at lines 159-163
head -170 scripts/migrate_inheritance.py | tail -20Repository: hivesolutions/colony-plugins
Length of output: 785
🏁 Script executed:
# Check for any WAL checkpoint calls in the backup function and surrounding code
rg "wal_checkpoint|WAL|backup" scripts/migrate_inheritance.py -B 3 -A 3Repository: hivesolutions/colony-plugins
Length of output: 3645
🏁 Script executed:
# Look for test files and their checkpoint patterns
fd "test.*\.py" scripts/Repository: hivesolutions/colony-plugins
Length of output: 54
🏁 Script executed:
# Check if there are any tests that mention wal_checkpoint or backup
rg "wal_checkpoint|FULL.*backup" . -t py -B 2 -A 2Repository: hivesolutions/colony-plugins
Length of output: 1419
Call PRAGMA wal_checkpoint(FULL) before copying the SQLite database file to ensure WAL data is flushed.
With WAL enabled, recent transactions can live in the -wal sidecar file, and shutil.copy2() on only the main database file will produce an incomplete backup. The tests already demonstrate awareness of this (they call wal_checkpoint(FULL) in multiple places), but the backup_database() function should apply the same safety measure before performing the file copy.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/migrate_inheritance.py` around lines 159 - 163, The backup path code
for SQLite does not flush WAL to the main DB before copying, so modify the
sqlite branch (the block that checks if engine == "sqlite" and uses
connection_params/file_path) to open a sqlite3 connection to file_path (or use
the existing connection if available) and execute "PRAGMA wal_checkpoint(FULL)"
(via cursor.execute) and commit/close before calling shutil.copy2; then perform
the existing backup_path creation and copy as before so the backup includes WAL
changes.
| # validates the hierarchy first | ||
| is_valid, validation_messages = validate_hierarchy(entity_class, target_strategy) | ||
| messages.extend(validation_messages) | ||
|
|
||
| if not is_valid: | ||
| return False, messages | ||
|
|
||
| if validate_only: | ||
| return True, messages | ||
|
|
||
| # determines the current strategy and generates the | ||
| # appropriate migration queries | ||
| current_strategy = entity_class.get_inheritance_strategy() | ||
|
|
||
| if current_strategy == "class_table" and target_strategy == "concrete_table": | ||
| queries = generate_cti_to_concrete_queries(entity_class, types_map, connection) | ||
| elif current_strategy == "concrete_table" and target_strategy == "class_table": | ||
| queries = generate_concrete_to_cti_queries(entity_class, types_map, connection) | ||
| else: |
There was a problem hiding this comment.
Run data validation before declaring the migration safe.
migrate() returns after validate_hierarchy() and never calls validate_data(). That means orphaned CTI rows will be reported as valid and can then disappear during the INNER JOIN copy in generate_cti_to_concrete_queries().
🛡️ Suggested validation hook
if not is_valid:
return False, messages
-
- if validate_only:
- return True, messages
# determines the current strategy and generates the
# appropriate migration queries
current_strategy = entity_class.get_inheritance_strategy()
+
+ if current_strategy == "class_table":
+ is_valid = validate_data(connection, entity_class, messages)
+ if not is_valid:
+ return False, messages
+
+ if validate_only:
+ return True, messages🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/migrate_inheritance.py` around lines 707 - 725, The migrate() flow
currently stops after validate_hierarchy() and skips validate_data(), so run
validate_data() after a successful hierarchy validation and before proceeding to
build queries (i.e., after the existing is_valid check and before the
current_strategy logic); if validate_data() returns failures, append its
messages to messages and return False, messages, and if validate_only is true
return True, messages only when both validate_hierarchy() and validate_data()
succeed; ensure you call validate_data(entity_class, connection, types_map) (or
the appropriate signature) and propagate its results before calling
generate_cti_to_concrete_queries() or generate_concrete_to_cti_queries().
…support Move all migration tests out of the Colony entity manager test bundle into scripts/test/test_migrate_inheritance.py as a self-contained test suite with its own mock entity classes and raw SQLite setup. Remove EntityManagerMigrationTestCase from entity_manager/test.py. Make the migration script engine-agnostic with helper functions: - has_table(): engine-specific catalog queries (sqlite_master, information_schema.tables, pg_tables) - index_query(): engine-specific CREATE INDEX syntax (SQLite plain, MySQL USING after column, PostgreSQL USING before column) with name truncation (64 chars MySQL, 63 chars PostgreSQL) - insert_ignore_query(): engine-specific duplicate handling (INSERT OR IGNORE for SQLite, INSERT IGNORE for MySQL, ON CONFLICT DO NOTHING for PostgreSQL) CI now uses pytest to run all scripts/test/ files: pip install pytest pytest scripts/test/ -v Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
data/src/entity_manager/test.py (1)
2516-2518: Consider consistent save order for related entities.The save order here (person saved before address when
person.address = address) differs from the pattern established in the basetest_one_to_one(lines 361-372), where the referenced entity is saved first. While this may work correctly, maintaining consistency across tests helps with readability and maintainability.🔧 Suggested fix to match established pattern
person.address = address -self.entity_manager.save(person) self.entity_manager.save(address) +self.entity_manager.save(person)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@data/src/entity_manager/test.py` around lines 2516 - 2518, The test saves person before address after setting person.address, which is inconsistent with the established pattern in test_one_to_one; update the save order so the referenced entity is persisted first (call self.entity_manager.save(address) before self.entity_manager.save(person)) after assigning person.address = address to match the base test pattern and improve readability and maintainability.scripts/migrate_inheritance.py (3)
486-489: Prefix unused loop variables with underscore.
sql_typeandis_pkare unpacked but not used. Prefix them with underscore to indicate intentionally unused.♻️ Proposed fix
- for col_name, sql_type, is_pk in columns: + for col_name, _sql_type, _is_pk in columns: source_table = get_source_table_for_column(cls, col_name) select_cols.append("%s.%s" % (source_table, col_name))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/migrate_inheritance.py` around lines 486 - 489, The loop unpacks unused variables sql_type and is_pk which should be prefixed to signal intentional non-use; modify the for-loop over columns to unpack as (col_name, _sql_type, _is_pk) and leave the rest of the logic (calls to get_source_table_for_column(cls, col_name) and appending to select_cols) unchanged so the variables and intent are clear.
404-408: SQL built via string formatting - ensure table/column names are trusted.The query construction uses string formatting with
table_name,table_id, andparent_table. While these values derive from entity class metadata (not user input), this pattern doesn't use parameterized queries for identifiers. This is acceptable since SQL doesn't support parameterized table/column names, but ensure entity class names are always code-defined and never user-controlled.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/migrate_inheritance.py` around lines 404 - 408, The SQL is built via Python string formatting in cursor.execute using table_name, table_id, and parent_table; replace this with a safe identifier composition (e.g., psycopg2.sql.SQL/Identifier) or assert/validate that these values are strictly code-defined entity names (no user input) before use. Locate the cursor.execute call in migrate_inheritance.py and either (1) rebuild the statement using psycopg2.sql.Identifier for table/column identifiers or (2) add explicit validation (regex whitelist or membership check against known entity metadata) for table_name, table_id and parent_table to guarantee they cannot be user-controlled, then use them in the query.
847-853: Consider logging when rollback fails.The
try-except-passpattern on the rollback silently swallows any rollback failure. While avoiding a cascade of exceptions is reasonable, knowing that rollback failed is critical for debugging and manual recovery.♻️ Proposed fix
except Exception as exception: # rolls back the transaction on failure try: connection.rollback() except Exception: - pass + messages.append("warning: rollback also failed") messages.append("migration failed: %s" % str(exception)) messages.append("transaction rolled back") return False, messages🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/migrate_inheritance.py` around lines 847 - 853, The rollback exception is being swallowed by a bare except in the block around connection.rollback; catch the exception from connection.rollback (e.g., as rollback_exc) and record/log it instead of pass so failures are visible—update the try/except around connection.rollback to append a message like "rollback failed: %s" % str(rollback_exc) to messages (or call the existing logger if available) before continuing, and keep the existing "transaction rolled back" and return behavior; reference the connection.rollback call and the messages list and exception variable in your change.scripts/test/test_migrate_inheritance.py (1)
102-105: Useisfor class identity comparison.When comparing class objects,
isis more appropriate than==since it checks identity rather than equality.♻️ Proposed fix
def _get_entity_parents(cls): parents = [] for base in cls.__bases__: - if base == MockEntityBase or base == object: + if base is MockEntityBase or base is object: continue parents.append(base) return parents🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/test/test_migrate_inheritance.py` around lines 102 - 105, In the loop that iterates over cls.__bases__ (variables cls and base), replace the equality checks using "==" with identity checks using "is" when comparing against MockEntityBase and object (i.e., use "base is MockEntityBase" and "base is object"); update any similar class comparisons in the same function to use "is" to ensure class identity semantics are used rather than equality..github/workflows/main.yml (1)
70-77: Consider adding pytest to PyPy builds (excluding 2.7).The PyPy build jobs don't include the new pytest step. While PyPy 2.7 should be excluded, the migration tests should work with PyPy 3.x versions when they're uncommented in the matrix.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/main.yml around lines 70 - 77, Add a pytest invocation to the PyPy job so migration tests run on PyPy 3.x: modify the job steps that currently include the "run: pypy setup.py test" step to first install and run pytest (e.g., "pip install pytest" then "pytest") but guard it with the matrix condition (matrix.python-version != '2.7') so PyPy 2.7 is excluded; update the existing "run: pypy setup.py test" step or add a preceding step that references matrix.python-version and the existing "pypy setup.py test" entry to ensure tests run under PyPy 3.x.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/migrate_inheritance.py`:
- Around line 116-121: The current truncation uses index_name =
index_name[-limit:] which slices the tail and can collide for similar suffixes;
modify the logic around INDEX_NAME_LIMITS handling to produce deterministic
unique names by combining a preserved prefix with a short hash of the full
index_name: compute a stable hash (e.g., md5/sha1) of the original "%s_%s_%s" %
(table_name, attribute_name, index_type), then build index_name as prefix + "_"
+ hash_suffix such that len(index_name) <= limit (choose prefix length = limit -
1 - len(hash_suffix)); update the code that reads/writes index_name and ensure
engine, table_name, attribute_name and index_type are used to compute the hash
to guarantee uniqueness across similar-suffixed names.
---
Nitpick comments:
In @.github/workflows/main.yml:
- Around line 70-77: Add a pytest invocation to the PyPy job so migration tests
run on PyPy 3.x: modify the job steps that currently include the "run: pypy
setup.py test" step to first install and run pytest (e.g., "pip install pytest"
then "pytest") but guard it with the matrix condition (matrix.python-version !=
'2.7') so PyPy 2.7 is excluded; update the existing "run: pypy setup.py test"
step or add a preceding step that references matrix.python-version and the
existing "pypy setup.py test" entry to ensure tests run under PyPy 3.x.
In `@data/src/entity_manager/test.py`:
- Around line 2516-2518: The test saves person before address after setting
person.address, which is inconsistent with the established pattern in
test_one_to_one; update the save order so the referenced entity is persisted
first (call self.entity_manager.save(address) before
self.entity_manager.save(person)) after assigning person.address = address to
match the base test pattern and improve readability and maintainability.
In `@scripts/migrate_inheritance.py`:
- Around line 486-489: The loop unpacks unused variables sql_type and is_pk
which should be prefixed to signal intentional non-use; modify the for-loop over
columns to unpack as (col_name, _sql_type, _is_pk) and leave the rest of the
logic (calls to get_source_table_for_column(cls, col_name) and appending to
select_cols) unchanged so the variables and intent are clear.
- Around line 404-408: The SQL is built via Python string formatting in
cursor.execute using table_name, table_id, and parent_table; replace this with a
safe identifier composition (e.g., psycopg2.sql.SQL/Identifier) or
assert/validate that these values are strictly code-defined entity names (no
user input) before use. Locate the cursor.execute call in migrate_inheritance.py
and either (1) rebuild the statement using psycopg2.sql.Identifier for
table/column identifiers or (2) add explicit validation (regex whitelist or
membership check against known entity metadata) for table_name, table_id and
parent_table to guarantee they cannot be user-controlled, then use them in the
query.
- Around line 847-853: The rollback exception is being swallowed by a bare
except in the block around connection.rollback; catch the exception from
connection.rollback (e.g., as rollback_exc) and record/log it instead of pass so
failures are visible—update the try/except around connection.rollback to append
a message like "rollback failed: %s" % str(rollback_exc) to messages (or call
the existing logger if available) before continuing, and keep the existing
"transaction rolled back" and return behavior; reference the connection.rollback
call and the messages list and exception variable in your change.
In `@scripts/test/test_migrate_inheritance.py`:
- Around line 102-105: In the loop that iterates over cls.__bases__ (variables
cls and base), replace the equality checks using "==" with identity checks using
"is" when comparing against MockEntityBase and object (i.e., use "base is
MockEntityBase" and "base is object"); update any similar class comparisons in
the same function to use "is" to ensure class identity semantics are used rather
than equality.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6ddc6489-533c-492d-82d5-f4ec1e4185ae
📒 Files selected for processing (4)
.github/workflows/main.ymldata/src/entity_manager/test.pyscripts/migrate_inheritance.pyscripts/test/test_migrate_inheritance.py
| index_name = "%s_%s_%s" % (table_name, attribute_name, index_type) | ||
|
|
||
| # truncates the index name to the engine-specific limit | ||
| limit = INDEX_NAME_LIMITS.get(engine, None) | ||
| if limit: | ||
| index_name = index_name[-limit:] |
There was a problem hiding this comment.
Index name truncation from the start may cause collisions.
Using index_name[-limit:] takes the last N characters, which could produce duplicate names when multiple tables share similar suffixes. For example, table_a_very_long_attr_hash and table_b_very_long_attr_hash truncated to 64 chars would both end with the same suffix.
Consider truncating from the middle or using a hash-based suffix to ensure uniqueness.
♻️ Proposed fix using hash for uniqueness
+import hashlib
+
def index_query(table_name, attribute_name, index_type, engine="sqlite"):
...
index_name = "%s_%s_%s" % (table_name, attribute_name, index_type)
# truncates the index name to the engine-specific limit
limit = INDEX_NAME_LIMITS.get(engine, None)
- if limit:
- index_name = index_name[-limit:]
+ if limit and len(index_name) > limit:
+ # use a hash suffix to ensure uniqueness when truncating
+ name_hash = hashlib.md5(index_name.encode()).hexdigest()[:8]
+ prefix_len = limit - 9 # leave room for underscore + 8-char hash
+ index_name = index_name[:prefix_len] + "_" + name_hash🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/migrate_inheritance.py` around lines 116 - 121, The current
truncation uses index_name = index_name[-limit:] which slices the tail and can
collide for similar suffixes; modify the logic around INDEX_NAME_LIMITS handling
to produce deterministic unique names by combining a preserved prefix with a
short hash of the full index_name: compute a stable hash (e.g., md5/sha1) of the
original "%s_%s_%s" % (table_name, attribute_name, index_type), then build
index_name as prefix + "_" + hash_suffix such that len(index_name) <= limit
(choose prefix length = limit - 1 - len(hash_suffix)); update the code that
reads/writes index_name and ensure engine, table_name, attribute_name and
index_type are used to compute the hash to guarantee uniqueness across
similar-suffixed names.
- Update debug, info, warning, error, critical methods to accept
*args and **kwargs for deferred string formatting
- Update class-level logging methods (_class_debug, etc.) with
the same signature change
- Enables logging calls like self.debug("Event %s", name) where
formatting is skipped if the log level is not active
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
concrete_table) as an alternative to the existing Class Table Inheritance (class_table) strategyinheritance = "concrete_table"class attribute, fully backward compatiblescripts/migrate_inheritance.py) for converting existing hierarchies between strategiesSchema design
For
ConcreteRootEntity -> ConcretePerson -> ConcreteEmployee:Core changes
structures.py: Addedinheritanceattribute,get_inheritance_strategy(),is_concrete_table(), andget_all_items()toEntityClasssystem.py: Save/update/delete write to all ancestor tables; select reads from single table without joins; create builds flattened tables at every hierarchy levelmocks.py: AddedConcreteRootEntity,ConcretePerson,ConcreteEmployee,ConcreteAddresstest entitiestest.py: 77 tests covering base ORM, concrete table (including polymorphic queries), migration tooling, and backupscripts/migrate_inheritance.py: Migration script with backup, validation, dry-run, and transactional executionscripts/benchmark_inheritance.py: Performance benchmark comparing both strategiesdoc/design/002-concrete_table_inheritance.md: Design document (COP-002)Test plan
concrete_tablethrough hierarchyget_all_items()returns flattened items including inherited fieldsCloses #25
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Changed
Tests
Documentation
Chores