Fix TOM model: .Find() fixes and expression capture#1158
Fix TOM model: .Find() fixes and expression capture#1158KornAlexander wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…tition source expressions
There was a problem hiding this comment.
Pull request overview
This PR modifies the TOM model wrapper (src/sempy_labs/tom/_model.py), primarily affecting how Vertipaq statistics are persisted as annotations and how incremental refresh rewrites M partition expressions.
Changes:
- Reduced the set of Vertipaq annotations written to tables/columns/relationships.
- Updated incremental refresh logic to rewrite M partition expressions via
p.Source.Expressionusing a regex-based insertion of a new step.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| row = dfT_filt.iloc[0] | ||
| rowCount = str(row["Row Count"]) | ||
| totalSize = str(row["Total Size"]) | ||
| dict_size = str(row["Dictionary Size"]) | ||
| data_size = str(row["Data Size"]) | ||
| hierarchy_size = str(row["Hierarchy Size"]) | ||
| user_hierarchy_size = str(row["User Hierarchy Size"]) | ||
| relationship_size = str(row["Relationship Size"]) | ||
| pct_db = str(row["% DB"]) | ||
|
|
||
| self.set_annotation(object=t, name="Vertipaq_RowCount", value=rowCount) | ||
| self.set_annotation( | ||
| object=t, name="Vertipaq_TotalSize", value=totalSize | ||
| ) |
There was a problem hiding this comment.
set_vertipaq_annotations() no longer sets several table-level Vertipaq annotations (e.g., Vertipaq_DataSize, Vertipaq_DictionarySize, Vertipaq_HierarchySize, Vertipaq_RelationshipSize, Vertipaq_%DB). semantic_model/_vertipaq_analyzer.py reads these annotations for its output, so this change will regress analyzer results (values become None/0). Please restore these annotations here or update the analyzer to not depend on them.
| row = dfC_filt.iloc[0] | ||
| totalSize = str(row["Total Size"]) | ||
| dataSize = str(row["Data Size"]) | ||
| dictSize = str(row["Dictionary Size"]) | ||
| hierSize = str(row["Hierarchy Size"]) | ||
| is_resident = str(row["Is Resident"]) | ||
| temp = str(row["Temperature"]) | ||
| card = str(row["Column Cardinality"]) | ||
| last_accessed = str(row["Last Accessed"]) | ||
| self.set_annotation( |
There was a problem hiding this comment.
set_vertipaq_annotations() stopped writing column annotations Vertipaq_IsResident, Vertipaq_Temperature, and Vertipaq_LastAccessed, but semantic_model/_vertipaq_analyzer.py still expects these when building its column stats. This will lead to missing/incorrect analyzer fields. Either re-add these annotations (if the source data still exists) or adjust the analyzer to handle their absence explicitly.
| for r in self.model.Relationships: | ||
| dfR_filt = dfR[dfR["Relationship Name"] == r.Name] | ||
| if not dfR_filt.empty: | ||
| relSize = str(dfR_filt["Used Size"].iloc[0]) | ||
| mult = str(dfR_filt["Multiplicity"].iloc[0]) | ||
| self.set_annotation(object=r, name="Vertipaq_UsedSize", value=relSize) |
There was a problem hiding this comment.
Relationship annotation Vertipaq_Multiplicity is no longer set here, but semantic_model/_vertipaq_analyzer.py reads it when generating relationship diagnostics. Please keep setting this annotation (or update the analyzer to not rely on it) to avoid regressing analyzer output.
| # Find the last "in <identifier>" block | ||
| match = re.search(r"in\s+(\S.*?)$", text, re.DOTALL) | ||
| if not match: | ||
| raise ValueError(f"{icons.red_dot} Invalid M-partition expression.") |
There was a problem hiding this comment.
The regex used to locate the final in ... clause (re.search(r"in\s+(\S.*?)$", ...)) will match the first occurrence of in in the expression, not the last, which can corrupt the rewritten M query if in appears earlier (common in strings/identifiers). Consider matching the final in keyword explicitly (e.g., anchor to line start and/or use a greedy prefix / last match via finditer) so the inserted step is placed correctly.
| # Update partition expression | ||
| i = 0 | ||
| for p in t.Partitions: | ||
| if p.SourceType != TOM.PartitionSourceType.M: | ||
| raise ValueError( | ||
| f"{icons.red_dot} Invalid partition source type. Incremental refresh can only be set up if the table's partition is an M-partition." | ||
| ) | ||
|
|
||
| elif i == 0: | ||
| text = p.Expression | ||
| text = text.rstrip() | ||
| text = p.Source.Expression.rstrip() |
There was a problem hiding this comment.
PR description mentions “.Find() method fixes” and “expression capture”, but the diff in this file only changes Vertipaq annotation writing and incremental refresh M-expression rewriting; there are no .Find(...) call sites or new expression-capture surfaces in tom/_model.py. Please update the PR description to reflect the actual changes, or include the missing intended changes.
TOM Model Fixes
Fixes and improvements to the TOM (Tabular Object Model) wrapper class in
tom/_model.py. These changes were needed to support reliable model inspection and modification by the PBI Fixer UI and SM fixers.Changes (~67 lines)
.Find()method fixes: Uses.Find()instead of bracket indexing for safer collection lookups that don't throw on missing items.Files
src/sempy_labs/tom/_model.py(modified)Why These Changes Are Needed
The SM fixers and PBI Fixer UI rely on safely inspecting model objects (tables, columns, measures, relationships). The upstream
.Find()pattern is more robust than bracket indexing when items may not exist. Expression capture is needed for the SM Explorer to display partition expressions and table definitions.PBI Fixer Contribution — Overview
This PR is part of the PBI Fixer contribution to semantic-link-labs — an interactive ipywidgets-based UI for scanning and fixing Power BI reports and semantic models directly in Microsoft Fabric Notebooks.
The PBI Fixer provides a tabbed ipywidgets interface (Semantic Model Explorer, Report Explorer, Perspective Editor, Vertipaq Analyzer) that lets users interactively scan, inspect, and fix Power BI artifacts without leaving the notebook. All underlying fixer functions also work as standalone API calls, so users can integrate them into scripts and pipelines without the UI.
Contribution Structure
The full contribution (~17K lines across 68 files) is split into 22 focused PRs across 6 phases to keep each PR reviewable and self-contained. Only new files are added in Phases 1–4 and 6 — no existing SLL code is modified.
/)..Find()fixes and expression capture (tom/_model.py), Vertipaq analyzer enhancements with memory/column-level analysis (_vertipaq.py, ~1000 lines changed), and various small fixes across_items.py,_item_recovery.py,_helper_functions.py,_export_report.py,_sql.py, andadmin/_tenant.py. These carry higher merge conflict risk and may need closer review or discussion.Dependencies & Review Order
sempy_labs.report.fix_piecharts(...)orsempy_labs.semantic_model.add_calculated_calendar(...).