Add report_prototype: auto-generate report pages from semantic model#1155
Add report_prototype: auto-generate report pages from semantic model#1155KornAlexander wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…tic model structure
There was a problem hiding this comment.
Pull request overview
Adds a new report utility module intended to generate a schematic “report prototype” (SVG + Excalidraw JSON) from a Power BI report’s page/visual metadata, optionally embedding exported page screenshots.
Changes:
- Introduces
generate_report_prototype(...)to analyze pages and navigation links and produce SVG + Excalidraw outputs. - Adds optional parallel screenshot exporting and embedding as base64 PNGs in the Excalidraw output.
- Adds
save_report_prototype(...)to persist the generated artifacts into the attached lakehouse.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| png_bytes = export_report( | ||
| report=report, | ||
| export_format="PNG", | ||
| file_name=f"_prototype_{idx:02d}", | ||
| page_name=pg["name"], | ||
| workspace=workspace, | ||
| _report_id=_resolved_report_id, | ||
| _return_bytes=True, | ||
| ) |
There was a problem hiding this comment.
export_report is invoked with _report_id and _return_bytes, but sempy_labs.report.export_report() does not accept these parameters. As written, screenshots=True will raise TypeError before any export happens. Consider either extending export_report with a supported return_bytes/report_id option (and using public parameters only here), or call the underlying API directly from this module without private kwargs.
| def _export_one(idx, pg): | ||
| # Suppress all output inside each thread | ||
| import sys as _tsys | ||
| import io as _tio | ||
| _t_old_stdout = _tsys.stdout | ||
| _tsys.stdout = _tio.StringIO() | ||
| try: | ||
| import IPython.display as _tipd | ||
| _tipd_orig = _tipd.display | ||
| _tipd.display = lambda *a, **kw: None | ||
| except Exception: | ||
| _tipd = None | ||
| _tipd_orig = None | ||
| try: | ||
| # Win 3: get PNG bytes directly, skip file I/O | ||
| png_bytes = export_report( | ||
| report=report, | ||
| export_format="PNG", | ||
| file_name=f"_prototype_{idx:02d}", | ||
| page_name=pg["name"], | ||
| workspace=workspace, | ||
| _report_id=_resolved_report_id, | ||
| _return_bytes=True, | ||
| ) | ||
| if png_bytes: | ||
| with _lock: | ||
| page_images[pg["name"]] = base64.b64encode(png_bytes).decode("ascii") | ||
| else: | ||
| with _lock: | ||
| export_errors.append(f"'{pg['display_name']}': empty response") | ||
| except Exception as e: | ||
| with _lock: | ||
| export_errors.append(f"'{pg['display_name']}': {str(e)[:200]}") | ||
| finally: | ||
| _tsys.stdout = _t_old_stdout | ||
| if _tipd and _tipd_orig: | ||
| _tipd.display = _tipd_orig | ||
| with _lock: | ||
| _done_count[0] += 1 | ||
| done = _done_count[0] | ||
| if on_progress: | ||
| try: | ||
| on_progress(done, total_pages, pg["display_name"]) | ||
| except Exception: | ||
| pass | ||
|
|
||
| # Redirect stdout AND suppress IPython.display to prevent notebook output overflow | ||
| _real_stdout = sys.stdout | ||
| sys.stdout = _io.StringIO() | ||
|
|
||
| # Monkey-patch IPython display to a no-op during exports | ||
| _ipd = None | ||
| _ipd_orig = None | ||
| _idf = None | ||
| _idf_orig = None | ||
| try: | ||
| import IPython.display as _ipd_mod | ||
| _ipd = _ipd_mod | ||
| _ipd_orig = _ipd_mod.display | ||
| _ipd_mod.display = lambda *a, **kw: None | ||
| except Exception: | ||
| pass | ||
| try: | ||
| import IPython.core.display_functions as _idf_mod | ||
| _idf = _idf_mod | ||
| _idf_orig = _idf_mod.display | ||
| _idf_mod.display = lambda *a, **kw: None | ||
| except Exception: | ||
| pass | ||
|
|
||
| try: | ||
| if on_progress: | ||
| on_progress(0, total_pages, "starting exports...") | ||
| from concurrent.futures import ThreadPoolExecutor | ||
| with ThreadPoolExecutor(max_workers=min(5, total_pages)) as pool: | ||
| pool.map(lambda args: _export_one(*args), target_pages) | ||
| finally: | ||
| sys.stdout = _real_stdout | ||
| if _ipd and _ipd_orig: | ||
| _ipd.display = _ipd_orig | ||
| if _idf and _idf_orig: | ||
| _idf.display = _idf_orig | ||
|
|
There was a problem hiding this comment.
The screenshot export path monkey-patches sys.stdout and IPython.display.display globally while running a ThreadPoolExecutor. These globals are shared across threads, so concurrent calls can race and leak side effects to other notebook code (and threads can restore the wrong stdout/display). Prefer a thread-safe approach (e.g., avoid global monkey-patching; add a quiet mode to export_report/API calls, or use contextlib.redirect_stdout + locks around any global state changes).
| if on_progress: | ||
| on_progress(0, total_pages, "starting exports...") | ||
| from concurrent.futures import ThreadPoolExecutor | ||
| with ThreadPoolExecutor(max_workers=min(5, total_pages)) as pool: | ||
| pool.map(lambda args: _export_one(*args), target_pages) |
There was a problem hiding this comment.
If include_hidden is false and all pages are hidden (or the report has zero pages after filtering), total_pages becomes 0 and ThreadPoolExecutor(max_workers=min(5, total_pages)) will be created with max_workers=0, raising ValueError. Add an early return/skip when total_pages == 0.
| if on_progress: | |
| on_progress(0, total_pages, "starting exports...") | |
| from concurrent.futures import ThreadPoolExecutor | |
| with ThreadPoolExecutor(max_workers=min(5, total_pages)) as pool: | |
| pool.map(lambda args: _export_one(*args), target_pages) | |
| if total_pages == 0: | |
| if on_progress: | |
| on_progress(0, total_pages, "no pages to export") | |
| else: | |
| if on_progress: | |
| on_progress(0, total_pages, "starting exports...") | |
| from concurrent.futures import ThreadPoolExecutor | |
| with ThreadPoolExecutor(max_workers=min(5, total_pages)) as pool: | |
| pool.map(lambda args: _export_one(*args), target_pages) |
| def generate_report_prototype( | ||
| report: str, | ||
| workspace: Optional[str | UUID] = None, | ||
| screenshots: bool = False, | ||
| include_hidden: bool = False, | ||
| cols: int = 4, | ||
| thumb_width: int = 480, | ||
| thumb_height: int = 270, | ||
| on_progress=None, |
There was a problem hiding this comment.
The public parameters cols, thumb_width, and thumb_height are passed into _build_diagram, but _build_diagram currently uses hard-coded IMG_W/IMG_H/COL_W/SLOT_GAP constants and does not use the provided values. This makes the API misleading (callers cannot actually control the output size/layout). Either wire these parameters into the layout calculations or remove them from the public signature/docstring.
| ) | ||
|
|
||
| safe_name = report.replace(" ", "_").replace("/", "_") | ||
| from sempy_labs._helper_functions import _mount |
There was a problem hiding this comment.
save_report_prototype writes into the attached lakehouse via _mount() but does not check lakehouse_attached() first (unlike other report functions). This can surface a less actionable exception when no lakehouse is attached. Add the standard lakehouse_attached() guard and raise a ValueError with the usual user-facing message.
| from sempy_labs._helper_functions import _mount | |
| from sempy_labs._helper_functions import _mount, lakehouse_attached | |
| if not lakehouse_attached(): | |
| raise ValueError( | |
| "A lakehouse must be attached to the notebook to save the report prototype." | |
| ) |
| safe_name = report.replace(" ", "_").replace("/", "_") | ||
| from sempy_labs._helper_functions import _mount | ||
| local_path = _mount() | ||
|
|
||
| exc_path = f"{local_path}/Files/{safe_name}_prototype.excalidraw" | ||
| with open(exc_path, "w", encoding="utf-8") as f: | ||
| f.write(result["excalidraw"]) | ||
| print(f"\u2713 Saved {safe_name}_prototype.excalidraw") | ||
|
|
||
| svg_path = f"{local_path}/Files/{safe_name}_prototype.svg" | ||
| with open(svg_path, "w", encoding="utf-8") as f: | ||
| f.write(result["svg"]) |
There was a problem hiding this comment.
safe_name only replaces spaces and /. A user-supplied report name can still contain other path separators (e.g., .., \\) or characters that produce unintended file paths. Consider sanitizing more robustly (e.g., allowlist [A-Za-z0-9._-], replace everything else, and use os.path.join) to ensure the output is always written under Files/.
| def generate_report_prototype( | ||
| report: str, | ||
| workspace: Optional[str | UUID] = None, | ||
| screenshots: bool = False, | ||
| include_hidden: bool = False, | ||
| cols: int = 4, | ||
| thumb_width: int = 480, | ||
| thumb_height: int = 270, | ||
| on_progress=None, | ||
| ) -> dict: | ||
| """ |
There was a problem hiding this comment.
This module introduces public functions but does not use the @log decorator pattern that is consistently applied across other sempy_labs.report top-level APIs. If logging/instrumentation is expected for report functions, decorate these public entrypoints as well for consistency.
| import base64 | ||
| import json | ||
| import os | ||
| import uuid | ||
| from typing import Optional | ||
| from uuid import UUID | ||
|
|
||
|
|
||
| def generate_report_prototype( | ||
| report: str, | ||
| workspace: Optional[str | UUID] = None, | ||
| screenshots: bool = False, | ||
| include_hidden: bool = False, | ||
| cols: int = 4, | ||
| thumb_width: int = 480, | ||
| thumb_height: int = 270, | ||
| on_progress=None, | ||
| ) -> dict: | ||
| """ | ||
| Generates a visual prototype of a Power BI report as SVG + Excalidraw. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| report : str | ||
| Name of the Power BI report. | ||
| workspace : str | uuid.UUID, default=None | ||
| The Fabric workspace name or ID. | ||
| screenshots : bool, default=False | ||
| If True, exports each page as PNG via the Export API and embeds as images. | ||
| include_hidden : bool, default=False | ||
| If True, includes hidden pages in the diagram. | ||
| cols : int, default=4 | ||
| Number of columns in the page grid layout. | ||
| thumb_width : int, default=480 | ||
| Width of each page thumbnail in the diagram. | ||
| thumb_height : int, default=270 | ||
| Height of each page thumbnail in the diagram. | ||
|
|
||
| Returns | ||
| ------- | ||
| dict | ||
| A dictionary with keys: | ||
| - "svg": str — the SVG diagram as a string | ||
| - "excalidraw": str — the Excalidraw JSON as a string | ||
| - "pages": list — page metadata | ||
| - "screenshots": int — number of screenshots captured | ||
| - "errors": list — export error messages | ||
| """ | ||
| from sempy_labs.report import connect_report | ||
|
|
||
| # Layout constants | ||
| pad_x = 40 | ||
| pad_y = 60 | ||
| header_h = 30 | ||
| footer_h = 25 | ||
|
|
There was a problem hiding this comment.
Unused imports/variables add noise: os is imported but not used, and layout constants pad_x/pad_y/header_h/footer_h are defined/passed but not used by _build_diagram. Removing these (or actually using them) will reduce confusion about which knobs affect the output.
| def generate_report_prototype( | ||
| report: str, | ||
| workspace: Optional[str | UUID] = None, | ||
| screenshots: bool = False, | ||
| include_hidden: bool = False, | ||
| cols: int = 4, | ||
| thumb_width: int = 480, | ||
| thumb_height: int = 270, | ||
| on_progress=None, | ||
| ) -> dict: |
There was a problem hiding this comment.
PR description mentions updating src/sempy_labs/report/__init__.py to export generate_report_prototype, but the PR diff provided only adds this new module. As-is, users won’t be able to call labs.report.generate_report_prototype(...) via package exports; please ensure the report package exports are updated in this PR (or adjust the documented usage accordingly).
Report Prototype Generator
Auto-generates a visual prototype of a Power BI report as SVG and Excalidraw format. Analyzes the report page layout, visual types, field mappings, and page structure to produce a schematic diagram — useful for report reviews, documentation, and restructuring discussions.
Functions Added
generate_report_prototype(report, workspace=None, screenshots=False, include_hidden=False, cols=4, thumb_width=480, thumb_height=270, on_progress=None)Files
src/sempy_labs/report/_report_prototype.py(new file)src/sempy_labs/report/__init__.py(updated exports)Usage
PBI Fixer Contribution — Overview
This PR is part of the PBI Fixer contribution to semantic-link-labs — an interactive ipywidgets-based UI for scanning and fixing Power BI reports and semantic models directly in Microsoft Fabric Notebooks.
The PBI Fixer provides a tabbed ipywidgets interface (Semantic Model Explorer, Report Explorer, Perspective Editor, Vertipaq Analyzer) that lets users interactively scan, inspect, and fix Power BI artifacts without leaving the notebook. All underlying fixer functions also work as standalone API calls, so users can integrate them into scripts and pipelines without the UI.
Contribution Structure
The full contribution (~17K lines across 68 files) is split into 22 focused PRs across 6 phases to keep each PR reviewable and self-contained. Only new files are added in Phases 1–4 and 6 — no existing SLL code is modified.
/)..Find()fixes and expression capture (tom/_model.py), Vertipaq analyzer enhancements with memory/column-level analysis (_vertipaq.py, ~1000 lines changed), and various small fixes across_items.py,_item_recovery.py,_helper_functions.py,_export_report.py,_sql.py, andadmin/_tenant.py. These carry higher merge conflict risk and may need closer review or discussion.Dependencies & Review Order
sempy_labs.report.fix_piecharts(...)orsempy_labs.semantic_model.add_calculated_calendar(...).