Skip to content

Add report_prototype: auto-generate report pages from semantic model#1155

Open
KornAlexander wants to merge 1 commit intomicrosoft:mainfrom
KornAlexander:feature/report-prototype
Open

Add report_prototype: auto-generate report pages from semantic model#1155
KornAlexander wants to merge 1 commit intomicrosoft:mainfrom
KornAlexander:feature/report-prototype

Conversation

@KornAlexander
Copy link
Copy Markdown

Report Prototype Generator

Auto-generates a visual prototype of a Power BI report as SVG and Excalidraw format. Analyzes the report page layout, visual types, field mappings, and page structure to produce a schematic diagram — useful for report reviews, documentation, and restructuring discussions.

Functions Added

Function Description
generate_report_prototype(report, workspace=None, screenshots=False, include_hidden=False, cols=4, thumb_width=480, thumb_height=270, on_progress=None) Generates a visual prototype of a Power BI report as SVG + Excalidraw.

Files

  • src/sempy_labs/report/_report_prototype.py (new file)
  • src/sempy_labs/report/__init__.py (updated exports)

Usage

import sempy_labs as labs

# Generate a report prototype
labs.report.generate_report_prototype("My Report")

# Include hidden pages and use wider thumbnails
labs.report.generate_report_prototype("My Report", include_hidden=True, thumb_width=640)

PBI Fixer Contribution — Overview

This PR is part of the PBI Fixer contribution to semantic-link-labs — an interactive ipywidgets-based UI for scanning and fixing Power BI reports and semantic models directly in Microsoft Fabric Notebooks.

The PBI Fixer provides a tabbed ipywidgets interface (Semantic Model Explorer, Report Explorer, Perspective Editor, Vertipaq Analyzer) that lets users interactively scan, inspect, and fix Power BI artifacts without leaving the notebook. All underlying fixer functions also work as standalone API calls, so users can integrate them into scripts and pipelines without the UI.

Contribution Structure

The full contribution (~17K lines across 68 files) is split into 22 focused PRs across 6 phases to keep each PR reviewable and self-contained. Only new files are added in Phases 1–4 and 6 — no existing SLL code is modified.

Phase Focus PRs Description
1 Report Fixers 7 Standalone functions that programmatically fix common Power BI report issues: replace pie charts with bar charts, standardize page sizes to Full HD, apply chart formatting best practices, migrate slicers to slicerbars, hide visual-level filters, clean up unused custom visuals, align visuals, migrate report-level measures, and upgrade reports to PBIR format. Each function operates on PBIR-format report definitions.
2 Semantic Model Fixers 4 Functions that fix and enhance semantic models via XMLA/TOM: add calculated calendar and measure tables, add calculation groups for units and time intelligence, discourage implicit measures, and 19 BPA auto-fixers covering formatting conventions, naming standards, data types, column visibility, sort order, and DAX patterns (e.g., use DIVIDE instead of /).
3 SM Setup & Analysis 3 Setup utilities: configure cache warming queries, set up incremental refresh policies, and prepare semantic models for AI/Copilot integration (descriptions, metadata enrichment).
4 Report Utilities 3 Report-level utilities: auto-generate report page prototypes from a semantic model's structure, extract and apply report themes, and generate IBCS-compliant variance charts.
5 Upstream Enhancements 3 ⚠️ These PRs modify existing SLL code (unlike Phases 1–4 which only add new files). Changes include TOM model .Find() fixes and expression capture (tom/_model.py), Vertipaq analyzer enhancements with memory/column-level analysis (_vertipaq.py, ~1000 lines changed), and various small fixes across _items.py, _item_recovery.py, _helper_functions.py, _export_report.py, _sql.py, and admin/_tenant.py. These carry higher merge conflict risk and may need closer review or discussion.
6 PBI Fixer UI 2 The interactive UI layer: shared UI components (theme, icons, tree builders, layout helpers), BPA scan runners, report helpers, and the main PBI Fixer application with its tabbed interface (SM Explorer, Report Explorer, Perspective Editor, Vertipaq Analyzer). Depends on Phases 1–5 but uses lazy imports to degrade gracefully if individual fixers aren't yet merged.

Dependencies & Review Order

  • Phases 1–4 are fully independent — they only add new files and can be reviewed/merged in any order.
  • Phase 5 is also independent but modifies existing code, so it may benefit from early discussion.
  • Phase 6 (the UI) ties everything together. It depends on the earlier phases but works standalone via lazy imports.
  • All fixer functions work without the UI — they can be called directly as sempy_labs.report.fix_piecharts(...) or sempy_labs.semantic_model.add_calculated_calendar(...).

Copilot AI review requested due to automatic review settings April 9, 2026 16:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new report utility module intended to generate a schematic “report prototype” (SVG + Excalidraw JSON) from a Power BI report’s page/visual metadata, optionally embedding exported page screenshots.

Changes:

  • Introduces generate_report_prototype(...) to analyze pages and navigation links and produce SVG + Excalidraw outputs.
  • Adds optional parallel screenshot exporting and embedding as base64 PNGs in the Excalidraw output.
  • Adds save_report_prototype(...) to persist the generated artifacts into the attached lakehouse.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +246 to +254
png_bytes = export_report(
report=report,
export_format="PNG",
file_name=f"_prototype_{idx:02d}",
page_name=pg["name"],
workspace=workspace,
_report_id=_resolved_report_id,
_return_bytes=True,
)
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

export_report is invoked with _report_id and _return_bytes, but sempy_labs.report.export_report() does not accept these parameters. As written, screenshots=True will raise TypeError before any export happens. Consider either extending export_report with a supported return_bytes/report_id option (and using public parameters only here), or call the underlying API directly from this module without private kwargs.

Copilot uses AI. Check for mistakes.
Comment on lines +231 to +313
def _export_one(idx, pg):
# Suppress all output inside each thread
import sys as _tsys
import io as _tio
_t_old_stdout = _tsys.stdout
_tsys.stdout = _tio.StringIO()
try:
import IPython.display as _tipd
_tipd_orig = _tipd.display
_tipd.display = lambda *a, **kw: None
except Exception:
_tipd = None
_tipd_orig = None
try:
# Win 3: get PNG bytes directly, skip file I/O
png_bytes = export_report(
report=report,
export_format="PNG",
file_name=f"_prototype_{idx:02d}",
page_name=pg["name"],
workspace=workspace,
_report_id=_resolved_report_id,
_return_bytes=True,
)
if png_bytes:
with _lock:
page_images[pg["name"]] = base64.b64encode(png_bytes).decode("ascii")
else:
with _lock:
export_errors.append(f"'{pg['display_name']}': empty response")
except Exception as e:
with _lock:
export_errors.append(f"'{pg['display_name']}': {str(e)[:200]}")
finally:
_tsys.stdout = _t_old_stdout
if _tipd and _tipd_orig:
_tipd.display = _tipd_orig
with _lock:
_done_count[0] += 1
done = _done_count[0]
if on_progress:
try:
on_progress(done, total_pages, pg["display_name"])
except Exception:
pass

# Redirect stdout AND suppress IPython.display to prevent notebook output overflow
_real_stdout = sys.stdout
sys.stdout = _io.StringIO()

# Monkey-patch IPython display to a no-op during exports
_ipd = None
_ipd_orig = None
_idf = None
_idf_orig = None
try:
import IPython.display as _ipd_mod
_ipd = _ipd_mod
_ipd_orig = _ipd_mod.display
_ipd_mod.display = lambda *a, **kw: None
except Exception:
pass
try:
import IPython.core.display_functions as _idf_mod
_idf = _idf_mod
_idf_orig = _idf_mod.display
_idf_mod.display = lambda *a, **kw: None
except Exception:
pass

try:
if on_progress:
on_progress(0, total_pages, "starting exports...")
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=min(5, total_pages)) as pool:
pool.map(lambda args: _export_one(*args), target_pages)
finally:
sys.stdout = _real_stdout
if _ipd and _ipd_orig:
_ipd.display = _ipd_orig
if _idf and _idf_orig:
_idf.display = _idf_orig

Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The screenshot export path monkey-patches sys.stdout and IPython.display.display globally while running a ThreadPoolExecutor. These globals are shared across threads, so concurrent calls can race and leak side effects to other notebook code (and threads can restore the wrong stdout/display). Prefer a thread-safe approach (e.g., avoid global monkey-patching; add a quiet mode to export_report/API calls, or use contextlib.redirect_stdout + locks around any global state changes).

Copilot uses AI. Check for mistakes.
Comment on lines +302 to +306
if on_progress:
on_progress(0, total_pages, "starting exports...")
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=min(5, total_pages)) as pool:
pool.map(lambda args: _export_one(*args), target_pages)
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If include_hidden is false and all pages are hidden (or the report has zero pages after filtering), total_pages becomes 0 and ThreadPoolExecutor(max_workers=min(5, total_pages)) will be created with max_workers=0, raising ValueError. Add an early return/skip when total_pages == 0.

Suggested change
if on_progress:
on_progress(0, total_pages, "starting exports...")
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=min(5, total_pages)) as pool:
pool.map(lambda args: _export_one(*args), target_pages)
if total_pages == 0:
if on_progress:
on_progress(0, total_pages, "no pages to export")
else:
if on_progress:
on_progress(0, total_pages, "starting exports...")
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=min(5, total_pages)) as pool:
pool.map(lambda args: _export_one(*args), target_pages)

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +20
def generate_report_prototype(
report: str,
workspace: Optional[str | UUID] = None,
screenshots: bool = False,
include_hidden: bool = False,
cols: int = 4,
thumb_width: int = 480,
thumb_height: int = 270,
on_progress=None,
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The public parameters cols, thumb_width, and thumb_height are passed into _build_diagram, but _build_diagram currently uses hard-coded IMG_W/IMG_H/COL_W/SLOT_GAP constants and does not use the provided values. This makes the API misleading (callers cannot actually control the output size/layout). Either wire these parameters into the layout calculations or remove them from the public signature/docstring.

Copilot uses AI. Check for mistakes.
)

safe_name = report.replace(" ", "_").replace("/", "_")
from sempy_labs._helper_functions import _mount
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_report_prototype writes into the attached lakehouse via _mount() but does not check lakehouse_attached() first (unlike other report functions). This can surface a less actionable exception when no lakehouse is attached. Add the standard lakehouse_attached() guard and raise a ValueError with the usual user-facing message.

Suggested change
from sempy_labs._helper_functions import _mount
from sempy_labs._helper_functions import _mount, lakehouse_attached
if not lakehouse_attached():
raise ValueError(
"A lakehouse must be attached to the notebook to save the report prototype."
)

Copilot uses AI. Check for mistakes.
Comment on lines +372 to +383
safe_name = report.replace(" ", "_").replace("/", "_")
from sempy_labs._helper_functions import _mount
local_path = _mount()

exc_path = f"{local_path}/Files/{safe_name}_prototype.excalidraw"
with open(exc_path, "w", encoding="utf-8") as f:
f.write(result["excalidraw"])
print(f"\u2713 Saved {safe_name}_prototype.excalidraw")

svg_path = f"{local_path}/Files/{safe_name}_prototype.svg"
with open(svg_path, "w", encoding="utf-8") as f:
f.write(result["svg"])
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safe_name only replaces spaces and /. A user-supplied report name can still contain other path separators (e.g., .., \\) or characters that produce unintended file paths. Consider sanitizing more robustly (e.g., allowlist [A-Za-z0-9._-], replace everything else, and use os.path.join) to ensure the output is always written under Files/.

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +22
def generate_report_prototype(
report: str,
workspace: Optional[str | UUID] = None,
screenshots: bool = False,
include_hidden: bool = False,
cols: int = 4,
thumb_width: int = 480,
thumb_height: int = 270,
on_progress=None,
) -> dict:
"""
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module introduces public functions but does not use the @log decorator pattern that is consistently applied across other sempy_labs.report top-level APIs. If logging/instrumentation is expected for report functions, decorate these public entrypoints as well for consistency.

Copilot uses AI. Check for mistakes.
Comment on lines +4 to +59
import base64
import json
import os
import uuid
from typing import Optional
from uuid import UUID


def generate_report_prototype(
report: str,
workspace: Optional[str | UUID] = None,
screenshots: bool = False,
include_hidden: bool = False,
cols: int = 4,
thumb_width: int = 480,
thumb_height: int = 270,
on_progress=None,
) -> dict:
"""
Generates a visual prototype of a Power BI report as SVG + Excalidraw.

Parameters
----------
report : str
Name of the Power BI report.
workspace : str | uuid.UUID, default=None
The Fabric workspace name or ID.
screenshots : bool, default=False
If True, exports each page as PNG via the Export API and embeds as images.
include_hidden : bool, default=False
If True, includes hidden pages in the diagram.
cols : int, default=4
Number of columns in the page grid layout.
thumb_width : int, default=480
Width of each page thumbnail in the diagram.
thumb_height : int, default=270
Height of each page thumbnail in the diagram.

Returns
-------
dict
A dictionary with keys:
- "svg": str — the SVG diagram as a string
- "excalidraw": str — the Excalidraw JSON as a string
- "pages": list — page metadata
- "screenshots": int — number of screenshots captured
- "errors": list — export error messages
"""
from sempy_labs.report import connect_report

# Layout constants
pad_x = 40
pad_y = 60
header_h = 30
footer_h = 25

Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports/variables add noise: os is imported but not used, and layout constants pad_x/pad_y/header_h/footer_h are defined/passed but not used by _build_diagram. Removing these (or actually using them) will reduce confusion about which knobs affect the output.

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +21
def generate_report_prototype(
report: str,
workspace: Optional[str | UUID] = None,
screenshots: bool = False,
include_hidden: bool = False,
cols: int = 4,
thumb_width: int = 480,
thumb_height: int = 270,
on_progress=None,
) -> dict:
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions updating src/sempy_labs/report/__init__.py to export generate_report_prototype, but the PR diff provided only adds this new module. As-is, users won’t be able to call labs.report.generate_report_prototype(...) via package exports; please ensure the report package exports are updated in this PR (or adjust the documented usage accordingly).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants