Skip to content

[Enhancement] Add BASE_TABLE_REFRESH_VERSION_TIMES column to information_schema.materialized_views#74717

Open
Youngwb wants to merge 1 commit into
StarRocks:mainfrom
Youngwb:mv-obs-base-table-version-times
Open

[Enhancement] Add BASE_TABLE_REFRESH_VERSION_TIMES column to information_schema.materialized_views#74717
Youngwb wants to merge 1 commit into
StarRocks:mainfrom
Youngwb:mv-obs-base-table-version-times

Conversation

@Youngwb

@Youngwb Youngwb commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Why I'm doing:

information_schema.materialized_views and SHOW MATERIALIZED VIEWS already expose LAST_REFRESH_TIME, but only as a single maximum over all base tables. When a materialized view has multiple base tables, operators cannot see which base table the reported data version time comes from, nor the per-table breakdown. This makes it hard to reason about staleness on a per-base-table basis for external/data-lake-backed materialized views.

What I'm doing:

Add a new column BASE_TABLE_REFRESH_VERSION_TIMES (varchar) to information_schema.materialized_views and SHOW MATERIALIZED VIEWS. It is a JSON object mapping each base table's catalog.database.table name to the latest data version time observed for it — the per-table detail behind LAST_REFRESH_TIME (their single maximum).

Only external/data lake base tables report a time, because their AsyncRefreshContext partition entries carry the source modified time. OLAP (internal) base partitions store only a version id (lastRefreshTime == -1, see BasePartitionInfo.fromOlapTable) and are omitted. The value is {} when no base table reports a time (e.g. an OLAP-only materialized view).

The column is wired through every existing column surface: the information_schema table definition, the SHOW result header, the SHOW ... WHERE → SELECT rewrite column list, TMaterializedViewStatus thrift field 39, ShowMaterializedViewStatus (thrift + result row), and the BE schema scanner.

Fixes #issue: N/A — observability enhancement, no tracked issue.

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.1
    • 4.0
    • 3.5

@github-actions github-actions Bot added the documentation Documentation changes label Jun 12, 2026
@wanpengfei-git wanpengfei-git requested a review from a team June 12, 2026 02:59
@CelerData-Reviewer

Copy link
Copy Markdown

@codex review

@github-actions github-actions Bot requested review from HangyuanLiu and wyb June 12, 2026 03:02

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 03d210e527

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

"query_rewrite_status_reason",
"last_freshness_confirmed_at"
"last_freshness_confirmed_at",
"base_table_refresh_version_times"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Update the SHOW MV rewrite expectation

When this column is appended to the rewritten SHOW MATERIALIZED VIEWS ... WHERE select list, AstToStringBuilder now emits information_schema.materialized_views.base_table_refresh_version_times AS base_table_refresh_version_times before the FROM clause. The existing assertion in fe/fe-core/src/test/java/com/starrocks/analysis/ShowMaterializedViewTest.java still expects the string to end at last_freshness_confirmed_at, so that FE unit test will fail until the expected rewrite is updated.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 6bfeb8a — updated the expected rewrite string in ShowMaterializedViewTest.testNormal to append information_schema.materialized_views.base_table_refresh_version_times AS base_table_refresh_version_times before the FROM clause, matching the AstToStringBuilder output verbatim. (This Codex review ran on the earlier commit 03d210e5, prior to the fix.)

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

🌎 Translation Required?

All translation files are up to date.
Great job! No translation actions are required for this PR.

🕒 Last updated: Fri, 12 Jun 2026 03:37:52 GMT

…ion_schema.materialized_views

Surface a per-base-table data version time as a new column on
information_schema.materialized_views and SHOW MATERIALIZED VIEWS, as a
JSON object mapping each base table's catalog.db.table name to the latest
data version time observed for it.

This complements LAST_REFRESH_TIME, which is the single maximum over all
base tables; the new column is the per-table detail behind that maximum.
Only external/data lake base tables report a time, because their
AsyncRefreshContext partition entries carry the source modified time. OLAP
(internal) base partitions store only a version id (lastRefreshTime == -1,
see BasePartitionInfo.fromOlapTable) and are omitted. The value is "{}"
when no base table reports a time, e.g. an OLAP-only materialized view.

Wired through every column surface: the information_schema table
definition, the SHOW result header, the SHOW ... WHERE -> SELECT rewrite
column list, TMaterializedViewStatus thrift field 39, ShowMaterializedViewStatus
(thrift + result row), and the BE schema scanner.

Signed-off-by: Youngwb <yangwenbo_mailbox@163.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Youngwb Youngwb force-pushed the mv-obs-base-table-version-times branch from 03d210e to 6bfeb8a Compare June 12, 2026 03:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Documentation changes PROTO-REVIEW

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants