[Enhancement] Add BASE_TABLE_REFRESH_VERSION_TIMES column to information_schema.materialized_views#74717
[Enhancement] Add BASE_TABLE_REFRESH_VERSION_TIMES column to information_schema.materialized_views#74717Youngwb wants to merge 1 commit into
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 03d210e527
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "query_rewrite_status_reason", | ||
| "last_freshness_confirmed_at" | ||
| "last_freshness_confirmed_at", | ||
| "base_table_refresh_version_times" |
There was a problem hiding this comment.
Update the SHOW MV rewrite expectation
When this column is appended to the rewritten SHOW MATERIALIZED VIEWS ... WHERE select list, AstToStringBuilder now emits information_schema.materialized_views.base_table_refresh_version_times AS base_table_refresh_version_times before the FROM clause. The existing assertion in fe/fe-core/src/test/java/com/starrocks/analysis/ShowMaterializedViewTest.java still expects the string to end at last_freshness_confirmed_at, so that FE unit test will fail until the expected rewrite is updated.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 6bfeb8a — updated the expected rewrite string in ShowMaterializedViewTest.testNormal to append information_schema.materialized_views.base_table_refresh_version_times AS base_table_refresh_version_times before the FROM clause, matching the AstToStringBuilder output verbatim. (This Codex review ran on the earlier commit 03d210e5, prior to the fix.)
🌎 Translation Required?✅ All translation files are up to date.
|
…ion_schema.materialized_views
Surface a per-base-table data version time as a new column on
information_schema.materialized_views and SHOW MATERIALIZED VIEWS, as a
JSON object mapping each base table's catalog.db.table name to the latest
data version time observed for it.
This complements LAST_REFRESH_TIME, which is the single maximum over all
base tables; the new column is the per-table detail behind that maximum.
Only external/data lake base tables report a time, because their
AsyncRefreshContext partition entries carry the source modified time. OLAP
(internal) base partitions store only a version id (lastRefreshTime == -1,
see BasePartitionInfo.fromOlapTable) and are omitted. The value is "{}"
when no base table reports a time, e.g. an OLAP-only materialized view.
Wired through every column surface: the information_schema table
definition, the SHOW result header, the SHOW ... WHERE -> SELECT rewrite
column list, TMaterializedViewStatus thrift field 39, ShowMaterializedViewStatus
(thrift + result row), and the BE schema scanner.
Signed-off-by: Youngwb <yangwenbo_mailbox@163.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
03d210e to
6bfeb8a
Compare
Why I'm doing:
information_schema.materialized_viewsandSHOW MATERIALIZED VIEWSalready exposeLAST_REFRESH_TIME, but only as a single maximum over all base tables. When a materialized view has multiple base tables, operators cannot see which base table the reported data version time comes from, nor the per-table breakdown. This makes it hard to reason about staleness on a per-base-table basis for external/data-lake-backed materialized views.What I'm doing:
Add a new column
BASE_TABLE_REFRESH_VERSION_TIMES(varchar) toinformation_schema.materialized_viewsandSHOW MATERIALIZED VIEWS. It is a JSON object mapping each base table'scatalog.database.tablename to the latest data version time observed for it — the per-table detail behindLAST_REFRESH_TIME(their single maximum).Only external/data lake base tables report a time, because their
AsyncRefreshContextpartition entries carry the source modified time. OLAP (internal) base partitions store only a version id (lastRefreshTime == -1, seeBasePartitionInfo.fromOlapTable) and are omitted. The value is{}when no base table reports a time (e.g. an OLAP-only materialized view).The column is wired through every existing column surface: the
information_schematable definition, the SHOW result header, theSHOW ... WHERE→ SELECT rewrite column list,TMaterializedViewStatusthrift field 39,ShowMaterializedViewStatus(thrift + result row), and the BE schema scanner.Fixes #issue: N/A — observability enhancement, no tracked issue.
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: