Skip to content

[flink] Support stream read Chain Table#8262

Open
yunfengzhou-hub wants to merge 1 commit into
apache:masterfrom
yunfengzhou-hub:chain-table-streaming
Open

[flink] Support stream read Chain Table#8262
yunfengzhou-hub wants to merge 1 commit into
apache:masterfrom
yunfengzhou-hub:chain-table-streaming

Conversation

@yunfengzhou-hub

Copy link
Copy Markdown
Contributor

Purpose

Chain Table (chain-table.enabled=true) separates data into a snapshot branch (batch-imported full partitions) and a delta branch (incremental updates). Prior to this change, streaming read was not supported because the standard DataTableStreamScan is unaware of the two-branch architecture.

This PR introduces ChainTableFileStoreTable (a wrapper over FallbackReadFileStoreTable) and ChainTableStreamScan which implements a two-phase streaming scan: Phase 1 does a full load by reading delta data pinned to the current snapshot and merging snapshot files for overlapping partitions; Phase 2 incrementally monitors the delta branch only, returning DataSplit(isStreaming=true) for changelog passthrough. The snapshot-pinning strategy makes the Phase 1 / Phase 2 boundary deterministic — no overlap or data loss regardless of concurrent commits.

Tests

Added FlinkChainTableITCase with 16 tests (all passing, ~75s):

  • Full load with snapshot+delta overlap, empty delta, empty snapshot
  • Changelog passthrough (-U/+U) with changelog-producer=input
  • Snapshot OVERWRITE does not trigger streaming output
  • Stateless restart and stateful restart (MiniCluster checkpoint/restore)
  • WHERE predicate forwarding, withShard forwarding
  • scan.mode=latest bypass, changelog-producer=none rejection
  • restore(id, scanAll=true) and restore(null, scanAll=true) state reset
  • chain-partition-keys group partition streaming

@yunfengzhou-hub yunfengzhou-hub force-pushed the chain-table-streaming branch from 4ca1aab to 49a4f20 Compare June 17, 2026 08:34
@yunfengzhou-hub yunfengzhou-hub force-pushed the chain-table-streaming branch from 49a4f20 to f2dc523 Compare June 17, 2026 14:50
@yunfengzhou-hub yunfengzhou-hub marked this pull request as ready for review June 18, 2026 01:42
@yunfengzhou-hub yunfengzhou-hub changed the title [POC][flink] Support stream read Chain Table [flink] Support stream read Chain Table Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant