Skip to content

[FLINK-39412][cdc-common] Skip duplicate columns in AddColumnEvent to ensure idempotency#4370

Open
dangerousfeng wants to merge 1 commit intoapache:masterfrom
dangerousfeng:fix-duplicate-add-column-event
Open

[FLINK-39412][cdc-common] Skip duplicate columns in AddColumnEvent to ensure idempotency#4370
dangerousfeng wants to merge 1 commit intoapache:masterfrom
dangerousfeng:fix-duplicate-add-column-event

Conversation

@dangerousfeng
Copy link
Copy Markdown

Summary

When recovering from a checkpoint/savepoint, binlog events may be replayed, causing AddColumnEvent to be applied for columns that already exist in the cached schema. This leads to duplicate field names in RowType, which throws:

java.lang.IllegalArgumentException: Field names must be unique. Found duplicates: [valid_date]
    at org.apache.flink.cdc.common.types.RowType.validateFields(RowType.java:158)
    at org.apache.flink.cdc.runtime.operators.transform.PreTransformOperator.processElement(...)

Root Cause

SchemaUtils.applyAddColumnEvent() blindly adds columns without checking if a column with the same name already exists. While isSchemaChangeEventRedundant() exists as a utility, PreTransformOperator.cacheChangeSchema() does not call it before applying schema changes.

Fix

  • Added an idempotency check in SchemaUtils.applyAddColumnEvent() to skip columns that already exist in the schema
  • This is the most defensive fix location since it protects all callers of applySchemaChangeEvent(), not just PreTransformOperator
  • Also maintains existingColumnNames set across iterations for correctness when a single event adds multiple columns

Changes

  • SchemaUtils.java: Added duplicate column name check before adding columns
  • SchemaUtilsTest.java: Added test cases for duplicate AddColumnEvent in both LAST and FIRST positions

Test plan

  • Added unit tests for duplicate AddColumnEvent with LAST position
  • Added unit tests for duplicate AddColumnEvent with FIRST position
  • All existing SchemaUtilsTest tests pass (5/5)

@github-actions github-actions bot added the common label Apr 9, 2026
@dangerousfeng dangerousfeng changed the title [FLINK-xxxxx][cdc-common] Skip duplicate columns in AddColumnEvent to ensure idempotency [FLINK-39412][cdc-common] Skip duplicate columns in AddColumnEvent to ensure idempotency Apr 9, 2026
@dangerousfeng dangerousfeng force-pushed the fix-duplicate-add-column-event branch from 6a7977f to 1f1b0f0 Compare April 9, 2026 04:29
… ensure idempotency

When recovering from a checkpoint/savepoint, binlog events may be replayed,
causing AddColumnEvent to be applied for columns that already exist in the
cached schema. This leads to duplicate field names in RowType, which throws
IllegalArgumentException: "Field names must be unique".

This fix adds an idempotency check in SchemaUtils.applyAddColumnEvent() to
skip columns that already exist in the schema, preventing the duplicate
column error during failover recovery.
@dangerousfeng dangerousfeng force-pushed the fix-duplicate-add-column-event branch from 1f1b0f0 to 0ff420f Compare April 9, 2026 05:58
@lvyanquan lvyanquan requested a review from Copilot April 9, 2026 10:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Improve schema-change replay safety by making AddColumnEvent application idempotent, preventing duplicate field names in recovered/replayed binlog scenarios.

Changes:

  • Skip adding columns whose names already exist when applying AddColumnEvent
  • Track existing column names across a single AddColumnEvent application
  • Add unit tests covering duplicate AddColumnEvent behavior for LAST and FIRST

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
flink-cdc-common/src/main/java/org/apache/flink/cdc/common/utils/SchemaUtils.java Adds duplicate-name guard to make applyAddColumnEvent idempotent during replay/recovery.
flink-cdc-common/src/test/java/org/apache/flink/cdc/common/utils/SchemaUtilsTest.java Adds regression tests ensuring duplicate adds are ignored for LAST and FIRST.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +124 to +133
Set<String> existingColumnNames =
columns.stream()
.map(Column::getName)
.collect(Collectors.toCollection(HashSet::new));
for (AddColumnEvent.ColumnWithPosition columnWithPosition : event.getAddedColumns()) {
// Skip adding the column if it already exists in the schema to ensure idempotency.
// This can happen when schema change events are replayed after a failover recovery.
if (existingColumnNames.contains(columnWithPosition.getAddColumn().getName())) {
continue;
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipping duplicates purely by name can hide a real schema conflict (e.g., replay vs. a different upstream AddColumnEvent that reuses the same column name but with a different type/nullable/default/comment). To keep idempotency while avoiding silent corruption, consider: if the name exists, look up the existing Column and compare with columnWithPosition.getAddColumn(); only skip when they are equivalent, otherwise throw an informative exception.

Copilot uses AI. Check for mistakes.
Comment on lines +163 to +171
// add duplicate column should be ignored (idempotency)
addedColumns = new ArrayList<>();
addedColumns.add(
new AddColumnEvent.ColumnWithPosition(
Column.physicalColumn("col3", DataTypes.STRING()),
AddColumnEvent.ColumnPosition.LAST,
null));
addColumnEvent = new AddColumnEvent(tableId, addedColumns);
schema = SchemaUtils.applySchemaChangeEvent(schema, addColumnEvent);
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current tests cover replay of a single-column AddColumnEvent. Since the implementation now keeps existingColumnNames across iterations, add a unit test where a single AddColumnEvent contains the same column name twice (e.g., two ColumnWithPosition entries for col3), and assert only one column is added. This directly validates the new intra-event dedup behavior.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants