Skip to content

[CORE] Fix UnsupportedOperationException in input_file_name() with broadcast join build-side LocalRelation#12272

Merged
taiyang-li merged 1 commit into
apache:mainfrom
taiyang-li:aime/1781072978-fix-input-file-name-bhj-localrelation
Jun 12, 2026
Merged

[CORE] Fix UnsupportedOperationException in input_file_name() with broadcast join build-side LocalRelation#12272
taiyang-li merged 1 commit into
apache:mainfrom
taiyang-li:aime/1781072978-fix-input-file-name-bhj-localrelation

Conversation

@taiyang-li

@taiyang-li taiyang-li commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Problem

The newly added ut failed:

java.lang.UnsupportedOperationException: WholeStageCodegen (3) does not implement doExecuteBroadcast
  at org.apache.spark.sql.errors.QueryExecutionErrors$.doExecuteBroadcastNotImplementedError(QueryExecutionErrors.scala:1942)
  at org.apache.spark.sql.execution.SparkPlan.doExecuteBroadcast(SparkPlan.scala:312)
  ...
  at org.apache.gluten.execution.RowToVeloxColumnarExec.doExecuteBroadcast(RowToVeloxColumnarExec.scala:76)
  at org.apache.spark.sql.execution.ColumnarInputAdapter.doExecuteBroadcast(ColumnarCollapseTransformStages.scala:207)
  at org.apache.spark.sql.execution.InputIteratorTransformer.doExecuteBroadcast(ColumnarCollapseTransformStages.scala:72)
  ...
  at org.apache.gluten.execution.VeloxBroadcastNestedLoopJoinExecTransformer.columnarInputRDDs(VeloxBroadcastNestedLoopJoinExecTransformer.scala:47)

Root cause

PushDownInputFileExpression.PreOffload originally injects Project [..., input_file_name() AS attr#N] above every LeafExecNode (with a fallback tag).

When the broadcast build side is a non-file leaf (e.g. LocalTableScanExec from a constant subquery), the rule still wraps that leaf with a fallback ProjectExec. This forces vanilla Spark to wrap the subtree in WholeStageCodegen for the broadcast side. Down the road, VeloxBroadcastNestedLoopJoinExecTransformer.columnarInputRDDsInputIteratorTransformer.doExecuteBroadcastRowToVeloxColumnarExec.doExecuteBroadcastchild.executeBroadcast ends up calling WholeStageCodegenExec.doExecuteBroadcast, which is not implemented, hence UnsupportedOperationException.

In short: the rule was injecting input_file_name() on leaves that cannot produce a real file name in the first place (they have no InputFileBlockHolder context), and the resulting fallback Project changes the operator shape on the broadcast side so the broadcast pipeline can no longer be executed.

Fix

Limit the rewrite to leaves that can actually populate InputFileBlockHolder:

  • FileSourceScanExec
  • v2 BatchScanExec
  • Hive table scan (HiveTableScanExecTransformer.isHiveTableScan)
  • BatchScanExecTransformerBase (already special-cased in community for Iceberg etc.)

In addition, the ProjectExec match in PreOffload now requires that the subtree actually contains at least one such file-aware source via the new hasInputFileRelatedSource helper. Other leaves (LocalTableScanExec, RangeExec, RDDScanExec, ...) are left untouched, so the broadcast build side keeps its original shape and the doExecuteBroadcast path no longer terminates at WholeStageCodegen.

As a nice side effect, this also avoids producing a fake empty input_file_name attribute on non-file leaves, which previously could collide with a real file scan's ExprId on the other side of the join and silently return an empty file name.

Test

Added input_file_name() with BHJ build-side LocalRelation must return real path in backends-velox ScalarFunctionsValidateSuite, which reproduces the failing scenario (parquet table broadcast-joined with a literal subquery, projecting input_file_name()) and asserts:

  • The query runs without UnsupportedOperationException.
  • Result matches vanilla Spark via compareResultsAgainstVanillaSpark.
  • Each row's fname is non-empty and contains the real parquet path.

The existing test("input_file_name") (file/Hive scan paths) is unchanged because those scans remain in the whitelist.

@github-actions github-actions Bot added CORE works for Gluten Core VELOX labels Jun 10, 2026
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@taiyang-li taiyang-li changed the title [GLUTEN] Fix wrong input_file_name() for BHJ build-side LocalRelation [CORE] Fix wrong input_file_name() for BHJ build-side LocalRelation Jun 10, 2026
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@taiyang-li taiyang-li changed the title [CORE] Fix wrong input_file_name() for BHJ build-side LocalRelation [WIP][CORE] Fix wrong input_file_name() for BHJ build-side LocalRelation Jun 10, 2026
@taiyang-li taiyang-li changed the title [WIP][CORE] Fix wrong input_file_name() for BHJ build-side LocalRelation [GLUTEN] Fix UnsupportedOperationException in input_file_name() with broadcast join build-side LocalRelation Jun 10, 2026
@taiyang-li taiyang-li force-pushed the aime/1781072978-fix-input-file-name-bhj-localrelation branch from 82dcdc4 to a9fc169 Compare June 10, 2026 06:49
@taiyang-li taiyang-li requested a review from WangGuangxin June 10, 2026 06:49
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@taiyang-li taiyang-li changed the title [GLUTEN] Fix UnsupportedOperationException in input_file_name() with broadcast join build-side LocalRelation [CORE] Fix UnsupportedOperationException in input_file_name() with broadcast join build-side LocalRelation Jun 10, 2026
@taiyang-li taiyang-li merged commit 99f8383 into apache:main Jun 12, 2026
117 of 119 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants