Skip to content

Preserve logical cast field semantics during physical lowering with field-aware CastExpr#20836

Merged
kosiew merged 9 commits intoapache:mainfrom
kosiew:cast-02-20164
Apr 9, 2026
Merged

Preserve logical cast field semantics during physical lowering with field-aware CastExpr#20836
kosiew merged 9 commits intoapache:mainfrom
kosiew:cast-02-20164

Conversation

@kosiew
Copy link
Copy Markdown
Contributor

@kosiew kosiew commented Mar 10, 2026

Which issue does this PR close?


Rationale for this change

The current physical planning path for Expr::Cast discards logical field information (name, nullability, and metadata) by lowering casts using only the target DataType. This results in a loss of semantic fidelity between logical and physical plans, particularly for metadata-bearing fields and same-type casts with explicit field intent.

Additionally, the planner previously rejected casts with metadata due to limitations of the type-only casting API, creating inconsistencies with other parts of the system (e.g. adapter-generated expressions).

This change introduces a field-aware casting path that preserves logical intent throughout physical lowering, ensuring consistent semantics across planner and adapter outputs.


What changes are included in this PR?

  • Introduced cast_with_target_field to construct CastExpr using full FieldRef semantics (name, nullability, metadata).
  • Refactored existing cast_with_options to delegate to the new field-aware helper.
  • Moved is_default_target_field to a shared helper function for reuse.
  • Updated planner (planner.rs) to use cast_with_target_field instead of type-only casting.
  • Removed metadata rejection logic during cast lowering.
  • Ensured same-type casts preserve explicit field semantics unless the target field is default.
  • Adjusted cast construction to validate compatibility before building expressions.
  • Exported cast_with_target_field for internal planner use.

Are these changes tested?

Yes.

Added planner-focused unit tests to validate:

  • Preservation of target field metadata during cast lowering
  • Correct propagation of nullability semantics
  • Proper handling of same-type casts with explicit field overrides
  • No regression for standard type-only casts
  • Rejection behavior for unsupported extension type casts via TryCast

These tests ensure both backward compatibility and correctness of the new semantics.


Are there any user-facing changes?

Yes, behaviorally (but not API-breaking):

  • Cast expressions now preserve logical field metadata and nullability in physical plans.
  • Previously rejected metadata-bearing casts are now supported.
  • Same-type casts may now produce a CastExpr when explicit field semantics are provided.

There are no breaking changes to public APIs, but downstream consumers that relied on previous planner behavior (e.g. metadata stripping or cast elision) may observe differences.


LLM-generated code disclosure

This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.

@github-actions github-actions bot added the physical-expr Changes to the physical-expr crates label Mar 10, 2026
@kosiew kosiew changed the title Preserve target field metadata and nullability when lowering logical CASTs Preserve logical field semantics during physical cast lowering Mar 10, 2026
@kosiew kosiew force-pushed the cast-02-20164 branch 2 times, most recently from 1058b83 to df5486c Compare March 10, 2026 07:22
@kosiew kosiew changed the title Preserve logical field semantics during physical cast lowering Preserve logical cast field semantics during physical lowering and schema rewrite Mar 10, 2026
@kosiew kosiew marked this pull request as ready for review March 10, 2026 09:10
@kosiew kosiew requested a review from adriangb March 13, 2026 03:23
@kosiew kosiew changed the title Preserve logical cast field semantics during physical lowering and schema rewrite Preserve logical field semantics in physical cast lowering (field-aware CastExpr / TryCastExpr) Apr 6, 2026
@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Apr 6, 2026
@kosiew
Copy link
Copy Markdown
Contributor Author

kosiew commented Apr 6, 2026

@adriangb
PTAL

@adriangb
Copy link
Copy Markdown
Contributor

adriangb commented Apr 6, 2026

Interestingly I was just poking around here myself: #21390

@kosiew kosiew marked this pull request as draft April 7, 2026 03:26
@kosiew
Copy link
Copy Markdown
Contributor Author

kosiew commented Apr 7, 2026

@adriangb
I am going to rework this after #21390 is merged.

kosiew added 5 commits April 7, 2026 22:11
Refactor logical Expr::Cast to use field-aware CastExpr,
ensuring target FieldRef metadata is preserved.
Enhance tests to confirm metadata retention,
validate that same-type casts aren't elided for fields
with semantics, and ensure existing TryCast rejection
for extension types remains effective.
Expose shared cast_with_target_field helper to validate and
build field-aware CastExprs in one place. Update planner to
directly call this helper, removing the need for temporary
type-only casts. Add regression tests to cover standard casts,
metadata-bearing casts, and same-type semantic-preserving
casts.
Consolidate default-target-field predicate and success
construction path in cast_with_target_field to reduce
duplicate code in cast.rs. Simplify tests in planner.rs
by implementing shared setup helpers and caching
return_field() results for standard casts.
Narrow cast_with_target_field from a public re-export to a
crate-only re-export in datafusion/physical-expr/src/expressions/mod.rs.
This change allows the planner to still utilize it while reducing
the public expressions API surface.
Use as_planner_cast(...) helper in planner tests to
eliminate repeated downcasting. Update cast_with_target_field
documentation to clarify that default synthesized fields are
elided while explicit field semantics are preserved.
@github-actions github-actions bot removed the physical-plan Changes to the physical-plan crate label Apr 7, 2026
@kosiew kosiew changed the title Preserve logical field semantics in physical cast lowering (field-aware CastExpr / TryCastExpr) Preserve logical cast field semantics during physical lowering with field-aware CastExpr Apr 7, 2026
kosiew added 2 commits April 7, 2026 22:28
- Refactored the handling of `Expr::Cast` to remove unnecessary line breaks and improve readability in the `create_physical_expr` function.
- Modified the test for cast lowering to maintain consistent formatting while preserving target field metadata.
Collapse the nested if statements in
datafusion/physical-expr/src/expressions/cast.rs
to satisfy Clippy's collapsible_if lint. This change
does not alter any existing behavior.
@kosiew kosiew marked this pull request as ready for review April 7, 2026 14:48
@kosiew
Copy link
Copy Markdown
Contributor Author

kosiew commented Apr 7, 2026

@adriangb
This is ready for review.

Copy link
Copy Markdown
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Can we add SLT tests that would reflect these changes?

kosiew added 2 commits April 8, 2026 14:41
Introduce TypePlanner hook in test_context.rs for
file-specific SLT to handle UUID target-field metadata.
Added cast_extension_type_metadata.slt to cover new
test cases for CAST and TRY_CAST with FixedSizeBinary.
Ensure target metadata is preserved and acknowledge
unchanged rejection path for TRY_CAST.
Eliminate the test_try_cast_to_extension_type_is_rejected
from planner.rs as the new SLT now directly covers this case.
This cleanup ensures better maintainability and reduces test
duplication.
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Apr 8, 2026
@kosiew kosiew added this pull request to the merge queue Apr 9, 2026
Merged via the queue into apache:main with commit e1ad871 Apr 9, 2026
35 checks passed
@kosiew kosiew deleted the cast-02-20164 branch April 9, 2026 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants