Skip to content

opt: Fold OpCompositeExtract feeding from OpCopyLogical or OpLoad#6614

Open
s-perron wants to merge 2 commits intoKhronosGroup:mainfrom
s-perron:users/s-perron/i6611
Open

opt: Fold OpCompositeExtract feeding from OpCopyLogical or OpLoad#6614
s-perron wants to merge 2 commits intoKhronosGroup:mainfrom
s-perron:users/s-perron/i6611

Conversation

@s-perron
Copy link
Copy Markdown
Collaborator

  • CopyLogicalFeedingExtract: Hoist OpCompositeExtract before OpCopyLogical.
    If the input to an OpCompositeExtract is an OpCopyLogical, we can
    extract from the original composite and then copy the result.
  • LoadFeedingExtract: Change OpLoad + OpCompositeExtract to OpAccessChain + OpLoad.
    If the input to an OpCompositeExtract is an OpLoad, we can load the
    specific element instead of the entire composite. This is restricted
    to non-Function/Private storage classes to avoid interfering with
    local-access-chain-convert.

Updated fold_test.cpp with 8 new test cases and updated existing tests
to use SPV_ENV_UNIVERSAL_1_5.

Fixes #6611

- CopyLogicalFeedingExtract: Hoist OpCompositeExtract before OpCopyLogical.
  If the input to an OpCompositeExtract is an OpCopyLogical, we can
  extract from the original composite and then copy the result.
- LoadFeedingExtract: Change OpLoad + OpCompositeExtract to OpAccessChain + OpLoad.
  If the input to an OpCompositeExtract is an OpLoad, we can load the
  specific element instead of the entire composite. This is restricted
  to non-Function/Private storage classes to avoid interfering with
  local-access-chain-convert.

Updated fold_test.cpp with 8 new test cases and updated existing tests
to use SPV_ENV_UNIVERSAL_1_5.

Fixes KhronosGroup#6611
@s-perron s-perron requested a review from luciechoi March 26, 2026 20:28
@devshgraphicsprogramming
Copy link
Copy Markdown

the only catch is that you don't want to replace every OpLoad -> OpLogicalCopy -> OpCompositeExtract chain with an OpAccessChain -> OpLoad

I'm worried about a temp load geting used a lot getting replaced by dozens of individual redundand loads, so it would only need to consider OpLoad of whole structures for replacement where only one and the same composite component is getting extracted (so that it was really only codegen for an r-value temp returned from a function)

Is this vulnerable to what I'm imagening here?

Comment on lines +2252 to +2265
if (cinst->opcode() != spv::Op::OpLoad) {
return false;
}

Instruction* composite_type_inst = def_use_mgr->GetDef(cinst->type_id());
if (composite_type_inst->opcode() != spv::Op::OpTypeStruct &&
composite_type_inst->opcode() != spv::Op::OpTypeArray) {
return false;
}

// Check the memory operands. We only support None.
if (cinst->NumInOperands() > 1) {
return false;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly cinst is the Instruction that produces the Operand of OpCompositeExtract and you want that to be OpLoad

The problem is that you then refuse to process an OpLoad with more than 1 argument, which would be an OpLoad with Aligned for a BDA

So this optimization pass will work on UBO and SSBO but not BDA, correct ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll try to update this.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devshgraphicsprogramming Let me know how this works.

@luciechoi luciechoi self-requested a review April 9, 2026 22:39
- Adds `Type::GetByteOffset` to correctly compute byte offsets into
  composite types, handling structs, arrays, matrices, and vectors.
- Modifies `LoadFeedingExtract` folding rule to:
  - Bail out on `Volatile` loads.
  - Dynamically recalculate `Aligned` loads by leveraging the
    computed byte offset and the power-of-two alignment property.
  - Transparently preserve and propagate all other memory operands
    like `MakePointerVisible` or `Nontemporal` to the new `OpLoad`.
- Adds unit tests for the new `Type` method and updates folding
  tests to ensure correct preservation of the memory operands.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pass to load-store eliminate OpLoad OpLogicalCopy OpCompositeExtract code generated due to explicit layout

3 participants