Route init-session provider-option shaping through DeviceInterface by qjia7 · Pull Request #2232 · microsoft/onnxruntime-genai

qjia7 · 2026-06-18T03:33:54Z

Summary

Move EP-specific shaping of the EnsureDeviceOrtInit trivial-model session options out of model.cpp and onto a new optional virtual DeviceInterface::ShapeInitSessionProviderOptions. WebGPU and QNN provide overrides; all other EPs inherit the default no-op. The base path becomes EP-agnostic: it looks up the user-supplied entry once, then dispatches.
Expand the WebGPU global-options whitelist from 7 to 14 entries so every key consumed by ParseWebGpuContextConfig (preserveDevice, maxStorageBufferBindingSize, maxNumPendingDispatches, and the four *BufferCacheMode keys) actually reaches the process-wide WebGpuContext singleton.
Make config.h self-sufficient (adds the standard and project headers it already implicitly depended on) so it can be included from smartptrs.h without a transitive chain. A one-line comment in config.h warns against introducing a circular include back to smartptrs.h/generators.h.

Test plan

Build onnxruntime-genai with USE_WEBGPU=ON against a local ORT (onnxruntime-genai.dll, *.pyd, wheel, unit_tests.exe, model_benchmark.exe all produced).
End-to-end WebGPU smoke: confirm a known-good model still loads and generates correctly via verify_model_correctness.py and verify_multi_gen.py (covers the original reason EnsureDeviceOrtInit exists).
QNN compile/load check on hardware (if available) — confirms enable_htp_shared_memory_allocator=1 is still injected.
CPU/CUDA/DML/OpenVINO/RyzenAI/NvTensorRtRtx: compile-only check is sufficient (default no-op virtual; previously never executed the WebGPU/QNN branches either).

EnsureDeviceOrtInit previously contained inline EP-specific branches for WebGPU (whitelist forwarding of global/singleton options) and QNN (injection of enable_htp_shared_memory_allocator=1). This change adds a new optional virtual ShapeInitSessionProviderOptions on DeviceInterface so each EP owns its own init-session shaping. The base EnsureDeviceOrtInit becomes EP-agnostic: it looks up the user-supplied provider options once, then dispatches to the EP override. Most EPs inherit the default no-op. Also expand the WebGPU global-options whitelist from 7 to 14 entries to match all options consumed by ParseWebGpuContextConfig: preserveDevice, maxStorageBufferBindingSize, maxNumPendingDispatches, and the four *BufferCacheMode keys were previously dropped before reaching the process-wide WebGpuContext singleton. config.h becomes self-sufficient (adds the standard and project headers it already implicitly depended on) so it can be included from smartptrs.h without a transitive chain. A warning comment prevents future contributors from introducing a circular include. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

⚠️ Not ready to approve

config.h still relies on transitive standard-library includes for types it uses (e.g., uint32_t/std::unordered_map/std::byte), which undermines the stated goal of being self-sufficient.

Pull request overview

This PR refactors EnsureDeviceOrtInit so execution-provider-specific shaping of the trivial init-session provider options is routed through a new optional DeviceInterface virtual, with WebGPU and QNN providing overrides. It also expands the WebGPU init-session option forwarding whitelist and makes config.h more directly includable from lower-level headers by adding explicit includes and warning about include cycles.

Changes:

Add DeviceInterface::ShapeInitSessionProviderOptions(...) (default no-op) and route EnsureDeviceOrtInit through it.
WebGPU: forward a larger set of global/singleton WebGPU provider options into the trivial init session.
QNN: inject enable_htp_shared_memory_allocator=1 via the new shaping hook; update config.h include set and document include-cycle constraints.

File summaries

File	Description
src/webgpu/interface.cpp	Implements WebGPU override to forward process-wide WebGPU context options into the init session.
src/smartptrs.h	Adds the new optional shaping virtual to `DeviceInterface` and includes `config.h` for the signature.
src/qnn/interface.cpp	Implements QNN override to inject the shared-memory allocator gating provider option.
src/models/model.cpp	Replaces hard-coded WebGPU/QNN shaping with provider-option lookup + dispatch through `DeviceInterface`.
src/config.h	Adds explicit includes and a warning comment to avoid include cycles with `smartptrs.h`/`generators.h`.

Copilot's findings

Files reviewed: 5/5 changed files
Comments generated: 1

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

Add <cstddef>, <cstdint>, <span>, and <unordered_map> so config.h provides every type it uses (uint32_t, size_t, std::byte, std::span<const std::byte>, std::unordered_map) without depending on transitive includes pulled in through onnxruntime_api.h or <vector>.

Copilot

⚠️ Not ready to approve

src/config.h now includes <span>, which can break builds that define USE_CXX17 (the repo’s span.h provides the C++17 polyfill), so the include should use the project wrapper instead.

Copilot's findings

Files reviewed: 5/5 changed files
Comments generated: 1

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

…t lint - Replace standard <span> include with project span.h wrapper in config.h to maintain C++17 polyfill compatibility (fixes Copilot review concern) - Fix include comment formatting in smartptrs.h to align with clang-format style (fixes lint-cpp CI failure) These changes address the PR #2232 Copilot review comments and lint check failures.

Copilot

⚠️ Not ready to approve

Device filtering options can be silently dropped during init-session allocator initialization for non-WebGPU EPs, which can initialize the global allocator against the wrong device.

Copilot's findings

Files reviewed: 5/5 changed files
Comments generated: 2

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

…artptrs.h DeviceInterface::ShapeInitSessionProviderOptions needs ProviderOptions in its signature. Including config.h from smartptrs.h required making config.h self-sufficient (12 new standard headers), which caused the C++17 span incompatibility and downstream lint issues. Fix: move DeviceFilteringOptions, NamedString, and ProviderOptions to a new self-sufficient provider_options.h. smartptrs.h includes the lightweight new header; config.h adds one #include and replaces the three definitions with transparent type aliases, so all Config::ProviderOptions call sites compile unchanged.

- Add <algorithm> and <type_traits> to smartptrs.h for std::copy and std::remove_const_t which were relying on transitive includes. - Propagate device_filtering_options unconditionally in EnsureDeviceOrtInit before the virtual ShapeInitSessionProviderOptions call so all EPs (CUDA, QNN, DML, etc.) honour the user's device selection for the allocator-init session, not just WebGPU. - Remove the now-redundant device_filtering_options assignment from the WebGPU override.

…-options

kunal-vaishnavi · 2026-06-22T19:08:54Z

 #include <assert.h>
 #include <atomic>
 #include <memory>
+#include <type_traits>


Why is this import needed?

<type_traits> is needed for std::remove_const_t used at DeviceInterface::WrapMemory (line 117 of this file) to strip const before passing to WrapMemoryBase(void*, ...). It was previously pulled in transitively, which Copilot flagged as fragile, so I added it explicitly in commit 2821298. Added a // for std::remove_const_t marker in d792135 to match the existing convention.

kunal-vaishnavi · 2026-06-22T19:11:15Z

 //
 // Modifications Copyright(C) 2026 Advanced Micro Devices, Inc. All rights reserved.
 #pragma once
+#include <algorithm>


Why is this import needed?

<algorithm> is needed for std::copy used in the free function Generators::copy at the bottom of this file. It was previously pulled in transitively, which Copilot flagged as fragile, so I added it explicitly in commit 2821298. Added a // for std::copy marker in d792135 to match the existing convention used by the other includes in this file.

Add inline 'for X' markers next to the two includes to match the existing convention in this file (see span.h, onnxruntime_api.h, provider_options.h above). <algorithm> is used by std::copy in Generators::copy; <type_traits> is used by std::remove_const_t in DeviceInterface::WrapMemory.

qjia7 requested a review from Copilot June 18, 2026 04:18

Copilot started reviewing on behalf of qjia7 June 18, 2026 04:20 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

Comment thread src/config.h Outdated

qjia7 requested a review from Copilot June 18, 2026 04:59

Copilot started reviewing on behalf of qjia7 June 18, 2026 05:00 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

Comment thread src/config.h Outdated

qjia7 requested a review from Copilot June 18, 2026 06:11

Copilot started reviewing on behalf of qjia7 June 18, 2026 06:13 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

Comment thread src/smartptrs.h Outdated

Comment thread src/models/model.cpp

qjia7 added 2 commits June 18, 2026 15:15

qjia7 marked this pull request as ready for review June 18, 2026 09:54

qjia7 requested a review from a team as a code owner June 18, 2026 09:54

Merge remote-tracking branch 'origin/main' into ep-shape-init-session…

4e19090

…-options

kunal-vaishnavi reviewed Jun 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Route init-session provider-option shaping through DeviceInterface#2232

Route init-session provider-option shaping through DeviceInterface#2232
qjia7 wants to merge 7 commits into
mainfrom
ep-shape-init-session-options

qjia7 commented Jun 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

kunal-vaishnavi Jun 22, 2026

Uh oh!

qjia7 Jun 23, 2026

Uh oh!

kunal-vaishnavi Jun 22, 2026

Uh oh!

qjia7 Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

qjia7 commented Jun 18, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

⚠️ Not ready to approve

Copilot's findings

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

⚠️ Not ready to approve

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

⚠️ Not ready to approve

Uh oh!

Uh oh!

Uh oh!

kunal-vaishnavi Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

qjia7 Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

kunal-vaishnavi Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

qjia7 Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants