sidecar: per-CPU overrides for servicing restore with NVMe keepalive by emirceski · Pull Request #3166 · microsoft/openvmm

emirceski · 2026-03-31T18:02:11Z

sidecar: per-CPU overrides for servicing restore with NVMe keepalive

During servicing restore with NVMe keepalive, sidecar was disabled
entirely if any devices had mapped interrupts — all VPs fell back to
sequential Linux onlining, even if only a few CPUs had outstanding IO.

This change makes it selective: only CPUs with outstanding IO are
excluded from sidecar startup (kernel-started instead), while the rest
keep sidecar's parallel fan-out. This preserves servicing latency
improvements even when NVMe keepalive is active.

This PR continues Matt Kurjanowicz's PR 2477, which introduced the
per-CPU state concept. start_aps() was reworked to map each node's
control page individually via scoped mappings, ensuring node-local
correctness in multi-NUMA topologies.

changes

Add PerCpuState and initial_state to SidecarParams in
sidecar_defs, supporting up to 400 CPUs within a single 4 KiB
page. VMs exceeding this fall back to disabling sidecar entirely.
Replace all-or-nothing sidecar disable in openhcl_boot DT parsing
with per-CPU overrides: only CPUs in cpus_with_outstanding_io are
kernel-started, the rest stay sidecar-started.
Update SidecarConfig and boot_cpus= command line generation to
respect per-CPU overrides when per_cpu_state_specified is set.
Rework start_aps() to use scoped per-node control page mappings,
skipping REMOVED VPs with a log message.
Add create_keepalive_test_config_custom helper for flexible NVMe
keepalive test configuration (topology, cmdline, NVMe params).
Add test servicing_keepalive_sidecar_with_outstanding_io_very_heavy:
24 VPs, 2 NUMA nodes, NVMe fault injection with 10s delayed
completions, save with IO in-flight, restore exercises per-CPU
override path. Programmatically asserts: per-CPU override fired
(via inspect_openhcl("vm/runtime_params/bootshim_logs")), and
all 24 VPs online after restore.

During servicing restore with NVMe keepalive, sidecar was disabled entirely if any devices had mapped interrupts — all VPs fell back to sequential Linux onlining, even if only a few CPUs had outstanding IO. This change makes it selective: only CPUs with outstanding IO are excluded from sidecar startup (kernel-started instead), while the rest keep sidecar's parallel fan-out. This preserves servicing latency improvements even when NVMe keepalive is active. This PR continues Matt Kurjanowicz's PR 2477, which introduced the per-CPU state concept. `start_aps()` was reworked to map each node's control page individually via scoped mappings, ensuring node-local correctness in multi-NUMA topologies. Changes: - Add `PerCpuState` and `initial_state` to `SidecarParams` in `sidecar_defs`, supporting up to 1000 CPUs (single 4 KiB page). VMs exceeding this disable sidecar entirely. - Replace all-or-nothing sidecar disable in `openhcl_boot` DT parsing with per-CPU overrides: IO-busy CPUs are kernel-started, the rest stay sidecar-started. - Update `sidecar.rs` boot CPU generation and `boot_cpus=` command line to respect per-CPU overrides. - Rework `start_aps()` to use scoped per-node control page mappings. - Add test `servicing_keepalive_sidecar_with_outstanding_io`: 24 VPs, 2 NUMA nodes, NVMe fault injection with delayed completions, save with IO in-flight, restore with per-CPU overrides (2 kernel-started, 22 sidecar-started).

openhcl/openhcl_boot/src/host_params/dt/mod.rs

vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs

openhcl/openhcl_boot/src/host_params/mod.rs

vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs

During servicing restore with NVMe keepalive, sidecar was disabled entirely if any devices had mapped interrupts — all VPs fell back to sequential Linux onlining, even if only a few CPUs had outstanding IO. This change makes it selective: only CPUs with outstanding IO are excluded from sidecar startup (kernel-started instead), while the rest keep sidecar's parallel fan-out. This preserves servicing latency improvements even when NVMe keepalive is active. This PR continues Matt Kurjanowicz's PR 2477, which introduced the per-CPU state concept. `start_aps()` was reworked to map each node's control page individually via scoped mappings, ensuring node-local correctness in multi-NUMA topologies. - Add `PerCpuState` and `initial_state` to `SidecarParams` in `sidecar_defs`, supporting up to 400 CPUs within a single 4 KiB page. VMs exceeding this fall back to disabling sidecar entirely. - Replace all-or-nothing sidecar disable in `openhcl_boot` DT parsing with per-CPU overrides: only CPUs in `cpus_with_outstanding_io` are kernel-started, the rest stay sidecar-started. - Update `SidecarConfig` and `boot_cpus=` command line generation to respect per-CPU overrides when `per_cpu_state_specified` is set. - Rework `start_aps()` to use scoped per-node control page mappings, skipping REMOVED VPs with a log message. - Add `create_keepalive_test_config_custom` helper for flexible NVMe keepalive test configuration (topology, cmdline, NVMe params). - Add test `servicing_keepalive_sidecar_with_outstanding_io_very_heavy`: 24 VPs, 2 NUMA nodes, NVMe fault injection with 10s delayed completions, save with IO in-flight, restore exercises per-CPU override path, asserts all VPs online after restore.

During servicing restore with NVMe keepalive, sidecar was disabled entirely if any devices had mapped interrupts — all VPs fell back to sequential Linux onlining, even if only a few CPUs had outstanding IO. This change makes it selective: only CPUs with outstanding IO are excluded from sidecar startup (kernel-started instead), while the rest keep sidecar's parallel fan-out. This preserves servicing latency improvements even when NVMe keepalive is active. This PR continues Matt Kurjanowicz's PR 2477, which introduced the per-CPU state concept. `start_aps()` was reworked to map each node's control page individually via scoped mappings, ensuring node-local correctness in multi-NUMA topologies. - Add `PerCpuState` and `initial_state` to `SidecarParams` in `sidecar_defs`, supporting up to 400 CPUs within a single 4 KiB page. VMs exceeding this fall back to disabling sidecar entirely. - Replace all-or-nothing sidecar disable in `openhcl_boot` DT parsing with per-CPU overrides: only CPUs in `cpus_with_outstanding_io` are kernel-started, the rest stay sidecar-started. - Update `SidecarConfig` and `boot_cpus=` command line generation to respect per-CPU overrides when `per_cpu_state_specified` is set. - Rework `start_aps()` to use scoped per-node control page mappings, skipping REMOVED VPs with a log message. - Add `create_keepalive_test_config_custom` helper for flexible NVMe keepalive test configuration (topology, cmdline, NVMe params). - Add test `servicing_keepalive_sidecar_with_outstanding_io_very_heavy`: 24 VPs, 2 NUMA nodes, NVMe fault injection with 10s delayed completions, save with IO in-flight, restore exercises per-CPU override path. Programmatically asserts: per-CPU override fired (via `inspect_openhcl("vm/runtime_params/bootshim_logs")`), and all 24 VPs online after restore.

smalis-msft · 2026-04-07T18:38:09Z

openhcl/sidecar_defs/src/lib.rs

 pub const MAX_NODES: usize = 128;
+/// The maximum number of supported sidecar CPUs.
+/// Keep small to leave space on the SidecarParams page for future fields.
+/// VMs with more CPUs fall back to disabling sidecar on restore.


Where do we disable sidecar on restore for too many cpus?

in openhcl/openhcl_boot/src/host_params/dt/mod.rs the block around 'else' on 996.

emirceski requested a review from a team as a code owner March 31, 2026 18:02

Copilot AI review requested due to automatic review settings March 31, 2026 18:02

github-actions bot added the unsafe Related to unsafe code label Mar 31, 2026

This comment was marked as resolved.

Sign in to view

Copilot started reviewing on behalf of emirceski March 31, 2026 18:03 View session

This comment was marked as resolved.

Sign in to view

gurasinghMS reviewed Mar 31, 2026

View reviewed changes

openhcl/openhcl_boot/src/host_params/dt/mod.rs Outdated Show resolved Hide resolved

gurasinghMS reviewed Mar 31, 2026

View reviewed changes

vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs Outdated Show resolved Hide resolved

mattkur reviewed Mar 31, 2026

View reviewed changes

vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs Outdated Show resolved Hide resolved

mattkur reviewed Mar 31, 2026

View reviewed changes

openhcl/openhcl_boot/src/host_params/mod.rs Show resolved Hide resolved

gurasinghMS reviewed Mar 31, 2026

View reviewed changes

vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs Show resolved Hide resolved

This comment was marked as resolved.

Sign in to view

Merge branch 'main' into sidecar-opt-v2

ecac525

Copilot AI review requested due to automatic review settings April 1, 2026 21:14

Copilot started reviewing on behalf of emirceski April 1, 2026 21:15 View session

This comment was marked as resolved.

Sign in to view

emirceski added 2 commits April 1, 2026 17:25

Copilot AI review requested due to automatic review settings April 1, 2026 21:29

Copilot started reviewing on behalf of emirceski April 1, 2026 21:30 View session

This comment was marked as resolved.

Sign in to view

emirceski changed the title ~~wip: sidecar: per-CPU overrides for servicing restore with NVMe keepalive~~ sidecar: per-CPU overrides for servicing restore with NVMe keepalive Apr 2, 2026

smalis-msft reviewed Apr 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sidecar: per-CPU overrides for servicing restore with NVMe keepalive#3166

sidecar: per-CPU overrides for servicing restore with NVMe keepalive#3166
emirceski wants to merge 6 commits intomicrosoft:mainfrom
emirceski:sidecar-opt-v2

emirceski commented Mar 31, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

smalis-msft Apr 7, 2026

Uh oh!

emirceski Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

emirceski commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!