Skip to content

xen/arch/x86: fix TXT AP gate placement and MONITOR/MWAIT race#27

Open
accek-itl wants to merge 2 commits intoTrenchBoot:aem-staging-2026-03-13from
accek-itl:txt-ap-gate
Open

xen/arch/x86: fix TXT AP gate placement and MONITOR/MWAIT race#27
accek-itl wants to merge 2 commits intoTrenchBoot:aem-staging-2026-03-13from
accek-itl:txt-ap-gate

Conversation

@accek-itl
Copy link
Copy Markdown

@accek-itl accek-itl commented Apr 15, 2026

Two fixes for the TXT AP gate needed after upstream changes:

  1. Gate TXT AP startup earlier: the gate must be before the cpu_info.cr4 write in early assembly.
  2. Fix potential deadlock in MONITOR/MWAIT TXT AP gate loop.

Tested by booting Trenchboot-XEN on Asus LCL laptop (aem-staging-2026-03-13-accek branch which included these commits) and verifying PCRs 17, 18 are non-zero.

Related issue: TrenchBoot/trenchboot-issues#82

The gate needs to be before non-target APs attempt to access
(uninitialized) per-cpu data, which is a save to cpu_info.cr4
field in .L_after_stack_setup.

Signed-off-by: Szymon Acedański <accek@invisiblethingslab.com>
Assisted-by: Claude:claude-opus-4-6
The wait condition needs to be re-checked after MONITOR and before
MWAIT to avoid the deadlock when the trigger happens between
an earlier condition check and the MONITOR instruction.

Signed-off-by: Szymon Acedański <accek@invisiblethingslab.com>
Assisted-by: Claude:claude-opus-4-6
@m-iwanicki
Copy link
Copy Markdown

Tested on DELL Optiplex 7010 (UEFI), Xen + Trenchboot works and boots, PCRs 17 and 18 are extended

@accek
Copy link
Copy Markdown

accek commented Apr 16, 2026

I think for upstreaming these changes could be squashed into this commit: 8b76786, though this commit message part will no longer hold:

With this patch, every AP goes through assembly part, and only when in
start_secondary() in C they re-enter MONITOR/MWAIT iff they are not the
AP that was asked to boot. The same address is reused for simplicity,
and on next wakeup call APs don't have to go through assembly part
again (GDT, paging, stack setting).

* __high_start in parallel, so we gate non-target APs.
*
* In non-TXT boot APs wake one-by-one via SIPI.
*/
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would explain that this serializes APs initialization to force them waking up in accordance with the order in which BSP expects them to wake up. Could also mention STACK_CPUINFO_FIELD() below which necessitates doing it here.

Comment thread xen/arch/x86/smpboot.c
cpu_relax();
}

unsigned int txt_booting_apicid;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This data is int in other places and I think it can be static.

Suggested change
unsigned int txt_booting_apicid;
static int txt_booting_apicid;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants