This issue is not a kernel hardening problem. The hardening changes exposed a pre‑existing latent issue in UEFI, which becomes visible after enabling early kernel logging and memblock debugging.
Summary of the issue
While enabling kernel hardening, the system hits S2 faults when HLOS accesses address 0x91a80000.
- From the memory map (IP Catalog – MonacoAU), this address falls under the Gunyah MD region.
- This region should be removed by UEFI before handing off to HLOS, but that does not appear to be happening.
Evidence
- Hypervisor log
qhee_hyp_assign_remove_memory: 0x91a80000/0x80000 -> ret 0
- UEFI log
DtPlatformLoadDtbBlob qclinux_fit.img Loading Failed status=0xE
DtPlatformLoadDtbBlob Loading Legacy MultiDtb
OS DTB found. Model = Qualcomm Technologies, Inc. QCS8300 Ride
Reading of OsConfigTableSelection failed, checking DT settings
- Kernel crash
- Synchronous external abort during early boot
- Fault occurs while freeing pages in
memblock_free_all(), indicating invalid memory being exposed to the kernel.
[ 0.000000] Internal error: synchronous external abort: 0000000096000010 SMP
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Tainted: G M 6.18.18-g7ea01f6a3cff-dirty PREEMPT
[ 0.000000] Tainted: [M]=MACHINE_CHECK
[ 0.000000] Hardware name: Qualcomm Technologies, Inc. QCS8300 Ride (DT)
[ 0.000000] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.000000] pc : clear_page+0x30/0x6c
[ 0.000000] lr : __free_pages_ok+0x144/0x5d0
[ 0.000000] sp : ffffa9e8499d3d00
[ 0.000000] x29: ffffa9e8499d3d00 x28: fffffdffc046a400 x27: 0000000000000000
[ 0.000000] x26: 0000000000000000 x25: 0000000000091a88 x24: ffff000efe6381c0
[ 0.000000] x23: ffffff0000000000 x22: 0000000000000003 x21: 00003c000046a400
[ 0.000000] x20: fffffdffc046a200 x19: 00003c000046a240 x18: 0000000000000006
[ 0.000000] x17: 00000000000ef927 x16: 0000000000000000 x15: 0000000000000018
[ 0.000000] x14: 0000000093b00000 x13: 0000000000000000 x12: ffffa9e849f576c0
[ 0.000000] x11: 00000000a90b0000 x10: 0000000091a88000 x9 : 0000000000000000
[ 0.000000] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 0.000000] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 0.000000] x2 : 0000000000000004 x1 : 0000000000000040 x0 : ffff000011a88000
[ 0.000000] Call trace:
[ 0.000000] clear_page+0x30/0x6c (P)
[ 0.000000] __free_pages_core+0xc4/0x140
[ 0.000000] memblock_free_pages+0x18/0x28
[ 0.000000] memblock_free_all+0x1d8/0x284
[ 0.000000] mm_core_init+0xe8/0x120
[ 0.000000] start_kernel+0x4f0/0x780
[ 0.000000] __primary_switched+0x88/0x90
[ 0.000000] Code: 37200121 12000c21 d2800082 9ac12041 (d50b7420)
[ 0.000000] ---[ end trace 0000000000000000 ]---
This confirms that HLOS is accessing memory that should have been reserved/removed earlier by UEFI.
Current status
- An internal thread is ongoing with the UEFI team to fix the root cause properly.
- However, the UEFI fix will take time to land and propagate, and we cannot block kernel progress until then.
Temporary workaround
Disabling the following hardening configs avoids touching the problematic memory region and allows the device to boot reliably:
CONFIG_INIT_ON_FREE_DEFAULT_ON
CONFIG_INIT_STACK_ALL_ZERO
CONFIG_INIT_ON_ALLOC_DEFAULT_ON
With these options disabled, QCS8300 boots fine.
Plan
- Merge this PR with the above hardening options temporarily disabled.
- Once the UEFI fix is available and merged, these kernel hardening options will be re‑enabled.
Thanks for understanding the situation and the need for this interim solution.
Please let me know if any concern.
Originally posted by @jaihindy in #1904 (comment)
This issue is not a kernel hardening problem. The hardening changes exposed a pre‑existing latent issue in UEFI, which becomes visible after enabling early kernel logging and memblock debugging.
Summary of the issue
While enabling kernel hardening, the system hits S2 faults when HLOS accesses address 0x91a80000.
Evidence
qhee_hyp_assign_remove_memory: 0x91a80000/0x80000 -> ret 0
DtPlatformLoadDtbBlob qclinux_fit.img Loading Failed status=0xE
DtPlatformLoadDtbBlob Loading Legacy MultiDtb
OS DTB found. Model = Qualcomm Technologies, Inc. QCS8300 Ride
Reading of OsConfigTableSelection failed, checking DT settings
memblock_free_all(), indicating invalid memory being exposed to the kernel.[ 0.000000] Internal error: synchronous external abort: 0000000096000010 SMP
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Tainted: G M 6.18.18-g7ea01f6a3cff-dirty PREEMPT
[ 0.000000] Tainted: [M]=MACHINE_CHECK
[ 0.000000] Hardware name: Qualcomm Technologies, Inc. QCS8300 Ride (DT)
[ 0.000000] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.000000] pc : clear_page+0x30/0x6c
[ 0.000000] lr : __free_pages_ok+0x144/0x5d0
[ 0.000000] sp : ffffa9e8499d3d00
[ 0.000000] x29: ffffa9e8499d3d00 x28: fffffdffc046a400 x27: 0000000000000000
[ 0.000000] x26: 0000000000000000 x25: 0000000000091a88 x24: ffff000efe6381c0
[ 0.000000] x23: ffffff0000000000 x22: 0000000000000003 x21: 00003c000046a400
[ 0.000000] x20: fffffdffc046a200 x19: 00003c000046a240 x18: 0000000000000006
[ 0.000000] x17: 00000000000ef927 x16: 0000000000000000 x15: 0000000000000018
[ 0.000000] x14: 0000000093b00000 x13: 0000000000000000 x12: ffffa9e849f576c0
[ 0.000000] x11: 00000000a90b0000 x10: 0000000091a88000 x9 : 0000000000000000
[ 0.000000] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 0.000000] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 0.000000] x2 : 0000000000000004 x1 : 0000000000000040 x0 : ffff000011a88000
[ 0.000000] Call trace:
[ 0.000000] clear_page+0x30/0x6c (P)
[ 0.000000] __free_pages_core+0xc4/0x140
[ 0.000000] memblock_free_pages+0x18/0x28
[ 0.000000] memblock_free_all+0x1d8/0x284
[ 0.000000] mm_core_init+0xe8/0x120
[ 0.000000] start_kernel+0x4f0/0x780
[ 0.000000] __primary_switched+0x88/0x90
[ 0.000000] Code: 37200121 12000c21 d2800082 9ac12041 (d50b7420)
[ 0.000000] ---[ end trace 0000000000000000 ]---
This confirms that HLOS is accessing memory that should have been reserved/removed earlier by UEFI.
Current status
Temporary workaround
Disabling the following hardening configs avoids touching the problematic memory region and allows the device to boot reliably:
CONFIG_INIT_ON_FREE_DEFAULT_ONCONFIG_INIT_STACK_ALL_ZEROCONFIG_INIT_ON_ALLOC_DEFAULT_ONWith these options disabled, QCS8300 boots fine.
Plan
Thanks for understanding the situation and the need for this interim solution.
Please let me know if any concern.
Originally posted by @jaihindy in #1904 (comment)