x86: fix extra_bi_size init value#1696
Conversation
40cf1cc to
ae2fe76
Compare
|
I'd need a bit more explanation to figure out what you're trying to do and if that is correct. |
|
The rust-seL4 bootinfo work iterates from 0..extraLen in the bootinfo (which excludes the padding) |
|
ARM's bootinfo excludes the seL4/src/arch/arm/kernel/boot.c Line 351 in ae11cf1 I am not sure if the |
|
Ok, that sounds reasonable. We need to check what people are currently doing with this field and if changing it would break things. Which existing consumers of bootinfo exist? The C capDL loader comes to mind, that does have an x64 mode and must be dealing with this field. Anything else? Maybe in the other seL4 libraries? |
|
What I know is:
|
Yes, they use them (it's necessary for x86 for ACPI stuff). |
|
I agree that Adding So I suppose the question is why we have a |
I don't see how it can be sure that that memory is safe to access, it is not guaranteed that
I think it is, for code that parses the block and stops at
But then it doesn't document the actual padding any more... The information is correct, it just conflicts with |
|
By far easiest solution would be to just remove the end The original idea seems that you can add them to fill holes to required alignments, but the header itself also has alignment requirements, so how is that supposed to work? With that in mind, they can't be used as end markers either, as they can be anywhere, so it should be safe to remove the end one. |
I think this would make this PR consistent with the removal of the header length. As you noted, yes, I suppose that was there for the padding because the padding never increments the size itself. |
|
So to summarise the discussion and to make sure I understand the boot info layout (please correct me if I'm wrong). On x86, there always is an explicit padding block with padding header after the rest of the The current I can't see any specific reason that this last padding block has to exist, so we could do what Indan is suggesting, which is remove the trailing padding header on x86. If we do that, then That all said, as far as I can see, there is no bug in the kernel. If there is one, the bug is in Rust, because the kernel does exactly what is advertised in the manual. We can still make the behaviour consistent between x86 and Arm/RISC-V, but we should probably look at what is going wrong on the Rust side first. |
The Rust side is unhappy because if makes a slice (a fat pointer with size+address) that covers from When you iterate over the bootinfo looking for an ID with the Iterator trait: return Some(BootInfoExtra {
id,
content_with_header: &self.bootinfo.extra_slice()
[content_with_header_start..content_with_header_end],
});
Because the padding tag has a length that goes up to the page aligned end (what is actually mapped), this results in an out of bounds crash from rust as that is outside The other alternative here I suppose would be to make |
But the padding tag length is as specified in the manual, no? So doesn't this mean this code is just wrong? |
|
To be more precise: what stops any other block at some point going up to the page aligned end or any other padding block occurring at some point in the boot info? |
|
Ok, now I think I understand it, and I take back what I said. There is a bug, and the kernel is not doing what it says in the manual. The problem (as I understand it), is that So the fix is either to correct that reporting or to drop the padding block, which also corrects the reporting. And if we ever do add another padding block it just needs to correctly increase I think I also can see why this never crashed in C: because it's seeing that it is a padding block, and increasing the current pointer by the length of the padding block goes > extraLen, which means it is finished. Alright, so with that, I vote for removing the trailing padding block on x86 and setting initial As far as I can see in the C capDL loader and in seL4_libs, this wouldn't change anything, the block is of course ignored anyway. Boot info becomes smaller, but I don't think the padding caused it to cross a page boundary, so even number of allocated pages etc should be the same. |
'extra_bi_size' is meant to report the full size of all the bootinfo blocks, but excluded the body size of trailing padding block in the original implementation. Plus the page boundary alignment issue that the trailing padding block was trying to solve is not really a problem. Therefore, this fix removes the trailing padding block and initialises `extra_bi_size` to 0 as what other architectures do. Signed-off-by: Terry Bai <tianyi.bai@unsw.edu.au>
ae2fe76 to
a51980d
Compare
|
This might not be relative to this PR. Can I please know what is the point of this chunk on ARM and RISCV. This chunk seems never to be touched becase |
That's the code I thought was buggy, but if it always matches, it's fine and should be removed. (For some reason I thought it got rounded up.) Mind that RISCV has exactly the same code. seL4/src/arch/riscv/kernel/boot.c Lines 327 to 331 in c8526a4 |
RISCV was very likely just copy-pasted. I can try to validate that this afternoon, but I think you're right, we should remove all of these (if they never get emitted anyway). |
|
I guess it's there as a guard against getting things wrong, but if it's that it should be an assert instead. |
|
Ok, confirmed that RISC-V is copy/paste from ARM (commit 5fc7346), and the ARM padding bit was introduced in commit 6a31a57. Even there the condition was never true. The comment and commit message do not provide much info for intention, but it's either a guard against having gotten the size/offset wrong or it was trying to do something like x86 did before, but got it wrong. In either case, we should replace them with an assert. |
extraLenis wrong on x86 due to non-zero init value ofextra_bi_size.