Add creation of the build-id to the linker flags#435
Conversation
|
Forgot that build-id is an ELF only thing, so we can only add it on Linux and FreeBSD. |
|
Can we have some tests? |
|
Side note, I just realised we may not have reproducibility tests in this repo. We have some in BinaryBuilder: https://github.com/search?q=repo%3AJuliaPackaging%2FBinaryBuilder.jl+reproducibility+language%3AJulia+path%3A%2F%5Etest%5C%2F%2F&type=code |
|
I've blind-coded a test for this based on what I think the output should be. Let's see if that passes. it's a bit awkward because we don't have the actual elf to query here, so I had to do a |
Co-authored-by: Mosè Giordano <765740+giordano@users.noreply.github.com>
|
Can ObjectFile.jl read the notes? |
It can, but I didn't see any place where we can get the actual compiled file out in these tests already (or at least I didn't see any tests doing it), and we didn't have ObjectFile.jl in the deps already, so I didn't dive deeper into trying to make that work. Using ObjectFile.jl, this is pseudo-code to test if the build-id section exists (and can also form the basis for an Audit pass in BinaryBuilder.jl): |
|
And I guess |
|
It's not quite as easy on Windows. Apparently, the support for build-id was added in binutils 2.25, which means we can only support the build-id on Windows for GCC >= 5.2.0 and not the default GCC 4.8.5 we use. |
|
You should be able to access |
|
Yep, that variable should be accessible. I just went down the rabbit hole of also adding a test for the Windows COFF format including the build-id info. |
|
Ok, after going down yet another rabbit hole to get the tests for the build-id running on the Windows binaries, this should be good to go now. We will add a reproducible build-id to all platforms other than macOS automatically now (but I think I read somewhere macOS compilers provided a build-id already, so that shouldn't be a problem). |
|
So I was just thinking... I don't know if this would have covered the Rust, Go or OCaml toolchain binaries. I'll have to look at it a bit later to see if those will also get the build-id from this, or if we will have to inject the linker flag separately into their toolchains. |
|
This broke reproducibility of Windows binaries with GCC 5+: https://github.com/JuliaPackaging/BinaryBuilder.jl/blob/d41ddc84157a8c4b490761b0fb5f66cf32746ce6/test/auditing.jl#L796-L839 |
|
Are you comparing reproducibility between two binaries compiled with this option or against a static hash computed earlier? It is expected that there would be a one-time change in binary hashes because of this, because it is adding a new entry into the headers of the binaries that wasn't there before. |
|
I'm updating the hashes in JuliaPackaging/BinaryBuilder.jl#1401. Compiling twice with the new option you get two different binaries, but apparently only on Windows. |
|
On Windows I'm told that this should appear to be a copy of the CodeView GUID value already in the binary |
|
Reproducer: julia> BinaryBuilderBase.runshell(Platform("i686", "windows"); preferred_gcc_version=v"6");
sandbox:${WORKSPACE} # echo 'int foo(){ return 42; }' | SUPER_VERBOSE=1 cc -x c -shared - -o libfoo.${dlext} -Wl,--out-implib,libfoo.${dlext}.a
ccache /opt/i686-w64-mingw32/bin/i686-w64-mingw32-gcc -D_GLIBCXX_USE_CXX11_ABI=1 -frandom-seed=0x8ac3fab9 -DWINVER=0x0A00 -D_WIN32_WINNT=0x0A00 -march=pentium4 -mtune=generic -x c -shared - -o libfoo.dll -Wl,--out-implib,libfoo.dll.a -Wl,--no-insert-timestamp -Wl,--build-id=sha1
sandbox:${WORKSPACE} # sha256sum libfoo.dll*
edfd3305aec90e38954c80ef2d3fea1d9a966abf16602a0d157c7cff5c7cfbc1 libfoo.dll
c20ac05a44844b9d3180ac2ba1e206e9a11e247bc6617f4a58cd5ee216991d5c libfoo.dll.a
sandbox:${WORKSPACE} # echo 'int foo(){ return 42; }' | SUPER_VERBOSE=1 cc -x c -shared - -o libfoo.${dlext} -Wl,--out-implib,libfoo.${dlext}.a
ccache /opt/i686-w64-mingw32/bin/i686-w64-mingw32-gcc -D_GLIBCXX_USE_CXX11_ABI=1 -frandom-seed=0x8ac3fab9 -DWINVER=0x0A00 -D_WIN32_WINNT=0x0A00 -march=pentium4 -mtune=generic -x c -shared - -o libfoo.dll -Wl,--out-implib,libfoo.dll.a -Wl,--no-insert-timestamp -Wl,--build-id=sha1
sandbox:${WORKSPACE} # sha256sum libfoo.dll*
19a31cc606dc96cf7aafb8b2a6929a6cd810c9584f7936a36410266c1e4cac14 libfoo.dll
7eb234fd60bd6d17ef6b5753819d6aac7c5a7762ba690d74284eeffab006e145 libfoo.dll.aI tried to compare the two dlls with https://try.diffoscope.org, but it crashed with an OOM. However this may be specific to the i686 toolchain, I believe the x86_64 one is still reproducible. |
Do you know if that is implemented in Looking at the current source tree for binutils (https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/ldbuildid.c;h=a4a6dc3c3e9403e72bcd52eaa6dfda39fd912586;hb=HEAD), it seems our options for build-id types are: Right now, we have it set to SHA1, because that was supposed to be reproducible. |
This is the toolchain part of the implementation of the build-id. This should now make the linker build in a build-id to the libraries/executables, and use one that also allows reproducibility.
ref: JuliaPackaging/Yggdrasil#11013, JuliaPackaging/BinaryBuilder.jl#1272