Skip to content

Fix patchelf concurrency issues. Show subprocess output in errors#1410

Merged
giordano merged 11 commits into
masterfrom
ib/show_subprocess_io
Nov 29, 2025
Merged

Fix patchelf concurrency issues. Show subprocess output in errors#1410
giordano merged 11 commits into
masterfrom
ib/show_subprocess_io

Conversation

@IanButterworth

@IanButterworth IanButterworth commented Nov 28, 2025

Copy link
Copy Markdown
Member

Summary

This PR fixes two issues in the auditor:

  1. Improved error reporting: When patchelf or other subprocess commands fail, the error now includes the subprocess output, making debugging much easier.

  2. Fixed race condition in patchelf operations: The auditor uses nested @threads loops—iterating over files in parallel, and within each file, iterating over libraries in parallel. When multiple threads try to relink different libraries in the same binary, they call patchelf on the same file concurrently, corrupting the ELF headers and causing "Missing ELF header" errors.

Why this wasn't an issue before

In Julia ≤1.7, nested @threads loops effectively serialized because the default :static scheduling pinned iterations to threads and didn't support nesting. Starting in Julia 1.8, @threads defaults to :dynamic scheduling (JuliaLang/julia#43919, #44136), which enables true nested parallelism—exposing the latent race condition.

i.e. in 1.7 the inner @threads becomes a standard for loop.
julia +1.7 -t6

julia> Threads.@threads for i in 1:3
           @info "outer i$i - $(Threads.threadid())"
           Threads.@threads for j in 1:3
               @info "inner j$j i$i - $(Threads.threadid())"
           end
       end
[ Info: outer i1 - 1
[ Info: outer i2 - 2
[ Info: outer i3 - 3
[ Info: inner j1 i3 - 3
[ Info: inner j2 i3 - 3
[ Info: inner j3 i3 - 3
[ Info: inner j1 i2 - 2
[ Info: inner j2 i2 - 2
[ Info: inner j3 i2 - 2
[ Info: inner j1 i1 - 1
[ Info: inner j2 i1 - 1
[ Info: inner j3 i1 - 1

julia +1.12 -t6

julia> Threads.@threads for i in 1:3
           @info "outer i$i - $(Threads.threadid())"
           Threads.@threads for j in 1:3
               @info "inner j$j i$i - $(Threads.threadid())"
           end
       end
[ Info: outer i3 - 4
[ Info: outer i1 - 6
[ Info: outer i2 - 5
[ Info: inner j1 i1 - 2
[ Info: inner j3 i3 - 6
[ Info: inner j2 i1 - 5
[ Info: inner j1 i2 - 3
[ Info: inner j2 i3 - 2
[ Info: inner j1 i3 - 4
[ Info: inner j3 i1 - 4
[ Info: inner j2 i2 - 6
[ Info: inner j3 i2 - 7

Problem

The inner loop in check_dynamic_linkage runs in parallel:

Threads.@threads for libname in collect(keys(libs))

Each iteration may call patchelf (via update_linkage, relink_to_rpath, or ensure_soname) on the same binary file (path(oh)). Without synchronization, concurrent patchelf writes corrupt the file.

Solution

Add per-file locking via with_patchelf_lock(path) to serialize patchelf operations on the same binary
Capture subprocess output and include it in error messages when commands fail

Comment thread azure-pipelines.yml Outdated
@IanButterworth IanButterworth changed the title Show subprocess output in error message when run_with_io fails Fix concurrency issues. Show subprocess output in errors Nov 29, 2025
@IanButterworth IanButterworth changed the title Fix concurrency issues. Show subprocess output in errors Fix patchelf concurrency issues. Show subprocess output in errors Nov 29, 2025
Remove nested Threads.@threads over libraries since the outer loop over
files already provides parallelism. Nested threading caused:
1. Race conditions with patchelf/install_name_tool on the same file
2. ConcurrencyViolationError in OutputCollectors when multiple sandbox
   operations ran in parallel
@giordano giordano force-pushed the ib/show_subprocess_io branch from cb6c03e to 5280b43 Compare November 29, 2025 11:30
Wrap the entire read-modify-verify cycle in with_patchelf_lock() for
Linux/BSD ELF operations. This prevents concurrent ObjectFile reads
from failing with EOFError when another thread is running patchelf
on the same file.

Split update_linkage into _update_linkage_macho and _update_linkage_elf,
and ensure_soname into _ensure_soname_macho and _ensure_soname_elf for
cleaner platform separation.
Sys.isbsd(platform) returns true for macOS since Darwin is BSD-based,
but macOS uses install_name_tool not patchelf. Check Sys.isapple first.
@giordano giordano force-pushed the ib/show_subprocess_io branch from 5280b43 to bb5f705 Compare November 29, 2025 11:31
Comment thread src/auditor/dynamic_linkage.jl Outdated
return
end

# macOS uses install_name_tool (check first since Sys.isbsd is true for macOS too)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment

Suggested change
# macOS uses install_name_tool (check first since Sys.isbsd is true for macOS too)
# macOS uses install_name_tool

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude initially didn't realize isbsd covered macos. This is just a correction comment that I should've dropped

Comment thread src/auditor/soname_matching.jl
Comment thread src/auditor/soname_matching.jl Outdated
end

function _ensure_soname_macho(prefix::Prefix, path::AbstractString, platform::AbstractPlatform;
verbose::Bool = false, autofix::Bool = false, subdir::AbstractString="")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this function takes autofix as argument is code smell: we shouldn't call this function to start with if autofixis false.

Comment thread src/auditor/soname_matching.jl Outdated
Comment on lines +88 to +104
if !retval
@lock AUDITOR_LOGGING_LOCK @warn("Unable to set SONAME on $(rel_path)")
return false
end

# Read the SONAME back in and ensure it's set properly
new_soname = get_soname(path)
if new_soname != soname
@lock AUDITOR_LOGGING_LOCK @warn("Set SONAME on $(rel_path) to $(soname), but read back $(string(new_soname))!")
return false
end

if verbose
@lock AUDITOR_LOGGING_LOCK @info("Set SONAME of $(rel_path) to \"$(soname)\"")
end

return true

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More duplication

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure defining the two functions buys us anything, I'm going to remove them.

Comment thread src/auditor/soname_matching.jl Outdated
return true
end

# macOS uses install_name_tool (check first since Sys.isbsd is true for macOS too)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun, also weird comments are duplicated

@giordano giordano force-pushed the ib/show_subprocess_io branch 2 times, most recently from ac7aa14 to f939257 Compare November 29, 2025 12:16
@giordano giordano force-pushed the ib/show_subprocess_io branch from f939257 to 8440c20 Compare November 29, 2025 12:17
@giordano giordano merged commit ee05c44 into master Nov 29, 2025
10 checks passed
@giordano giordano deleted the ib/show_subprocess_io branch November 29, 2025 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants