In Fedora CoreOS we've been running into an odd bug in our upgrade tests in our pipeline. I filed an issue for that over in coreos/fedora-coreos-tracker#2149 (it's a good read if you have time).
Following on from that I had AI analyze to see if the bug was fixed in newer code and it thinks it isn't and needs to be fixed. I generated this AI bug report which I think summarizes the problem well:
Summary
skopeo experimental-image-proxy hangs permanently when an HTTP connection stalls mid-transfer during a blob download. The proxy's FinishPipe handler blocks on wg.Wait() while holding the handler mutex, which deadlocks the entire proxy — no further requests can be processed, and the calling application (rpm-ostree, bootc, etc.) hangs indefinitely.
This was observed during an automated Fedora CoreOS upgrade via zincati/rpm-ostree. The Akamai CDN serving a Quay.io blob stopped sending data at 76% completion while keeping the TCP connection alive. Because there is no idle timeout anywhere in the pipeline (containers/image HTTP client, skopeo proxy goroutine, or the proxy protocol), the entire system deadlocked permanently.
Environment
- OS: Fedora CoreOS (cosa-devsh VM, QEMU NAT networking)
- Package:
rpm-ostree-2025.6-6.fc42.x86_64
- skopeo: bundled with rpm-ostree (protocol version
0.2.6)
- Image:
quay.io/fedora/fedora-coreos@sha256:c6a73015376ade9c5ae50e8cca5356e542a51edb8df4a05590abfaf268edeaa9
- Triggered by:
zincati.service (automatic update agent)
- Proxy invocation:
setpriv --no-new-privs --init-groups --reuid rpm-ostree --bounding-set -all --pdeathsig TERM -- skopeo experimental-image-proxy --no-creds
Symptoms
With CONTAINERS_IMAGE_PROXY_DEBUG=1, the process prints the following and then hangs forever:
DEBUG new_with_config: Spawned skopeo pid=3978 config=ImageProxyConfig { authfile: None, auth_data: None, auth_anonymous: true, ... }
DEBUG new_with_config: Remote protocol version: 0.2.6 ...
Fetching ostree-remote-image:fedora:docker://quay.io/fedora/fedora-coreos@sha256:c6a730...
DEBUG open_image: opening image ...
DEBUG finish_pipe: closing pipe ... pipeid=7
DEBUG finish_pipe: closing pipe ... pipeid=8
layers already present: 36; layers needed: 15 (245.6 MB)
DEBUG Fetching base
DEBUG get_layer_info: Getting layer info ...
DEBUG fetching sha256:a283e9be7391dc4a3d858e321574277c621faad113a8f6b36e0511cd05352448
DEBUG get_blob: fetching blob ... size=2759859
DEBUG layer progress: [0/15] Fetching ostree chunk a283e9be7391dc4a3d8 (2.8 MB)
DEBUG finish_pipe: closing pipe ... pipeid=8
← hangs here forever, no further output
Root Cause Analysis
Three issues combine to produce a permanent deadlock:
1. CDN stalls mid-transfer (trigger)
The Akamai CDN edge server (23.61.255.227, serving cdn01.quay.io) stopped sending data after delivering 76% (2,124,175 of 2,759,859 bytes) of a blob. The TCP connection remained alive — keepalive ACKs continued flowing — but no application data was sent. ss -tnpi output during the hang:
ESTAB 0 0 10.0.2.15:50856 23.61.255.227:443 users:(("skopeo",pid=3978,fd=6))
bytes_received:2124175
lastsnd:1341635 ← last data sent ~22 minutes ago
lastrcv:1341446 ← last data received ~22 minutes ago
lastack:12438 ← TCP keepalive ACKs still flowing (~12 sec ago)
rcv_wnd:2636640 ← skopeo advertising 2.5 MB receive window (not a flow control issue)
2. No idle timeout on HTTP body reads (containers/image)
The containers/image library's HTTP transport (pkg/tlsclientconfig/tlsclientconfig.go:88-101) sets TCP-level timeouts but has no HTTP body read timeout:
func NewTransport() *http.Transport {
direct := &net.Dialer{
Timeout: 30 * time.Second, // TCP connect only
KeepAlive: 30 * time.Second, // TCP keepalive interval
}
tr := &http.Transport{
Proxy: http.ProxyFromEnvironment,
DialContext: direct.DialContext,
TLSHandshakeTimeout: 10 * time.Second,
IdleConnTimeout: 90 * time.Second, // idle conn pool, NOT active reads
MaxIdleConns: 100,
// No ResponseHeaderTimeout
// No overall Timeout on http.Client
// No body-read idle timeout
}
return tr
}
The bodyReader wrapper (docker/body_reader.go:139) tracks lastSuccessTime (line 147) but only uses it for debug logging (line 224-226), never to enforce a deadline. Its reconnection logic only handles hard failures (ECONNRESET, ErrUnexpectedEOF), not stalls where the connection stays open but no data arrives.
3. FinishPipe holds mutex during wg.Wait() — deadlocks entire proxy (skopeo/container-libs)
In the proxy handler (common/pkg/json-proxy/handler.go):
GetBlob (line 427) spawns a goroutine that copies the blob from the HTTP response body to a pipe:
func (h *handler) GetBlob(ctx context.Context, args []any) (replyBuf, error) {
h.lock.Lock()
defer h.lock.Unlock()
// ...
go func() {
defer blobr.Close()
defer f.wg.Done()
// ...
n, err := io.Copy(f.w, tr) // line 472 — blocks if HTTP body stalls
// ...
}()
// returns pipeid + fd to client
}
The goroutine at line 466 does not use any context — it does a raw io.Copy with no cancellation mechanism. If the HTTP body read stalls, this goroutine blocks forever.
FinishPipe (line 664) waits for the goroutine to complete while holding the handler mutex:
func (h *handler) FinishPipe(ctx context.Context, args []any) (replyBuf, error) {
h.lock.Lock() // line 665
defer h.lock.Unlock() // line 666
// ...
f.wg.Wait() // line 682 — blocks forever if goroutine is stuck
// ...
}
Since h.lock is held, all other handler methods (GetBlob, GetManifest, OpenImage, etc.) are blocked. The proxy cannot process any requests — including a potential shutdown/cleanup — until the stalled goroutine completes, which it never will because the CDN has stopped sending data.
Deadlock Diagram
rpm-ostree (pid 3980) skopeo proxy (pid 3986) Akamai CDN
sendto(29, FinishPipe[8]) ────────► recvfrom(3, FinishPipe[8])
h.lock.Lock() ← acquired
f.wg.Wait() ← BLOCKED
recvmsg(29) ← BLOCKED │
waiting for response │ goroutine still running:
that never comes │ io.Copy(pipe, httpBody)
│ │
│ └──► Read(fd=6) ──────────► CDN stalled
│ EAGAIN no data
│ epoll: no events keepalive OK
│ goroutine parked connection alive
│ but silent
▼
wg never Done()
lock never released
proxy fully deadlocked
Protocol Message Trace (from strace -f)
The image proxy protocol uses a SOCK_SEQPACKET Unix socket. JSON requests are sent from rpm-ostree, and JSON responses (with optional file descriptors via SCM_RIGHTS) come back from skopeo. The strace shows 9 requests sent but only 8 responses received:
| # |
Request |
Response |
Status |
| 1 |
version handshake |
"0.2.6" |
OK |
| 2 |
OpenImage(docker://quay.io/...) |
value: 1 |
OK |
| 3 |
GetManifest(1) |
sha256:c6a7..., pipeid=7, fd passed |
OK |
| 4 |
FinishPipe(7) |
null |
OK |
| 5 |
get config/metadata |
null, pipeid=8, fd passed |
OK |
| 6 |
FinishPipe(8) |
null |
OK |
| 7 |
GetLayerInfo(1) |
8178 bytes of layer JSON |
OK |
| 8 |
GetBlob(1, "sha256:a283e9...", 2759859) |
value: 2759859, pipeid=8, fd passed |
OK |
| 9 |
FinishPipe(8) |
NO RESPONSE |
DEADLOCK |
Evidence from strace — skopeo received all 9 requests on fd 3:
[pid 3978] recvfrom(3, "{\"method\":\"OpenImage\",\"args\":[\"docker://quay.io/...\"]}", ...)
[pid 3984] recvfrom(3, "{\"method\":\"GetManifest\",\"args\":[1]}", ...)
[pid 3984] recvfrom(3, "{\"method\":\"FinishPipe\",\"args\":[7]}", ...)
[pid 3986] recvfrom(3, "{\"method\":\"FinishPipe\",\"args\":[8]}", ...)
[pid 3986] recvfrom(3, "{\"method\":\"GetLayerInfo\",\"args\":[1]}", ...)
[pid 3986] recvfrom(3, "{\"method\":\"GetBlob\",\"args\":[1,\"sha256:a283e9...\",2759859]}", ...)
[pid 3986] recvfrom(3, "{\"method\":\"FinishPipe\",\"args\":[8]}", ...) ← received, never responded
But only 8 sendmsg(3, ...) responses were sent — the 9th (for FinishPipe(8)) was never sent because the handler is stuck in wg.Wait().
Evidence from strace — rpm-ostree blocked waiting for the response:
[pid 3980] sendto(29, '{"method":"FinishPipe","args":[8]}', ...) = 34
[pid 3980] recvmsg(29, ... ← BLOCKED, waiting for response that never comes
Evidence from strace — skopeo's blob streaming goroutine stalled:
[pid 3986] write(8, ..., 4264) = 4264 # last chunk written to pipe (fd 8)
[pid 3986] read(6, ...) = -1 EAGAIN # no more data from CDN TLS connection (fd 6)
[pid 3986] epoll_pwait(4, [], ...) = 0 # no events on any fd
[pid 3986] futex(FUTEX_WAIT, ...) # goroutine parked by Go runtime
Download progress at stall
| Metric |
Value |
| Expected blob size |
2,759,859 bytes |
Bytes received from CDN (ss bytes_received) |
2,124,175 bytes |
Bytes written to pipe (summed from strace write(8,...)) |
~2,107,617 bytes |
| Completion |
76.4% |
Why This Only Affects the Proxy Path
This deadlock does not occur with direct podman pull or skopeo copy because:
-
Direct pulls use the caller's context (which can be canceled via ctrl-C, systemd timeout, etc.). When the context is canceled, Go's net/http closes the underlying TCP connection, and the Read() returns an error. The stall is recoverable.
-
The proxy path passes context.Background() to the handler — no deadline, no cancellation. The io.Copy goroutine has no context at all. The only thing that can unblock it is the TCP connection actually dying (via TCP keepalive, which can take hours with the OS defaults).
-
The mutex in FinishPipe amplifies the impact: a stalled blob download doesn't just block that one operation — it deadlocks the entire proxy, preventing any other request from being processed including potential cleanup/shutdown.
Suggested Fixes
Fix 1 (proxy — high priority): Don't hold mutex during wg.Wait()
In handler.go, FinishPipe should release the mutex before waiting:
func (h *handler) FinishPipe(ctx context.Context, args []any) (replyBuf, error) {
h.lock.Lock()
// ... validate pipeid, get f ...
h.lock.Unlock() // release before blocking
f.wg.Wait() // now other requests can proceed while we wait
h.lock.Lock() // re-acquire to clean up
delete(h.activePipes, pipeid)
h.lock.Unlock()
f.w.Close()
return ret, f.err
}
Fix 2 (proxy — high priority): Wire context through to the io.Copy goroutine
The GetBlob goroutine should respect context cancellation:
go func() {
defer blobr.Close()
defer f.wg.Done()
verifier := d.Verifier()
tr := io.TeeReader(blobr, verifier)
// Use a context-aware copy
done := ctx.Done()
buf := make([]byte, 32*1024)
for {
select {
case <-done:
f.err = ctx.Err()
return
default:
}
n, err := tr.Read(buf)
if n > 0 {
if _, werr := f.w.Write(buf[:n]); werr != nil {
f.err = werr
return
}
}
if err != nil {
if err != io.EOF {
f.err = err
}
return
}
}
}()
Fix 3 (containers/image — medium priority): Add idle timeout to bodyReader
The bodyReader in docker/body_reader.go already tracks lastSuccessTime. It should use this to enforce an idle timeout, e.g.:
func (br *bodyReader) Read(p []byte) (int, error) {
// Set a per-read deadline on the underlying connection
if br.idleTimeout > 0 && !br.lastSuccessTime.IsZero() {
if time.Since(br.lastSuccessTime) > br.idleTimeout {
return 0, fmt.Errorf("read idle timeout after %v with no data", br.idleTimeout)
}
}
// ... existing Read logic ...
}
Or better, wrap the response body with a deadline-enforcing reader at the http.Transport level.
This would fix the issue for all consumers of containers/image, not just the proxy.
Workaround
Kill the skopeo process to unblock rpm-ostree:
sudo kill $(pgrep -f 'skopeo experimental-image-proxy')
rpm-ostree/zincati will detect the child exit and can retry the upgrade.
Related
containers/image issue #1145 — RFE for per-layer retry on blob download failures
containers/image PR #1469 — proposed reducing TCP KeepAlive from 30s to 15s for faster dead connection detection
containers/image docker/body_reader.go — reconnection logic handles ECONNRESET/ErrUnexpectedEOF but not stalls
containers/container-libs common/pkg/json-proxy/handler.go — proxy handler with the mutex/wg.Wait deadlock
In Fedora CoreOS we've been running into an odd bug in our upgrade tests in our pipeline. I filed an issue for that over in coreos/fedora-coreos-tracker#2149 (it's a good read if you have time).
Following on from that I had AI analyze to see if the bug was fixed in newer code and it thinks it isn't and needs to be fixed. I generated this AI bug report which I think summarizes the problem well:
Summary
skopeo experimental-image-proxyhangs permanently when an HTTP connection stalls mid-transfer during a blob download. The proxy'sFinishPipehandler blocks onwg.Wait()while holding the handler mutex, which deadlocks the entire proxy — no further requests can be processed, and the calling application (rpm-ostree, bootc, etc.) hangs indefinitely.This was observed during an automated Fedora CoreOS upgrade via zincati/rpm-ostree. The Akamai CDN serving a Quay.io blob stopped sending data at 76% completion while keeping the TCP connection alive. Because there is no idle timeout anywhere in the pipeline (
containers/imageHTTP client, skopeo proxy goroutine, or the proxy protocol), the entire system deadlocked permanently.Environment
rpm-ostree-2025.6-6.fc42.x86_640.2.6)quay.io/fedora/fedora-coreos@sha256:c6a73015376ade9c5ae50e8cca5356e542a51edb8df4a05590abfaf268edeaa9zincati.service(automatic update agent)setpriv --no-new-privs --init-groups --reuid rpm-ostree --bounding-set -all --pdeathsig TERM -- skopeo experimental-image-proxy --no-credsSymptoms
With
CONTAINERS_IMAGE_PROXY_DEBUG=1, the process prints the following and then hangs forever:Root Cause Analysis
Three issues combine to produce a permanent deadlock:
1. CDN stalls mid-transfer (trigger)
The Akamai CDN edge server (
23.61.255.227, servingcdn01.quay.io) stopped sending data after delivering 76% (2,124,175 of 2,759,859 bytes) of a blob. The TCP connection remained alive — keepalive ACKs continued flowing — but no application data was sent.ss -tnpioutput during the hang:2. No idle timeout on HTTP body reads (
containers/image)The
containers/imagelibrary's HTTP transport (pkg/tlsclientconfig/tlsclientconfig.go:88-101) sets TCP-level timeouts but has no HTTP body read timeout:The
bodyReaderwrapper (docker/body_reader.go:139) trackslastSuccessTime(line 147) but only uses it for debug logging (line 224-226), never to enforce a deadline. Its reconnection logic only handles hard failures (ECONNRESET,ErrUnexpectedEOF), not stalls where the connection stays open but no data arrives.3.
FinishPipeholds mutex duringwg.Wait()— deadlocks entire proxy (skopeo/container-libs)In the proxy handler (
common/pkg/json-proxy/handler.go):GetBlob(line 427) spawns a goroutine that copies the blob from the HTTP response body to a pipe:The goroutine at line 466 does not use any context — it does a raw
io.Copywith no cancellation mechanism. If the HTTP body read stalls, this goroutine blocks forever.FinishPipe(line 664) waits for the goroutine to complete while holding the handler mutex:Since
h.lockis held, all other handler methods (GetBlob,GetManifest,OpenImage, etc.) are blocked. The proxy cannot process any requests — including a potential shutdown/cleanup — until the stalled goroutine completes, which it never will because the CDN has stopped sending data.Deadlock Diagram
Protocol Message Trace (from
strace -f)The image proxy protocol uses a
SOCK_SEQPACKETUnix socket. JSON requests are sent from rpm-ostree, and JSON responses (with optional file descriptors viaSCM_RIGHTS) come back from skopeo. The strace shows 9 requests sent but only 8 responses received:"0.2.6"OpenImage(docker://quay.io/...)value: 1GetManifest(1)sha256:c6a7...,pipeid=7, fd passedFinishPipe(7)nullnull,pipeid=8, fd passedFinishPipe(8)nullGetLayerInfo(1)GetBlob(1, "sha256:a283e9...", 2759859)value: 2759859,pipeid=8, fd passedFinishPipe(8)Evidence from strace — skopeo received all 9 requests on fd 3:
But only 8
sendmsg(3, ...)responses were sent — the 9th (forFinishPipe(8)) was never sent because the handler is stuck inwg.Wait().Evidence from strace — rpm-ostree blocked waiting for the response:
Evidence from strace — skopeo's blob streaming goroutine stalled:
Download progress at stall
ss bytes_received)write(8,...))Why This Only Affects the Proxy Path
This deadlock does not occur with direct
podman pullorskopeo copybecause:Direct pulls use the caller's context (which can be canceled via ctrl-C, systemd timeout, etc.). When the context is canceled, Go's
net/httpcloses the underlying TCP connection, and theRead()returns an error. The stall is recoverable.The proxy path passes
context.Background()to the handler — no deadline, no cancellation. Theio.Copygoroutine has no context at all. The only thing that can unblock it is the TCP connection actually dying (via TCP keepalive, which can take hours with the OS defaults).The mutex in
FinishPipeamplifies the impact: a stalled blob download doesn't just block that one operation — it deadlocks the entire proxy, preventing any other request from being processed including potential cleanup/shutdown.Suggested Fixes
Fix 1 (proxy — high priority): Don't hold mutex during
wg.Wait()In
handler.go,FinishPipeshould release the mutex before waiting:Fix 2 (proxy — high priority): Wire context through to the
io.CopygoroutineThe
GetBlobgoroutine should respect context cancellation:Fix 3 (
containers/image— medium priority): Add idle timeout tobodyReaderThe
bodyReaderindocker/body_reader.goalready trackslastSuccessTime. It should use this to enforce an idle timeout, e.g.:Or better, wrap the response body with a deadline-enforcing reader at the
http.Transportlevel.This would fix the issue for all consumers of
containers/image, not just the proxy.Workaround
Kill the skopeo process to unblock rpm-ostree:
rpm-ostree/zincati will detect the child exit and can retry the upgrade.
Related
containers/imageissue #1145 — RFE for per-layer retry on blob download failurescontainers/imagePR #1469 — proposed reducing TCP KeepAlive from 30s to 15s for faster dead connection detectioncontainers/imagedocker/body_reader.go— reconnection logic handlesECONNRESET/ErrUnexpectedEOFbut not stallscontainers/container-libscommon/pkg/json-proxy/handler.go— proxy handler with the mutex/wg.Wait deadlock