Skip to content

fix(mediation): add ACK handshake to drain buffer before sending stdio fds#55

Merged
kipz merged 1 commit into
developfrom
kipz/fix-emsgsize-stdio-ack
Jun 24, 2026
Merged

fix(mediation): add ACK handshake to drain buffer before sending stdio fds#55
kipz merged 1 commit into
developfrom
kipz/fix-emsgsize-stdio-ack

Conversation

@kipz

@kipz kipz commented Jun 24, 2026

Copy link
Copy Markdown
Owner

Problem

Commands with large environments fail with:

nono-shim: failed to send stdio fds: Message too long (os error 40)

This is EMSGSIZE. The root cause is a race between the JSON write and the SCM_RIGHTS sendmsg: the shim sends the full process environment in the JSON request (which can be 6–9 KB), filling the macOS UDS receive buffer (~8 KB default). When the shim immediately calls sendmsg with the SCM_RIGHTS control message, there is no contiguous buffer space left for the ancillary data — even with all three fds bundled into a single sendmsg.

Fix

Add a 1-byte ACK handshake between the JSON read and the fd receive:

  1. Server reads the JSON body (buffer now drained)
  2. Server sends 0x06 ACK
  3. Shim reads ACK, then calls sendmsg

This guarantees the buffer is empty at the point the SCM_RIGHTS message is sent.

Protocol change — shim and server must ship together.

Updated protocol:

1. Request:  u32 length || JSON body   (unchanged)
2. ACK:      0x06 from server          (new)
3. SCM_RIGHTS sendmsg — stdio fds      (unchanged)
4. Response: u32 length || JSON body   (unchanged)

Test

Added ack_handshake_round_trips to nono-shim — spins up a mock server thread that follows the new protocol, verifies the SCM_RIGHTS transfer succeeds, and checks the exit code.

@github-actions github-actions Bot added the bug Something isn't working label Jun 24, 2026
…o fds

The shim sends a length-prefixed JSON request that can be 6-9 KB in an
AI agent session (full process environment included). macOS UDS receive
buffers default to ~8 KB, so the JSON can fill the buffer before the
server has had a chance to drain it. The subsequent sendmsg with
SCM_RIGHTS then fails with EMSGSIZE because there is no contiguous space
left in the buffer for the ancillary control message — even with all
three fds batched into a single sendmsg call.

Fix: the server sends a single ACK byte (0x06) after reading the JSON
body (which drains the buffer), and the shim reads this ACK before
calling sendmsg. This ensures the buffer is always empty at the point
the SCM_RIGHTS message is sent.

Protocol change (shim and server must be updated together):
  1. Request:  u32 length || JSON body   (unchanged)
  2. ACK:      1 byte 0x06 from server   (new)
  3. SCM_RIGHTS sendmsg — stdin/stdout/stderr   (unchanged)
  4. Response: u32 length || JSON body   (unchanged)

Also adds an end-to-end test in nono-shim that exercises the full ACK
handshake and SCM_RIGHTS transfer with a mock server thread.
@kipz kipz force-pushed the kipz/fix-emsgsize-stdio-ack branch from 485620e to de4dfa9 Compare June 24, 2026 08:16
@kipz kipz marked this pull request as ready for review June 24, 2026 08:17
@kipz kipz merged commit bceedb0 into develop Jun 24, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant