Skip to content

fix: prevent sub-agent hang after OAuth callback server is started#43

Open
blai wants to merge 1 commit intonicobailon:mainfrom
blai:main
Open

fix: prevent sub-agent hang after OAuth callback server is started#43
blai wants to merge 1 commit intonicobailon:mainfrom
blai:main

Conversation

@blai
Copy link
Copy Markdown

@blai blai commented Apr 12, 2026

OAuth changes added initializeOAuth() on session_start, which starts an HTTP callback server. Two defects caused sub-agent processes to hang indefinitely after completing their task:

  1. session_shutdown did not call shutdownOAuth(), so the callback server was never closed when a session ended. Fix: call shutdownOAuth() alongside shutdownState() in the session_shutdown handler.

  2. The HTTP server was created without server.unref(), making it a strong event- loop reference. Even if shutdown ran cleanly, any race where the handler was skipped (e.g. SIGKILL, uncaught throw) would keep the process alive. Fix: call server.unref() immediately after bind so Node treats the server as a background resource and exits naturally when no foreground work remains.

Tests: added regression coverage in two new suites:

  • index-lifecycle: asserts shutdownOAuth is called on session_shutdown
  • mcp-callback-server-unref: asserts server.unref() is called on successful bind and NOT called on a failed bind (EADDRINUSE)

OAuth changes added initializeOAuth() on session_start, which starts an HTTP
callback server. Two defects caused sub-agent processes to hang indefinitely
after completing their task:

1. session_shutdown did not call shutdownOAuth(), so the callback server was
   never closed when a session ended. Fix: call shutdownOAuth() alongside
   shutdownState() in the session_shutdown handler.

2. The HTTP server was created without server.unref(), making it a strong event-
   loop reference. Even if shutdown ran cleanly, any race where the handler was
   skipped (e.g. SIGKILL, uncaught throw) would keep the process alive.
   Fix: call server.unref() immediately after bind so Node treats the server as
   a background resource and exits naturally when no foreground work remains.

Tests: added regression coverage in two new suites:
- index-lifecycle: asserts shutdownOAuth is called on session_shutdown
- mcp-callback-server-unref: asserts server.unref() is called on successful
  bind and NOT called on a failed bind (EADDRINUSE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants