fix: prevent GenServer.call timeout crash in StreamableHTTP Plug#246
fix: prevent GenServer.call timeout crash in StreamableHTTP Plug#246larskluge wants to merge 1 commit intocloudwalk:mainfrom
Conversation
The handle_message_for_sse/4 and handle_message/4 functions used the default 5s GenServer.call timeout. When tool execution takes longer (e.g. LLM calls), this caused an unhandled exit that crashed the HTTP connection — the client received no error response at all. Two fixes: 1. Default GenServer.call timeout to :infinity for handle_message and handle_message_for_sse, since the actual timeout is already enforced internally via the transport's request_timeout option. 2. Catch :exit in the Plug's handle_sse_request and handle_json_request so that any unexpected exits return a proper JSON-RPC error response instead of silently crashing the connection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Summary
Solid fix for a real production pain point. The :infinity default is the right call — the actual timeout budget is already governed by the transport's internal request_timeout, so the outer GenServer.call timeout was just a landmine waiting to go off on any slow tool. The catch :exit guards in the Plug are good defensive layering on top.
Two observations worth addressing in a follow-up:
| {:error, error} -> | ||
| handle_request_error(conn, error, body) | ||
| end | ||
| catch |
There was a problem hiding this comment.
The catch :exit block here and the identical one in handle_json_request (line ~281) are copy-pasted. If the error handling logic ever changes (e.g. adding a different log level or shaping the error differently), both need updating. Consider extracting into a private handle_exit(conn, reason, body) helper to keep it DRY.
| {:error, error} -> | ||
| handle_request_error(conn, error, body) | ||
| end | ||
| catch |
There was a problem hiding this comment.
There's no automated test exercising this catch :exit path — the PR description leaves the manual verification checkbox unchecked. Worth adding an ExUnit test with a stubbed transport that exits (Process.exit(self(), :kill) from a mock handler) to verify the Plug returns a proper JSON-RPC internal_error response rather than dropping the connection. This path is easy to accidentally break in a refactor.
Problem
handle_message_for_sse/4andhandle_message/4inStreamableHTTPuse the default 5sGenServer.calltimeout. When a tool handler takes longer than 5 seconds (common for LLM-backed tools, database queries, or network calls), theGenServer.callraises an unhandled:exitthat crashes the Plug process. The MCP client receives a generic HTTP connection error with no indication of what went wrong.The transport already has a configurable
request_timeout(default 30s) that's correctly used for the internalforward_request_to_servercall — but the outerGenServer.callfrom the public API functions times out first.Fix
Default
GenServer.calltimeout to:infinityforhandle_message/5andhandle_message_for_sse/5, since the actual timeout is already enforced internally via the transport'srequest_timeoutoption. A new optionaltimeoutparameter is added for callers that want explicit control.Catch
:exitin the Plug for bothhandle_sse_requestandhandle_json_request, returning a proper JSON-RPC internal error response instead of silently crashing the connection.Test plan
streamable_http_test.exstests pass (9 tests)streamable_http/plugtests pass (15 tests)🤖 Generated with Claude Code