fix(connection): reload onRequest in onProcess to avoid dead loop#425
Open
014-code wants to merge 1 commit into
Open
fix(connection): reload onRequest in onProcess to avoid dead loop#425014-code wants to merge 1 commit into
014-code wants to merge 1 commit into
Conversation
… "When SetOnRequest is called from inside the onConnect callback (or from inside a previous onRequest call) the task goroutine spawned by onProcess kept using the stale handler snapshot captured as a closure parameter. If the old handler (typically a no-op placeholder) failed to consume the input buffer, the for-loop at the START label would never break, the processing lock would never be released, and the goroutine would spin forever, effectively killing the connection. This change reloads the latest handler from c.onRequestCallback at the entry of START and at the end of each onRequest call inside the processing loop, so any handler swap made by the user is picked up immediately. Closes cloudwego#421
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
What this PR does / why we need it:
Fixes the dead-loop described in #421. When
SetOnRequestis called from inside theOnConnectcallback (or from inside a previousOnRequestcall), the task goroutine spawned byonProcesskeeps using the stale handler snapshot captured as a closure parameter. If the old handler (typically a no-op placeholder) fails to consume the input buffer, theforloop at theSTART:label never breaks, theprocessinglock is never released, and the goroutine spins forever, effectively killing the connection.Root cause
In
connection_onevent.go::onConnect, the handler is loaded once from the atomic beforeonProcessis called:Inside
onProcess, the snapshot is captured by thetaskclosure and reused:START: if onRequest != nil && c.Reader().Len() > 0 { _ = onRequest(c.ctx, c) // stale } for { ... _ = onRequest(c.ctx, c) // stale }When the user calls
conn.SetOnRequest(newHandler)from inside theOnConnectcallback, the snapshot is no longer current, but the task continues to invoke the old one. A no-op placeholder never consumes the buffered data,c.Reader().Len() > 0stays true, the loop spins, and the lock is never released.Fix
Reload the latest handler from
c.onRequestCallbackat the entry ofSTART:and after eachonRequestcall inside the processing loop, so any handler swap is picked up immediately. No API change, no signature change, ~8 lines added.START: onRequest, _ = c.onRequestCallback.Load().(OnRequest) if onRequest != nil && c.Reader().Len() > 0 { _ = onRequest(c.ctx, c) } for { ... _ = onRequest(c.ctx, c) onRequest, _ = c.onRequestCallback.Load().(OnRequest) // pick up SetOnRequest }Trigger conditions
The bug was reproducible when all of the following held:
OnRequestwas installed viaNewEventLoop(...)or an earlySetOnRequest.OnConnectwas configured viaWithOnConnect.OnConnectcalledconn.SetOnRequest(realHandler)to swap the placeholder.inputBuffer(or arrived) before the task reachedSTART:— common under load, or whenOnConnectdid any non-trivial work (handshake, RPC endpoint construction, ~10ms+ latency).Impact when triggered
processinglock is never released → the connection becomes a black hole; all subsequent bytes are dropped.Side benefit
The same re-read in the for-loop body also fixes a symmetric case where
SetOnRequestis called from inside a regularOnRequesthandler, not just fromOnConnect. The original code had the same staleness problem on that path too; the reload now handles both uniformly.Performance
atomic.Value.LoadperSTART:entry and per for-loop iteration. Cost is a single pointer read with atomic semantics (~1ns on amd64), negligible compared to the userOnRequestcall (µs–ms).Which issue(s) this PR fixes:
Fixes #421
Checklist
go build ./...,go vet ./...on linux/amd64)gofmtonProcess死循环风险 #421 intoconnection_test.goand adding a-racerun to CI)