Skip to content

fix: improve relay robustness — errgroup, ping handling, PAKE for local relay#1114

Open
abakum wants to merge 1 commit into
schollz:mainfrom
abakum:fix/local-relay-robustness
Open

fix: improve relay robustness — errgroup, ping handling, PAKE for local relay#1114
abakum wants to merge 1 commit into
schollz:mainfrom
abakum:fix/local-relay-robustness

Conversation

@abakum
Copy link
Copy Markdown
Contributor

@abakum abakum commented May 21, 2026

No description provided.

…al relay

- Replace sync.WaitGroup + panic with errgroup.Group (crash prevention)
- Add receiveSkippingPing helper for relay keepalive pings
- PAKE key exchange for encrypted IP discovery in local relay
- Fix race condition: send error to errchan on connection failure
- Gracefully refuse local relay on unexpected data
- Add golang.org/x/sync v0.10.0 dependency
@abakum abakum force-pushed the fix/local-relay-robustness branch from 5e8cb3d to 5a5731d Compare May 21, 2026 16:16
@abakum
Copy link
Copy Markdown
Contributor Author

abakum commented May 21, 2026

fix: improve relay robustness — errgroup, ping handling, PAKE for local relay

This PR combines several robustness fixes for relay communication.

Part A: Replace panic with errgroup (critical fix)

Problem: In processMessagePake, panic(err) is called inside an anonymous goroutine launched via go func(j int). There is no recover in this goroutine, and the recover in the parent function processMessagePake cannot catch a panic from a child goroutine. The result: the entire process crashes when connecting to a relay port fails.

Solution: Replace sync.WaitGroup + panic with errgroup.Group, which catches the error from any goroutine, returns it via g.Wait(), and allows the caller to handle the error gracefully. Additionally, local variables splitErr and connErr replace captures of the outer err, fixing potential data races.

Part B: Skip relay keepalive pings

Problem: The relay sends periodic keepalive pings ([]byte{1}) while waiting for the second peer to connect. All Receive() calls must skip these pings to avoid protocol errors. Previously, ping handling was done inline in some places and missing in others.

Solution: Centralize ping-skipping logic in a new receiveSkippingPing helper method and apply it to all relevant Receive calls in receiveData, Receive, and transfer methods.

Part C: PAKE key exchange for local relay IP discovery

Problem: The receiver may not discover the sender via multicast (due to firewalls, different subnets, or IPv4/IPv6 mismatches). The receiver needs a fallback to request the sender's local IP addresses through the relay pipe.

Solution: Extend transferOverLocalRelay to support a PAKE-based handshake (pake1/pake2) between sender and receiver over the local relay. After PAKE completes, the receiver can send an encrypted ipRequest to obtain the sender's local IPs. The shared secret ensures only the authorized receiver can decrypt the IP list, preventing leakage to unauthorized peers that might guess the room name.

Part D: Race condition fix in local relay

Problem: When ConnectToTCPServer fails in transferOverLocalRelay, the function returns without sending to errchan. The second read from errchan in the caller hangs forever, causing a deadlock.

Solution: Send the error to errchan before returning. Also handle unexpected data gracefully by sending an error and returning instead of continuing to read garbage data.

Dependency: golang.org/x/sync v0.10.0 (for errgroup)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant