Skip to content

Refactor timers#275

Open
jthomas43 wants to merge 9 commits into
holepunchto:mainfrom
jthomas43:refactor-timers
Open

Refactor timers#275
jthomas43 wants to merge 9 commits into
holepunchto:mainfrom
jthomas43:refactor-timers

Conversation

@jthomas43

Copy link
Copy Markdown
Contributor

This PR replaces 5 libuv timers with 1 libuv timer, based on the observation that only one of these timers need to be active at a time:

  1. Retransmission timer (RTO)
  2. Tail-Loss Probe timer (TLP)
  3. Rack-Reorder timer (RACK_REO)
  4. Zero-Window probe timer (ZWP)
  5. KeepAlive probe timer (KeepAlive)

(4) Zero-Window probe and (5) KeepAlive can only be active if there are no packets in flight, ie all data has been acked. If the window size is zero then the ZWP timer is set instead of KeepAlive

(1) RTO, (2) TLP or (3) RACK_REO may be active when data is in flight.. The TLP is set when no loss is detected and the RACK_REO is set when potential loss (in the form of a SACK) is received, both of these timers reset the RTO timer on expiration.

@jthomas43 jthomas43 requested a review from a team June 17, 2026 14:30

@lejeunerenard lejeunerenard left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consolidating the timers is a good idea. I believe I found some edge cases and some other nits.

Reviewed with help from AI.

Comment thread src/udx.c Outdated
}
rto = rto_delta_ms;
}
uv_timer_start(&stream->timer, udx_rto_timeout, rto, 0);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be stream_timer_start? Currently the timer's enum (pending_timer) isn't changed so if this is called via udx_tlp_timeout (and resets the timeout to the delta ms for the next RTO) and a data packet comes in after calling arm_stream_timers which would then set the timer to RTO with the default delay.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! it should be stream_timer_start(), also I think we should move setting stream->next_rto_ts = now + rto into stream_timer_start()

Comment thread src/udx.c Outdated
assert(stream->rto >= 1);
assert(stream->status != UDX_STREAM_CLOSED);
uv_timer_start(&stream->rto_timer, udx_rto_timeout, stream->rto, 0);
stream->next_rto_ts = uv_now(stream->udx->loop) + stream->rto;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be updated when there's an existing RTO timer? Since the RTO isn't restarted below (not one of the pending_timer types that trigger it for good reason), the RTO could fire sooner than anything using next_rto_ts to calculate etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, I think this is cleaned up if we set stream->next_rto_ts as part of stream_timer_start()

Comment thread src/udx.c
} else {
uv_timer_stop(&stream->zwp_timer);
if (stream->ca_state == UDX_CA_OPEN && !stream->sacks && !sent_tlp) {
schedule_loss_probe(stream, false);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the advancing_rto flag be true if there's an existing RTO timer to ensure the TLP is scheduled before the RTO timer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is OK. If the TLP is scheduled then the existing RTO doesn't matter because after the TLP is sent the RTO is rescheduled from now + RTO, ( not next_rto_ts )

Comment thread src/udx.c Outdated

stream->pending_timer = timer;

uv_timer_cb timer_to_callback[] = {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this be declared once outside the function? Not 100% sure, but might reduce allocating the array each time the timer is started?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make it static and keep it in the function. GCC is smart enough to just use a pointer to the initializer memory for the lookup but making it static makes it clearer.

Comment thread src/udx.c Outdated

// how many ms into the future should the rto fire
int64_t
udx_rto_delta_ms (udx_stream_t *stream) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally like the idea of just using the time sent from packets to determine RTO, but it gets tricky when the RTO triggers as then it's stream->rto * 2 so any deltas calculated with this function would be incorrect since it uses stream->rto. To get around that we'd end up tracking the next rto anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants