Contributions from Recoco maintainer 🦀 : 2 bugs in stats.rs + ergonomics improvements #1789
bashandbone
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi! I'm the maintainer of Recoco, a pure-Rust fork of CocoIndex that we publish on crates.io. We track the
v1branch closely — implementing upstream changes as they land, occasionally adapting them for a no-Python environment.As part of implementing PR #1767 (progress watching API) on our side, I did a detailed Rust-to-Rust comparison of our
crates/coreagainst yourrust/core. I found two correctness bugs introduced in #1767, a few ergonomics improvements we made that might be worth upstreaming, and some trivial code-quality items.I'd like to contribute patches for all of this if you are interested. I'll start with the bugs, then follow with individual issues/PRs for the rest if you're interested. Happy to take guidance on your preferred process — file issues first, open PRs directly, etc.
Bug 1:
notify_terminated()doesn't persist the termination sentinelFile:
rust/core/src/engine/stats.rsIntroduced in: #1767
notify_terminated()sendsTERMINATED_VERSIONto the watch channel but never writes it into theMutex<VersionedProcessingStats>. As a result:Callers who poll via
stats.snapshot()orstats_snapshot()(rather than subscribing to the watch channel) never see that the task has terminated — they continue observing the last real version number.A late update() call after
notify_terminated()succeeds silently, mutating the stats after the termination signal was sent. Watch subscribers would then see a real version number again after having seenTERMINATED_VERSION, which violates the sentinel semantics.Fix: Write
TERMINATED_VERSIONinto the mutex insidenotify_terminated(), and add an early-return guard in update():Bug 2: Watch channel carries only a version number — stats+version are not read atomically (TOCTOU)
File:
rust/core/src/engine/stats.rsIntroduced in: #1767
The watch channel sends a
u64version. To read the actual stats, a subscriber must callstats.snapshot()as a separate step. Betweenrx.changed().awaitandsnapshot(), moreupdate()calls may have run, so the version the watcher saw no longer corresponds to the stats it reads:The original
UpdateHandle::stats_snapshot()returns aVersionedProcessingStatsfrom the mutex (version + stats together), so that call-site is fine — but anyone subscribing via the watch API and then polling stats is exposed to the race.Fix: Send
VersionedProcessingStats(or an equivalent snapshot) through the watch channel instead of justu64. Subscribers then get a coherent(version, stats)pair in onerx.borrow()call, with no auxiliary lock needed:(The extra clone cost on the watch send is negligible at a 1-second reporting interval.)
Tests for both bugs
Both bugs are currently untestable because the relevant test cases are absent. I'd include these with the fix:
Other improvements (lower priority, happy to discuss)
Beyond the bugs, we made several smaller improvements while implementing this engine. Happy to contribute these as separate PRs if useful, or just leave them here as reference:
UpdateHandle→IntoFuture(app.rs): Implementingstd::future::IntoFuturelets callers writehandle.await?instead ofhandle.result().await?. Small ergonomics win, no breaking change.UpdateHandlewatch API (app.rs): Exposing awatch::Receiver<VersionedProcessingStats>directly onUpdateHandle(via awatch_stats()method) makes the subscribe-and-loop pattern idiomatic without requiring access to the rawProcessingStatsobject.DeclaredTargetState→DeclaredEffect(context.rs, execution.rs): The struct represents an action/effect to apply to a target, not a target state value. The rename reads more accurately at call sites throughout execution.rs. (Total: 2 files, ~8 occurrences.)FnCallMemoEntrydefault via#[derive] (context.rs): Replace the manual Default impl with #[derive(Default)] + #[default] on the Pending variant — cleaner and self-documenting.Minor idioms (stable_path.rs, target_state_path.rs): A few places use verbose UFCS (::method(...)) where the method syntax (value.method(...)) works fine. Also write!(f, "{}", uuid.to_string()) can be write!(f, "{}", uuid).
Batcher::new() in utils: We removed the explicit
BatchQueueargument fromBatcher::new()since it's an implementation detail callers shouldn't need to construct. If this is something you'd want, it would require a change incocoindex_utils.Thanks for CocoIndex — the incremental dataflow model is excellent, and we've enjoyed working with the
v1engine. Happy to prepare any of the above as PRs once we've synced on what's wanted.Beta Was this translation helpful? Give feedback.
All reactions