Conversation
WalkthroughThis PR introduces a graceful shutdown mechanism for the JSON-RPC server by refactoring RPC initialization to accept a pre-bound Changes
Sequence Diagram(s)sequenceDiagram
participant Daemon
participant RPC Service
participant Stop Channel
rect rgba(76, 175, 80, 0.5)
Note over Daemon,Stop Channel: RPC Server Startup
Daemon->>RPC Service: Call start_rpc(state, rpc_listener, stop_handle)
RPC Service->>RPC Service: Bind TcpListener & accept connections
RPC Service->>Stop Channel: Register for stop signals
end
rect rgba(244, 67, 54, 0.5)
Note over Daemon,Stop Channel: RPC Server Shutdown
Daemon->>Stop Channel: Send stop signal via StopHandle
Stop Channel->>RPC Service: Notify shutdown
RPC Service->>RPC Service: Close connections & cleanup
RPC Service-->>Daemon: Shutdown complete
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/tool/offline_server/server.rs (1)
219-239: Propagate start_rpc errors instead of discarding them.The select branch currently ignores
Errfromstart_rpc, so failures returnOk(()).🐛 Proposed fix
- _ = start_rpc(state, rpc_listener,stop_handle, None) => Ok(()), + result = start_rpc(state, rpc_listener, stop_handle, None) => result,
🤖 Fix all issues with AI agents
In `@src/daemon/mod.rs`:
- Around line 405-410: The code currently binds the RPC listener with
tokio::net::TcpListener::bind(rpc_address) and then calls .unwrap(), which will
panic; instead propagate the error using ? and add context: replace the
map_err/unwrap chain on the bind call that constructs rpc_listener with a call
that uses .with_context(...) (from anyhow::Context) and then ? so the enclosing
async block (which returns anyhow::Result<()>) returns the error instead of
panicking; target the rpc_listener binding expression and rpc_address to apply
this change.
In `@src/tool/offline_server/server.rs`:
- Around line 81-87: The JoinSet is being created as a temporary (&mut
JoinSet::new()) which is dropped immediately cancelling background tasks spawned
by MessagePool::new; fix by creating a persistent mutable binding (e.g., let mut
join_set = JoinSet::new()) and pass &mut join_set into MessagePool::new so the
join set lives for the required lifetime of the pool; update the surrounding
scope to retain join_set (and return or store it if necessary) so it is not
dropped, and apply the same change in the other occurrences in test_snapshot.rs
and generate_test_snapshot.rs.
🧹 Nitpick comments (2)
src/rpc/methods/auth.rs (1)
4-29: Add rustdoc + error context for the new public helper.This new public helper lacks rustdoc, and adding context to keystore/token failures will improve diagnosability.
♻️ Proposed fix
-use anyhow::Result; +use anyhow::{Context as _, Result};impl AuthNew { + /// Create a JWT token using the keystore's JWT_IDENTIFIER key. pub fn create_token( keystore: &KeyStore, token_exp: Duration, permissions: Vec<String>, ) -> anyhow::Result<String> { - let ki = keystore.get(JWT_IDENTIFIER)?; - Ok(create_token(permissions, ki.private_key(), token_exp)?) + let ki = keystore + .get(JWT_IDENTIFIER) + .context("failed to load JWT key from keystore")?; + create_token(permissions, ki.private_key(), token_exp) + .context("failed to create JWT token") } }As per coding guidelines: Document all public functions and structs with rustdoc comments; Use anyhow::Result for most operations and add context with .context() method.
src/tool/offline_server/server.rs (1)
47-52: Add rustdoc for the new public offline_rpc_state helper.Public API should be documented.
♻️ Proposed fix
+/// Build the offline RPC state and return the shutdown receiver. pub async fn offline_rpc_state<DB>( chain: NetworkChain, db: Arc<DB>, genesis_fp: Option<&Path>, save_jwt_token: Option<&Path>, ) -> anyhow::Result<(RPCState<DB>, mpsc::Receiver<()>)>As per coding guidelines: Document all public functions and structs with rustdoc comments.
Codecov Report❌ Patch coverage is Additional details and impacted files
... and 27 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@src/tool/offline_server/server.rs`:
- Around line 100-107: Do not log the raw admin JWT — remove or redact the
info!("Admin token: {token}") call in the block that creates the token (the code
that calls crate::auth::create_token with crate::auth::ADMIN and
ki.private_key()). Replace it with a safe message that does not include the full
token (e.g., log that an admin token was created and its expiry or that it was
saved to save_jwt_token when present), or log a redacted token (only show token
prefix/suffix) if you must indicate its value; ensure create_token, ADMIN,
ki.private_key(), and save_jwt_token usage remain unchanged except for removing
sensitive logging.
- Around line 226-228: The select! branch currently discards start_rpc's return
value which hides failures; change the select! arm to capture the result from
start_rpc (e.g., using `res = start_rpc(...)`) and then propagate any error (by
mapping the join to res and returning Err or using `res?`) so that a failing
start_rpc returns its anyhow::Error instead of Ok(())—update the tokio::select!
arm that references start_rpc and ensure the surrounding logic (the `result`
handling) propagates that error.
- Around line 82-90: The current background task drops JoinSet results silently;
change the spawn closure that consumes services to use while let Some(res) =
services.join_next().await and handle Err(res) instead of ignoring it: inspect
the Result returned by JoinSet::join_next (the res from services.join_next())
and log or propagate panics/cancellations (use the existing logger or return an
error) so task panics/cancellations are not silently swallowed; update the same
pattern in the other instances mentioned (test_snapshot.rs and
generate_test_snapshot.rs) referencing JoinSet, services, and join_next to match
the standard error-checking used in chain_sync/ and daemon/mod.rs.
🧹 Nitpick comments (3)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (1)
133-141: LGTM — JoinSet drainer pattern is idiomatic.The implementation correctly creates a shared
JoinSet, passes it toMessagePool::new, and spawns a background task to drain completed tasks. This aligns with the AI summary's goal of centralized task management.One optional consideration:
join_next()returnsResult<T, JoinError>, so panicking tasks are currently silently ignored. For test infrastructure this is likely fine, but if visibility into task failures becomes useful, you could log errors:♻️ Optional: Log errors from joined tasks
- tokio::spawn(async move { while services.join_next().await.is_some() {} }); + tokio::spawn(async move { + while let Some(result) = services.join_next().await { + if let Err(e) = result { + tracing::warn!("Background service task failed: {e}"); + } + } + });src/tool/offline_server/server.rs (2)
47-52: Add rustdoc for the new public APIoffline_rpc_state.
This is a public function and currently undocumented.📘 Suggested doc comment
+/// Build an offline RPCState and return the shutdown receiver. +/// +/// Initializes chain state, message pool, keystore, and networking context. pub async fn offline_rpc_state<DB>(As per coding guidelines, Document all public functions and structs with rustdoc comments.
107-109: Use async write with error context.
std::fs::writeblocks the async runtime and lacks helpful context on failure. The codebase usestokio::fsfor async file operations throughout.⚙️ Suggested async write with context
- if let Some(path) = save_jwt_token { - std::fs::write(path, token)?; - } + if let Some(path) = save_jwt_token { + tokio::fs::write(path, token) + .await + .with_context(|| format!("failed to write JWT to {}", path.display()))?; + }
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/tool/offline_server/server.rs`:
- Around line 47-53: Add a rustdoc comment for the new public function
offline_rpc_state that explains its purpose, parameters, and return values;
specifically state that it initializes and returns an RPCState<DB> for the given
NetworkChain and database (Arc<DB>), describe what genesis_fp and save_jwt_token
do (optional file paths) and that the function returns a Result containing the
(RPCState<DB>, mpsc::Receiver<()>) on success or an anyhow::Error on failure;
include any concurrency or lifetime implications and note that DB is generic and
must satisfy the trait bounds used in the function signature.
- Around line 118-123: Replace the direct use of std::fs::write when persisting
the admin JWT with the project's sensitive-file helper to ensure secure
permissions and safe overwrite semantics: in the block that checks
save_jwt_token and writes token (references: save_jwt_token, token,
std::fs::write), call the existing sensitive-file utility function (e.g.,
sensitive_file::write_sensitive or whatever helper is exported in your crate) to
create the file with restrictive permissions and proper atomic/overwrite
behavior, handling and propagating errors the same way so the info!("Admin token
is saved...") branch remains unchanged.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/rpc/mod.rs`:
- Around line 859-862: The await on shutdown_recv.recv() currently discards its
return value causing a must_use warning; change it to explicitly handle the
result (e.g. let _ = shutdown_recv.recv().await; or match
shutdown_recv.recv().await { Some(_) => (), None => () }) so the value is used
and the clippy warning is resolved, keeping the surrounding
shutdown_send.send(()).await.unwrap() and server_handle.stop().unwrap() calls
intact.
🧹 Nitpick comments (1)
src/rpc/mod.rs (1)
527-533: Add rustdoc forstart_rpcand its new parameters.
This is a public API and now includes listener/stop-handle semantics.📝 Suggested rustdoc
-pub async fn start_rpc<DB>( +/// Starts the JSON-RPC server using a pre-bound listener and a stop handle. +/// +/// # Arguments +/// - `state`: Shared RPC state. +/// - `rpc_listener`: Pre-bound listener for incoming connections. +/// - `stop_handle`: Handle used to signal graceful shutdown. +/// - `filter_list`: Optional filter list for the filter middleware. +pub async fn start_rpc<DB>(As per coding guidelines, "Document all public functions and structs with rustdoc comments".
|
I had to revert the JoinSet handling changes suggested by AI with 666ffb0 to unblock CI. The issue is now tracked in filecoin-project/go-f3#1062 |
Summary of changes
Changes introduced in this pull request:
Reference issue to close (if applicable)
Closes
Other information and links
Change checklist
Outside contributions
Summary by CodeRabbit
Bug Fixes
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.