Skip to content

Scan workspaces in background tasks#1245

Open
lionel- wants to merge 12 commits into
oak/salsa-12-scan-workspacefrom
oak/salsa-13-scan-async
Open

Scan workspaces in background tasks#1245
lionel- wants to merge 12 commits into
oak/salsa-12-scan-workspacefrom
oak/salsa-13-scan-async

Conversation

@lionel-

@lionel- lionel- commented May 29, 2026

Copy link
Copy Markdown
Contributor

Branched from #1244
Progress towards #1212

This PR moves workspace scanning off the LSP main loop, to avoid blocking frontend requests on initialize and on every didChangeWorkspaceFolders. This PR introduces the same setup rust-analyzer and ty use: the handler dispatches scanning work to an asynchronous task pool and replies immediately.

The new piece is oak_scan::ScanScheduler, which owns the policy (when to scan, how to handle events arriving mid-scan) but not the runtime. The LSP layer is free to use any appropriate mechanism for running off-thread tasks: The LSP takes a ScanRequest from the oak_scan scheduler, run it via ScanRequest::run() off-thread task, and hand the resulting ScanCompleted results back to oak_scan via ScanScheduler::apply_scan_completed().

In Ark's case we spawn each request on lsp::spawn_blocking() (which handles things like crashes gracefully with frontend reports) and ships the result back via a new Event::OakScanCompleted for the main loop. Scheduler state and Salsa inputs are mutated only on the main loop, which reduces the surface for race handling issues. We just need to ensure proper event ordering in oak_scan, there are no issues of concurrent accesses at all in the LSP layer.

See module doc in scheduler.rs for how race conditions are resolved.

@lionel- lionel- force-pushed the oak/salsa-12-scan-workspace branch from 48eb900 to a29744e Compare May 29, 2026 13:51
@lionel- lionel- force-pushed the oak/salsa-13-scan-async branch from 1abf718 to 6c0c303 Compare May 29, 2026 13:51
Comment thread crates/ark/src/lsp/main_loop.rs Outdated
Comment on lines +495 to +507
pub(super) fn dispatch_scan_requests(
events_tx: &TokioUnboundedSender<Event>,
requests: Vec<ScanRequest>,
) {
for req in requests {
let tx = events_tx.clone();
spawn_blocking(move || {
let scan = req.run();
tx.send(Event::OakScanCompleted(scan)).log_err();
Ok(None)
});
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than tweaking the value of each LSP Request to possibly return ScanRequests, I think I would much rather allow state_handlers::initialize() and state_handlers::did_change_workspace_folders() to kick off the scan requests themselves.

i.e. if you just pass events_tx to each of those, they should have everything they need to kick off a spawn_blocking() call.

That way you can, for example, keep the dispatch close to the lsp_state.oak_scheduler.set_workspace_paths() call that requires it, which seems nice to me.

I also really like that that would mean that we would not need the respond_with() changes (I'm not super comfortable with how the ::default() behavior there works) and tweaks to the return values just for this one feature.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This one feels important enough to me to talk about)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this idea has merit, with dispatch_scan_requests() calls moving right after oak_scheduler usage like this

Screenshot 2026-06-08 at 2.59.55 PM.png

and then this nicely simplifies back to where we started

Screenshot 2026-06-08 at 3.00.24 PM.png

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Comment thread crates/oak_scan/src/scheduler.rs Outdated
Comment thread crates/oak_scan/src/scheduler.rs Outdated
Comment thread crates/oak_scan/src/scheduler.rs Outdated
Comment thread crates/oak_scan/src/scheduler.rs Outdated
Comment thread crates/oak_scan/src/scheduler.rs Outdated
Comment on lines +336 to +340
self.state.insert(root, ScanState::Scanning);
let Some(path) = root.path(db).to_file_path().warn_on_err() else {
return Vec::new();
};
vec![ScanRequest { root, path }]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The early return Vec::new() seems like a problem. You set this root up as Scanning but if you don't return a ScanRequest for it, then it will never get scanned and will be infinitely left in Scanning mode

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to do this there?

            self.state.remove(&root);
            self.buffered.remove(&root);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Comment thread crates/oak_scan/src/scheduler.rs Outdated
@lionel- lionel- force-pushed the oak/salsa-12-scan-workspace branch from a29744e to eec3212 Compare June 10, 2026 10:12
@lionel- lionel- force-pushed the oak/salsa-13-scan-async branch 2 times, most recently from 8f13f9c to 5f3ecf2 Compare June 10, 2026 14:23
@lionel- lionel- force-pushed the oak/salsa-13-scan-async branch from ec1435d to a2787c2 Compare June 10, 2026 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants