Add workspace scanning to oak_scan and wire it to the LSP#1244
Add workspace scanning to oak_scan and wire it to the LSP#1244lionel- wants to merge 30 commits into
oak_scan and wire it to the LSP#1244Conversation
48eb900 to
a29744e
Compare
| // Try to load package from this workspace folder and set as | ||
| // root if found. This means we're dealing with a package | ||
| // source. | ||
| if state.root.is_none() { |
There was a problem hiding this comment.
Are we eventually going to remove this root field? Probably right? It's not really correct for multi root workspaces anyways, and I feel like we will have this stored elsewhere
There was a problem hiding this comment.
yes we're heading towards a single notion of workspace root that can contain any number of scripts and packages. This way mono-repos are properly supported.
| pub(crate) library: Library, | ||
|
|
||
| /// Salsa input tree for Oak queries. | ||
| pub(crate) oak: OakDatabase, |
There was a problem hiding this comment.
I would like to strongly advocate for db naming scheme like everywhere else!
There was a problem hiding this comment.
I would like to strongly agree with you!
| // Start first round of indexing. `state.documents` is empty at init since | ||
| // no `didOpen` has fired yet, but build the set through the same shape we | ||
| // use elsewhere so the call site reads consistently. | ||
| let editor_owned: HashSet<UrlId> = state | ||
| .documents | ||
| .keys() | ||
| .map(|url| UrlId::from_url(url.clone())) | ||
| .collect(); | ||
| state | ||
| .oak | ||
| .set_workspace_paths(&workspace_paths, &editor_owned); |
There was a problem hiding this comment.
I think just seeing something like
// Start first round of indexing. We are initializing, so no documents have been opened yet.
set_workspace_paths(&workspace_paths, &HashSet::new())
is less confusing
| /// every package. Mirrors the placement the bulk scanner would pick. | ||
| pub(crate) fn add_watched_file<DB: Db + DbInputs>(db: &mut DB, url: UrlId, contents: String) { |
There was a problem hiding this comment.
I'm mildly nervous about the fact that we have 2 implementations of file classification. One for bulk updates and one for incremental ish updates. It seems likely they could easily get out of sync :/
There was a problem hiding this comment.
I've extracted a placement classifier to prevent some of the potential drift
| let mut scripts = root.scripts(db).clone(); | ||
| if !scripts.contains(&file) { | ||
| scripts.push(file); | ||
| root.set_scripts(db).to(scripts); | ||
| } | ||
| }, | ||
| Placement::PackageFile(pkg) => { | ||
| let mut files = pkg.files(db).clone(); | ||
| if !files.contains(&file) { | ||
| files.push(file); | ||
| pkg.set_files(db).to(files); | ||
| } | ||
| }, | ||
| Placement::PackageScript(pkg) => { | ||
| let mut scripts = pkg.scripts(db).clone(); | ||
| if !scripts.contains(&file) { | ||
| scripts.push(file); | ||
| pkg.set_scripts(db).to(scripts); | ||
| } |
There was a problem hiding this comment.
for all 3, only clone if !contains
d50f230 to
76a82b0
Compare
a29744e to
eec3212
Compare
1132f3d to
779653a
Compare
Branched from #1243
Progress towards #1212
This PR adds workspace scanning to
oak_scanand wires it into the LSP. The scanner walks a workspace folder, finds packages (anyDESCRIPTIONfile in the tree with aPackagefield, honoring.gitignore) and top-level R scripts, and registers them under aRootinoak_db.The LSP feeds three event types into the scanner:
initializeanddidChangeWorkspaceFoldersupdate workspace roots in the DB.didChangeWatchedFilesapplies surgical updates for single R files. When a change to aDESCRIPTIONfile is detected, this triggers a full rescan of the containing root (package files might be demoted to scripts).Editor-owned URLs (files the editor has open) get special handling because we ignore disk state for those. The editor contents are the source of truth. When a workspace is removed, these files survive and get placed in
OrphanRoot. This way analysis keeps working when a user closes a workspace folder while a buffer from it is still open.Files that leave a live workspace (folder removed, file deleted, buffer closed) route to
StaleRoot. We keep them there for as long as they remain deleted, and we bring them back if they are added again. This allows reusing theFileinputs without creating new ones. Salsa inputs are never garbage collected so reusing them (based on their URL key) avoids unbounded memory growth when e.g. switching git branches, which can delete and bring back many files at a time.Workspace scans run synchronously in the main loop of the LSP in this PR. Since these operations are too heavy to be run synchronously, and would stall requests from user interactions, the next PR moves
them off the main loop.