Conversation
1. Both
|
|
I haven't reviewed the code but I'm not too keen on the idea. This makes projects not portable, eg if you have r_path set I would need to modify the rproject.toml to make it work on my machine and remember to not commit it. It will also not work for CI most likely |
|
I will let @dpastoor also weigh in, but this is the solution we landed on together. The problem space is wanting to run CI's with a matrix of R versions (i.e. 4.4 and 4.5) without needing to change the config between sync's, but not just make We considered using something like |
|
That sounds like the flag --r-version could be added to sync then? |
|
there is absolutely a chance of not being portable - but also we need some ways to actually loosen the portability/precision. I think flags make a lot of sense, but also i don't want it to be only flags - its too easy to forget to use a flag. If it becomes a situation where you'd need to use that flag every time you do an activity, it should be a config var, and this is a case of it can be used for one-off's but also could be used. We also have the issue at the moment of we're only auto-discovering the 'well-known' areas that rstudio installs r versions, if a group installs it anywhere else on the system we have no way of telling rv where to find it without doing path fiddling in the specific terminal session. So we should absolutely have some path disclaimers, but i think its worth giving the option. The most common path usage I'd see is literally This would work for CI very well since CI installs the specific version of R requested in the matrix and puts it on the default path |
|
That does throw out the reproducible part of rv through the window though.
I would expect to not have a config option that breaks reproducibility/portability tbh. IMO if we really want something like that, the R version should be required in the rproject.toml still just so we keep the reproducibility aspect. Eg And the validation stays the same as in this PR. This way at least it's kinda reproducible since the R version is explicit, otherwise rv will just give me random stuff depending on when I'm running the sync and what's in my /usr/bin |
|
I don't think it throws reproducibility out the window. It does open the door to define a level of precision needed, of which now we've said "reproducible" = matching R plus package versions. here is what we can say about reproducibility:
We can also say that for many projects getting the same package versions with a different version of R can often get consistent results. Finally, I think we're opening the developer door a bit with rv around making it something where stricter reproducibility should be used for projects, and should generally be the "default state" of rv, but for the developer persona giving more fluidity. For example, setting use_lockfile = false and pointing to a moving target of latest is a very convenient way to consistently and purposefully keep pulling in the latest of everything. I think we can and should discuss the specific implementation and nomenclature to make sure these options and the associated risks are well articulated. To use an explicit example here for us - I would like us to get to the point that we can include rv setup for all our R packages - use https://github.com/A2-ai/dvs as an example. This should NOT need an explicit R version, as a developer should be able to come and develop dvs against different versions of R on purpose and be able to switch between them locally, or otherwise not have to find a specific version. To tie this to a cargo example, the cargo.toml/cargo.lock do not force a specific version of rustc, though it can be used if needed. To the use of the r_path variable itself as the way to get there - jury is still out - I honestly don't love it, but i'm not sure what we should do. r_path gives us a way to do it via shorthand of "R" to say "whatever is on that path" in a pseudo-portable way. It also at the same time gives a way for people to be super explicit if they want eg if they have a custom version of R at /opt/custom-r/path/to/bin/R it gives a way to make sure rv uses that too, so a "two-birds-one-stone" situation. I would also be open-to, and wonder if we could have something like |
|
Agreed on the system changes impacting reproducibility, here I meant reproducibility as in I will pull the same set of deps on my machine everytime for a project.
This is like doing "*" for versions in Cargo.toml which works, but no one uses it in practice because it breaks all the time. Maybe it's better in R.
I agree, it's just that it probably shouldn't be in the config file. If it's in the file, a contributor will still need to remove it or change to their own hardcoded paths and then remember to not commit that part to the PR. We could allow |
|
I would be fine with
though i wouldn't have that definitely force lockfile to be false. plenty of package combos should run fine across r versions, so that shouldn't mean throwing that out for sure. If the lockfile can't be resolved due to dep constraints for a different R version that is their issue to then solve |
For build processes, it is often helpful to not have to specify an R version, instead just use the version found on the path. Additionally, R installed in non-standard locations has previously not been accessible to
rv(except if it is added to the PATH). To address both of these issues is the addition of ther_pathfield, which if specifiedrvwill look for R at that location (and that location only). This allows ther_versionto be optional, while still requiring specificity around what R is to be used.The rules for the two fields are as followed: