Contracts for file backups#522054
Conversation
|
@ofborg test restic borgbackup contracts.fileBackup-borgbackup contracts.fileBackup-restic |
|
@ofborg test contracts.fileBackup-demo |
drupol
left a comment
There was a problem hiding this comment.
Two minor optional nit I saw while quickly glancing at it.
| if config.fileBackupContract == null then | ||
| null | ||
| else | ||
| config.fileBackupContract.input.sourceDirectories; |
There was a problem hiding this comment.
Nit
| if config.fileBackupContract == null then | |
| null | |
| else | |
| config.fileBackupContract.input.sourceDirectories; | |
| lib.optional (config.fileBackupContract != null) config.fileBackupContract.input.sourceDirectories; |
There was a problem hiding this comment.
in the falsy case the suggestion returns empty list rather than null, which as per the declared type i'm not sure is intended here
There was a problem hiding this comment.
Indeed, I'm not aware of a function to do that.
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/pre-rfc-decouple-services-using-structured-typing/58257/46 |
| example = [ "--cleanup-commits" ]; | ||
| }; | ||
|
|
||
| fileBackupContract = lib.mkOption { |
There was a problem hiding this comment.
| fileBackupContract = lib.mkOption { | |
| fileBackup = lib.mkOption { |
there's more places like this, but i'd consider this for perhaps brevity + discoverability (going by the contract name)
| type = lib.types.nullOr ( | ||
| lib.types.submodule { | ||
| options = { | ||
| input = { |
There was a problem hiding this comment.
| input = { | |
| request = { |
may be a more robust name when switching context between provider vs consumer, on top of facilitating calling the consumer (in pre-resolution context) a requester
| default = [ ]; | ||
| }; | ||
| }; | ||
| output = { |
There was a problem hiding this comment.
| output = { | |
| result = { |
so, input/output terminology takes the perspective from the provider, yet gets more confusing from the perspective of the consumer, where that output is mostly something that comes in
There was a problem hiding this comment.
Thinking about this more, I think now response makes more sense than result. Request-response is a common combination.
|
For #506343 what the plan for that PR now with this? |
as per #506343 (comment), while we managed to get a sense of where to go there, we feel reviewers deserve more sensible chunking than such a big bang PR. This PR would make for a start to a more layered approach. |
I wouldn't say it's a start. That PR's goal is to take the concepts as far as possible, so we could answer any potential questions about interoperability with other concepts like modular services, or about how this "options machinery" (cf. You could see the current PR as step 1 cherry-picked from #506343. |
| example = "vaultwarden"; | ||
| }; | ||
|
|
||
| sourceDirectories = lib.mkOption { |
There was a problem hiding this comment.
The naming is unfortunate, the fileBackup contract only allows backing up sourceDirectories. Something like paths sounds broader.
Another option that does seem lacking is retention. Disk space is not unlimited so pruning of backups is pretty much essential. Having that in its own contract feels like it would decouple things more than necessary.
There was a problem hiding this comment.
The naming is unfortunate, the fileBackup contract only allows backing up sourceDirectories. Something like paths sounds broader.
That’s a good point. The name file comes from the need to distinguish between backing up actual files on disk and not streams. Streams would be coming from the stdout of programs, for example for backing up a database. This is purposely not proposed here to keep the scope somewhat small but I do have an implementation.
I don’t know if there’s a better name than path but path does encapsulate both files and directories. At least to my knowledge. Then we could rename the option sourcePaths.
Another option that does seem lacking is retention. Disk space is not unlimited so pruning of backups is pretty much essential. Having that in its own contract feels like it would decouple things more than necessary.
That’s indeed a valid concern but IMO it doesn’t belong in this contract which is only used to let a service tell what to back up. The contract sits between the consumer and the provider. Retention is more of a concern between the backup service and the user, so between the provider and the user.
Now having the user be able to tell its preference for backup retention once and that be applied to all backup services is an interesting consideration.
There was a problem hiding this comment.
Do services just get to request backups? Because just because the service maintainer thinks something needs backups doesn't mean that I as a user of the service agree. I'd expect an opt-in mechanism and if I'm opting in explicitly, having the retention settings right there as well makes it a lot easier to maintain an overview.
Doing a systemwide default for backup retention is good but control over individual backups is also important I don't use the same retention for my various backups.
There was a problem hiding this comment.
That's fair. Currently, it's all opt-in anyway so it matches your expectations. I was a bit extrapolating there.
Also, since the contract option is just one more option in the restic service, you can still access all existing restic options and set retention per service this way:
{
services.restic.backups."home-assistant" = {
// Here goes "normal" Restic options
initialize = true;
repository = "/opt/restic-repos/${name}";
passwordFile = "${pkgs.writeText "password" "restic-${name}-password"}";
pruneOpts = [
"--keep-daily 7"
"--keep-weekly 5"
"--keep-monthly 12"
"--keep-yearly 75"
];
// Contract option
fileBackupContract.input = config.services.home-assistant.fileBackupContract.input;
};
}
Converting back the PR to draft because I got some sizable amount of changes I want to make and you should know that before continuing.
This PR follows the RFC 189 to implement a contract for file backups.
File backups means specifically backing up files on a filesystem. Backing up a database using a dump command or taking snapshots of a snapshottable filesystem (e.g. ZFS) are out of scope. Separate contracts will be made for those as seen fit.
Previous work
This PR supersedes #495303 which was introducing a contract for secrets. I am of the opinion that PR, although having similar goals, was doomed to fail from the start because:
This contract does not suffer from the issues.
Two other prior PRs exist that introduced contracts by the wrong end. They started by adding a complex mechanism without a concrete use case and it made it hard to distinguish between intrinsic complexity and accidental complexity. This PR does not suffer from this issue either as it introduces no such mechanism and focuses on the essence of the contract. The PRs are linked at the end of the PR description.
There is also the PR #506343 which goes very far, a bit like a research project, and sees what pushing the contracts mechanism can lead to. It answers to the questions like does this work with modular services and others with success. The implementation might change in the future but the answers are there.
Finally, I do use this contract, although with a different naming convention, in my project and it works great for about 10 services https://shb.skarabox.com/contracts-backup.html. The options I chose are the minimal necessary ones to satisfy all those services.
Contract
This PR creates a standardized option submodule that allows maintainers of a service to codify how their service should be backed up. The submodule provides the following option:
The
input.useroption defines which Unix user the backup service should run under to have the correct permission to backup the wanted files.The
input.sourceDirectoriesoption defines which directories to back up.The
input.excludePatternsoption defines which patterns to exclude from the backup. For example, setting[ "_exclude" "*.rnd" ]would exclude the file whose name exactly matches_excludeand all files ending in.rnd. Note that uniformizing the handling of this option across all providers is important although must still be further expanded in this PR (I'm happy to get suggestions here, TBH I didn’t want to spend too much time making this option perfect before getting more feedback on the PR). The generic test is used to enforce all providers implement this option as expected. See theProvidersection below for the generic test.The
output.scriptoption must be a package whose goal is to provide a uniformized helper on the command line:These exact same commands must be understood by all providers. These are enforced by the generic tests, see the
Providersection below for the generic test.The user may expose the package provided under the
output.scriptoption in any way they see fit, including adding them in theenvironment.systemPackagesoption.This package, although not strictly necessary from a user perspective, is necessary to make the tests generic as it can then trigger a backup or restore a snapshot ion a standard way for each test. After writing those tests I realized extracting this script was super nice and thus why it appeared as an option here instead of staying internal in the tests.
Consumer
A service that wants to be backed up, also called a consumer of the file backup contract, must declare and define at least the
input.userandinput.sourceDirectoriesoptions and optionally theinput.excludePatternsoption. Although the usefulness is limited, theoutput.scriptoption may be referenced by the service. A minimal example is:The
nextcloudandhome-assistantmodules are modified to implement this contract. You will notice the implementation is fully backwards compatible and is quite minimal.Provider
The
resticandborgbackupmodules are modified to implement this contract. Although the diff here is bigger than for the consumers, the diff is merely a massaging of the options of the contract to match those of the existingresticandborgbackupproviders. Most of the diff is related to the implementation of the script that goes into theoutput.scriptoption. The modifications are fully backwards compatible.A service that can back up files which wants to become a provider of this contract must:
Each option must be handled the same way by all providers. This is enforced by making the provider pass the NixOS VM generic test described hereunder.
The goal of the generic test is to make sure each provider behaves in the same uniformized way, making them interchangeable.
A provider wanting to implement this contract must add a file at
which imports the generic test function
and uses it to define a test. A line referencing this new file must be added to
nixos/tests/contracts/default.nix. The test can then be run withnix-build -A nixosTests.contracts.fileBackup-<provider name>.All generic test functions (for this contract and others) accept one
attrsetas argument which must define the following attributes:providerName (str)providerRoot (listOf str)this is the path in the option tree to the provider option implementing thefileBackupContract. It is a list because the generic test usessetAttrByPathandgetAttrFromPathto respectively set theinputand get theoutputof the contract.maintainersto know who to ping in case of an issue.importswhich is a list of modules which is used both to import the actual provider module to be tested but also to setup any required options needed to initialize the provider, would this be necessary.Furthermore, the generic test function defines the
test.inputandtest.outputoptions which follows exactly the same definition as the contract. These can be used in the modules in theimportsif required by the provider.Usage
To make this PR more concrete, a demo file has been added at
nixos/tests/contracts/fileBackup/demo.nixwith a NixOS VM test which shows how the end user would use thefileBackupcontract.The demo shows the
resticandborgbackupservices using the file backup contract to backup bothnextcloudandhome-assistant. This demo file could be moved to the manual instead of living as a NixOS VM test but I wanted to showcase this is actually evaluating and building correctly.Out of Scope
Options Machinery
All in all, this PR does not do anything magic as far as the module system is concerned and that is its primary goal, keep it simple.
Clearly, the options for the contract are repeated in various places. It is quite evident that it would be beneficial in the long run to somehow define the options once in a source of truth location and be able to reuse them when needed. After all, the diff in the consumers repeat a lot of stuff.
I specifically want to not include this consideration in this PR. The goal here is to focus on the contract itself. What it means to have a file backup contract, what options should go in there, etc. Attempts were made to introduce the machinery to avoid repeating the options in other PRs but these did not receive a lot of attention. If you have ideas though, feel free to comment on those PRs:
Service Orchestration
Or is it out of scope? Should we stop a service before backing up the files? Should the contract include a "hook" option that triggers a dump of a consumer service's files?
Streaming Backup
I do have an implementation ready for this but it should probably go into its own follow-up PR.
Things done
passthru.tests.nixpkgs-reviewon this PR. See nixpkgs-review usage../result/bin/.