fix(crypto): add extra 16MiB space to engine and replicas#4624
fix(crypto): add extra 16MiB space to engine and replicas#4624mantissahz wants to merge 5 commits intolonghorn:masterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@mantissahz Is the PR ready? |
|
Hi @derekbit, |
9d03fb8 to
36a457b
Compare
be389bb to
baa81df
Compare
There was a problem hiding this comment.
Pull request overview
This PR addresses encrypted volume sizing by accounting for the fixed 16 MiB LUKS2 header (Issue longhorn/longhorn#9205; issue context unavailable in this review environment), and adds safeguards so migrations/upgrades only proceed when the engine image supports the required crypto behavior.
Changes:
- Add a shared “backend size” calculation (volume size + 16 MiB header when applicable) and use it when creating/upgrading engine/replica processes and when reconciling expansion.
- Add webhook validations to block encrypted live migration actions when the engine image CLI API version is too old to support the new sizing behavior.
- Introduce cryptsetup version detection and a pre-live-upgrade encrypted device resize step for the fixed-header behavior.
Reviewed changes
Copilot reviewed 14 out of 17 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| webhook/resources/volumeattachment/validator.go | Adds encrypted migratable-volume validation gated on engine CLI API version. |
| webhook/resources/volume/validator.go | Adds validation when setting MigrationNodeID for encrypted volumes to require sufficient engine CLI API version. |
| util/util.go | Adds cryptsetup version detection and a “precise backend size” helper used by controllers/engine process args. |
| types/types.go | Adds StorageClass parameter keys for node-stage secret lookup. |
| controller/engine_controller.go | Adds pre-live-upgrade expansion/resize flow for encrypted volumes and updates expansion logic to compare against backend size. |
| controller/replica_controller.go | Passes encryption flag into replica instance creation requests. |
| controller/volume_controller.go | Uses backend size when determining expansion state and cancellation. |
| engineapi/types.go | Updates VolumeExpand interface to accept an explicit size. |
| engineapi/proxy_volume.go | Plumbs explicit size through proxy VolumeExpand. |
| engineapi/instance_manager.go | Adjusts engine/replica process --size args for encrypted volumes; adds encryption flag to create/upgrade requests. |
| engineapi/enginesim.go | Updates simulator signature for VolumeExpand(engine, size). |
| engineapi/engine.go | Updates engine-binary signature for VolumeExpand(engine, size). |
| csi/crypto/crypto.go | Reuses shared LUKS2 header size constant from go-common-libs. |
| vendor/modules.txt | Updates vendored go-common-libs version. |
| vendor/github.com/longhorn/go-common-libs/types/crypto.go | Adds constants and GetBackendSize() helper for LUKS2 header sizing. |
| go.mod | Bumps github.com/longhorn/go-common-libs dependency version. |
| go.sum | Updates checksums for the bumped go-common-libs version. |
@copilot Do you review PRs based on the instructions? |
| } | ||
|
|
||
| func IsValidForExpansion(engine *longhorn.Engine, cliAPIVersion, imAPIVersion int) (bool, error) { | ||
| func IsValidForExpansion(engine *longhorn.Engine, cliAPIVersion, imAPIVersion int, encrypted bool) (bool, int64, error) { |
There was a problem hiding this comment.
| func IsValidForExpansion(engine *longhorn.Engine, cliAPIVersion, imAPIVersion int, encrypted bool) (bool, int64, error) { | |
| func isValidForExpansion(engine *longhorn.Engine, cliAPIVersion, imAPIVersion int, encrypted bool) (bool, int64, error) { |
| return false | ||
| } | ||
|
|
||
| func GetPreciseBackendSize(size int64, encrypted bool, cliAPIVersion int) (int64, error) { |
There was a problem hiding this comment.
Precise is a bit weird. Unable to understand the purpose of the volume from the name. Can we use GetActualBackendSize?
| func IsCryptsetupVerWithFixed16MiBHeaderSize() (bool, error) { | ||
| ver, err := getCryptsetupVersion() | ||
| if err != nil { | ||
| return false, err | ||
| } | ||
|
|
||
| parsedVer, err := version.ParseSemantic(ver) | ||
| if err != nil { | ||
| return false, fmt.Errorf("failed to parse cryptsetup version %q: %w", ver, err) | ||
| } | ||
|
|
||
| // The fixed header size 16 MiB was introduced in cryptsetup 2.1.0. See: https://www.kernel.org/pub/linux/utils/cryptsetup/v2.1/v2.1.0-ReleaseNotes. | ||
| minVer, err := version.ParseSemantic("2.1.0") | ||
| if err != nil { | ||
| return false, fmt.Errorf("failed to parse minimum cryptsetup version: %w", err) | ||
| } | ||
|
|
||
| return parsedVer.AtLeast(minVer), nil | ||
| } |
There was a problem hiding this comment.
Move the function to go-common-libs
| func GetPreciseBackendSize(size int64, encrypted bool, cliAPIVersion int) (int64, error) { | ||
| if !encrypted { | ||
| return size, nil | ||
| } | ||
|
|
||
| if is16MiBHeaderPkgVersion, err := IsCryptsetupVerWithFixed16MiBHeaderSize(); err != nil || !is16MiBHeaderPkgVersion { | ||
| return size, err | ||
| } | ||
|
|
||
| return lhtypes.GetBackendSize(size, encrypted, cliAPIVersion), nil | ||
| } |
There was a problem hiding this comment.
Move the function to go-common-libs.
| func getCryptsetupVersion() (string, error) { | ||
| var err error | ||
|
|
||
| namespaces := []lhtypes.Namespace{lhtypes.NamespaceMnt, lhtypes.NamespaceNet} | ||
| nsexec, err := lhns.NewNamespaceExecutor(lhtypes.ProcessNone, lhtypes.HostProcDirectory, namespaces) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
|
|
||
| result, err := nsexec.Execute(nil, lhtypes.BinaryCryptsetup, []string{"--version"}, time.Hour) | ||
| if err != nil { | ||
| return "", errors.Wrap(err, "cannot find cryptsetup version info on host") | ||
| } | ||
|
|
||
| //command: cryptsetup --version; result: cryptsetup 2.4.3\n | ||
| fields := strings.Fields(result) | ||
| if len(fields) < 2 { | ||
| return "", fmt.Errorf("failed to parse cryptsetup version from output %q", result) | ||
| } | ||
| return fields[1], nil | ||
| } |
There was a problem hiding this comment.
Move the function to go-common-libs.
| "currentSize": e.Status.CurrentSize, | ||
| "expectedBackendSize": needBacknedSize, | ||
| "volume": e.Spec.VolumeName, | ||
| }).Info("Wait for the expected volume size to be updated before engine live upgrade for encrypted volume") |
There was a problem hiding this comment.
| }).Info("Wait for the expected volume size to be updated before engine live upgrade for encrypted volume") | |
| }).Info("Waiting for the expected volume size to be updated before engine live upgrade for encrypted volume") |
| } | ||
|
|
||
| e.Status.IsExpanding = true | ||
| ec.enqueueEngineAfter(e, 30*time.Second) |
There was a problem hiding this comment.
30 seconds is quite long. Why do we set a long duration?
| if is16MiBHeaderPkgVersion { | ||
| if isExpanding, err := ec.expandEncryptedVolumeBeforeLiveUpgrade(volume, engine); err != nil || isExpanding { | ||
| // Wait for the expected volume size to be updated before engine live upgrade for encrypted volume | ||
| return err |
There was a problem hiding this comment.
Is it reasonable to return nil err if the returned values are "isExpanding == true" and "err == nil"?
| if !v.Spec.Encrypted || e.Status.CurrentState != longhorn.InstanceStateRunning { | ||
| return false, nil | ||
| } |
There was a problem hiding this comment.
How do we handle the detached encrypted volume?
| } | ||
|
|
||
| e.Status.CurrentSize = volumeInfo.Size | ||
| needBacknedSize := lhtypes.GetBackendSize(e.Spec.VolumeSize, v.Spec.Encrypted, cliAPIVersion) |
There was a problem hiding this comment.
Typo. needBackendSize
BTW, can we find a better name for needBackendSize. The name is not easy to understand the purppse.
| } | ||
| } | ||
|
|
||
| e.Status.IsExpanding = true |
There was a problem hiding this comment.
You reuse e.Status.IsExpanding in the PR. What will happen if user trigger REAL volume exapsion? Should we reject the REAL volume exapsion request"
| return e.Status.IsExpanding, nil | ||
| } | ||
|
|
||
| func (ec *EngineController) resizeEncryptedDeviceBeforeUpgradeFor16MiBHeader(v *longhorn.Volume) error { |
There was a problem hiding this comment.
nit: the function name is too long
|
After disussing with @mantissahz, the PR doesn't handle the non-running volumes expansionl. We need to handle the logics. |
of the encryption volume. ref: longhorn/longhorn 9205 Signed-off-by: James Lu <james.lu@suse.com>
ref: longhorn/longhorn 9205 Signed-off-by: James Lu <james.lu@suse.com>
ref: longhorn/longhorn 9205 Signed-off-by: James Lu <james.lu@suse.com>
ref: longhorn/longhorn 9205 Signed-off-by: James Lu <james.lu@suse.com>
ref: longhorn/longhorn 9205 Signed-off-by: James Lu <james.lu@suse.com>
Which issue(s) this PR fixes:
Issue # longhorn/longhorn#9205
What this PR does / why we need it:
Special notes for your reviewer:
Additional documentation or context
Merge after this PR longhorn/go-common-libs#174