Skip to content

fix(crypto): add extra 16MiB space to engine and replicas#4624

Open
mantissahz wants to merge 5 commits intolonghorn:masterfrom
mantissahz:issue9205
Open

fix(crypto): add extra 16MiB space to engine and replicas#4624
mantissahz wants to merge 5 commits intolonghorn:masterfrom
mantissahz:issue9205

Conversation

@mantissahz
Copy link
Copy Markdown
Contributor

@mantissahz mantissahz commented Mar 31, 2026

Which issue(s) this PR fixes:

Issue # longhorn/longhorn#9205

What this PR does / why we need it:

Special notes for your reviewer:

Additional documentation or context

Merge after this PR longhorn/go-common-libs#174

@mantissahz mantissahz requested a review from a team March 31, 2026 13:34
@mantissahz mantissahz self-assigned this Mar 31, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 31, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dd682624-561b-482a-bd16-6a8e2b6b8046

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@derekbit derekbit requested a review from shuo-wu April 2, 2026 00:56
@derekbit
Copy link
Copy Markdown
Member

derekbit commented Apr 6, 2026

@mantissahz Is the PR ready?

@mantissahz
Copy link
Copy Markdown
Contributor Author

Hi @derekbit,
No. The PR longhorn/go-common-libs#174 should be merged, and the package dependency should be updated first

@mantissahz mantissahz force-pushed the issue9205 branch 2 times, most recently from 9d03fb8 to 36a457b Compare April 7, 2026 04:57
@mantissahz mantissahz marked this pull request as ready for review April 8, 2026 10:51
@mantissahz mantissahz marked this pull request as draft April 8, 2026 10:52
@mantissahz mantissahz requested review from a team April 8, 2026 10:53
@mantissahz mantissahz force-pushed the issue9205 branch 2 times, most recently from be389bb to baa81df Compare April 8, 2026 11:26
@mantissahz mantissahz marked this pull request as ready for review April 9, 2026 02:22
@mantissahz mantissahz requested review from Copilot and derekbit April 9, 2026 02:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses encrypted volume sizing by accounting for the fixed 16 MiB LUKS2 header (Issue longhorn/longhorn#9205; issue context unavailable in this review environment), and adds safeguards so migrations/upgrades only proceed when the engine image supports the required crypto behavior.

Changes:

  • Add a shared “backend size” calculation (volume size + 16 MiB header when applicable) and use it when creating/upgrading engine/replica processes and when reconciling expansion.
  • Add webhook validations to block encrypted live migration actions when the engine image CLI API version is too old to support the new sizing behavior.
  • Introduce cryptsetup version detection and a pre-live-upgrade encrypted device resize step for the fixed-header behavior.

Reviewed changes

Copilot reviewed 14 out of 17 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
webhook/resources/volumeattachment/validator.go Adds encrypted migratable-volume validation gated on engine CLI API version.
webhook/resources/volume/validator.go Adds validation when setting MigrationNodeID for encrypted volumes to require sufficient engine CLI API version.
util/util.go Adds cryptsetup version detection and a “precise backend size” helper used by controllers/engine process args.
types/types.go Adds StorageClass parameter keys for node-stage secret lookup.
controller/engine_controller.go Adds pre-live-upgrade expansion/resize flow for encrypted volumes and updates expansion logic to compare against backend size.
controller/replica_controller.go Passes encryption flag into replica instance creation requests.
controller/volume_controller.go Uses backend size when determining expansion state and cancellation.
engineapi/types.go Updates VolumeExpand interface to accept an explicit size.
engineapi/proxy_volume.go Plumbs explicit size through proxy VolumeExpand.
engineapi/instance_manager.go Adjusts engine/replica process --size args for encrypted volumes; adds encryption flag to create/upgrade requests.
engineapi/enginesim.go Updates simulator signature for VolumeExpand(engine, size).
engineapi/engine.go Updates engine-binary signature for VolumeExpand(engine, size).
csi/crypto/crypto.go Reuses shared LUKS2 header size constant from go-common-libs.
vendor/modules.txt Updates vendored go-common-libs version.
vendor/github.com/longhorn/go-common-libs/types/crypto.go Adds constants and GetBackendSize() helper for LUKS2 header sizing.
go.mod Bumps github.com/longhorn/go-common-libs dependency version.
go.sum Updates checksums for the bumped go-common-libs version.

@derekbit
Copy link
Copy Markdown
Member

derekbit commented Apr 9, 2026

Pull request overview

This PR addresses encrypted volume sizing by accounting for the fixed 16 MiB LUKS2 header (Issue longhorn/longhorn#9205; issue context unavailable in this review environment), and adds safeguards so migrations/upgrades only proceed when the engine image supports the required crypto behavior.

Changes:

  • Add a shared “backend size” calculation (volume size + 16 MiB header when applicable) and use it when creating/upgrading engine/replica processes and when reconciling expansion.
  • Add webhook validations to block encrypted live migration actions when the engine image CLI API version is too old to support the new sizing behavior.
  • Introduce cryptsetup version detection and a pre-live-upgrade encrypted device resize step for the fixed-header behavior.

Reviewed changes

Copilot reviewed 14 out of 17 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
webhook/resources/volumeattachment/validator.go Adds encrypted migratable-volume validation gated on engine CLI API version.
webhook/resources/volume/validator.go Adds validation when setting MigrationNodeID for encrypted volumes to require sufficient engine CLI API version.
util/util.go Adds cryptsetup version detection and a “precise backend size” helper used by controllers/engine process args.
types/types.go Adds StorageClass parameter keys for node-stage secret lookup.
controller/engine_controller.go Adds pre-live-upgrade expansion/resize flow for encrypted volumes and updates expansion logic to compare against backend size.
controller/replica_controller.go Passes encryption flag into replica instance creation requests.
controller/volume_controller.go Uses backend size when determining expansion state and cancellation.
engineapi/types.go Updates VolumeExpand interface to accept an explicit size.
engineapi/proxy_volume.go Plumbs explicit size through proxy VolumeExpand.
engineapi/instance_manager.go Adjusts engine/replica process --size args for encrypted volumes; adds encryption flag to create/upgrade requests.
engineapi/enginesim.go Updates simulator signature for VolumeExpand(engine, size).
engineapi/engine.go Updates engine-binary signature for VolumeExpand(engine, size).
csi/crypto/crypto.go Reuses shared LUKS2 header size constant from go-common-libs.
vendor/modules.txt Updates vendored go-common-libs version.
vendor/github.com/longhorn/go-common-libs/types/crypto.go Adds constants and GetBackendSize() helper for LUKS2 header sizing.
go.mod Bumps github.com/longhorn/go-common-libs dependency version.
go.sum Updates checksums for the bumped go-common-libs version.

@copilot Do you review PRs based on the instructions?

}

func IsValidForExpansion(engine *longhorn.Engine, cliAPIVersion, imAPIVersion int) (bool, error) {
func IsValidForExpansion(engine *longhorn.Engine, cliAPIVersion, imAPIVersion int, encrypted bool) (bool, int64, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func IsValidForExpansion(engine *longhorn.Engine, cliAPIVersion, imAPIVersion int, encrypted bool) (bool, int64, error) {
func isValidForExpansion(engine *longhorn.Engine, cliAPIVersion, imAPIVersion int, encrypted bool) (bool, int64, error) {

return false
}

func GetPreciseBackendSize(size int64, encrypted bool, cliAPIVersion int) (int64, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precise is a bit weird. Unable to understand the purpose of the volume from the name. Can we use GetActualBackendSize?

Comment on lines +941 to +959
func IsCryptsetupVerWithFixed16MiBHeaderSize() (bool, error) {
ver, err := getCryptsetupVersion()
if err != nil {
return false, err
}

parsedVer, err := version.ParseSemantic(ver)
if err != nil {
return false, fmt.Errorf("failed to parse cryptsetup version %q: %w", ver, err)
}

// The fixed header size 16 MiB was introduced in cryptsetup 2.1.0. See: https://www.kernel.org/pub/linux/utils/cryptsetup/v2.1/v2.1.0-ReleaseNotes.
minVer, err := version.ParseSemantic("2.1.0")
if err != nil {
return false, fmt.Errorf("failed to parse minimum cryptsetup version: %w", err)
}

return parsedVer.AtLeast(minVer), nil
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the function to go-common-libs

Comment on lines +929 to +939
func GetPreciseBackendSize(size int64, encrypted bool, cliAPIVersion int) (int64, error) {
if !encrypted {
return size, nil
}

if is16MiBHeaderPkgVersion, err := IsCryptsetupVerWithFixed16MiBHeaderSize(); err != nil || !is16MiBHeaderPkgVersion {
return size, err
}

return lhtypes.GetBackendSize(size, encrypted, cliAPIVersion), nil
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the function to go-common-libs.

Comment on lines +961 to +981
func getCryptsetupVersion() (string, error) {
var err error

namespaces := []lhtypes.Namespace{lhtypes.NamespaceMnt, lhtypes.NamespaceNet}
nsexec, err := lhns.NewNamespaceExecutor(lhtypes.ProcessNone, lhtypes.HostProcDirectory, namespaces)
if err != nil {
return "", err
}

result, err := nsexec.Execute(nil, lhtypes.BinaryCryptsetup, []string{"--version"}, time.Hour)
if err != nil {
return "", errors.Wrap(err, "cannot find cryptsetup version info on host")
}

//command: cryptsetup --version; result: cryptsetup 2.4.3\n
fields := strings.Fields(result)
if len(fields) < 2 {
return "", fmt.Errorf("failed to parse cryptsetup version from output %q", result)
}
return fields[1], nil
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the function to go-common-libs.

"currentSize": e.Status.CurrentSize,
"expectedBackendSize": needBacknedSize,
"volume": e.Spec.VolumeName,
}).Info("Wait for the expected volume size to be updated before engine live upgrade for encrypted volume")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}).Info("Wait for the expected volume size to be updated before engine live upgrade for encrypted volume")
}).Info("Waiting for the expected volume size to be updated before engine live upgrade for encrypted volume")

}

e.Status.IsExpanding = true
ec.enqueueEngineAfter(e, 30*time.Second)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30 seconds is quite long. Why do we set a long duration?

if is16MiBHeaderPkgVersion {
if isExpanding, err := ec.expandEncryptedVolumeBeforeLiveUpgrade(volume, engine); err != nil || isExpanding {
// Wait for the expected volume size to be updated before engine live upgrade for encrypted volume
return err
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it reasonable to return nil err if the returned values are "isExpanding == true" and "err == nil"?

Comment on lines +407 to +409
if !v.Spec.Encrypted || e.Status.CurrentState != longhorn.InstanceStateRunning {
return false, nil
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we handle the detached encrypted volume?

}

e.Status.CurrentSize = volumeInfo.Size
needBacknedSize := lhtypes.GetBackendSize(e.Spec.VolumeSize, v.Spec.Encrypted, cliAPIVersion)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo. needBackendSize
BTW, can we find a better name for needBackendSize. The name is not easy to understand the purppse.

}
}

e.Status.IsExpanding = true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You reuse e.Status.IsExpanding in the PR. What will happen if user trigger REAL volume exapsion? Should we reject the REAL volume exapsion request"

return e.Status.IsExpanding, nil
}

func (ec *EngineController) resizeEncryptedDeviceBeforeUpgradeFor16MiBHeader(v *longhorn.Volume) error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the function name is too long

@derekbit
Copy link
Copy Markdown
Member

After disussing with @mantissahz, the PR doesn't handle the non-running volumes expansionl. We need to handle the logics.

of the encryption volume.

ref: longhorn/longhorn 9205

Signed-off-by: James Lu <james.lu@suse.com>
ref: longhorn/longhorn 9205

Signed-off-by: James Lu <james.lu@suse.com>
ref: longhorn/longhorn 9205

Signed-off-by: James Lu <james.lu@suse.com>
ref: longhorn/longhorn 9205

Signed-off-by: James Lu <james.lu@suse.com>
ref: longhorn/longhorn 9205

Signed-off-by: James Lu <james.lu@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants