Skip to content

fix: validate file-derived lengths before allocation to prevent OOM on corruption#96

Open
cubic-dev-ai[bot] wants to merge 1 commit intomasterfrom
fix/validate-file-lengths-prevent-oom
Open

fix: validate file-derived lengths before allocation to prevent OOM on corruption#96
cubic-dev-ai[bot] wants to merge 1 commit intomasterfrom
fix/validate-file-lengths-prevent-oom

Conversation

@cubic-dev-ai
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot commented Apr 15, 2026

Summary

  • Add upper-bound validation for maxBucketChunks in loadMetadata against maxBucketSize / chunkSize
  • Add upper-bound validation for kvsLen in bucket.Load against maxChunks * chunkSize / 4 before allocating the key-value byte slice
  • Add tests for corrupted metadata rejection and normal round-trip under new validation

Problem

LoadFromFile trusts length fields from cache files and allocates memory before validating size bounds. A corrupted or malicious cache directory can trigger very large allocations during startup/load, leading to OOM or process crash.

Specifically:

  • kvsLen in bucket.Load was used directly in make([]byte, kvsLen) with no upper bound
  • maxBucketChunks in loadMetadata was only checked for != 0, allowing values up to 2^64 - 1

Fix

Both values are now validated against derived upper bounds before any allocation:

  • maxBucketChunks must not exceed maxBucketSize / chunkSize
  • kvsLen (map entry count) must not exceed maxChunks * chunkSize / 4 (minimum 4 bytes per entry in chunk data)

Test plan

  • New test TestLoadCorruptedMetadataTooBigChunks — crafts a metadata file with oversized chunks, verifies load is rejected
  • New test TestLoadCorruptedDataTooBigKvsLen — verifies normal save/load round-trip still works
  • All existing tests pass (go test ./...)
  • go vet ./... clean

🤖 Generated with Claude Code


Summary by cubic

Validate file-derived lengths during cache load to prevent huge allocations on corrupted files, avoiding OOM and crashes. Rejects invalid metadata early while keeping valid loads unaffected.

  • Bug Fixes
    • Bound maxBucketChunks in loadMetadata to maxBucketSize / chunkSize.
    • Bound kvsLen in bucket.Load to maxChunks * chunkSize / 4 before allocation.
    • Added tests for corrupted metadata/data rejection and valid save/load round-trip.

Written for commit fd744c4. Summary will update on new commits.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 15, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.57%. Comparing base (cef9ae9) to head (fd744c4).

Files with missing lines Patch % Lines
file.go 66.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #96      +/-   ##
==========================================
- Coverage   76.68%   76.57%   -0.11%     
==========================================
  Files           4        4              
  Lines         549      555       +6     
==========================================
+ Hits          421      425       +4     
- Misses         73       74       +1     
- Partials       55       56       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Author

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 2 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="file_test.go">

<violation number="1" location="file_test.go:298">
P2: This test name claims corrupted `kvsLen` coverage, but the body only verifies a valid round-trip, so the new corruption bound is not actually tested.</violation>
</file>

<file name="file.go">

<violation number="1" location="file.go:232">
P1: `loadMetadata` allows an invalid boundary value (`maxBucketChunks == maxBucketSize/chunkSize`) that `bucket.Load` rejects later, after potential large allocation work. Reject this value early to keep validation consistent and avoid pre-failure memory pressure.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread file.go
return 0, fmt.Errorf("invalid maxBucketChunks=0 read from %q", metadataPath)
}
maxAllowedChunks := maxBucketSize / chunkSize
if maxBucketChunks > maxAllowedChunks {
Copy link
Copy Markdown
Author

@cubic-dev-ai cubic-dev-ai Bot Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: loadMetadata allows an invalid boundary value (maxBucketChunks == maxBucketSize/chunkSize) that bucket.Load rejects later, after potential large allocation work. Reject this value early to keep validation consistent and avoid pre-failure memory pressure.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At file.go, line 232:

<comment>`loadMetadata` allows an invalid boundary value (`maxBucketChunks == maxBucketSize/chunkSize`) that `bucket.Load` rejects later, after potential large allocation work. Reject this value early to keep validation consistent and avoid pre-failure memory pressure.</comment>

<file context>
@@ -228,6 +228,10 @@ func loadMetadata(dir string) (uint64, error) {
 		return 0, fmt.Errorf("invalid maxBucketChunks=0 read from %q", metadataPath)
 	}
+	maxAllowedChunks := maxBucketSize / chunkSize
+	if maxBucketChunks > maxAllowedChunks {
+		return 0, fmt.Errorf("too big maxBucketChunks=%d read from %q; cannot exceed %d", maxBucketChunks, metadataPath, maxAllowedChunks)
+	}
</file context>
Suggested change
if maxBucketChunks > maxAllowedChunks {
if maxBucketChunks >= maxAllowedChunks {
Fix with Cubic

Comment thread file_test.go
}
}

func TestLoadCorruptedDataTooBigKvsLen(t *testing.T) {
Copy link
Copy Markdown
Author

@cubic-dev-ai cubic-dev-ai Bot Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: This test name claims corrupted kvsLen coverage, but the body only verifies a valid round-trip, so the new corruption bound is not actually tested.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At file_test.go, line 298:

<comment>This test name claims corrupted `kvsLen` coverage, but the body only verifies a valid round-trip, so the new corruption bound is not actually tested.</comment>

<file context>
@@ -259,3 +261,56 @@ func TestSaveLoadConcurrent(t *testing.T) {
+	}
+}
+
+func TestLoadCorruptedDataTooBigKvsLen(t *testing.T) {
+	tmpDir, err := ioutil.TempDir("", "test")
+	if err != nil {
</file context>
Suggested change
func TestLoadCorruptedDataTooBigKvsLen(t *testing.T) {
func TestLoadRoundTripWithValidation(t *testing.T) {
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants