Skip to content

Fix OOM during dictionary database export with custom chunked streaming#2347

Open
RadiantSol wants to merge 4 commits intoyomidevs:masterfrom
RadiantSol:claude/relaxed-bhaskara
Open

Fix OOM during dictionary database export with custom chunked streaming#2347
RadiantSol wants to merge 4 commits intoyomidevs:masterfrom
RadiantSol:claude/relaxed-bhaskara

Conversation

@RadiantSol
Copy link
Copy Markdown

@RadiantSol RadiantSol commented Mar 21, 2026

I'm actually not 100% sure why it crashes when doing export as from reading Dexie's export documentation it should be streamed properly. I suspect the issue is related to how macOS handles memory optimization since I can only reproduce the crashing from #1034 and #1274 on my 16 GB RAM macbook. Regardless, the below change is able to export a 100 dict collection successfully on my machine without crashing. Would encourage anyone more well versed in Dexie to investigate further as for some reason finding the actual exception thrown is exceedingly hard.

EDIT: Update to investigation: Might be related to how Chromium handles loading blob to memory vs disk. Chromium seems to define a per-machine threshold based on available RAM. Since Dexie does not merge blobs until the end these blobs are potentially small enough to remain in memory and eventually overload it. This implementation periodically merges the blobs into a single larger one that I am assuming Chromium handles properly by storing on disk.

AI Generated Summary
Replace dexie-export-import's db.export() with a custom streaming export that reads IndexedDB via cursors with one transaction per chunk.

The custom implementation uses separate transactions per 2000-row chunk and periodic Blob merging (50MB threshold) to minimize heap pressure, while producing dexie-export-import compatible JSON output.

Fixes #1034

Replace dexie-export-import's db.export() with a custom streaming export
that reads IndexedDB via cursors with one transaction per chunk. The
default db.export() wraps the entire export in a single read transaction,
forcing the IDB engine to hold snapshot data for every row in memory
until the transaction completes, causing OOM on large databases (1.5GB+).

The custom implementation uses separate transactions per 2000-row chunk
and periodic Blob merging (50MB threshold) to minimize heap pressure,
while producing dexie-export-import compatible JSON output.

Fixes yomidevs#1034

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RadiantSol RadiantSol requested a review from a team as a code owner March 21, 2026 17:58
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa3db0f2d5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@RadiantSol
Copy link
Copy Markdown
Author

Update to investigation: Might be related to how Chromium handles loading blob to memory vs disk. Chromium seems to define a per-machine threshold based on available RAM. Since Dexie does not merge blobs until the end these blobs are potentially small enough to remain in memory and eventually overload it. This implementation periodically merges the blobs into a single larger one that I am assuming Chromium handles properly by storing on disk.

@Kuuuube
Copy link
Copy Markdown
Member

Kuuuube commented Mar 21, 2026

Make sure these can import as well. Theres no sense in allowing exporting of collections exceeding memory if they cannot be imported.

@RadiantSol
Copy link
Copy Markdown
Author

RadiantSol commented Mar 30, 2026

Just tested, was able to export a 97 dict collection on the problematic machine and re-import it with no issues. Can't attach of course due to size restrictions but here's screenshots of the collection before and after.
Screenshot 2026-03-29 at 6 49 17 PM
Screenshot 2026-03-29 at 7 14 07 PM

EDIT: Also worth noting I gave the definitions themselves a quick look through and they all look good.

@Kuuuube Kuuuube added kind/enhancement The issue or PR is a new feature or request area/performance The issue or PR is related to performance labels Mar 30, 2026
@Kuuuube
Copy link
Copy Markdown
Member

Kuuuube commented Mar 30, 2026

image

Well this doesn't seem right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/performance The issue or PR is related to performance kind/enhancement The issue or PR is a new feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Export dictionaries to JSON failed

2 participants