This update adds comprehensive memory management and OCR model control features to NuoYi, solving the "CUDA out of memory" errors when processing large batches of PDFs.
Purpose: Save ~1.5GB VRAM by not loading OCR-related models for digital PDFs.
Usage:
nuoyi paper.pdf --disable-ocr-modelsWhen to use:
- Digital PDFs with embedded text (not scanned documents)
- PDFs without complex tables requiring OCR
- PDFs without mathematical formulas requiring OCR
Warning: OCR features will not work when this flag is enabled.
Improvements:
- More aggressive memory optimization
- Better VRAM threshold detection (4GB and 6GB thresholds)
- Automatic cleanup every N files during batch processing
Models are only loaded when the first file is processed, not at initialization. This allows:
- Checking available memory before loading
- Providing helpful error messages if memory is insufficient
- Automatic cleanup between files in batch mode
When CUDA OOM occurs during conversion:
- Automatic aggressive memory cleanup
- Retry with cleaned cache
- Helpful suggestions if still fails
New utility functions:
get_current_memory_usage()- Get detailed GPU memory statisticscheck_memory_available(required_mb)- Check if enough memory availableaggressive_memory_cleanup()- Force cleanup of GPU memory
-
src/nuoyi/utils.py
- Added
aggressive_memory_cleanup() - Added
check_memory_available() - Added
get_current_memory_usage() - Added
VERY_LOW_VRAM_THRESHOLD_GBconstant - Added
_setup_directml_env()function
- Added
-
src/nuoyi/converter.py
MarkerPDFConverter:- Added
disable_ocr_modelsparameter - Lazy model loading
_create_minimal_model_dict()for OCR-disabled mode- OOM retry mechanism with cleanup
cleanup()method for explicit resource release
- Added
-
src/nuoyi/cli.py
- Added
--disable-ocr-modelsCLI flag - Updated
convert_single_file()andconvert_directory() - Better memory management in batch processing
- Conflict warning between
--disable-ocr-modelsand--force-ocr
- Added
-
src/nuoyi/api.py
- Updated
convert_file()withdisable_ocr_modelsparameter - Updated
convert_directory()withdisable_ocr_modelsparameter
- Updated
-
src/nuoyi/gui.py
- Added "No OCR Models" checkbox
- Updated
ConverterWorkerto supportdisable_ocr_models
| Mode | VRAM Required | Features |
|---|---|---|
| Full models | ~3GB | All features (OCR, tables, formulas) |
| Minimal models (--disable-ocr-models) | ~1.5GB | Layout detection only, no OCR |
| CPU mode | 0GB | All features, slower |
All tests pass (49 passed, 3 deselected):
tests/test_memory_management.py- 13 new tests for memory features- All existing tests continue to pass
- README.md - Added low VRAM options section, updated CLI options table
- README_CN.md - Chinese documentation updated with same content
- CLI help text includes new
--disable-ocr-modelsoption
nuoyi paper.pdf --disable-ocr-models --low-vramnuoyi ./papers --batch --disable-ocr-modelsnuoyi scanned.pdf --low-vram# Minimal models for digital PDFs
converter = MarkerPDFConverter(
disable_ocr_models=True,
low_vram=True,
device="auto"
)
markdown, images = converter.convert_file("digital.pdf")
# Don't forget to cleanup when done
converter.cleanup()Potential future enhancements:
- Auto-detect PDF type (digital vs scanned) and auto-disable OCR models
- Model swapping: unload OCR models after OCR-heavy pages
- Streaming batch processing to reduce memory footprint
- Integration with system memory monitors for adaptive processing