Is there an existing issue for this?
None of the currently open issues directly address the "Thread Explosion" (CPU Thrashing) caused by ONNX Runtime. Though a few of them do sound like cousins to the problem like #1034 and #1033
What happened?
The application currently experiences significant performance degradation (high Load Average, UI stutters, system unresponsiveness) during the image syncing/indexing process.
After some investigation, I found out that the backend initializes onnxruntime.InferenceSession using the default constructor without specifying thread limits. When combined with the existing ProcessPoolExecutor architecture, this causes a "thread explosion" that thrashes the CPU and performance.
Cause
The backend handles parallelism by spawning multiple worker processes. Inside each worker, the ONNX models (YOLO.py, FaceNet.py) are initialized with default settings. By default, ONNX Runtime attempts to use all available CPU cores.
If the host machine has 8 cores and the ProcessPool spawns 4 workers:
- 4 Workers × 8 Threads per Worker = 32 Heavy Compute Threads fighting for 8 physical cores.
Result: The OS scheduler spends more time context-switching than processing, causing the UI thread to starve and the indexing speed to effectively decrease.
Proposed Solution
Since we are achieving parallelism at the Process Level (multiple images processed at once), we must disable parallelism at the Model Level.
We need to explicitly configure the SessionOptions to limit each model instance to a single thread.
Record
Is there an existing issue for this?
None of the currently open issues directly address the "Thread Explosion" (CPU Thrashing) caused by ONNX Runtime. Though a few of them do sound like cousins to the problem like #1034 and #1033
What happened?
The application currently experiences significant performance degradation (high Load Average, UI stutters, system unresponsiveness) during the image syncing/indexing process.
After some investigation, I found out that the backend initializes
onnxruntime.InferenceSessionusing the default constructor without specifying thread limits. When combined with the existingProcessPoolExecutorarchitecture, this causes a "thread explosion" that thrashes the CPU and performance.Cause
The backend handles parallelism by spawning multiple worker processes. Inside each worker, the ONNX models (YOLO.py, FaceNet.py) are initialized with default settings. By default, ONNX Runtime attempts to use all available CPU cores.
If the host machine has 8 cores and the
ProcessPoolspawns 4 workers:Result: The OS scheduler spends more time context-switching than processing, causing the UI thread to starve and the indexing speed to effectively decrease.
Proposed Solution
Since we are achieving parallelism at the Process Level (multiple images processed at once), we must disable parallelism at the Model Level.
We need to explicitly configure the
SessionOptionsto limit each model instance to a single thread.Record