Skip to content

inotify exhausted under large parallel campaigns #130

@azchin

Description

@azchin

Summary

libCRS register-submit-dir uses inotify instances. Parallel OSS-CRS runs hit the default Linux host limit for fs.inotify.max_user_instances.

On a host with the default limit of 128, running multiple OSS-CRS fuzzer containers caused new containers to fail during startup with:

OSError(24, 'inotify instance limit reached')

What we observed

The processes consuming these inotify instances were libCRS watcher processes launched from /run_fuzzer.sh inside fuzzer containers.

Examples:

/root/.local/share/uv/tools/libcrs/bin/python /usr/local/bin/libCRS register-submit-dir pov /work/out/main/submitted_povs --log /tmp/pov_submit_main.log
/root/.local/share/uv/tools/libcrs/bin/python /usr/local/bin/libCRS register-submit-dir seed /work/out/worker_1/queue --log /tmp/seed_submit_worker_1.log

These were spawned in large numbers under active fuzzer containers, with parent process:

/bin/bash /run_fuzzer.sh

On the affected host, root was exactly at the inotify instance cap:

fs.inotify.max_user_instances = 128
root: 128 inotify instances

At that point, additional root-owned/container processes trying to create an inotify instance failed.

Why this is a problem

This makes OSS-CRS fragile on otherwise normal Linux hosts using the default kernel setting. Users can hit a non-obvious host-level failure that looks like a container/runtime problem, but is really exhaustion caused by many register-submit-dir watchers.

Suggested improvements

Possible fixes / mitigations:

Replace register-submit-dir and similar API's with dumb polling.

Potentially also recommend a larger fs.inotify.max_user_watches as a companion setting.

fs.inotify.max_user_instances=1024

Expected behavior

OSS-CRS should either:

  • avoid consuming so many inotify instances by default, or
  • fail with a clear, actionable diagnostic, or
  • document the required host tuning clearly enough that operators can avoid this upfront.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions