Summary
With Milvus Lite 2.5.1, a long-lived MilvusClient stops working after roughly 60 seconds of idle. The client object is never recreated, never closed, and runs in a single sync Python process — but the next RPC after the idle gap raises:
MilvusException: (code=2, message=Fail connecting to server on unix:/tmp/tmpXXX_<db>.sock,
illegal connection params or server unavailable)
Shorter idles (≤30s) work fine. The unix socket file still exists on disk, but connections to it fail, suggesting the milvus-lite daemon subprocess has exited while the Python side still holds its MilvusClient.
This hurts any long-lived Python process that talks to Milvus Lite infrequently — MCP servers, long-running web backends with low QPS, interactive Jupyter notebooks with multi-minute gaps between cells, scheduled jobs, REPL-style tools, etc.
Minimal reproduction (no threads, no asyncio, no MCP)
import os, time
from pymilvus import MilvusClient, DataType
db = "/tmp/lite_repro.db"
if os.path.exists(db):
os.unlink(db)
client = MilvusClient(uri=db)
schema = client.create_schema(auto_id=False)
schema.add_field(field_name="id", datatype=DataType.VARCHAR, is_primary=True, max_length=64)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=4)
idx = client.prepare_index_params()
idx.add_index(field_name="vector", index_type="AUTOINDEX", metric_type="COSINE")
client.create_collection(collection_name="probe", schema=schema, index_params=idx)
client.insert("probe", [{"id": "a", "vector": [0.1, 0.2, 0.3, 0.4]}])
print("first insert OK")
time.sleep(60) # <-- key
client.insert("probe", [{"id": "b", "vector": [0.5, 0.6, 0.7, 0.8]}])
# MilvusException: Fail connecting to server on unix:/tmp/tmp1ns3wqia_lite_repro.db.sock,
# illegal connection params or server unavailable
Idle-duration matrix (same script, only time.sleep(N) changes)
| idle N (s) |
second insert |
| 5 |
✅ OK |
| 15 |
✅ OK |
| 30 |
✅ OK |
| 60 |
❌ FAIL (server unavailable) |
Threshold sits somewhere between 30s and 60s on this environment. Same behavior on fresh ./milvus.db files (not caused by file-locking / stale state).
Environment
- OS: Ubuntu 22.04 (Linux 5.15.0-174-generic, x86_64)
- Python: 3.12.9
- pymilvus: 2.6.8
- milvus-lite: 2.5.1 (latest stable on PyPI as of 2026-04-16)
What I'd like to understand / propose
- Is this intentional? i.e. does
milvus-lite deliberately shut down the daemon subprocess after an idle timeout to free resources?
- If intentional, is there a supported way to keep the daemon alive, or a documented recommendation for long-lived clients (e.g. "periodically call
list_collections() as a keepalive" or "reconstruct MilvusClient on code=2 errors")?
- If not intentional, this looks like a real regression for any long-lived low-QPS user of Milvus Lite. Would be great to have either a fix (client-side auto-reconnect on
server unavailable / closed channel) or explicit documentation of the unsupported usage pattern.
Related issues I found but none cover this specific pattern: #88, #152, #195, #216, #263, #264.
Happy to contribute a fix or docs patch once the intended behavior is confirmed.
Summary
With Milvus Lite 2.5.1, a long-lived
MilvusClientstops working after roughly 60 seconds of idle. The client object is never recreated, never closed, and runs in a single sync Python process — but the next RPC after the idle gap raises:Shorter idles (≤30s) work fine. The unix socket file still exists on disk, but connections to it fail, suggesting the
milvus-litedaemon subprocess has exited while the Python side still holds itsMilvusClient.This hurts any long-lived Python process that talks to Milvus Lite infrequently — MCP servers, long-running web backends with low QPS, interactive Jupyter notebooks with multi-minute gaps between cells, scheduled jobs, REPL-style tools, etc.
Minimal reproduction (no threads, no asyncio, no MCP)
Idle-duration matrix (same script, only
time.sleep(N)changes)server unavailable)Threshold sits somewhere between 30s and 60s on this environment. Same behavior on fresh
./milvus.dbfiles (not caused by file-locking / stale state).Environment
What I'd like to understand / propose
milvus-litedeliberately shut down the daemon subprocess after an idle timeout to free resources?list_collections()as a keepalive" or "reconstructMilvusClientoncode=2errors")?server unavailable/closed channel) or explicit documentation of the unsupported usage pattern.Related issues I found but none cover this specific pattern: #88, #152, #195, #216, #263, #264.
Happy to contribute a fix or docs patch once the intended behavior is confirmed.