Milvus Lite daemon becomes unreachable after ~60s of idle on a long-lived MilvusClient

## Summary

With Milvus Lite 2.5.1, a long-lived `MilvusClient` stops working after roughly 60 seconds of idle. The client object is never recreated, never closed, and runs in a single sync Python process — but the next RPC after the idle gap raises:

```
MilvusException: (code=2, message=Fail connecting to server on unix:/tmp/tmpXXX_<db>.sock,
illegal connection params or server unavailable)
```

Shorter idles (≤30s) work fine. The unix socket file still exists on disk, but connections to it fail, suggesting the `milvus-lite` daemon subprocess has exited while the Python side still holds its `MilvusClient`.

This hurts any long-lived Python process that talks to Milvus Lite infrequently — MCP servers, long-running web backends with low QPS, interactive Jupyter notebooks with multi-minute gaps between cells, scheduled jobs, REPL-style tools, etc.

## Minimal reproduction (no threads, no asyncio, no MCP)

```python
import os, time
from pymilvus import MilvusClient, DataType

db = "/tmp/lite_repro.db"
if os.path.exists(db):
    os.unlink(db)

client = MilvusClient(uri=db)

schema = client.create_schema(auto_id=False)
schema.add_field(field_name="id", datatype=DataType.VARCHAR, is_primary=True, max_length=64)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=4)
idx = client.prepare_index_params()
idx.add_index(field_name="vector", index_type="AUTOINDEX", metric_type="COSINE")
client.create_collection(collection_name="probe", schema=schema, index_params=idx)

client.insert("probe", [{"id": "a", "vector": [0.1, 0.2, 0.3, 0.4]}])
print("first insert OK")

time.sleep(60)   # <-- key

client.insert("probe", [{"id": "b", "vector": [0.5, 0.6, 0.7, 0.8]}])
# MilvusException: Fail connecting to server on unix:/tmp/tmp1ns3wqia_lite_repro.db.sock,
# illegal connection params or server unavailable
```

## Idle-duration matrix (same script, only `time.sleep(N)` changes)

| idle N (s) | second insert |
|------------|---------------|
| 5          | ✅ OK         |
| 15         | ✅ OK         |
| 30         | ✅ OK         |
| 60         | ❌ FAIL (`server unavailable`) |

Threshold sits somewhere between 30s and 60s on this environment. Same behavior on fresh `./milvus.db` files (not caused by file-locking / stale state).

## Environment

- OS: Ubuntu 22.04 (Linux 5.15.0-174-generic, x86_64)
- Python: 3.12.9
- pymilvus: 2.6.8
- milvus-lite: 2.5.1 (latest stable on PyPI as of 2026-04-16)

## What I'd like to understand / propose

1. Is this intentional? i.e. does `milvus-lite` deliberately shut down the daemon subprocess after an idle timeout to free resources?
2. If intentional, is there a supported way to keep the daemon alive, or a documented recommendation for long-lived clients (e.g. "periodically call `list_collections()` as a keepalive" or "reconstruct `MilvusClient` on `code=2` errors")?
3. If not intentional, this looks like a real regression for any long-lived low-QPS user of Milvus Lite. Would be great to have either a fix (client-side auto-reconnect on `server unavailable` / `closed channel`) or explicit documentation of the unsupported usage pattern.

Related issues I found but none cover this specific pattern: #88, #152, #195, #216, #263, #264.

Happy to contribute a fix or docs patch once the intended behavior is confirmed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Milvus Lite daemon becomes unreachable after ~60s of idle on a long-lived MilvusClient #334

Summary

Minimal reproduction (no threads, no asyncio, no MCP)

Idle-duration matrix (same script, only `time.sleep(N)` changes)

Environment

What I'd like to understand / propose

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Milvus Lite daemon becomes unreachable after ~60s of idle on a long-lived MilvusClient #334

Description

Summary

Minimal reproduction (no threads, no asyncio, no MCP)

Idle-duration matrix (same script, only time.sleep(N) changes)

Environment

What I'd like to understand / propose

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Idle-duration matrix (same script, only `time.sleep(N)` changes)