perf: ruflo memory store needs batch/pipe mode — CLI startup overhead makes bulk import impractical

## Summary

`ruflo memory store` spawns a full Node.js process per invocation (~5-10s startup). For bulk operations (e.g., importing 76 documentation files into HNSW-indexed memory), this means ~10+ minutes for what should be a 30-second operation.

**Request**: Add a batch import mode — either `ruflo memory import-dir` or `ruflo memory store --batch` reading from stdin/file.

## Environment

- **ruflo**: v3.5.48 (via `npx @claude-flow/cli@latest`)
- **agentic-flow**: v3.0.0-alpha.1
- **agentdb**: v1.3.9 (project-local), v3.0.0-alpha.10 (npx-bundled)
- **Node.js**: v22.x
- **Platform**: macOS Darwin 25.0.0 (Apple Silicon)

## Reproduction

```bash
# Single store works but is slow (~5-10s per call)
time ruflo memory store --namespace adr --key test --value "hello world"
# real    0m7.234s  ← 7 seconds for 11 bytes

# Attempting to import 76 files:
for FILE in docs/adr/*.md docs/ddd/*.md docs/specification/*.md; do
  KEY=$(basename "$FILE" .md)
  NS=$(basename $(dirname "$FILE"))
  ruflo memory store --namespace "$NS" --key "$KEY" --value "$(cat "$FILE")"
done
# Result: ETIMEDOUT errors, ~10+ minutes, most files fail
```

## Root Cause Analysis

Each `ruflo memory store` invocation:

1. **Node.js process startup** (~1s)
2. **Package resolution** — loads @claude-flow/cli, agentic-flow, agentdb (~2s)
3. **AgentDB runtime patch attempt** — searches `process.cwd()/node_modules/agentdb/dist/controllers/index.js`, fails with warning if wrong distribution found (~0.5s)
4. **SQLite database open** — opens/creates the memory.db file (~0.5s)
5. **HNSW embedding generation** — computes 384-dim vector for the value (~1-2s)
6. **Actual store operation** — SQLite INSERT (~10ms)
7. **Process exit** (~0.1s)

**Steps 1-4 are pure overhead** repeated for every single CLI call. The actual useful work (step 6) takes 10ms.

## Secondary Issue: cwd()-based agentdb Resolution

The `agentdb-runtime-patch.js` (line 87) searches for agentdb starting at `process.cwd()`:

```javascript
const possiblePaths = [
  join(process.cwd(), 'node_modules', 'agentdb'),  // ← finds project-local copy
  // ...
];
```

When the project has its own `agentdb` installation (e.g., as a transitive dependency), the patch finds it but it's a **different distribution** (browser/WASM) without `dist/controllers/`. This produces the misleading warning:

```
[AgentDB Patch] Controller index not found: /project/node_modules/agentdb/dist/controllers/index.js
```

The patch should prefer the **npx-bundled copy** (which has the correct Node.js distribution) over the project-local copy. Related: #80, #111, #132.

## Proposed Solutions

### Option A: `ruflo memory import-dir` (preferred)

A single-process command that imports all files from a directory tree:

```bash
ruflo memory import-dir --source docs/ --namespace-from-dir
# Maps: docs/adr/*.md → namespace "adr", docs/ddd/*.md → namespace "ddd"
# One process, one SQLite connection, batch HNSW indexing
```

### Option B: `ruflo memory store --batch` (stdin JSONL)

Read multiple store operations from stdin:

```bash
cat <<EOF | ruflo memory store --batch
{"namespace":"adr","key":"ADR-001","value":"# ADR-001..."}
{"namespace":"adr","key":"ADR-002","value":"# ADR-002..."}
EOF
```

### Option C: `ruflo memory store --file`

Read value from file instead of `--value` argument (also avoids shell escaping issues with large markdown content containing backticks, quotes, etc.):

```bash
ruflo memory store --namespace adr --key ADR-001 --file docs/adr/ADR-001.md
```

### Option D: Long-running daemon mode

Keep a `ruflo memory` daemon process alive and send operations via IPC/socket:

```bash
ruflo memory daemon &
ruflo memory store --via-daemon --namespace adr --key ADR-001 --value "..."
```

### Fix for cwd() resolution

In `agentdb-runtime-patch.js`, prioritize the npx/CLI-bundled agentdb over project-local:

```javascript
function findAgentDBPath() {
  const possiblePaths = [
    // Prefer the bundled copy that ships with ruflo/agentic-flow
    join(dirname(fileURLToPath(import.meta.url)), '..', '..', 'node_modules', 'agentdb'),
    join(dirname(fileURLToPath(import.meta.url)), '..', '..', '..', 'agentdb'),
    // Then fall back to project-local
    join(process.cwd(), 'node_modules', 'agentdb'),
    // ...
  ];
}
```

## Impact

This blocks the `/prd2build --import=from-files` workflow in Turbo-Flow, which needs to index 76+ documentation files into HNSW memory for semantic search. Currently users must either:
- Wait 10+ minutes (with most files timing out)
- Skip memory indexing entirely and use file-based access only

## Workaround

We wrote a batch import script (`scripts/ruflo-batch-import.mjs`) that reuses a single ruflo binary path, but it still spawns a process per file. The real fix needs to happen inside ruflo to keep a single process/connection alive across multiple store operations.

## Related Issues

- #80 — agentdb missing `dist/controllers/index.js` (same root cause for the warning)
- #111 — runtime patch uses incorrect controller path
- #132 — controllers at wrong path in dist (closed)
- #72 — broken package.json exports in agentdb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: ruflo memory store needs batch/pipe mode — CLI startup overhead makes bulk import impractical #141

Summary

Environment

Reproduction

Root Cause Analysis

Secondary Issue: cwd()-based agentdb Resolution

Proposed Solutions

Option A: `ruflo memory import-dir` (preferred)

Option B: `ruflo memory store --batch` (stdin JSONL)

Option C: `ruflo memory store --file`

Option D: Long-running daemon mode

Fix for cwd() resolution

Impact

Workaround

Related Issues

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

perf: ruflo memory store needs batch/pipe mode — CLI startup overhead makes bulk import impractical #141

Description

Summary

Environment

Reproduction

Root Cause Analysis

Secondary Issue: cwd()-based agentdb Resolution

Proposed Solutions

Option A: ruflo memory import-dir (preferred)

Option B: ruflo memory store --batch (stdin JSONL)

Option C: ruflo memory store --file

Option D: Long-running daemon mode

Fix for cwd() resolution

Impact

Workaround

Related Issues

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Option A: `ruflo memory import-dir` (preferred)

Option B: `ruflo memory store --batch` (stdin JSONL)

Option C: `ruflo memory store --file`