Releases · VecGrep/vecgrep

04 Mar 18:31

iamvirul

v1.7.0

645d3a1

v1.7.0 - BYOK cloud embedding providers Latest

Latest

What's new in v1.7.0

BYOK cloud embedding providers

Bring your own API key to use OpenAI, Voyage AI, or Google Gemini embeddings instead of (or alongside) the default local model.

Pass provider= to index_codebase:

index_codebase("/path/to/myproject", provider="openai")

Provider	Model	Dims	API key env var	Install extra
`local`	all-MiniLM-L6-v2-code-search-512	384	—	(default)
`openai`	text-embedding-3-small	1536	`VECGREP_OPENAI_KEY`	`vecgrep[openai]`
`voyage`	voyage-code-3	1024	`VECGREP_VOYAGE_KEY`	`vecgrep[voyage]`
`gemini`	gemini-embedding-exp-03-07	3072	`VECGREP_GEMINI_KEY`	`vecgrep[gemini]`

Install a cloud provider extra:

pip install 'vecgrep[openai]'   # or voyage / gemini / cloud (all three)

Strategy-pattern embedder refactor

EmbeddingProvider ABC — all providers share the same embed(texts) -> np.ndarray interface. Adding new providers requires only subclassing EmbeddingProvider.

Dynamic vector dimensions

VectorStore now persists the provider's embedding dimensionality in the per-project meta table. Existing 384-dim indexes open without migration.

Provider lock — switching providers requires force=True, which drops and recreates the chunks table to avoid silent dimension mismatches.

Live-sync guard

watch=True is blocked for cloud providers — live file-change sync would incur unbounded API costs.

`get_index_status` now reports provider metadata

Index status for: /path/to/myproject
  ...
  Provider:    openai
  Model:       text-embedding-3-small
  Dimensions:  1536

New test suite

tests/test_providers.py — provider registry, LocalProvider, and all three cloud providers tested with mocked API responses.

Full changelog: https://github.com/VecGrep/vecgrep/blob/main/CHANGELOG.md

Assets 2

02 Mar 09:59

iamvirul

v1.6.0

ef4c3e4

v1.6.0 - Merkle Tree Change Detection & Auto-Reindex

What's New

Merkle Tree Change Detection

Incremental re-indexing now uses a Merkle tree to detect file changes. Only files that actually changed since the last index are re-embedded, reducing unnecessary work on large codebases.

Auto-Reindex on Startup

When watch=True, VecGrep detects any files that changed while the watcher was offline and re-indexes them automatically on the next startup — no manual index_codebase call needed.

`stop_watching` MCP Tool

New tool to stop the background file watcher for a project without restarting the server.

Watch State Persistence

Watched project paths are now persisted to disk so the watcher can be restored after a server restart.

Bug Fixes

Fixed create_index positional argument misuse in VectorStore that could cause index builds to fail silently
Fixed a race condition in the Merkle tree watcher that could cause duplicate re-index events on rapid file saves

Upgrade

pip install --upgrade vecgrep

No breaking changes. Existing indexes are compatible with 1.6.0.

Assets 2

28 Feb 08:03

iamvirul

v1.5.0

b2bac1f

v1.5.0

What's new in 1.5.0

Breaking change

The SQLite + numpy vector store has been replaced with LanceDB. Existing indexes at ~/.vecgrep/ need to be rebuilt — just run index_codebase on your project again.

5x faster MCP server startup

Metric	1.0.0	1.5.0
MCP server startup	~6.6s	~1.25s
Model load (first embed)	~2–3s	~100ms
Change detection	O(chunks) SHA-256	O(files) mtime+size

The default embedding backend switches from sentence-transformers (PyTorch) to fastembed (ONNX Runtime) — no PyTorch required for the default path.

New: user-selectable backend and model

# Use PyTorch backend with any HuggingFace model
VECGREP_BACKEND=torch VECGREP_MODEL=sentence-transformers/all-MiniLM-L6-v2 vecgrep

# Use a custom ONNX model
VECGREP_MODEL=my-org/my-onnx-model vecgrep

Variable	Default	Description
`VECGREP_BACKEND`	`onnx`	`onnx` (fastembed) or `torch` (sentence-transformers)
`VECGREP_MODEL`	`isuruwijesiri/all-MiniLM-L6-v2-code-search-512`	HuggingFace model ID

New: fine-tuned code search model

Default model switched to isuruwijesiri/all-MiniLM-L6-v2-code-search-512 — fine-tuned specifically for semantic code search.

Other improvements

IVF-PQ ANN index for sub-linear search on large codebases
file_stats table — O(files) incremental re-indexing
stop_watching MCP tool — stop watching a path explicitly
Watch state persistence — watchers survive MCP server restarts
Auto device detection — Metal / CUDA / CPU selected automatically
.githooks/pre-commit — local lint check for contributors (git config core.hooksPath .githooks)

Full changelog

See CHANGELOG.md for the complete list of changes.

Installation

uv tool install --python 3.12 vecgrep

claude mcp add --scope user vecgrep -- vecgrep

Assets 2

21 Feb 20:21

iamvirul

v1.0.0

7b44c52

v1.0.0 - First Stable Release

VecGrep v1.0.0

Cursor-style semantic code search as an MCP plugin for Claude Code. Instead of grepping 50 files and sending 30,000 tokens to Claude, VecGrep returns the top 8 semantically relevant code chunks (~1,600 tokens) — a ~95% token reduction for codebase queries.

Install

# Run directly (no install needed)
uvx vecgrep

# Or install permanently
pip install vecgrep
uv tool install vecgrep

Claude Code integration

{
  "mcpServers": {
    "vecgrep": {
      "command": "uvx",
      "args": ["vecgrep"]
    }
  }
}

claude mcp add vecgrep -- uvx vecgrep

What's in v1.0.0

Performance

In-memory embedding cache — search no longer re-reads all embeddings from disk on every query

Data integrity

Atomic file updates via replace_file_chunks() — DELETE + INSERT in a single transaction, no partial writes on crash
Orphan cleanup — chunks for deleted files are removed automatically on the next index run

Correctness fixes

Negative top_k values no longer crash (clamped to max(1, min(top_k, 20)))
Query validation — empty queries and queries over 500 characters return a clean error string
VectorStore context-manager support guarantees connection closure on exceptions
followlinks=False prevents symlink loops during directory walk
Per-path threading lock prevents concurrent indexing corruption

Tests

52 tests across store, chunker, embedder, and server integration
CI with ruff, pytest + Codecov coverage, and pyright

See CHANGELOG.md for full details.

Assets 4

Releases: VecGrep/vecgrep

v1.7.0 - BYOK cloud embedding providers

What's new in v1.7.0

BYOK cloud embedding providers

Strategy-pattern embedder refactor

Dynamic vector dimensions

Live-sync guard

get_index_status now reports provider metadata

New test suite

Uh oh!

v1.6.0 - Merkle Tree Change Detection & Auto-Reindex

What's New

Merkle Tree Change Detection

Auto-Reindex on Startup

stop_watching MCP Tool

Watch State Persistence

Bug Fixes

Upgrade

Uh oh!

v1.5.0

What's new in 1.5.0

Breaking change

5x faster MCP server startup

New: user-selectable backend and model

New: fine-tuned code search model

Other improvements

Full changelog

Installation

Uh oh!

v1.0.0 - First Stable Release

VecGrep v1.0.0

Install

Claude Code integration

What's in v1.0.0

Uh oh!

`get_index_status` now reports provider metadata

`stop_watching` MCP Tool