The Model Context Protocol (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools. Whether you're building an AI-powered IDE, enhancing a chat interface, or creating custom AI workflows, MCP provides a standardized way to connect LLMs with the context they need.
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
An official Model Context Protocol server for keeping and retrieving memories in the Qdrant vector search engine. It acts as a semantic memory layer with advanced RAG (Retrieval-Augmented Generation) capabilities including:
- Intelligent Document Chunking - Automatically split large documents using semantic, sentence, or fixed strategies
- Bulk Document Ingestion - CLI tool for ingesting entire directories of documents
- Set-Based Organization - Organize documents into knowledge bases with semantic filtering
- Multiple Embedding Providers - Support for FastEmbed, Model2Vec, OpenAI-compatible, and Gemini
- Hybrid Search - Combine dense and sparse vectors for improved retrieval accuracy
-
qdrant-store- Store some information in the Qdrant database
- Input:
information(string): Information to storemetadata(JSON): Optional metadata to storecollection_name(string, optional): Name of the collection to store the information in
- Returns: Confirmation message
-
qdrant-find- Retrieve relevant information from the Qdrant database using semantic search
- Input:
query(string): Query to use for searchingcollection_name(string, optional): Name of the collection to search in
- Returns: Information stored in the Qdrant database as separate messages
-
qdrant-hybrid-find- Retrieve information using hybrid search (combining dense and sparse vectors)
- Input:
query(string): Query to use for searchingcollection_name(string, optional): Name of the collection to search infusion_method(string): Fusion method - 'rrf' or 'dbsf' (default: 'rrf')dense_limit(int): Limit for dense vector prefetch (default: 10)sparse_limit(int): Limit for sparse vector prefetch (default: 10)final_limit(int): Final limit after fusion (default: 10)
- Returns: Hybrid search results
-
qdrant-get-point- Retrieve a specific point by ID from a collection
- Input:
point_id(string): The ID of the point to retrievecollection_name(string, optional): Name of the collection
- Returns: Point data with content and metadata
-
qdrant-delete-point- Delete a specific point by ID from a collection
- Input:
point_id(string): The ID of the point to deletecollection_name(string, optional): Name of the collection
- Returns: Confirmation message
-
qdrant-update-point-payload- Update the metadata/payload for a specific point
- Input:
point_id(string): The ID of the point to updatemetadata(JSON): The new metadata to setcollection_name(string, optional): Name of the collection
- Returns: Confirmation message
-
qdrant-list-points- List points in a collection with pagination
- Input:
collection_name(string, optional): Name of the collectionlimit(int): Maximum number of points to return (default: 10)offset(int): Number of points to skip (default: 0)
- Returns: List of points with IDs, content, and metadata
-
qdrant-get-collections- List all available collections in Qdrant
- Returns: List of collection names
-
qdrant-get-collection-details- Get detailed information about a specific collection
- Input:
collection_name(string): Name of the collection
- Returns: Collection details including status, vector count, and configuration
The configuration of the server is done using environment variables:
| Name | Description | Default Value |
|---|---|---|
QDRANT_URL |
URL of the Qdrant server | None |
QDRANT_API_KEY |
API key for the Qdrant server | None |
COLLECTION_NAME |
Name of the default collection to use | None |
COLLECTION_NAMES |
List of collection names for multiple collections support | None |
QDRANT_LOCAL_PATH |
Path to the local Qdrant database (alternative to QDRANT_URL) |
None |
QDRANT_READ_ONLY |
Enable read-only mode (disables write operations) | false |
EMBEDDING_PROVIDER |
Embedding provider: "fastembed", "model2vec", "oai_compat", or "gemini" | fastembed |
EMBEDDING_MODEL |
Name of the embedding model to use | sentence-transformers/all-MiniLM-L6-v2 |
USE_UNNAMED_VECTORS |
Use Qdrant's unnamed vector field instead of named vectors | false |
SPARSE_EMBEDDING_MODEL |
Sparse embedding model for hybrid search | None |
OAI_COMPAT_ENDPOINT |
OpenAI-compatible API endpoint URL | https://api.openai.com/v1 |
OAI_COMPAT_API_KEY |
API key for OpenAI-compatible endpoint | None |
OAI_COMPAT_VEC_SIZE |
Vector size override for OpenAI-compatible embeddings | None (auto-detected) |
GEMINI_API_KEY |
API key for Google Gemini embeddings | None |
ENABLE_CHUNKING |
Enable automatic document chunking for RAG | false |
MAX_CHUNK_SIZE |
Maximum chunk size in tokens/characters | 512 |
CHUNK_OVERLAP |
Overlap between consecutive chunks | 50 |
CHUNK_STRATEGY |
Chunking strategy: "semantic", "sentence", or "fixed" | semantic |
QDRANT_ENABLE_SEMANTIC_SET_MATCHING |
Enable set-based document filtering | false |
QDRANT_SETS_CONFIG |
Path to document sets configuration file | .qdrant_sets.json |
TOOL_STORE_DESCRIPTION |
Custom description for the store tool | See default in settings.py |
TOOL_FIND_DESCRIPTION |
Custom description for the find tool | See default in settings.py |
TOOL_HYBRID_FIND_DESCRIPTION |
Custom description for the hybrid find tool | See default in settings.py |
Note: You cannot provide both QDRANT_URL and QDRANT_LOCAL_PATH at the same time.
Important
Command-line arguments are not supported anymore! Please use environment variables for all configuration.
Since mcp-server-qdrant is based on FastMCP, it also supports all the FastMCP environment variables. The most
important ones are listed below:
| Environment Variable | Description | Default Value |
|---|---|---|
FASTMCP_DEBUG |
Enable debug mode | false |
FASTMCP_LOG_LEVEL |
Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | INFO |
FASTMCP_HOST |
Host address to bind the server to | 127.0.0.1 |
FASTMCP_PORT |
Port to run the server on | 8000 |
FASTMCP_WARN_ON_DUPLICATE_RESOURCES |
Show warnings for duplicate resources | true |
FASTMCP_WARN_ON_DUPLICATE_TOOLS |
Show warnings for duplicate tools | true |
FASTMCP_WARN_ON_DUPLICATE_PROMPTS |
Show warnings for duplicate prompts | true |
FASTMCP_DEPENDENCIES |
List of dependencies to install in the server environment | [] |
When using uvx no specific installation is needed to directly run mcp-server-qdrant.
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" \
uvx mcp-server-qdrantThe server supports different transport protocols that can be specified using the --transport flag:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
uvx mcp-server-qdrant --transport sseSupported transport protocols:
stdio(default): Standard input/output transport, might only be used by local MCP clientssse: Server-Sent Events transport, perfect for remote clientsstreamable-http: Streamable HTTP transport, perfect for remote clients, more recent than SSE
The default transport is stdio if not specified.
When SSE transport is used, the server will listen on the specified port and wait for incoming connections. The default
port is 8000, however it can be changed using the FASTMCP_PORT environment variable.
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
FASTMCP_PORT=1234 \
uvx mcp-server-qdrant --transport sseA Dockerfile is available for building and running the MCP server:
# Build the container
docker build -t mcp-server-qdrant .
# Run the container
docker run -p 8000:8000 \
-e FASTMCP_HOST="0.0.0.0" \
-e QDRANT_URL="http://your-qdrant-server:6333" \
-e QDRANT_API_KEY="your-api-key" \
-e COLLECTION_NAME="your-collection" \
mcp-server-qdrantTip
Please note that we set FASTMCP_HOST="0.0.0.0" to make the server listen on all network interfaces. This is
necessary when running the server in a Docker container.
To install Qdrant MCP Server for Claude Desktop automatically via Smithery:
npx @smithery/cli install mcp-server-qdrant --client claudeTo use this server with the Claude Desktop app, add the following configuration to the "mcpServers" section of your
claude_desktop_config.json:
{
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "https://xyz-example.eu-central.aws.cloud.qdrant.io:6333",
"QDRANT_API_KEY": "your_api_key",
"COLLECTION_NAME": "your-collection-name",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
}
}
}For local Qdrant mode:
{
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_LOCAL_PATH": "/path/to/qdrant/database",
"COLLECTION_NAME": "your-collection-name",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
}
}
}This MCP server will automatically create a collection with the specified name if it doesn't exist.
By default, the server will use the sentence-transformers/all-MiniLM-L6-v2 embedding model to encode memories.
The server supports multiple embedding providers:
- FastEmbed (default) - Local embedding models via FastEmbed
- Model2Vec - Fast, lightweight static embeddings
- OpenAI-compatible - Any OpenAI-compatible API endpoint
- Gemini - Google's Gemini embedding models
To use Model2Vec embeddings:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
EMBEDDING_PROVIDER="model2vec" \
EMBEDDING_MODEL="minishlab/potion-base-8M" \
uvx mcp-server-qdrantPopular Model2Vec models:
minishlab/potion-base-8M- Efficient 8M parameter modelminishlab/potion-base-4M- Compact 4M parameter modelminishlab/M2V_base_output- Base output model
To use OpenAI or compatible API:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
EMBEDDING_PROVIDER="oai_compat" \
EMBEDDING_MODEL="text-embedding-3-small" \
OAI_COMPAT_API_KEY="your-api-key" \
uvx mcp-server-qdrantTo use Google's Gemini embedding models:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
EMBEDDING_PROVIDER="gemini" \
EMBEDDING_MODEL="text-embedding-004" \
GEMINI_API_KEY="your-gemini-api-key" \
uvx mcp-server-qdrantThe Gemini provider supports text-embedding-004 (768 dimensions) and other Gemini embedding models.
Qdrant supports unnamed vectors as a simpler alternative to named vectors. To enable:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
USE_UNNAMED_VECTORS="true" \
uvx mcp-server-qdrantYou can configure the server to work with multiple collections by setting COLLECTION_NAMES:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAMES='["collection1", "collection2", "collection3"]' \
uvx mcp-server-qdrantWhen using multiple collections, the collection_name parameter becomes available in all tools.
{
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "https://xyz-example.eu-central.aws.cloud.qdrant.io:6333",
"QDRANT_API_KEY": "your_api_key",
"COLLECTION_NAMES": "[\"personal\", \"work\", \"research\"]",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
}
}
}The server supports hybrid search combining dense and sparse vectors for improved accuracy. This requires configuring sparse embeddings in your Qdrant collection. The qdrant-hybrid-find tool supports two fusion methods:
- RRF (Reciprocal Rank Fusion) - Default method
- DBSF (Distribution-Based Score Fusion) - Alternative fusion strategy
A docker-compose.yml file is provided for easy development:
make up # Start containers
make logs # View logs
make down # Stop containers
make clean # Remove containers and volumesThe compose file connects to an external qdrantnet network. Make sure you have a Qdrant instance running on this network.
The provided Makefile includes convenient shortcuts:
make up- Launch containers in detached modemake down- Stop and remove containersmake logs- Stream logs from the mcp-server-qdrant servicemake ps- Display active container statusmake rebuild- Rebuild without cachemake clean- Terminate containers and remove volumes
Automatically split large documents into optimal chunks for better retrieval performance.
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="large-docs" \
ENABLE_CHUNKING=true \
MAX_CHUNK_SIZE=512 \
CHUNK_OVERLAP=50 \
CHUNK_STRATEGY=semantic \
uvx mcp-server-qdrant-
Semantic Chunking (Recommended)
- Splits at natural boundaries (paragraphs, sentences)
- Preserves context and meaning
- Best for general documents
-
Sentence Chunking
- Splits only at sentence boundaries
- Good for preserving structural integrity
- Ideal for code and technical documents
-
Fixed Chunking
- Splits at fixed token/character boundaries
- Predictable chunk sizes
- Useful for consistent processing
| Use Case | Size | Overlap | Strategy |
|---|---|---|---|
| General documents | 512 | 50 | semantic |
| Code snippets | 300 | 30 | sentence |
| Long articles | 1024 | 100 | semantic |
Bulk ingest documents from directories with the qdrant-ingest command.
# Ingest all documents from a directory
qdrant-ingest ingest /path/to/docs \
--collection my-knowledge-base \
--knowledge-base "Company Docs" \
--enable-chunking
# Ingest with filtering
qdrant-ingest ingest /path/to/code \
--collection codebase \
--include "\.py$" \
--exclude "test_.*" \
--enable-chunking \
--chunk-strategy sentenceThe CLI automatically processes 25+ file types:
- Documents: .txt, .md, .markdown
- Code: .py, .js, .ts, .java, .go, .rs, .c, .cpp, .rb, .php
- Config: .json, .yaml, .yml, .toml, .xml, .ini
- Web: .html, .css, .scss
- Data: .csv, .sql
# Ingest documents
qdrant-ingest ingest <path> [options]
# List all collections
qdrant-ingest list--url- Qdrant server URL--api-key- Qdrant API key--collection- Collection name--embedding-model- Embedding model to use--knowledge-base- Knowledge base name (added to metadata)--doc-type- Document type (added to metadata)--include- Regex pattern for files to include--exclude- Regex pattern for files to exclude--enable-chunking- Enable document chunking--chunk-strategy- Chunking strategy (semantic/sentence/fixed)--max-chunk-size- Maximum chunk size--chunk-overlap- Chunk overlap
Organize documents into logical groups with semantic matching.
Create a .qdrant_sets.json file:
{
"sets": [
{
"slug": "backend_services",
"description": "Backend microservices and API documentation",
"aliases": ["backend", "api", "services"]
},
{
"slug": "frontend_code",
"description": "Frontend React components and UI code",
"aliases": ["frontend", "ui", "react"]
}
]
}QDRANT_ENABLE_SEMANTIC_SET_MATCHING=true \
QDRANT_SETS_CONFIG=.qdrant_sets.json \
uvx mcp-server-qdrantThe system will automatically match natural language queries to the appropriate document sets based on semantic similarity.
This MCP server can be used with any MCP-compatible client. For example, you can use it with Cursor and VS Code, which provide built-in support for the Model Context Protocol.
You can configure this MCP server to work as a code search tool for Cursor or Windsurf by customizing the tool descriptions:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="code-snippets" \
TOOL_STORE_DESCRIPTION="Store reusable code snippets for later retrieval. \
The 'information' parameter should contain a natural language description of what the code does, \
while the actual code should be included in the 'metadata' parameter as a 'code' property. \
The value of 'metadata' is a Python dictionary with strings as keys. \
Use this whenever you generate some code snippet." \
TOOL_FIND_DESCRIPTION="Search for relevant code snippets based on natural language descriptions. \
The 'query' parameter should describe what you're looking for, \
and the tool will return the most relevant code snippets. \
Use this when you need to find existing code snippets for reuse or reference." \
uvx mcp-server-qdrant --transport sse # Enable SSE transportIn Cursor/Windsurf, you can then configure the MCP server in your settings by pointing to this running server using SSE transport protocol. The description on how to add an MCP server to Cursor can be found in the Cursor documentation. If you are running Cursor/Windsurf locally, you can use the following URL:
http://localhost:8000/sse
Tip
We suggest SSE transport as a preferred way to connect Cursor/Windsurf to the MCP server, as it can support remote connections. That makes it easy to share the server with your team or use it in a cloud environment.
This configuration transforms the Qdrant MCP server into a specialized code search tool that can:
- Store code snippets, documentation, and implementation details
- Retrieve relevant code examples based on semantic search
- Help developers find specific implementations or usage patterns
You can populate the database by storing natural language descriptions of code snippets (in the information parameter)
along with the actual code (in the metadata.code property), and then search for them using natural language queries
that describe what you're looking for.
Note
The tool descriptions provided above are examples and may need to be customized for your specific use case. Consider adjusting the descriptions to better match your team's workflow and the specific types of code snippets you want to store and retrieve.
If you have successfully installed the mcp-server-qdrant, but still can't get it to work with Cursor, please
consider creating the Cursor rules so the MCP tools are always used when
the agent produces a new code snippet. You can restrict the rules to only work for certain file types, to avoid using
the MCP server for the documentation or other types of content.
You can enhance Claude Code's capabilities by connecting it to this MCP server, enabling semantic search over your existing codebase.
-
Add the MCP server to Claude Code:
# Add mcp-server-qdrant configured for code search claude mcp add code-search \ -e QDRANT_URL="http://localhost:6333" \ -e COLLECTION_NAME="code-repository" \ -e EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" \ -e TOOL_STORE_DESCRIPTION="Store code snippets with descriptions. The 'information' parameter should contain a natural language description of what the code does, while the actual code should be included in the 'metadata' parameter as a 'code' property." \ -e TOOL_FIND_DESCRIPTION="Search for relevant code snippets using natural language. The 'query' parameter should describe the functionality you're looking for." \ -- uvx mcp-server-qdrant
-
Verify the server was added:
claude mcp list
Tool descriptions, specified in TOOL_STORE_DESCRIPTION and TOOL_FIND_DESCRIPTION, guide Claude Code on how to use
the MCP server. The ones provided above are examples and may need to be customized for your specific use case. However,
Claude Code should be already able to:
- Use the
qdrant-storetool to store code snippets with descriptions. - Use the
qdrant-findtool to search for relevant code snippets using natural language.
The MCP server can be run in development mode using the mcp dev command. This will start the server and open the MCP
inspector in your browser.
COLLECTION_NAME=mcp-dev fastmcp dev src/mcp_server_qdrant/server.pyFor one-click installation, click one of the install buttons below:
Add the following JSON block to your User Settings (JSON) file in VS Code. You can do this by pressing Ctrl + Shift + P and typing Preferences: Open User Settings (JSON).
{
"mcp": {
"inputs": [
{
"type": "promptString",
"id": "qdrantUrl",
"description": "Qdrant URL"
},
{
"type": "promptString",
"id": "qdrantApiKey",
"description": "Qdrant API Key",
"password": true
},
{
"type": "promptString",
"id": "collectionName",
"description": "Collection Name"
}
],
"servers": {
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "${input:qdrantUrl}",
"QDRANT_API_KEY": "${input:qdrantApiKey}",
"COLLECTION_NAME": "${input:collectionName}"
}
}
}
}
}Or if you prefer using Docker, add this configuration instead:
{
"mcp": {
"inputs": [
{
"type": "promptString",
"id": "qdrantUrl",
"description": "Qdrant URL"
},
{
"type": "promptString",
"id": "qdrantApiKey",
"description": "Qdrant API Key",
"password": true
},
{
"type": "promptString",
"id": "collectionName",
"description": "Collection Name"
}
],
"servers": {
"qdrant": {
"command": "docker",
"args": [
"run",
"-p", "8000:8000",
"-i",
"--rm",
"-e", "QDRANT_URL",
"-e", "QDRANT_API_KEY",
"-e", "COLLECTION_NAME",
"mcp-server-qdrant"
],
"env": {
"QDRANT_URL": "${input:qdrantUrl}",
"QDRANT_API_KEY": "${input:qdrantApiKey}",
"COLLECTION_NAME": "${input:collectionName}"
}
}
}
}
}Alternatively, you can create a .vscode/mcp.json file in your workspace with the following content:
{
"inputs": [
{
"type": "promptString",
"id": "qdrantUrl",
"description": "Qdrant URL"
},
{
"type": "promptString",
"id": "qdrantApiKey",
"description": "Qdrant API Key",
"password": true
},
{
"type": "promptString",
"id": "collectionName",
"description": "Collection Name"
}
],
"servers": {
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "${input:qdrantUrl}",
"QDRANT_API_KEY": "${input:qdrantApiKey}",
"COLLECTION_NAME": "${input:collectionName}"
}
}
}
}For workspace configuration with Docker, use this in .vscode/mcp.json:
{
"inputs": [
{
"type": "promptString",
"id": "qdrantUrl",
"description": "Qdrant URL"
},
{
"type": "promptString",
"id": "qdrantApiKey",
"description": "Qdrant API Key",
"password": true
},
{
"type": "promptString",
"id": "collectionName",
"description": "Collection Name"
}
],
"servers": {
"qdrant": {
"command": "docker",
"args": [
"run",
"-p", "8000:8000",
"-i",
"--rm",
"-e", "QDRANT_URL",
"-e", "QDRANT_API_KEY",
"-e", "COLLECTION_NAME",
"mcp-server-qdrant"
],
"env": {
"QDRANT_URL": "${input:qdrantUrl}",
"QDRANT_API_KEY": "${input:qdrantApiKey}",
"COLLECTION_NAME": "${input:collectionName}"
}
}
}
}If you have suggestions for how mcp-server-qdrant could be improved, or want to report a bug, open an issue! We'd love all and any contributions.
The MCP inspector is a developer tool for testing and debugging MCP servers. It runs both a client UI (default port 5173) and an MCP proxy server (default port 3000). Open the client UI in your browser to use the inspector.
QDRANT_URL=":memory:" COLLECTION_NAME="test" \
fastmcp dev src/mcp_server_qdrant/server.pyOnce started, open your browser to http://localhost:5173 to access the inspector interface.
This MCP server is licensed under the Apache License 2.0. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the Apache License 2.0. For more details, please see the LICENSE file in the project repository.