Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion .devcontainer/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
version: "3.8"
services:
devcontainer:
build:
Expand All @@ -25,6 +24,11 @@ services:
GRAMPSWEB_SECRET_KEY: QAVoeYDzkQves9iDZ5PkxfdUoVElVMVYPqz-QXha6yE
GRAMPSWEB_CORS_ORIGINS: "*"
GRAMPSWEB_VECTOR_EMBEDDING_MODEL: sentence-transformers/distiluse-base-multilingual-cased-v2
# To use Ollama for embeddings instead of local SentenceTransformer,
# start with: docker compose --profile ollama up -d
# Then uncomment the following lines and comment out the line above:
# GRAMPSWEB_EMBEDDING_BASE_URL: http://ollama:11434
# GRAMPSWEB_VECTOR_EMBEDDING_MODEL: nomic-embed-text
GRAMPSWEB_CELERY_CONFIG__broker_url: redis://redis:6379/0
GRAMPSWEB_CELERY_CONFIG__result_backend: redis://redis:6379/0
GRAMPSWEB_LOG_LEVEL: DEBUG
Expand All @@ -35,3 +39,16 @@ services:
restart: unless-stopped
ports:
- "6379:6379"

ollama:
image: ollama/ollama:latest
profiles:
- ollama
restart: unless-stopped
ports:
- "11434:11434"
volumes:
- ollama-data:/root/.ollama

volumes:
ollama-data:
23 changes: 23 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,29 @@ Welcome, and thank you for your interest in contributing to Gramps Web API! Your
- Follow the [developer documentation](https://www.grampsweb.org/development/dev/) for setup, coding standards, and API details.
- Ensure your changes include appropriate tests and documentation updates where applicable.

#### Testing Remote Embeddings (Optional)

The devcontainer includes an optional [Ollama](https://ollama.com/) service for testing the remote embedding API without external dependencies.

1. Start the Ollama service:
```bash
docker compose -f .devcontainer/docker-compose.yml --profile ollama up -d ollama
```

2. Pull an embedding model:
```bash
docker compose -f .devcontainer/docker-compose.yml exec ollama ollama pull nomic-embed-text
```

3. In `.devcontainer/docker-compose.yml`, comment out the local `GRAMPSWEB_VECTOR_EMBEDDING_MODEL` line and uncomment the Ollama lines:
```yaml
# GRAMPSWEB_VECTOR_EMBEDDING_MODEL: sentence-transformers/distiluse-base-multilingual-cased-v2
GRAMPSWEB_EMBEDDING_BASE_URL: http://ollama:11434
GRAMPSWEB_VECTOR_EMBEDDING_MODEL: nomic-embed-text
```

4. Restart the devcontainer to pick up the new environment variables.

### Code of Conduct
- Please read and adhere to our [Code of Conduct](CODE_OF_CONDUCT.md) to ensure a welcoming and inclusive environment for all contributors.

Expand Down
27 changes: 27 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,33 @@ Gramps Web API is the backend of [Gramps Web](https://www.grampsweb.org/), a gen
- Developer documentation for Gramps Web API: https://www.grampsweb.org/dev-backend/
- Documentation for Gramps Web: https://www.grampsweb.org

## Remote Embedding API

By default, Gramps Web API uses a local [SentenceTransformers](https://www.sbert.net/) model for semantic search embeddings. You can optionally use a remote OpenAI-compatible embedding API (e.g. Ollama, OpenAI, LiteLLM) instead.

| Environment Variable | Description | Default |
|---|---|---|
| `GRAMPSWEB_VECTOR_EMBEDDING_MODEL` | Model name for semantic search embeddings | `""` (disabled) |
| `GRAMPSWEB_EMBEDDING_BASE_URL` | Base URL for a remote OpenAI-compatible embedding API | `None` (use local model) |
| `GRAMPSWEB_EMBEDDING_API_KEY` | API key for authenticated embedding providers | `None` |

**Ollama example:**

```bash
GRAMPSWEB_VECTOR_EMBEDDING_MODEL=nomic-embed-text
GRAMPSWEB_EMBEDDING_BASE_URL=http://localhost:11434
```

**OpenAI example:**

```bash
GRAMPSWEB_VECTOR_EMBEDDING_MODEL=text-embedding-3-small
GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com/v1
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OpenAI example sets GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com/v1, but create_remote_embedding_function() appends /v1/embeddings to the base URL. With this example config, the effective URL becomes https://api.openai.com/v1/v1/embeddings and will fail. Update the example (e.g., base URL without /v1) or adjust the code to accept both formats.

Suggested change
GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com/v1
GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com

Copilot uses AI. Check for mistakes.
GRAMPSWEB_EMBEDDING_API_KEY=sk-...
```

> **Note:** Changing the embedding model requires reindexing all records, since different models produce vectors with different dimensions.

## Related projects

- Gramps Web frontend repository: https://github.com/gramps-project/gramps-web
6 changes: 3 additions & 3 deletions gramps_webapi/api/search/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,9 +68,9 @@ def get_search_indexer(tree: str) -> SearchIndexer:
def get_semantic_search_indexer(tree: str) -> SemanticSearchIndexer:
"""Get the search indexer for the tree."""
db_url = _get_search_index_db_url()
model = current_app.config.get("_INITIALIZED_VECTOR_EMBEDDING_MODEL")
if not model:
embedding_function = current_app.config.get("_EMBEDDING_FUNCTION")
if not embedding_function:
raise ValueError("VECTOR_EMBEDDING_MODEL option not set")
return SemanticSearchIndexer(
db_url=db_url, tree=tree, embedding_function=model.encode
db_url=db_url, tree=tree, embedding_function=embedding_function
)
27 changes: 27 additions & 0 deletions gramps_webapi/api/search/embeddings.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
"""Functions to compute vector embeddings."""

from typing import Callable, List, Optional

import requests

from ..util import get_logger


Expand All @@ -16,3 +20,26 @@ def load_model(model_name: str):
model = SentenceTransformer(model_name)
logger.debug("Done initializing embedding model.")
return model


def create_remote_embedding_function(
base_url: str, model_name: str, api_key: Optional[str] = None
) -> Callable[[List[str]], List[List[float]]]:
"""Create an embedding function that calls a remote OpenAI-compatible API.

Returns a callable with signature (texts: list[str]) -> list[list[float]].
"""
url = f"{base_url.rstrip('/')}/v1/embeddings"
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_remote_embedding_function() always appends /v1/embeddings to base_url. If a user supplies a base URL that already includes /v1 (a common OpenAI-style base URL, and also shown in this PR’s README), the resulting request URL becomes .../v1/v1/embeddings and will 404. Consider normalizing base_url to accept both forms (strip a trailing /v1) or clearly document that base_url must not include /v1.

Suggested change
url = f"{base_url.rstrip('/')}/v1/embeddings"
stripped = base_url.rstrip("/")
if stripped.endswith("/v1"):
stripped = stripped[:-3]
url = f"{stripped}/v1/embeddings"

Copilot uses AI. Check for mistakes.

def _embed(texts: List[str]) -> List[List[float]]:
headers = {"Content-Type": "application/json"}
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
payload = {"model": model_name, "input": texts}
response = requests.post(url, json=payload, headers=headers)
response.raise_for_status()
data = response.json()["data"]
Comment thread
DavidMStraub marked this conversation as resolved.
data.sort(key=lambda item: item["index"])
return [item["embedding"] for item in data]

return _embed
14 changes: 10 additions & 4 deletions gramps_webapi/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
from .api import api_blueprint
from .api.cache import persistent_cache, request_cache, thumbnail_cache
from .api.ratelimiter import limiter
from .api.search.embeddings import load_model
from .api.search.embeddings import create_remote_embedding_function, load_model
from .api.tasks import run_task, send_telemetry_task
from .api.telemetry import should_send_telemetry
from .api.util import close_db, get_tree_from_jwt
Expand Down Expand Up @@ -258,9 +258,15 @@ def close_user_db_connection(exception) -> None:
user_db.session.remove() # pylint: disable=no-member

if app.config.get("VECTOR_EMBEDDING_MODEL"):
app.config["_INITIALIZED_VECTOR_EMBEDDING_MODEL"] = load_model(
app.config["VECTOR_EMBEDDING_MODEL"]
)
if app.config.get("EMBEDDING_BASE_URL"):
app.config["_EMBEDDING_FUNCTION"] = create_remote_embedding_function(
base_url=app.config["EMBEDDING_BASE_URL"],
model_name=app.config["VECTOR_EMBEDDING_MODEL"],
api_key=app.config.get("EMBEDDING_API_KEY"),
)
else:
model = load_model(app.config["VECTOR_EMBEDDING_MODEL"])
app.config["_EMBEDDING_FUNCTION"] = model.encode

@app.route("/ready", methods=["GET"])
def ready():
Expand Down
4 changes: 3 additions & 1 deletion gramps_webapi/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,9 @@ class DefaultConfig(object):
LLM_MODEL = ""
LLM_MAX_CONTEXT_LENGTH = 50000
LLM_SYSTEM_PROMPT = None
VECTOR_EMBEDDING_MODEL = ""
VECTOR_EMBEDDING_MODEL = "" # Model name for semantic search embeddings
EMBEDDING_BASE_URL = None # If set, use remote OpenAI-compatible API instead of local model
EMBEDDING_API_KEY = None # Optional API key for authenticated embedding providers
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line exceeds the repository’s configured flake8 max-line-length = 88 (see .flake8). Please wrap the inline comment (or move it above the assignment) to avoid lint failures.

Suggested change
EMBEDDING_BASE_URL = None # If set, use remote OpenAI-compatible API instead of local model
EMBEDDING_API_KEY = None # Optional API key for authenticated embedding providers
# If set, use remote OpenAI-compatible API instead of local model
EMBEDDING_BASE_URL = None
# Optional API key for authenticated embedding providers
EMBEDDING_API_KEY = None

Copilot uses AI. Check for mistakes.
DISABLE_TELEMETRY = False
OIDC_ISSUER = ""
OIDC_CLIENT_ID = ""
Expand Down
130 changes: 130 additions & 0 deletions tests/test_embeddings.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
"""Tests for remote embedding function."""

from unittest.mock import patch

import pytest

from gramps_webapi.api.search.embeddings import create_remote_embedding_function


@pytest.fixture
def mock_response_data():
"""Sample embedding API response with out-of-order indices."""
return {
"object": "list",
"data": [
{"object": "embedding", "index": 1, "embedding": [0.4, 0.5, 0.6]},
{"object": "embedding", "index": 0, "embedding": [0.1, 0.2, 0.3]},
{"object": "embedding", "index": 2, "embedding": [0.7, 0.8, 0.9]},
],
"model": "test-model",
"usage": {"prompt_tokens": 10, "total_tokens": 10},
}


class TestCreateRemoteEmbeddingFunction:
"""Tests for create_remote_embedding_function."""

@patch("gramps_webapi.api.search.embeddings.requests.post")
def test_returns_embeddings_in_order(self, mock_post, mock_response_data):
mock_post.return_value.json.return_value = mock_response_data
mock_post.return_value.raise_for_status.return_value = None

embed = create_remote_embedding_function(
base_url="http://localhost:11434",
model_name="test-model",
)
result = embed(["hello", "world", "foo"])

assert result == [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6], [0.7, 0.8, 0.9]]

@patch("gramps_webapi.api.search.embeddings.requests.post")
def test_posts_to_correct_url(self, mock_post, mock_response_data):
mock_post.return_value.json.return_value = mock_response_data
mock_post.return_value.raise_for_status.return_value = None

embed = create_remote_embedding_function(
base_url="http://localhost:11434",
model_name="test-model",
)
embed(["hello"])

mock_post.assert_called_once()
call_args = mock_post.call_args
assert call_args[0][0] == "http://localhost:11434/v1/embeddings"

@patch("gramps_webapi.api.search.embeddings.requests.post")
def test_strips_trailing_slash_from_base_url(self, mock_post, mock_response_data):
mock_post.return_value.json.return_value = mock_response_data
mock_post.return_value.raise_for_status.return_value = None

embed = create_remote_embedding_function(
base_url="http://localhost:11434/",
model_name="test-model",
)
embed(["hello"])

call_args = mock_post.call_args
assert call_args[0][0] == "http://localhost:11434/v1/embeddings"

@patch("gramps_webapi.api.search.embeddings.requests.post")
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no test covering a base_url that already includes a /v1 path segment (even though the README example uses that form). Adding a regression test for this case would prevent accidental reintroduction of .../v1/v1/embeddings URL construction issues.

Suggested change
@patch("gramps_webapi.api.search.embeddings.requests.post")
@patch("gramps_webapi.api.search.embeddings.requests.post")
def test_base_url_with_v1_segment(self, mock_post, mock_response_data):
mock_post.return_value.json.return_value = mock_response_data
mock_post.return_value.raise_for_status.return_value = None
embed = create_remote_embedding_function(
base_url="http://localhost:11434/v1",
model_name="test-model",
)
embed(["hello"])
call_args = mock_post.call_args
assert call_args[0][0] == "http://localhost:11434/v1/embeddings"
@patch("gramps_webapi.api.search.embeddings.requests.post")

Copilot uses AI. Check for mistakes.
def test_sends_model_and_input(self, mock_post, mock_response_data):
mock_post.return_value.json.return_value = mock_response_data
mock_post.return_value.raise_for_status.return_value = None

embed = create_remote_embedding_function(
base_url="http://localhost:11434",
model_name="nomic-embed-text",
)
embed(["hello", "world"])

call_args = mock_post.call_args
assert call_args[1]["json"] == {
"model": "nomic-embed-text",
"input": ["hello", "world"],
}

@patch("gramps_webapi.api.search.embeddings.requests.post")
def test_sends_api_key_header(self, mock_post, mock_response_data):
mock_post.return_value.json.return_value = mock_response_data
mock_post.return_value.raise_for_status.return_value = None

embed = create_remote_embedding_function(
base_url="http://localhost:11434",
model_name="test-model",
api_key="sk-test-key-123",
)
embed(["hello"])

call_args = mock_post.call_args
headers = call_args[1]["headers"]
assert headers["Authorization"] == "Bearer sk-test-key-123"

@patch("gramps_webapi.api.search.embeddings.requests.post")
def test_no_auth_header_without_api_key(self, mock_post, mock_response_data):
mock_post.return_value.json.return_value = mock_response_data
mock_post.return_value.raise_for_status.return_value = None

embed = create_remote_embedding_function(
base_url="http://localhost:11434",
model_name="test-model",
)
embed(["hello"])

call_args = mock_post.call_args
headers = call_args[1]["headers"]
assert "Authorization" not in headers

@patch("gramps_webapi.api.search.embeddings.requests.post")
def test_raises_on_http_error(self, mock_post):
from requests.exceptions import HTTPError

mock_post.return_value.raise_for_status.side_effect = HTTPError("500 Server Error")

embed = create_remote_embedding_function(
base_url="http://localhost:11434",
model_name="test-model",
)

with pytest.raises(HTTPError):
embed(["hello"])
Loading