Add remote OpenAI-compatible embedding API support#776
Add remote OpenAI-compatible embedding API support#776DavidMStraub merged 5 commits intogramps-project:masterfrom
Conversation
Allow using a remote embedding provider (Ollama, OpenAI, LiteLLM, etc.) instead of a local SentenceTransformer model for semantic search. This adds EMBEDDING_BASE_URL and EMBEDDING_API_KEY config options, documentation in README and CONTRIBUTING, and an optional Ollama service in the devcontainer. Closes gramps-project#775 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Do the open checkboxes mean that this is a draft? |
|
Hi @DavidMStraub I've tested the changes locally and confirmed that sentence transformers remain the default, and ollama is enabled when |
There was a problem hiding this comment.
Pull request overview
Adds support for using a remote OpenAI-compatible /v1/embeddings endpoint for semantic search embeddings (e.g., Ollama/OpenAI/LiteLLM) as an alternative to local SentenceTransformer models, along with related configuration, dev tooling, documentation, and tests.
Changes:
- Introduces
EMBEDDING_BASE_URL/EMBEDDING_API_KEYconfig and wires semantic search to use either a remote embedding function or localSentenceTransformer.encode. - Adds a focused unit test suite for the remote embedding function behavior (ordering, URL construction, auth header, error propagation).
- Updates docs and devcontainer compose to support optional Ollama-based local integration testing.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
gramps_webapi/api/search/embeddings.py |
Adds create_remote_embedding_function() using requests.post against an OpenAI-compatible embeddings endpoint. |
gramps_webapi/app.py |
Initializes _EMBEDDING_FUNCTION at app startup (remote function or local model .encode). |
gramps_webapi/api/search/__init__.py |
Switches semantic search indexer construction to use _EMBEDDING_FUNCTION. |
gramps_webapi/config.py |
Adds default config entries for remote embedding base URL and API key. |
tests/test_embeddings.py |
New unit tests covering the remote embedding function behavior. |
README.md |
Documents remote embedding configuration and examples. |
CONTRIBUTING.md |
Adds optional steps for testing remote embeddings with Ollama via devcontainer compose profile. |
.devcontainer/docker-compose.yml |
Adds an optional Ollama service behind a Docker Compose profile and related env var guidance. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| Returns a callable with signature (texts: list[str]) -> list[list[float]]. | ||
| """ | ||
| url = f"{base_url.rstrip('/')}/v1/embeddings" |
There was a problem hiding this comment.
create_remote_embedding_function() always appends /v1/embeddings to base_url. If a user supplies a base URL that already includes /v1 (a common OpenAI-style base URL, and also shown in this PR’s README), the resulting request URL becomes .../v1/v1/embeddings and will 404. Consider normalizing base_url to accept both forms (strip a trailing /v1) or clearly document that base_url must not include /v1.
| url = f"{base_url.rstrip('/')}/v1/embeddings" | |
| stripped = base_url.rstrip("/") | |
| if stripped.endswith("/v1"): | |
| stripped = stripped[:-3] | |
| url = f"{stripped}/v1/embeddings" |
| EMBEDDING_BASE_URL = None # If set, use remote OpenAI-compatible API instead of local model | ||
| EMBEDDING_API_KEY = None # Optional API key for authenticated embedding providers |
There was a problem hiding this comment.
This line exceeds the repository’s configured flake8 max-line-length = 88 (see .flake8). Please wrap the inline comment (or move it above the assignment) to avoid lint failures.
| EMBEDDING_BASE_URL = None # If set, use remote OpenAI-compatible API instead of local model | |
| EMBEDDING_API_KEY = None # Optional API key for authenticated embedding providers | |
| # If set, use remote OpenAI-compatible API instead of local model | |
| EMBEDDING_BASE_URL = None | |
| # Optional API key for authenticated embedding providers | |
| EMBEDDING_API_KEY = None |
|
|
||
| ```bash | ||
| GRAMPSWEB_VECTOR_EMBEDDING_MODEL=text-embedding-3-small | ||
| GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com/v1 |
There was a problem hiding this comment.
The OpenAI example sets GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com/v1, but create_remote_embedding_function() appends /v1/embeddings to the base URL. With this example config, the effective URL becomes https://api.openai.com/v1/v1/embeddings and will fail. Update the example (e.g., base URL without /v1) or adjust the code to accept both formats.
| GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com/v1 | |
| GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com |
| call_args = mock_post.call_args | ||
| assert call_args[0][0] == "http://localhost:11434/v1/embeddings" | ||
|
|
||
| @patch("gramps_webapi.api.search.embeddings.requests.post") |
There was a problem hiding this comment.
There’s no test covering a base_url that already includes a /v1 path segment (even though the README example uses that form). Adding a regression test for this case would prevent accidental reintroduction of .../v1/v1/embeddings URL construction issues.
| @patch("gramps_webapi.api.search.embeddings.requests.post") | |
| @patch("gramps_webapi.api.search.embeddings.requests.post") | |
| def test_base_url_with_v1_segment(self, mock_post, mock_response_data): | |
| mock_post.return_value.json.return_value = mock_response_data | |
| mock_post.return_value.raise_for_status.return_value = None | |
| embed = create_remote_embedding_function( | |
| base_url="http://localhost:11434/v1", | |
| model_name="test-model", | |
| ) | |
| embed(["hello"]) | |
| call_args = mock_post.call_args | |
| assert call_args[0][0] == "http://localhost:11434/v1/embeddings" | |
| @patch("gramps_webapi.api.search.embeddings.requests.post") |
- Normalize base_url to strip trailing /v1 to prevent /v1/v1/embeddings - Add timeout=30 to requests.post() to prevent indefinite hangs - Move inline comments above assignments in config.py for flake8 - Fix OpenAI example URL in README to not include /v1 - Add test for base_url containing /v1 path segment Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hi @DavidMStraub - all issues raised by the code review have been addressed |
|
Thank you! I'll do the human review in the next days. |
|
This looks good to me, thanks! Just two things:
Please do open a PR against github.com/gramps-project/gramps-web-docs with
I'll merge the doc and web API PRs back to back then. Thanks again! |
…move README docs Use VECTOR_EMBEDDING_BASE_URL and VECTOR_EMBEDDING_API_KEY for consistency with existing VECTOR_EMBEDDING_MODEL config option. Move documentation to gramps-web-docs repo per maintainer review. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hi @DavidMStraub thanks for the thorough review! I've updated the PR as per your recommendations, and created a separate PR in the documentation repo: gramps-project/gramps-web-docs#73 Please let me know if there's anything else I can do to assist in prep for merging these two |
|
Looks good, thanks! |
Summary
EMBEDDING_BASE_URLandEMBEDDING_API_KEYconfiguration optionsCloses #775
Documentation PR: gramps-project/gramps-web-docs#73
Test plan
docker compose --profile ollama up -d ollama, pullnomic-embed-text, switch env vars, and confirm semantic search works through the remote APIpytest tests/test_embeddings.py -vto confirm unit tests pass🤖 Generated with Claude Code
This PR was generated with the help of AI, and reviewed by a Human