Skip to content

Add remote OpenAI-compatible embedding API support#776

Merged
DavidMStraub merged 5 commits intogramps-project:masterfrom
cbcoutinho:feature/remote-embedding-api
Apr 7, 2026
Merged

Add remote OpenAI-compatible embedding API support#776
DavidMStraub merged 5 commits intogramps-project:masterfrom
cbcoutinho:feature/remote-embedding-api

Conversation

@cbcoutinho
Copy link
Copy Markdown
Contributor

@cbcoutinho cbcoutinho commented Feb 18, 2026

Summary

  • Add support for remote OpenAI-compatible embedding APIs (Ollama, OpenAI, LiteLLM, etc.) as an alternative to local SentenceTransformer models for semantic search
  • Introduce EMBEDDING_BASE_URL and EMBEDDING_API_KEY configuration options
  • Add optional Ollama service to devcontainer (behind a Docker Compose profile) for local integration testing
  • Document the new config variables in README, CONTRIBUTING, and inline code comments

Closes #775

Documentation PR: gramps-project/gramps-web-docs#73

Test plan

  • Verify default local SentenceTransformer path still works (no env vars changed)
  • Start Ollama via docker compose --profile ollama up -d ollama, pull nomic-embed-text, switch env vars, and confirm semantic search works through the remote API
  • Run pytest tests/test_embeddings.py -v to confirm unit tests pass
  • Verify devcontainer builds and starts without Ollama profile (default path)

🤖 Generated with Claude Code


This PR was generated with the help of AI, and reviewed by a Human

Allow using a remote embedding provider (Ollama, OpenAI, LiteLLM, etc.)
instead of a local SentenceTransformer model for semantic search. This adds
EMBEDDING_BASE_URL and EMBEDDING_API_KEY config options, documentation in
README and CONTRIBUTING, and an optional Ollama service in the devcontainer.

Closes gramps-project#775

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@DavidMStraub
Copy link
Copy Markdown
Member

Do the open checkboxes mean that this is a draft?

@cbcoutinho
Copy link
Copy Markdown
Contributor Author

cbcoutinho commented Feb 18, 2026

Hi @DavidMStraub I've tested the changes locally and confirmed that sentence transformers remain the default, and ollama is enabled when EMBEDDING_BASE_URL env var is set, with VECTOR_EMBEDDING_MODEL set to nomic-embed-text and pulling the model into ollama before running the tests

Both paths confirmed working through create_app:

  ┌───────────────────────────┬─────────────────────┬────────────────────────────────────────────────────────┬───────────┬────────┐
  │         Scenario          │ EMBEDDING_BASE_URL  │                     Function type                      │ Dimension │ Result │
  ├───────────────────────────┼─────────────────────┼────────────────────────────────────────────────────────┼───────────┼────────┤
  │ Local SentenceTransformer │ not set             │ model.encode (bound method)                            │ 512       │ PASSED │
  ├───────────────────────────┼─────────────────────┼────────────────────────────────────────────────────────┼───────────┼────────┤
  │ Remote Ollama             │ http://ollama:11434 │ _embed (closure from create_remote_embedding_function) │ 768       │ PASSED │
  └───────────────────────────┴─────────────────────┴────────────────────────────────────────────────────────┴───────────┴────────┘

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for using a remote OpenAI-compatible /v1/embeddings endpoint for semantic search embeddings (e.g., Ollama/OpenAI/LiteLLM) as an alternative to local SentenceTransformer models, along with related configuration, dev tooling, documentation, and tests.

Changes:

  • Introduces EMBEDDING_BASE_URL / EMBEDDING_API_KEY config and wires semantic search to use either a remote embedding function or local SentenceTransformer.encode.
  • Adds a focused unit test suite for the remote embedding function behavior (ordering, URL construction, auth header, error propagation).
  • Updates docs and devcontainer compose to support optional Ollama-based local integration testing.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
gramps_webapi/api/search/embeddings.py Adds create_remote_embedding_function() using requests.post against an OpenAI-compatible embeddings endpoint.
gramps_webapi/app.py Initializes _EMBEDDING_FUNCTION at app startup (remote function or local model .encode).
gramps_webapi/api/search/__init__.py Switches semantic search indexer construction to use _EMBEDDING_FUNCTION.
gramps_webapi/config.py Adds default config entries for remote embedding base URL and API key.
tests/test_embeddings.py New unit tests covering the remote embedding function behavior.
README.md Documents remote embedding configuration and examples.
CONTRIBUTING.md Adds optional steps for testing remote embeddings with Ollama via devcontainer compose profile.
.devcontainer/docker-compose.yml Adds an optional Ollama service behind a Docker Compose profile and related env var guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread gramps_webapi/api/search/embeddings.py Outdated

Returns a callable with signature (texts: list[str]) -> list[list[float]].
"""
url = f"{base_url.rstrip('/')}/v1/embeddings"
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_remote_embedding_function() always appends /v1/embeddings to base_url. If a user supplies a base URL that already includes /v1 (a common OpenAI-style base URL, and also shown in this PR’s README), the resulting request URL becomes .../v1/v1/embeddings and will 404. Consider normalizing base_url to accept both forms (strip a trailing /v1) or clearly document that base_url must not include /v1.

Suggested change
url = f"{base_url.rstrip('/')}/v1/embeddings"
stripped = base_url.rstrip("/")
if stripped.endswith("/v1"):
stripped = stripped[:-3]
url = f"{stripped}/v1/embeddings"

Copilot uses AI. Check for mistakes.
Comment thread gramps_webapi/api/search/embeddings.py
Comment thread gramps_webapi/config.py Outdated
Comment on lines +83 to +84
EMBEDDING_BASE_URL = None # If set, use remote OpenAI-compatible API instead of local model
EMBEDDING_API_KEY = None # Optional API key for authenticated embedding providers
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line exceeds the repository’s configured flake8 max-line-length = 88 (see .flake8). Please wrap the inline comment (or move it above the assignment) to avoid lint failures.

Suggested change
EMBEDDING_BASE_URL = None # If set, use remote OpenAI-compatible API instead of local model
EMBEDDING_API_KEY = None # Optional API key for authenticated embedding providers
# If set, use remote OpenAI-compatible API instead of local model
EMBEDDING_BASE_URL = None
# Optional API key for authenticated embedding providers
EMBEDDING_API_KEY = None

Copilot uses AI. Check for mistakes.
Comment thread README.md Outdated

```bash
GRAMPSWEB_VECTOR_EMBEDDING_MODEL=text-embedding-3-small
GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com/v1
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OpenAI example sets GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com/v1, but create_remote_embedding_function() appends /v1/embeddings to the base URL. With this example config, the effective URL becomes https://api.openai.com/v1/v1/embeddings and will fail. Update the example (e.g., base URL without /v1) or adjust the code to accept both formats.

Suggested change
GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com/v1
GRAMPSWEB_EMBEDDING_BASE_URL=https://api.openai.com

Copilot uses AI. Check for mistakes.
Comment thread tests/test_embeddings.py
call_args = mock_post.call_args
assert call_args[0][0] == "http://localhost:11434/v1/embeddings"

@patch("gramps_webapi.api.search.embeddings.requests.post")
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no test covering a base_url that already includes a /v1 path segment (even though the README example uses that form). Adding a regression test for this case would prevent accidental reintroduction of .../v1/v1/embeddings URL construction issues.

Suggested change
@patch("gramps_webapi.api.search.embeddings.requests.post")
@patch("gramps_webapi.api.search.embeddings.requests.post")
def test_base_url_with_v1_segment(self, mock_post, mock_response_data):
mock_post.return_value.json.return_value = mock_response_data
mock_post.return_value.raise_for_status.return_value = None
embed = create_remote_embedding_function(
base_url="http://localhost:11434/v1",
model_name="test-model",
)
embed(["hello"])
call_args = mock_post.call_args
assert call_args[0][0] == "http://localhost:11434/v1/embeddings"
@patch("gramps_webapi.api.search.embeddings.requests.post")

Copilot uses AI. Check for mistakes.
cbcoutinho and others added 2 commits April 1, 2026 09:10
- Normalize base_url to strip trailing /v1 to prevent /v1/v1/embeddings
- Add timeout=30 to requests.post() to prevent indefinite hangs
- Move inline comments above assignments in config.py for flake8
- Fix OpenAI example URL in README to not include /v1
- Add test for base_url containing /v1 path segment

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cbcoutinho
Copy link
Copy Markdown
Contributor Author

Hi @DavidMStraub - all issues raised by the code review have been addressed

@DavidMStraub
Copy link
Copy Markdown
Member

Thank you! I'll do the human review in the next days.

@DavidMStraub
Copy link
Copy Markdown
Member

This looks good to me, thanks!

Just two things:

  • I find it inconsistent to have some config options starting with VECTOR_EMBEDDING and some with EMBEDDING. Although the latter is more concise, we already have precedent for the former, so I think it makes more sense to use VECTOR_ everywhere.
  • The Readme is not the right place for the docs, please remove them again.

Please do open a PR against github.com/gramps-project/gramps-web-docs with

  • the additional config options documented at install_setup/configuration
  • the setup docs (with docker compose deployment example) at install_setup/chat. That page is probably going to be way too long at this point, so we might want to split it into a few parts.

I'll merge the doc and web API PRs back to back then.

Thanks again!

cbcoutinho and others added 2 commits April 7, 2026 15:00
…move README docs

Use VECTOR_EMBEDDING_BASE_URL and VECTOR_EMBEDDING_API_KEY for consistency
with existing VECTOR_EMBEDDING_MODEL config option. Move documentation to
gramps-web-docs repo per maintainer review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cbcoutinho
Copy link
Copy Markdown
Contributor Author

Hi @DavidMStraub thanks for the thorough review! I've updated the PR as per your recommendations, and created a separate PR in the documentation repo: gramps-project/gramps-web-docs#73

Please let me know if there's anything else I can do to assist in prep for merging these two

@DavidMStraub
Copy link
Copy Markdown
Member

Looks good, thanks!

@DavidMStraub DavidMStraub merged commit f1e528f into gramps-project:master Apr 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Support for remote embedding APIs and external vector store backends

3 participants