fix: make observer/observed optional in conclusions query endpoint by TreyDong · Pull Request #492 · plastic-labs/honcho

TreyDong · 2026-04-05T12:32:56Z

Summary

Previously, the conclusions query endpoint required both observer and observed to be specified, raising ValidationException if either was missing.

Changes

query_documents: observer/observed now optional (default None)
_query_documents_pgvector: conditionally applies WHERE clause only when param is not None
query_conclusions router: removed ValidationException guard

Behavior Change

Scenario	Before	After
Without filters	ValidationException	Returns results
SDK path	Works	Works (unchanged)

Summary by CodeRabbit

New Features
- Document and conclusion semantic search now supports optional peer filtering—users can omit observer and/or observed parameters to query across all peer relationships in the workspace.

Previously, query_conclusions required both observer and observed to be specified in filters, raising ValidationException if either was missing. This made the REST API awkward to use for ad-hoc queries and was inconsistent with the SDK (which always passes these params). Changes: - query_documents: observer/observed are now optional (default None) - _query_documents_pgvector: conditionally applies WHERE clause only when the param is not None (None = unfiltered) - query_conclusions router: removed the ValidationException guard, docstring updated to explain the new behavior Fixes the awkward UX where calling the conclusions query endpoint without observer/observed filters returned an opaque error instead of empty or cross-peer results.

coderabbitai · 2026-04-05T12:33:12Z

Walkthrough

The PR makes observer and observed parameters optional in document querying functions throughout the CRUD and routing layers. Conditional filtering is applied only when these values are provided, and validation requiring both parameters is removed from the router layer.

Changes

Cohort / File(s)	Summary
CRUD Document Query Layer `src/crud/document.py`	Updated `query_documents` and `_query_documents_pgvector` signatures to accept `observer: str \| None = None` and `observed: str \| None = None`. Conditional filters now append only when values are not `None`, allowing flexible peer filtering.
Router Conclusions Layer `src/routers/conclusions.py`	Removed `ValidationException` handling that required both `observer` and `observed` to be present. Updated docstring to clarify optional filtering behavior; omitting both parameters returns conclusions across all peer relationships in the workspace.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 Optional peers now dance with grace,
No longer bound by validation's embrace,
When None flows through, filters stay in place,
A flexible query finds its space! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main change: making observer/observed parameters optional in the conclusions query endpoint, which is the primary objective of the PR.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/crud/document.py (2)
386-394: ⚠️ Potential issue | 🔴 Critical

Critical: External vector store path will raise ValueError when observer or observed is None.

The query_documents function now accepts None for observer/observed, but when the external vector store path is taken (i.e., _uses_pgvector() returns False), it calls query_external_vector_document_ids which explicitly raises ValueError("observer and observed are required for document namespaces") per src/vector_store/__init__.py:110-112.

This creates a runtime failure for any deployment using external vector stores (LanceDB, Turbopuffer) when clients omit observer/observed filters.

Options to resolve:

Add a guard in query_documents to require observer/observed when using external vector stores

Update query_external_vector_document_ids to support None values (if the stores can handle workspace-wide queries)

Fall back to pgvector path when observer/observed are None
🐛 Option 1: Guard against None in external vector store path
     # External vector store — network call first, DB only for the ID fetch
+    if observer is None or observed is None:
+        raise ValidationException(
+            "observer and observed filters are required when using external vector stores"
+        )
     document_ids = await query_external_vector_document_ids(
         workspace_name=workspace_name,
         observer=observer,
         observed=observed,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/crud/document.py` around lines 386 - 394, The external-vector-store path
currently calls query_external_vector_document_ids which requires non-None
observer/observed, causing runtime errors when query_documents accepts None; fix
query_documents by checking _uses_pgvector(workspace_name) before calling
query_external_vector_document_ids and if it returns False and (observer is None
or observed is None) raise a clear ValueError (or return a controlled error)
stating that observer and observed are required for external vector stores;
update the error message to reference
query_external_vector_document_ids/_uses_pgvector so it's clear where the
constraint comes from.
399-419: ⚠️ Potential issue | 🟡 Minor

Type mismatch: fetch_documents_by_ids expects non-optional observer/observed.

The fetch_documents_by_ids function signature (lines 252-254) requires observer: str and observed: str as non-optional parameters. Passing None here would cause SQLAlchemy to filter for observer IS NULL rather than skipping the filter.

Currently this is masked because the external vector store call fails first (as noted above), but if that's ever fixed, this function would also need updating for consistency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/crud/document.py` around lines 399 - 419, fetch_documents_by_ids requires
non-optional observer: str and observed: str, but this call site may pass None;
before invoking fetch_documents_by_ids (both branches where db is provided and
inside the tracked_db block) validate that observer and observed are non-None
and of type str (or coerce to a meaningful non-empty string) and if they are
None raise a clear ValueError (or return an error) so SQLAlchemy won't
accidentally filter for IS NULL; update the calls in this file to ensure
observer and observed are validated and passed as strings to
fetch_documents_by_ids (refer to symbols: fetch_documents_by_ids, observer,
observed, db, tracked_db, managed_db).
src/routers/conclusions.py (1)
102-122: ⚠️ Potential issue | 🟠 Major

External vector store will fail with None observer/observed; docstring overpromises flexibility.

The updated docstring claims "omitting both is allowed," but this only works for pgvector deployments. When using an external vector store (Turbopuffer, LanceDB), the query_external_vector_document_ids function passes observer/observed directly to get_vector_namespace, which raises ValueError at lines 110-113 of src/vector_store/__init__.py: "observer and observed are required for document namespaces."

The pgvector path correctly handles None values via conditional checks (lines 292-295 in src/crud/document.py), but the external vector store path does not. Either restrict the API to require both fields, or update the external vector store query logic to handle optional observer/observed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/conclusions.py` around lines 102 - 122, The external vector store
path fails when observer/observed are None because
query_external_vector_document_ids passes them into get_vector_namespace which
requires both; update the logic in crud.query_documents or
query_external_vector_document_ids so external stores either (A) enforce
observer and observed are present and raise a clear error (validate in
query_documents before calling query_external_vector_document_ids), or (B)
handle optional values like the pgvector branch does (check for None and avoid
calling get_vector_namespace with None, using a workspace/global namespace or
building namespaces only for provided fields); modify get_vector_namespace only
if you choose option B to accept optional args, and reference these symbols:
crud.query_documents (or query_documents), query_external_vector_document_ids,
and get_vector_namespace when making the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/crud/document.py`:
- Around line 386-394: The external-vector-store path currently calls
query_external_vector_document_ids which requires non-None observer/observed,
causing runtime errors when query_documents accepts None; fix query_documents by
checking _uses_pgvector(workspace_name) before calling
query_external_vector_document_ids and if it returns False and (observer is None
or observed is None) raise a clear ValueError (or return a controlled error)
stating that observer and observed are required for external vector stores;
update the error message to reference
query_external_vector_document_ids/_uses_pgvector so it's clear where the
constraint comes from.
- Around line 399-419: fetch_documents_by_ids requires non-optional observer:
str and observed: str, but this call site may pass None; before invoking
fetch_documents_by_ids (both branches where db is provided and inside the
tracked_db block) validate that observer and observed are non-None and of type
str (or coerce to a meaningful non-empty string) and if they are None raise a
clear ValueError (or return an error) so SQLAlchemy won't accidentally filter
for IS NULL; update the calls in this file to ensure observer and observed are
validated and passed as strings to fetch_documents_by_ids (refer to symbols:
fetch_documents_by_ids, observer, observed, db, tracked_db, managed_db).

In `@src/routers/conclusions.py`:
- Around line 102-122: The external vector store path fails when
observer/observed are None because query_external_vector_document_ids passes
them into get_vector_namespace which requires both; update the logic in
crud.query_documents or query_external_vector_document_ids so external stores
either (A) enforce observer and observed are present and raise a clear error
(validate in query_documents before calling query_external_vector_document_ids),
or (B) handle optional values like the pgvector branch does (check for None and
avoid calling get_vector_namespace with None, using a workspace/global namespace
or building namespaces only for provided fields); modify get_vector_namespace
only if you choose option B to accept optional args, and reference these
symbols: crud.query_documents (or query_documents),
query_external_vector_document_ids, and get_vector_namespace when making the
change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7ebee6c1-a93c-4500-b94f-047adda7a86d

📥 Commits

Reviewing files that changed from the base of the PR and between e487358 and 427c674.

📒 Files selected for processing (2)

src/crud/document.py
src/routers/conclusions.py

ajspig · 2026-04-07T22:17:32Z

Hey @TreyDong — thanks for this PR!

Currently, the /conclusions/query endpoint requires both observer and observedto be specified. Omitting them raises a ValidationException.

I opened an issue to track this as a feature (#520) since the full solution touches the external vector store path (Turbopuffer/LanceDB) in a non-trivial way — namespace hashing requires both observer and observed, so omitting them crashes on those backends. There's also a type mismatch in fetch_documents_by_ids that would
need addressing.

Feel free to weigh in there and/or try and tackle the improvements needed!

VVoruganti · 2026-04-08T02:49:39Z

Generally we want to stick to a strict boundary for conclusions queries to a singular representation. For external vector dbs such as turbopuffer or lancedb we enforce a strict collection boundary. Instead we would have to do a fanout query across multiple representations simultaneously and then rerank

coderabbitai bot reviewed Apr 5, 2026

View reviewed changes

ajspig mentioned this pull request Apr 7, 2026

Cross-peer semantic search for conclusions #520

Open

ajspig mentioned this pull request Apr 10, 2026

fix(api): allow optional observer and observed parameters for cross-peer search #541

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: make observer/observed optional in conclusions query endpoint#492

fix: make observer/observed optional in conclusions query endpoint#492
TreyDong wants to merge 1 commit intoplastic-labs:mainfrom
TreyDong:fix/conclusions-query-optional-observer-observed

TreyDong commented Apr 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 5, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

ajspig commented Apr 7, 2026

Uh oh!

VVoruganti commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

TreyDong commented Apr 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Behavior Change

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ajspig commented Apr 7, 2026

Uh oh!

VVoruganti commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

TreyDong commented Apr 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 5, 2026 •

edited

Loading