Skip to content

Fix generate-embeddings row payload shape for CurateGPT insertion#13

Merged
caufieldjh merged 2 commits into
mainfrom
copilot/fix-type-error-embeddings
Feb 21, 2026
Merged

Fix generate-embeddings row payload shape for CurateGPT insertion#13
caufieldjh merged 2 commits into
mainfrom
copilot/fix-type-error-embeddings

Conversation

Copilot AI commented Feb 21, 2026

Copy link
Copy Markdown
Contributor

generate-embeddings was passing each CSV row to CurateGPT in a shape that caused dict keys to be treated as strings during embedding insertion, producing 'str' object has no attribute 'items' and resulting in zero embedded rows. This update aligns the CLI embedding path with CurateGPT’s expected document input structure.

  • Embedding insertion contract

    • Updated embedding generation to pass each row as a single-document list when calling CurateGPT store insertion.
    • This prevents per-row dicts from being interpreted as iterables of string keys.
  • Focused test updates

    • Adjusted embedding generation tests to assert the inserted payload shape ([row]).
    • Kept existing skip/ordering assertions, updated for the new call structure.
# before
store.insert(row, collection=collection_name)

# after
store.insert([row], collection=collection_name)
Original prompt

This section details on the original issue you should resolve

<issue_title>type error when generating embeddings with trowel CLI</issue_title>
<issue_description>Trying to generate embeddings with generate-embeddings gives a 'str' object has no attribute 'items' WARNING and leads to no embedding generation. Seems like at some point, strings aren't digested into a dictionary. Same file runs fine with curategpt.

trowel embeddings generate-embeddings -i bervo_prepared_small.csv
INFO: Starting embedding generation...
/Users/ukaraoz/Work/bioepic/ldrd/trowel/.venv/lib/python3.11/site-packages/eutils/init.py:4: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
INFO: Initializing CurateGPT store with DuckDB at ./backup/db.duckdb...
INFO: Loading data from bervo_prepared_small.csv...
WARNING: Failed to embed row 0: 'str' object has no attribute 'items'
WARNING: Failed to embed row 1: 'str' object has no attribute 'items'
WARNING: Failed to embed row 2: 'str' object has no attribute 'items'
INFO: Successfully embedded 0 rows from bervo_prepared_small.csv
INFO: Embeddings stored in collection 'embeddings' at ./backup/db.duckdb
INFO: Successfully generated 0 embeddings
INFO: Database saved at: ./backup/db.duckdb
</issue_description>

Comments on the Issue (you are @copilot in this section)

@caufieldjh Thanks @ukaraoz - I'll take a look

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: caufieldjh <7987729+caufieldjh@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix type error in embedding generation with trowel CLI Fix generate-embeddings row payload shape for CurateGPT insertion Feb 21, 2026
Copilot AI requested a review from caufieldjh February 21, 2026 23:18
@caufieldjh caufieldjh marked this pull request as ready for review February 21, 2026 23:31
@caufieldjh caufieldjh merged commit f04b1e9 into main Feb 21, 2026
2 checks passed
@caufieldjh caufieldjh deleted the copilot/fix-type-error-embeddings branch February 21, 2026 23:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

type error when generating embeddings with trowel CLI

2 participants