Reduce token usage in conversation search by mbklein · Pull Request #302 · nulib/dc-api-v2

mbklein · 2025-03-25T16:35:14Z

Building on PR #296, which reduced cumulative token usage across multiple interactions by removing all non-current ToolMessages, this PR reduces token counts within a single interaction by:

Reducing redundancy in tool results (by returning only content instead of content and artifact containing the same data in different forms)
Filtering fields in the documents returned from the index

On average, this reduces tool content byte size by about 90%, and token counts per interaction by about 80%. This allows complex questions to succeed by keeping tool recursion from overwhelming the LLM's max token count.

The PR also adds language to the system prompt to cap tool usage at 6 per interaction, with instructions to summarize results and ask for clarification if it still can't answer the question.

Update the prompt to limit tool calls and avoid recursion errors.

mbklein added 3 commits March 24, 2025 16:39

Let tools return just content instead of content and artifact

f94fd76

Move metrics logging to the MetricsCallbackHandler

c712fe4

Save tokens by filtering fields in documents returned from index.

97ebd8a

Update the prompt to limit tool calls and avoid recursion errors.

mbklein requested review from charlesLoder and kdid March 25, 2025 16:35

kdid approved these changes Mar 25, 2025

View reviewed changes

charlesLoder approved these changes Mar 25, 2025

View reviewed changes

mbklein merged commit 0fe80e4 into deploy/prototype Mar 25, 2025
1 check passed

mbklein deleted the 5373-too-much-input-filter branch March 25, 2025 20:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce token usage in conversation search#302

Reduce token usage in conversation search#302
mbklein merged 3 commits intodeploy/prototypefrom
5373-too-much-input-filter

mbklein commented Mar 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mbklein commented Mar 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants