Skip to content

Feat/dynamic tool routing#1096

Merged
murdore merged 1 commit into
juspay:releasefrom
rab1prasad:feat/dynamic-tool-routing
Jun 17, 2026
Merged

Feat/dynamic tool routing#1096
murdore merged 1 commit into
juspay:releasefrom
rab1prasad:feat/dynamic-tool-routing

Conversation

@rab1prasad

@rab1prasad rab1prasad commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

plane ticket: https://plane.breezehq.dev/breeze/browse/BZ-3640/
dev proof: Screenshot 2026-06-16 at 2 20 54 AM

Pull Request

Description

What does this PR do?

Adds an optional pre-call tool-routing layer to NeuroLink. When enabled via
the constructor's toolRouting config, a cheap router LLM runs once per
stream() turn, decides which tool servers the user's query actually needs, and
appends the unpicked servers' tools to options.excludeTools before the main
model call. The result is fewer tool schemas sent to the chat model per turn
(lower token cost, better selection accuracy) with zero behavior change when the
feature is off.

Related Issues

Relates to BZ-3640 (dynamic per-turn tool filtering for chat)

Type of Change

  • New feature (non-breaking change which adds functionality)

Motivation and Context

Why is this change needed? What problem does it solve?

Hosts (e.g. Lighthouse chat) register large MCP tool sets — 100+ tools across
many servers — and the full schema set is re-sent to the model on every step of
the agentic loop. Tool-schema payload dominates prompt tokens, and selection
accuracy degrades well past ~30–50 tools.

This PR moves the routing decision into the SDK so any consumer gets it through
config alone, instead of re-implementing a router per host. A separate, cheap
router model classifies the query against a server catalog (id + description)
and narrows the offered tool set for that turn.

The mechanism is a deliberate denylist (append to excludeTools): the
router only knows the declared catalog — a strict subset of the real tool set —
so built-in direct tools, always-include servers, and any tools outside the
catalog are never touched. Everything fails open, so a router problem can only
ever degrade to "all tools offered" (today's behavior), never to a broken turn.

Changes Made

What specific changes were made?

  • New src/lib/core/toolRouting.tsbuildToolRoutingCatalog() (groups
    registered tool names by ${serverId}_ prefix) and
    resolveToolRoutingExclusions() (runs the router LLM, validates JSON output
    with zod, returns the tool names to exclude). Includes buildRouterPrompt(),
    parseRouterJson(), DEFAULT_ROUTER_PROMPT_PREFIX, and a
    MAX_ROUTER_QUERY_CHARS bound on the interpolated query.
  • New src/lib/types/toolRouting.tsToolRoutingConfig,
    ToolRoutingServerDescriptor, ToolRoutingModelConfig,
    ToolRoutingCatalogEntry, ToolRoutingResolutionParams.
  • src/lib/neurolink.tstoolRouting constructor wiring, the
    applyToolRoutingExclusions() stream-path hook, and setToolRoutingServers()
    to supply/replace the catalog after construction (for hosts that register
    tools per session). The router call runs inside the neurolink.stream span's
    OTel context so its observation nests under the turn's trace.
  • src/lib/types/config.ts — adds toolRouting?: ToolRoutingConfig to
    NeurolinkConstructorConfig.
  • src/lib/core/constants.tsDEFAULT_TOOL_ROUTING_TIMEOUT_MS (15s).
  • src/lib/types/index.ts — exports the new types.
  • test/toolRouting.test.ts — unit tests for catalog building, set-math,
    every fail-open path, and the stream-hook wiring.

Configurable surface: enabled, servers / setToolRoutingServers(),
alwaysIncludeServerIds (never routed, never excluded), timeoutMs,
routerModel (provider/model/region/temperature), and routerPromptPrefix
(overrides the instruction header; defaults to DEFAULT_ROUTER_PROMPT_PREFIX).

Breaking Changes

Does this PR introduce breaking changes?

  • No breaking changes

With no toolRouting config, toolRoutingConfig stays undefined and the hook
returns at its first guard before doing any work — byte-identical to prior
behavior. disableTools and an empty catalog also short-circuit it.

Testing

How has this been tested?

  • Unit tests added/updated
  • Manual testing completed
  • Tested with multiple providers: Vertex (router gemini-3.1-flash-lite-preview; main chat claude-sonnet-4-5)

Test Coverage

  • All new code is covered by tests
  • Existing tests pass
  • Coverage percentage maintained or improved

test/toolRouting.test.ts — 13 tests passing. Covers: prefix grouping; drop of
zero-tool servers; exclude only unpicked routable servers; always-include never
offered/excluded; markdown-fenced output parsing; hallucinated-id filtering; and
fail-open on missing query, ≤1 routable server, non-JSON, schema-invalid,
empty/hallucinated pick, and router throw. Plus the private stream-hook appends
to excludeTools correctly.

Manual Testing Steps

  1. Construct NeuroLink with toolRouting: { enabled: true, alwaysIncludeServerIds: [...], routerModel: {...} }.
  2. registerTools(...), then setToolRoutingServers([{ id, description }, ...]).
  3. Send a stream() turn with a domain-specific query (e.g. "find surcharge").
  4. Verify in Langfuse: a router generation (cheap model, {"servers":[...]}
    output) appears nested in the turn trace, before the main model call.
  5. Verify the main model's offered tool count / input tokens drop to the routed
    subset + always-include set.
  6. Verify a conversational/nonsense query and a forced router failure both fall
    open (all tools offered, turn proceeds normally).

Code Quality

  • Code follows the project's style guidelines (ESLint passes)
  • Code is properly formatted (Prettier applied)
  • Self-review of code completed
  • No console.log statements (using logger instead)
  • No hardcoded API keys or secrets
  • TypeScript strict mode compliance
  • Proper error handling implemented (fail-open; resolver never throws)
  • TODO/FIXME comments reference issues

Documentation

  • JSDoc comments added/updated for public APIs
  • README.md updated (if needed)
  • Documentation in /docs updated (if needed)
  • Code examples added/updated (if needed)
  • CHANGELOG.md updated (if applicable)
  • Migration guide provided (if breaking changes) — N/A, no breaking changes

Commit Message Format

  • Commit message follows format: type(scope): description
  • Valid type used: feat
  • Scope specified

feat(tool-routing): pre-call per-turn tool routing via config

Dependencies

  • No dependency changes

(Uses the already-present zod for router-output validation.)

Performance Impact

Does this change affect performance?

  • Performance improved (provide metrics)

When enabled, the offered tool set for a turn shrinks from the full catalog to
the routed subset + always-include servers, cutting per-step tool-schema tokens.
Cost: one extra cheap router LLM call before the first token (bounded by
timeoutMs, default 15s). Example observed turn — query "find surcharge":
router selected ["surcharge-server"] for ~$0.00035 on gemini-flash-lite, and
the main model was offered the narrowed set instead of all 100+ tools.

When disabled (no config), there is no added work or latency.

Security Considerations

Are there any security implications?

  • Security review needed (low risk; described below)

  • The user query is interpolated into the router prompt. It is treated as
    untrusted: framed explicitly as "data to classify, not instructions," length
    is bounded by MAX_ROUTER_QUERY_CHARS, and the blast radius of any injection
    is bounded — the router can only pick from the declared catalog
    (routableServerIds.has(...) filters hallucinated ids), so the worst case is
    "keep more already-registered tools," never tool escalation outside the
    catalog.

  • Router output is parsed defensively (fence-strip + JSON-object extraction) and
    zod-validated; any failure fails open.

Deployment Notes

  • No special deployment steps

Inert until a consumer opts in via the toolRouting constructor config.

Reviewer Checklist

  • Code follows project style and conventions
  • Changes are well-documented
  • Tests provide adequate coverage
  • No obvious performance issues
  • No security vulnerabilities introduced
  • Breaking changes are properly documented (N/A)
  • Documentation is clear and accurate

Additional Notes

  • Trace nesting tradeoff (intentional): the router call runs inside
    executeStreamRequest so its span nests under the turn's neurolink.stream
    trace. A consequence is that on a model-access-denied fallback retry the
    router runs again for that attempt. This is accepted: fallback is rare and the
    router is the cheap flash-lite model with a hard timeout. Exclusions do not
    compound across attempts — applyToolRoutingExclusions reassigns
    options.excludeTools to a new array on a per-attempt clone, and stream()
    isolates options from the caller at entry.
  • The dedicated neurolink.toolRouting span is omitted on this branch; the
    router call is still visible as its own generation within the turn trace.

Pre-submission Checklist

  • Read and followed the Contributing Guidelines
  • Verified all automated pre-commit checks pass
  • Tested changes locally with pnpm test (pnpm exec vitest run test/toolRouting.test.ts)
  • Built the project successfully with pnpm build
  • Run pnpm run validate:all and all checks pass
  • Reviewed your own code for obvious issues
  • Ensured commit messages follow semantic format
  • Updated relevant documentation
  • Added tests for new functionality
  • Checked that CI/CD pipeline passes (after creating PR)

Summary by CodeRabbit

  • New Features
    • Added pre-call tool routing for streaming responses to automatically infer relevant tool servers and exclude unselected tools.
    • Added setToolRoutingServers() to supply/replace the routing server catalog after construction.
  • Configuration
    • Added optional toolRouting setting (including timeout and optional router model/prompt overrides); router failures now fail open (tools are not restricted).
  • Bug Fixes
    • Improved telemetry truncation to account for truncation suffix length.
  • Tests
    • Added coverage for routing catalog building, router parsing, exclusion computation, and routing history shaping.

Copilot AI review requested due to automatic review settings June 15, 2026 20:51
@vercel

vercel Bot commented Jun 15, 2026

Copy link
Copy Markdown

@rab1prasad is attempting to deploy a commit to the Sachin Sharma's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a82305fd-656e-43b7-84ac-9ecbf75db620

📥 Commits

Reviewing files that changed from the base of the PR and between c8bada7 and f162f86.

📒 Files selected for processing (8)
  • src/lib/core/constants.ts
  • src/lib/core/toolRouting.ts
  • src/lib/neurolink.ts
  • src/lib/telemetry/attributes.ts
  • src/lib/types/config.ts
  • src/lib/types/index.ts
  • src/lib/types/toolRouting.ts
  • test/toolRouting.test.ts
✅ Files skipped from review due to trivial changes (1)
  • src/lib/types/index.ts
🚧 Files skipped from review as they are similar to previous changes (7)
  • src/lib/telemetry/attributes.ts
  • src/lib/types/config.ts
  • src/lib/types/toolRouting.ts
  • src/lib/core/constants.ts
  • test/toolRouting.test.ts
  • src/lib/core/toolRouting.ts
  • src/lib/neurolink.ts

📝 Walkthrough

Walkthrough

Adds pre-call tool routing to NeuroLink's stream() pipeline. A router LLM is called before each stream turn with the user query and a JSON catalog of configured MCP servers; unselected servers' tool names are returned as an exclusion denylist. All failure paths fail open (return []). A new setToolRoutingServers() public method allows post-construction server catalog updates. Also fixes spanJsonAttribute truncation to keep the string within maxChars.

Changes

Pre-call Tool Routing

Layer / File(s) Summary
Tool routing types, config, and timeout constant
src/lib/types/toolRouting.ts, src/lib/types/config.ts, src/lib/types/index.ts, src/lib/core/constants.ts
Defines ToolRoutingServerDescriptor, ToolRoutingModelConfig, ToolRoutingConfig, ToolRoutingCatalogEntry, ToolRoutingResolutionParams, extends NeurolinkConstructorConfig with toolRouting?, re-exports from the barrel, and adds DEFAULT_TOOL_ROUTING_TIMEOUT_MS = 15000.
Core routing logic
src/lib/core/toolRouting.ts
Implements buildToolRoutingCatalog (server-id prefix grouping), buildRoutingQueryFromHistory (bounded transcript with char/message caps), router prompt builder with embedded JSON catalog, parseRouterJson with fence stripping and fallback extraction, and resolveToolRoutingExclusions end-to-end with Zod validation and fail-open guarantees.
NeuroLink streaming routing wiring
src/lib/neurolink.ts
Adds toolRoutingConfig instance field, clones config in constructor, wires applyToolRoutingExclusions into the turn-scoped streamSpanContext, and implements applyToolRoutingExclusions, fetchRecentRoutingHistory, and public setToolRoutingServers.
spanJsonAttribute truncation fix
src/lib/telemetry/attributes.ts
Computes keepLength to reserve space for the truncation suffix so the returned string stays within maxChars.
Tool routing tests
test/toolRouting.test.ts
Covers catalog prefix grouping, exclusion computation, all fail-open paths (missing query, ≤1 routable server, invalid/non-JSON output, hallucinated picks, errors), NeuroLink wiring integration, and buildRoutingQueryFromHistory transcript/truncation behavior.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant NeuroLink_stream as NeuroLink.stream()
  participant applyToolRoutingExclusions
  participant resolveToolRoutingExclusions
  participant RouterLLM as Router LLM (generateFn)

  Client->>NeuroLink_stream: stream(options, prompt)
  NeuroLink_stream->>applyToolRoutingExclusions: options, userQuery
  applyToolRoutingExclusions->>applyToolRoutingExclusions: buildToolRoutingCatalog(servers, registeredToolNames)
  applyToolRoutingExclusions->>applyToolRoutingExclusions: fetchRecentRoutingHistory(options)
  applyToolRoutingExclusions->>applyToolRoutingExclusions: buildRoutingQueryFromHistory(history, userQuery)
  applyToolRoutingExclusions->>resolveToolRoutingExclusions: catalog, alwaysInclude, query, routerModel, timeout, generateFn
  resolveToolRoutingExclusions->>RouterLLM: router prompt with JSON server catalog
  RouterLLM-->>resolveToolRoutingExclusions: { servers: ["serverA"] }
  resolveToolRoutingExclusions->>resolveToolRoutingExclusions: parse + validate + filter hallucinated ids
  resolveToolRoutingExclusions-->>applyToolRoutingExclusions: excludedToolNames[]
  applyToolRoutingExclusions->>NeuroLink_stream: options.excludeTools += excludedToolNames
  NeuroLink_stream->>Client: stream response (with routing applied)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • punyamsingh
  • murdore
  • akkulg
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Feat/dynamic tool routing' accurately describes the main feature being added—a dynamic pre-call tool routing system for NeuroLink—and is concise and clear.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

src/lib/core/constants.ts

Parsing error: Unable to parse the specified 'tsconfig' file. Ensure it's correct and has valid syntax.

error TS5012: Cannot read file '/.svelte-kit/tsconfig.json': ENOENT: no such file or directory, open '/.svelte-kit/tsconfig.json'.

src/lib/core/toolRouting.ts

Parsing error: Unable to parse the specified 'tsconfig' file. Ensure it's correct and has valid syntax.

error TS5012: Cannot read file '/.svelte-kit/tsconfig.json': ENOENT: no such file or directory, open '/.svelte-kit/tsconfig.json'.

src/lib/neurolink.ts

Parsing error: Unable to parse the specified 'tsconfig' file. Ensure it's correct and has valid syntax.

error TS5012: Cannot read file '/.svelte-kit/tsconfig.json': ENOENT: no such file or directory, open '/.svelte-kit/tsconfig.json'.

  • 4 others

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/core/toolRouting.ts`:
- Around line 49-62: The buildToolRoutingCatalog function uses startsWith to
match tools to server IDs, but this approach causes ambiguity when server IDs
have overlapping prefixes (e.g., "analytics" and "analytics_checkout" both match
"analytics_checkout_*" tools). Fix this by implementing a longest matching
prefix strategy instead of simple prefix matching: iterate through each tool in
registeredToolNames, find the server whose ID is the longest matching prefix for
that tool (accounting for the underscore separator), and assign the tool
exclusively to that server. This ensures each tool is owned by exactly one
server and prevents tools from being incorrectly excluded or assigned.

In `@src/lib/neurolink.ts`:
- Around line 1213-1215: The constructor assigns config.toolRouting directly by
reference to this.toolRoutingConfig, which allows subsequent mutations by
setToolRoutingServers() to affect the original config object. If the same config
is reused across multiple NeuroLink instances, mutations in one instance will
leak into others. Create a defensive clone of config.toolRouting before storing
it in this.toolRoutingConfig to prevent unintended cross-instance contamination.
This fix should also be applied at the other location where toolRouting config
is assigned (as noted in the comment).
- Around line 7748-7763: The `generateFn` callback in the
`resolveToolRoutingExclusions` call re-enters the public `generate()` method,
which has a `finally` block that resets the `_disableToolCacheForCurrentRequest`
flag to `false`. This causes the stream-scoped cache setting to be lost before
the main tool execution path runs. Save the current value of
`_disableToolCacheForCurrentRequest` before invoking the router call via
`generateFn`, and restore it afterward to preserve the original cache state.
Alternatively, implement an internal/silent generation path that avoids
re-entering the public `generate()` method and emitting public events for the
router turn.
- Around line 7742-7743: The routing catalog is being built from only custom
tools via getCustomTools().keys(), but this excludes external or in-memory MCP
server tools. To fix this, modify the registeredToolNames variable to include
all available MCP tool names from all sources - both custom tools and the tools
available from the servers collection. Instead of relying solely on
getCustomTools(), extract the complete set of tool names from the servers
parameter (which is passed to buildToolRoutingCatalog) to ensure the routing
catalog includes all MCP tools and the feature works for all tool registrations.

In `@src/lib/telemetry/attributes.ts`:
- Around line 164-166: The spanJsonAttribute function at Line 164 slices the
serialized string to maxChars and then appends a truncation suffix, which causes
the returned string to exceed the declared maxChars hard cap. To fix this,
calculate the length of the truncation suffix message and reduce the slice point
by that amount so the final concatenated result (slice plus suffix) does not
exceed maxChars. This ensures the hard-ceiling guarantee is actually enforced
and respects strict downstream attribute-size limits.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e91b1d7a-2300-4b62-a558-321f86649563

📥 Commits

Reviewing files that changed from the base of the PR and between 2588300 and 9e29708.

📒 Files selected for processing (15)
  • package.json
  • src/lib/core/constants.ts
  • src/lib/core/toolRouting.ts
  • src/lib/neurolink.ts
  • src/lib/providers/googleVertex.ts
  • src/lib/services/server/ai/observability/instrumentation.ts
  • src/lib/telemetry/attributes.ts
  • src/lib/telemetry/index.ts
  • src/lib/types/config.ts
  • src/lib/types/index.ts
  • src/lib/types/toolRouting.ts
  • src/lib/utils/anthropicTraceSanitizer.ts
  • src/lib/utils/mcpErrorText.ts
  • test/continuous-test-suite-vertex-langfuse-spans.ts
  • test/toolRouting.test.ts

Comment thread src/lib/core/toolRouting.ts
Comment thread src/lib/neurolink.ts
Comment thread src/lib/neurolink.ts Outdated
Comment thread src/lib/neurolink.ts Outdated
Comment thread src/lib/telemetry/attributes.ts
@rab1prasad rab1prasad force-pushed the feat/dynamic-tool-routing branch from 9e29708 to 743fe33 Compare June 16, 2026 20:21

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/core/toolRouting.ts`:
- Around line 236-249: The router call in the generateFn invocation currently
delegates timeout enforcement to the callee by passing the timeout parameter,
which can allow the operation to hang if the callee ignores it. Remove the
timeout property from the generateFn options object and instead wrap the entire
awaited generateFn call with the withTimeout utility, passing timeoutMs as the
second argument. This ensures timeoutMs is enforced as a true hard ceiling
locally rather than relying on the callee to respect it.
- Around line 254-262: The logger.warn call in the ToolRouting validation block
(around the rawResponsePreview parameter) is logging raw router output that may
contain sensitive user query or history content. Remove the rawResponsePreview
line entirely from the logger.warn call, or if router output logging is needed
for debugging, replace it with non-sensitive metadata instead. If raw output
logging is necessary, consider using transformParamsForLogging() to safely strip
secrets before logging as per the coding guidelines.
- Around line 112-126: The early return statement when priorTurns.length === 0
bypasses the maxChars constraint by returning currentQuery directly. Apply the
same character limiting logic used for the full transcript to the currentQuery
in the early return case: check if currentQuery.length exceeds maxChars and if
so, slice it to return only the last maxChars characters, otherwise return
currentQuery unchanged. This ensures the helper respects the caller-provided
character ceiling even when there is no prior conversation history.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 142c120e-5de6-40c1-adc1-77804373d263

📥 Commits

Reviewing files that changed from the base of the PR and between 9e29708 and 743fe33.

📒 Files selected for processing (8)
  • src/lib/core/constants.ts
  • src/lib/core/toolRouting.ts
  • src/lib/neurolink.ts
  • src/lib/telemetry/attributes.ts
  • src/lib/types/config.ts
  • src/lib/types/index.ts
  • src/lib/types/toolRouting.ts
  • test/toolRouting.test.ts
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/lib/types/index.ts
  • src/lib/telemetry/attributes.ts
  • src/lib/types/toolRouting.ts
  • src/lib/types/config.ts
  • src/lib/neurolink.ts

Comment thread src/lib/core/toolRouting.ts
Comment thread src/lib/core/toolRouting.ts Outdated
Comment thread src/lib/core/toolRouting.ts
@rab1prasad rab1prasad force-pushed the feat/dynamic-tool-routing branch from 743fe33 to c8bada7 Compare June 17, 2026 07:02

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/lib/telemetry/attributes.ts (1)

165-167: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Truncation count and hard-cap can still be wrong in edge cases.

On Line 165-Line 167, the suffix count is computed from serialized.length - maxChars, but once keepLength is reduced, the real truncated count is larger. Also, when the suffix itself is longer than maxChars, the returned value can still exceed the cap.

Suggested fix
   if (serialized.length > maxChars) {
-    const truncationSuffix = `...[truncated ${serialized.length - maxChars} chars]`;
-    const keepLength = Math.max(0, maxChars - truncationSuffix.length);
-    return `${serialized.slice(0, keepLength)}${truncationSuffix}`;
+    const buildSuffix = (truncatedChars: number) =>
+      `...[truncated ${truncatedChars} chars]`;
+
+    // First pass to estimate room for prefix.
+    let suffix = buildSuffix(serialized.length - maxChars);
+    if (suffix.length >= maxChars) {
+      return suffix.slice(0, maxChars);
+    }
+
+    let keepLength = maxChars - suffix.length;
+    // Recompute so suffix count reflects actual truncation after keepLength is known.
+    const truncatedChars = serialized.length - keepLength;
+    suffix = buildSuffix(truncatedChars);
+
+    if (suffix.length >= maxChars) {
+      return suffix.slice(0, maxChars);
+    }
+    keepLength = maxChars - suffix.length;
+    return `${serialized.slice(0, keepLength)}${suffix}`;
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/telemetry/attributes.ts` around lines 165 - 167, The truncation logic
in the serialized value truncation has two issues: the truncation count reported
in the suffix is calculated as serialized.length - maxChars, but this doesn't
account for the space consumed by the suffix itself, resulting in an inaccurate
count. Additionally, when the truncationSuffix is itself longer than maxChars,
the final returned string can exceed the maxChars limit. Fix this by calculating
the actual truncation count as serialized.length - keepLength (which reflects
the real number of characters removed) and enforce a hard cap on the final
result by slicing the concatenated string to ensure it never exceeds maxChars in
length.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/neurolink.ts`:
- Around line 7817-7823: The code unconditionally returns an empty array
whenever context.sessionId is missing, but stream execution can accept
caller-provided conversationMessages for inline multi-turn conversations without
needing a sessionId. Modify the condition to check not only for the presence of
sessionId but also for the availability of inline conversationMessages in the
context or options. Only return the empty array if both sessionId is absent AND
no conversationMessages are provided, allowing inline callers to proceed with
their conversation history without requiring server-side session memory.
- Around line 7748-7795: The setup calls to getCustomTools() and
buildToolRoutingCatalog() can throw exceptions before the resolver's fail-open
boundary, and the router's generate() call does not receive the abort signal,
allowing aborted streams to wait for the routing timeout. Wrap the
getCustomTools() and buildToolRoutingCatalog() calls in a try-catch block that
returns early (fail-open) on error, pass options.abortSignal to the generateFn
call within resolveToolRoutingExclusions to respect stream cancellations
immediately, and add an abort signal check after the resolver completes to stop
routing setup if the stream was aborted during the router call.

---

Duplicate comments:
In `@src/lib/telemetry/attributes.ts`:
- Around line 165-167: The truncation logic in the serialized value truncation
has two issues: the truncation count reported in the suffix is calculated as
serialized.length - maxChars, but this doesn't account for the space consumed by
the suffix itself, resulting in an inaccurate count. Additionally, when the
truncationSuffix is itself longer than maxChars, the final returned string can
exceed the maxChars limit. Fix this by calculating the actual truncation count
as serialized.length - keepLength (which reflects the real number of characters
removed) and enforce a hard cap on the final result by slicing the concatenated
string to ensure it never exceeds maxChars in length.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 101cfa2d-3963-4f5e-a9cf-e46501e76b10

📥 Commits

Reviewing files that changed from the base of the PR and between 743fe33 and c8bada7.

📒 Files selected for processing (8)
  • src/lib/core/constants.ts
  • src/lib/core/toolRouting.ts
  • src/lib/neurolink.ts
  • src/lib/telemetry/attributes.ts
  • src/lib/types/config.ts
  • src/lib/types/index.ts
  • src/lib/types/toolRouting.ts
  • test/toolRouting.test.ts
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/lib/core/constants.ts
  • src/lib/types/index.ts
  • test/toolRouting.test.ts
  • src/lib/types/toolRouting.ts
  • src/lib/core/toolRouting.ts

Comment thread src/lib/neurolink.ts Outdated
Comment thread src/lib/neurolink.ts
Add an optional pre-call tool-routing layer enabled through the constructor's
`toolRouting` config. Once per stream() turn a cheap router LLM is asked which
tool servers the user's query needs; the unpicked servers' registered tool
names are appended to options.excludeTools before the main model call. The
server catalog can be supplied or replaced after construction via
setToolRoutingServers() for hosts that register tools per session.

- alwaysIncludeServerIds servers are never offered to the router and never
  excluded; the router prompt prefix is overridable via routerPromptPrefix
  (defaults to DEFAULT_ROUTER_PROMPT_PREFIX), and the router LLM
  provider/model/region/temperature/timeout are configurable.
- Fails open: missing query, <=1 routable server, parse/validation failure,
  empty/hallucinated pick, or any error yields no exclusions, so all tools stay
  available — byte-identical to routing disabled. The resolver never throws.
- Untrusted query text is bounded (MAX_ROUTER_QUERY_CHARS) and framed as data,
  not instructions.
- The router call runs inside the stream span's OTel context so its observation
  nests under the turn's trace rather than starting a separate one.
@rab1prasad rab1prasad force-pushed the feat/dynamic-tool-routing branch from c8bada7 to f162f86 Compare June 17, 2026 07:52
@rab1prasad

Copy link
Copy Markdown
Contributor Author

@coderabbitai please review this pr

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

@rab1prasad Sure! I'll start reviewing PR #1096 now. The PR description image shows a Langfuse trace of the neuroLink.executeGeneration span handling the tool-routing router call — with gemini-3.1-flash-lite-preview running at ~1.70s latency and ~$0.000352, classifying the user query "Where are customers dropping off in my checkout flow today?" against a catalog of servers (math-operations, juspay-analytics, breeze-analytics, campaign-tools, etc.). The full turn traces nested under neuroLink.stream show the router completing well before the main neuroLink.provider.stream span.

I'll review the changes across all the new and modified files shortly.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@murdore murdore merged commit d78d691 into juspay:release Jun 17, 2026
11 of 12 checks passed
@github-actions

Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 9.72.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants