Skip to content

fix(tts): unify voice resolution — dashboard /tts settings now affect the tts tool#1175

Open
vanducng wants to merge 1 commit into
nextlevelbuilder:devfrom
dataplanelabs:upstream-fix/tts-system-configs-fallback
Open

fix(tts): unify voice resolution — dashboard /tts settings now affect the tts tool#1175
vanducng wants to merge 1 commit into
nextlevelbuilder:devfrom
dataplanelabs:upstream-fix/tts-system-configs-fallback

Conversation

@vanducng

Copy link
Copy Markdown
Contributor

Real-world bug

A user configured Vietnamese voice "HoaiMy" (`vi-VN-HoaiMyNeural`) on the dashboard `/tts` page. The Test Playground worked. But when the LLM invoked the `tts` tool, Edge synthesized English instead.

Root cause — two TTS storage locations

Location Read by Written by
`system_configs[tts..voice]` Test Playground, auto-apply at construction-time Dashboard `/tts` page
`builtin_tool_tenant_configs[tts].default_voice_id` `tts` tool (LLM-invoked) (not exposed in any UI)

When the LLM calls `tts` without an explicit `voice` arg, the resolution chain (`args > agent OtherConfig > tenant builtin`) finds no override → empty voice → Edge falls back to its built-in default (English).

Fix

Add `system_configs` as a 4th-level fallback so the dashboard is the single source of truth:

```
args > agent OtherConfig > tenant builtin > system_configs[tts..voice]
```

  • New `TtsTool.SetSystemConfigStore(s)` setter
  • `resolveVoiceAndModel` now takes `providerName` and reads the right key
  • Effective provider resolved BEFORE voice/model so the right lookup key is used
  • Wired at gateway boot in `cmd/gateway.go` after `pgStores` is ready

Tests

4 new unit tests in `internal/tools/tts_systemconfigs_fallback_test.go`:

  • `TestResolveVoiceAndModel_SystemConfigsFallback` — voice + model resolve from system_configs
  • `TestResolveVoiceAndModel_ArgWinsOverSystemConfigs` — explicit arg precedence preserved
  • `TestResolveVoiceAndModel_NoStoreNoFallback` — graceful when store unwired
  • `TestResolveVoiceAndModel_EmptyProviderSkipsFallback` — no key lookup without provider

All existing `tts` tests still pass.

Verification

  • `go build ./...` (PG) clean
  • `go vet ./...` clean
  • `go test ./internal/tools/...` green
  • Already running on the fork's prod cluster (v3.23.43) — verified TTS tool now uses dashboard-configured voice.

Cherry-picked from fork's PR dataplanelabs#172.

… the tts tool (#172)

The dashboard /tts page writes to system_configs[tts.<provider>.voice],
but the LLM-invoked tts tool was only checking args > agent OtherConfig
> builtin_tool_tenant_configs[tts].default_voice_id. Two different
storage locations → the user's chosen voice was ignored, Edge defaulted
to en-US-AriaNeural even when "HoaiMy" (vi-VN-HoaiMyNeural) was set
correctly in the dashboard.

Add system_configs as a 4th-level fallback so the dashboard becomes
the single source of truth.

- TtsTool gains SetSystemConfigStore(s) setter
- resolveVoiceAndModel takes providerName + looks up
  tts.<provider>.voice/model when no higher-precedence source set
- effectiveProvider resolved BEFORE voice/model so the right key is hit
- Wired at gateway boot in cmd/gateway.go (after pgStores ready)
- 4 unit tests covering: fallback, arg precedence, no-store, empty provider

Verified against production trace 019e6036-44bb-703b-85fa-dee34f7ab2c0
where the tts tool was called with provider=edge, no voice arg, and
defaulted to English instead of the configured Vietnamese voice.
@mrgoonie

Copy link
Copy Markdown
Contributor

Backlog review: merge-candidate. The PR fixes the dashboard-vs-tool TTS voice/model drift with explicit precedence and regression coverage for system_configs fallback. go and web checks are passing, and the scope is focused enough for merge review.

@mrgoonie

Copy link
Copy Markdown
Contributor

review-pr --fix --reply result

Verdict: Request changes until the prepared fix can be applied to the PR branch.

Summary: The PR correctly adds system_configs as a final TTS voice/model fallback so dashboard /tts settings affect LLM-invoked tts tool calls.

Iterations: 1 review loop, 1 local fix attempt.
Important finding fixed locally: TtsTool.systemConfigs is written under TtsTool.mu but was read without that mutex in resolveVoiceAndModel. That is a race risk if future startup/reload wiring overlaps with Execute. The prepared local fix snapshots systemConfigs under RLock before fallback lookups.

Prepared local commit: 2d63455d fix: guard tts system config fallback access
Merge/conflict state: GitHub reports CLEAN.
CI state: Existing PR checks are green (go, web), but no new checks ran for the prepared fix because push was blocked.

Local verification run after the fix:

  • go test ./internal/tools ./internal/audio ./cmd
  • go build ./...
  • go vet ./...
  • go build -tags sqliteonly ./...

External blocker: GitHub rejected pushing to dataplanelabs/goclaw:upstream-fix/tts-system-configs-fallback as mrgoonie (Permission denied). Branch owner or maintainer needs to apply the mutex snapshot fix, or grant push access.

Unresolved blockers/questions: fork push permission blocked applying the prepared fix to the PR head.

@mrgoonie mrgoonie left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: The PR fixes the real dashboard-vs-LLM tts voice drift by using system_configs as the final voice/model fallback, with focused tests for fallback precedence. The remaining blocker is concurrency safety around the new store pointer.
Risk level: Medium
Mandatory gates:

  • Duplicate/prior implementation: clear
  • Project standards: issue found
  • Strategic necessity: clear value
  • CI/checks: green
    Findings:
  • Critical: none
  • Important: TtsTool.systemConfigs is written under TtsTool.mu in SetSystemConfigStore, but the fallback lookup path reads it without taking the same lock. Today wiring is mostly startup-time, but this creates an unnecessary race if gateway reload/reconfiguration overlaps with Execute; the resolver should snapshot the store under RLock before using it.
  • Suggestion: none
    Verdict: REQUEST_CHANGES

Please apply the small mutex snapshot fix before merge. The feature itself is valuable and scoped, but this should not land with a known race in the tool execution path.

@mrgoonie mrgoonie added status:blocked Blocked by external dependency or decision agent:github-maintain Processed by github-maintain automation maintain:triaged Triaged by maintain workflow labels Jun 20, 2026

@mrgoonie mrgoonie left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: I re-checked the latest head after the recent activity. The functional fix and tests are still good, and go test ./internal/tools passes locally, but the previous concurrency blocker is still present in this head.

Risk level: Medium

Mandatory gates:

  • Duplicate/prior implementation: clear; no competing PR for the same dashboard /tts → tool voice fallback path was found.
  • Project standards: still blocked by the mutex/store-pointer access pattern already called out.
  • Strategic necessity: clear value; this fixes a real dashboard-configured voice drift for LLM-invoked tts calls.
  • CI/checks: GitHub go and web checks are green; local go test ./internal/tools passed.

Findings:

  • Critical: none.
  • Important: TtsTool.systemConfigs is still written under SetSystemConfigStore with t.mu, but resolveVoiceAndModel reads t.systemConfigs directly without the same lock/snapshot. Please snapshot the store under RLock before using it for the fallback lookup, or otherwise make the field immutable after construction with a clear guarantee. This is the same unresolved finding from the prior review.
  • Suggestion: none.

Verdict: COMMENT_ONLY / still blocked

Next step: apply the small mutex snapshot fix, then this should be a straightforward approve/merge candidate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent:github-maintain Processed by github-maintain automation maintain:triaged Triaged by maintain workflow status:blocked Blocked by external dependency or decision

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants