Skip to content

fix(acp): tool exposure + spec compliance + vendor default args (consolidated)#1078

Closed
codebit0 wants to merge 4 commits into
nextlevelbuilder:devfrom
codebit0:work/2026-05-02
Closed

fix(acp): tool exposure + spec compliance + vendor default args (consolidated)#1078
codebit0 wants to merge 4 commits into
nextlevelbuilder:devfrom
codebit0:work/2026-05-02

Conversation

@codebit0

@codebit0 codebit0 commented May 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Consolidated branch covering ACP-mode agent fixes verified end-to-end against Gemini CLI 0.40.1. Combines the tool-exposure work from #1072 with the spec-compliant ACP standardization, in the same provider-layer architecture as #1069.

Relationship to existing PRs

Reviewers may prefer to merge #1069 first and rebase this on top, or pick one branch to consolidate everything.

Major themes

1. ACP agent tool exposure (from #1072)

  • internal/skills/dep_checker.go — raise depCheckTimeout 5s→30s; on DeadlineExceeded log a warning and assume packages present (the previous behaviour falsely flagged everything missing under load and triggered a reinstall loop every ~30s).
  • cmd/gateway_builtin_tools.go — seed six tools that ACP-mode agents could not see (datetime, delegate, mcp_tool_search, list_group_members, vault_read, vault_search). Native-provider agents read the registry directly and were unaffected, masking the gap.
  • internal/mcp/bridge_server.go — replace the hardcoded BridgeToolNames whitelist with BuiltinToolStore.ListEnabled() lookup. The UI builtin-tools toggle is now the single source of truth for ACP exposure; bridgeAlwaysExcluded keeps internal-only tools (spawn, create_forum_topic, heartbeat) off the bridge.
  • cmd/gateway_setup.go — always construct an mcp.Manager so mcp_tool_search is available even when no external MCP servers are configured (returns "no tools found" until servers are added; no redeploy on add).

2. ACP spec compliance for Gemini CLI 0.40.1

  • Snake_case method aliases — Gemini CLI 0.38.x+ sends spec-compliant fs/read_text_file, fs/write_text_file, terminal/wait_for_exit, session/request_permission. Each tool-bridge switch case now accepts both snake_case and the legacy camelCase. Without this, requests fall through to "unknown method", which Gemini's catch path stringifies as [object Object] in tool_call_update.content because the rejection isn't a JS Error instance.
  • Kind-based permission selection — new SessionRequestPermissionRequest/Response types match the spec's nested-outcome shape ({outcome: {outcome: "selected", optionId}} / {outcome: {outcome: "cancelled"}}). handleSessionPermission selects by PermissionOption.Kind (allow_once / allow_always / reject_once / reject_always) rather than agent-defined OptionID strings, since Kind is stable across Gemini versions while OptionID values are not. Approve-all prefers allow_once so operator policy still gates each tool call.

3. Provider-layer vendor and include-directories injection

Removes the cmd-level enrichGeminiACPArgs in favour of two cleaner primitives in the providers package:

  • WithIncludeDirectories(dirs []string) ACPOption — callers pass candidate skill directories; NewACPProvider stat-filters and emits --include-directories <dir> pairs only for the gemini binary.
  • applyVendorDefaultArgs(binary, args) — appends per-binary CLI defaults that goclaw's deployment requires unconditionally. Currently injects --skip-trust for gemini so MCP discovery runs even when the per-session cwd inherits DO_NOT_TRUST from ~/.gemini/trustedFolders.json. ACP sessions always run inside a goclaw-managed sandbox, so the user-facing trust gate is moot. New vendor rules go here rather than scattering binary-name checks across call sites.

cmd/gateway_providers.go now only enumerates candidate skill directories via acpSkillCandidateDirs(workspace) and passes them as the WithIncludeDirectories option; binary gating and vendor flag emission live in the provider.

Test plan

  • go build ./... (PG)
  • go build -tags sqliteonly ./... (Desktop / SQLite)
  • go vet ./...
  • go test ./internal/providers/acp/...
  • go test ./internal/skills/...
  • Live verified against Gemini CLI 0.40.1: ACP session/newsession/promptsession/request_permission (Kind-based, approved with allow_once) → MCP tools/call for datetime → tool result text returned to LLM → final user-facing message contains the correct timestamp. No [object Object] content in tool_call_update.
  • mcp.bridge: tools registered count=37 skipped_excluded=1 — the seeded tools appear in tools/list results.
  • Reviewer to run integration tests with pgvector pg18 on port 5433.
  • Reviewer to verify the dep-check loop is gone under skill install load.

codebit0 and others added 4 commits May 2, 2026 10:26
- ACP MCP 브리지 노출: Gemini가 skill_search 등 goclaw 빌트인 툴 사용 가능
  - NewBridgeServer가 BuiltinToolStore에서 활성화된 툴 목록을 동적으로 로딩
  - bridgeAlwaysExcluded(spawn, create_forum_topic, heartbeat) 제외 처리
- Gemini McpServer 헤더 포맷 수정: 스펙(object) 대신 Gemini CLI 0.36.x 실제
  구현에 맞춰 []McpServerKV 배열 형식으로 변경
- ACP 퍼-세션 cwd 격리: cli-workspaces와 동일하게 세션별 독립 워크스페이스 생성
- enrichGeminiACPArgs: skills-store 등 5개 스킬 소스 디렉토리를 --include-directories로 추가
- buildACPMcpServersFunc: context 헤더 + HMAC 인증 + UI 등록 MCP 서버 통합
- is_system 시맨틱 정리: UpsertSystemSkill이 is_system을 강제하지 않음
  - INSERT: is_system=false (운영자가 명시적으로 설정)
  - UPDATE: is_system 컬럼 유지 (시더 재실행이 운영자 설정 덮어쓰지 않음)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Python dependency check used a 5s context deadline. Importing a
typical skill bundle (numpy + vectorbt + pandas + psycopg2 + dotenv)
takes ~4.4s on idle and easily crosses 5s under load. On context
deadline, the previous code marked every package as missing and
re-triggered pip install for the whole set, which produced more load
and made the next check time out again — an install loop visible in
journalctl as the same packages being installed every ~30 seconds.

- Raise depCheckTimeout from 5s to 30s.
- On context.DeadlineExceeded, log a warning and return nil (assume
  present) instead of treating every import as missing.
- On other exec errors, surface the captured stderr so future failures
  are diagnosable instead of opaque.
ACP-mode agents (Atlas, Cortana, etc.) reach goclaw tools only through
the /mcp/bridge MCP server, which exposes the subset of tools listed in
builtin_tools with enabled=true. Several tools registered in the Go
tool registry were never seeded into builtin_tools, so ACP agents could
not call them and reported errors like 'datetime tool is not available'.
Native-provider agents read tools.Registry directly and were unaffected,
which masked the gap.

- Seed datetime, delegate, mcp_tool_search, list_group_members,
  vault_read, vault_search in builtinToolSeedData. The seed reconcile
  step (DELETE WHERE name != ALL) means manual INSERTs were wiped on
  every restart; the seed function is the single source of truth.
- Always construct an mcp.Manager (empty when no servers configured)
  and register mcp_tool_search. This lets the BM25 discovery tool be
  available to LLMs immediately; it returns 'no tools found' until
  external MCP servers are added, then becomes useful without a
  redeploy.
…or default args

End-to-end Gemini CLI 0.40.1 ACP fixes, verified live with `datetime` tool
returning correct time text through the full ACP→MCP bridge path.

Changes:

1. ACP spec method name aliases (tool_bridge.go)
   The ACP spec uses snake_case (`fs/read_text_file`,
   `terminal/wait_for_exit`, `session/request_permission`); only Claude CLI
   ever emitted camelCase. Each switch case now accepts both, so newer
   Gemini builds no longer fall through to "unknown method" — the JSON-RPC
   error that Gemini's catch path stringifies as `[object Object]` in
   tool_call_update content because the rejection isn't a JS Error.

2. Kind-based session permission (tool_bridge.go, types.go)
   Replace the legacy `RequestPermissionRequest`/`Response` (flat outcome)
   with new `SessionRequestPermissionRequest`/`Response` that match the
   spec nested-outcome shape: `{outcome: {outcome: "selected", optionId}}`
   or `{outcome: {outcome: "cancelled"}}`. handleSessionPermission selects
   by `PermissionOption.Kind` (allow_once / allow_always / reject_once /
   reject_always) since OptionID is agent-defined and unstable across
   versions. Approve-all prefers allow_once over allow_always.

3. Provider-layer vendor-arg + include-directories injection
   (acp_provider.go, gateway_providers.go)
   Removes the cmd-level `enrichGeminiACPArgs` in favor of two cleaner
   primitives in the providers package, mirroring the architecture in
   merge/2026-04-29:
   - `WithIncludeDirectories(dirs []string)` option: callers pass
     candidate skill directories; NewACPProvider stat-filters and emits
     `--include-directories <dir>` pairs only for gemini.
   - `applyVendorDefaultArgs(binary, args)` helper: appends per-binary
     CLI defaults that goclaw's deployment requires regardless of user
     state. For gemini, injects `--skip-trust` so MCP discovery runs
     even when the per-session cwd inherits DO_NOT_TRUST from
     `~/.gemini/trustedFolders.json` (ACP sessions always run inside a
     goclaw-managed sandbox, so the user-facing trust gate is moot).
   The cmd layer now only enumerates candidate skill dirs via
   `acpSkillCandidateDirs(workspace)` and passes them as an option;
   binary gating and vendor flag emission live in the provider.

4. ACP ClientInfo.Name reset to empty
   Restored to "" (no ClientInfo identity broadcast); aligns with the
   pre-debug state.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codebit0

codebit0 commented May 2, 2026

Copy link
Copy Markdown
Contributor Author

Superseded — folding the same content (consolidated ACP work) directly into #1072 instead of opening a parallel PR.

@codebit0 codebit0 closed this May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant