Add async support across providers, client, tools, and the agent Runner by rohitprasad15 · Pull Request #296 · andrewyng/aisuite

rohitprasad15 · 2026-06-08T22:26:57Z

Summary

Makes the core path genuinely async — provider → client → tools → Runner — so agent workloads (parallel tool calls, concurrent subagents, many simultaneous runs) can actually overlap. Design: async-core with thin sync wrappers and a single tool loop, so there's one source of truth rather than two divergent code paths.

Stacked on #295 (base branch is feat/ollama-openai-compat). Review/merge #295 first; this PR's diff will collapse to just the async commits once #295 lands.

Commits

Async provider contract — Provider.achat_completions_create with a thread-executor default (asyncio.to_thread), so every provider is awaitable immediately; OpenaiProvider overrides it natively via openai.AsyncOpenAI (Ollama/LM Studio inherit it).
Async tool execution — Tools.aexecute_tool awaits async def tools and runs sync tools off the loop; per-call validation/policy/events/message-building are factored into shared helpers so sync and async stay in lockstep.
Async client core — Completions.acreate + an async _atool_runner; provider resolution, runner setup, model-event emission, and finalization are shared with the sync path (no duplicated tool loop).
Async Runner — run / continue_run await acreate; run_sync / continue_sync become thin wrappers (with a nest_asyncio fallback when invoked inside a running loop). A use_async_client flag keeps the sync wrappers driving the provider's sync create, so existing callers and tests are unaffected. Concurrent Runner.run calls are now gatherable.

Scope / follow-ups

First-pass native async covers the OpenAI-compatible providers; all others run correctly via the thread-executor default and can get native async incrementally.
Subagent tools (agent_tool) run via the thread executor in the async path, and a turn's tool calls execute sequentially. Parallel tool-call fan-out within a turn is a clean follow-up.

Testing

Full suite shows zero new failures vs. main (pre-existing failures are optional-dep collection errors + the known prerelease/trace tests). New async tests cover the provider contract, async tool execution, acreate, async Runner.run parity, gather concurrency, and run_sync inside a running loop.

Introduce achat_completions_create on Provider as the async completion entrypoint. The base implementation offloads each provider's synchronous chat_completions_create to a worker thread via asyncio.to_thread, so every provider is awaitable immediately. OpenaiProvider overrides it with a native openai.AsyncOpenAI client for true non-blocking I/O; the Ollama and LM Studio subclasses inherit that override for free.

Add Tools.aexecute_tool, which awaits async def tool callables and runs blocking sync tools in a worker thread via asyncio.to_thread. The per-call work (argument validation, policy evaluation, trace events, and tool-result message building) is factored into shared helpers, so the sync execute_tool and the new async path stay in lockstep with no duplicated tool-loop logic.

Add Completions.acreate and an async _atool_runner that await the provider's achat_completions_create and the async tool execution. Provider resolution, tool-runner setup, model-event emission, and response finalization are factored into shared helpers so the sync create/_tool_runner and the async paths stay in lockstep instead of duplicating the tool loop. MCP client cleanup remains synchronous via ExitStack.

Invert Runner so the async path is the core: run/continue_run await the client's acreate, and run_sync/continue_sync become thin wrappers that drive the coroutine to completion (falling back to nest_asyncio when called from within a running loop). The shared _run_impl/_continue_impl take a use_async_client flag so the sync wrappers still drive the provider's synchronous create — keeping existing callers and tests that patch client.chat.completions.create working unchanged. Concurrent Runner.run calls can now be awaited and gathered.

The async client replicated the manual tool-calling regression fixed in #266: acreate() pops `tools` from kwargs and never handed them to the provider when max_turns is absent. Both create() and acreate() now share a _provider_ready_tools() helper (schema dicts pass through, callables become OpenAI-format specs) and an async regression test mirrors the sync one.

rohitprasad15 force-pushed the feat/async-core branch from 25e85c9 to 1948b62 Compare June 8, 2026 23:18

rohitprasad15 force-pushed the feat/ollama-openai-compat branch from c213b18 to 6f71017 Compare June 10, 2026 22:10

rohitprasad15 added 5 commits June 10, 2026 15:19

rohitprasad15 force-pushed the feat/async-core branch from 1948b62 to 5397963 Compare June 10, 2026 22:22

rohitprasad15 mentioned this pull request Jun 10, 2026

Fix CI: black-format the repo and guard tomllib for py3.10 #300

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add async support across providers, client, tools, and the agent Runner#296

Add async support across providers, client, tools, and the agent Runner#296
rohitprasad15 wants to merge 5 commits into
feat/ollama-openai-compatfrom
feat/async-core

rohitprasad15 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rohitprasad15 commented Jun 8, 2026

Summary

Commits

Scope / follow-ups

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant