Add async support across providers, client, tools, and the agent Runner#296
Open
rohitprasad15 wants to merge 5 commits into
Open
Add async support across providers, client, tools, and the agent Runner#296rohitprasad15 wants to merge 5 commits into
rohitprasad15 wants to merge 5 commits into
Conversation
25e85c9 to
1948b62
Compare
c213b18 to
6f71017
Compare
Introduce achat_completions_create on Provider as the async completion entrypoint. The base implementation offloads each provider's synchronous chat_completions_create to a worker thread via asyncio.to_thread, so every provider is awaitable immediately. OpenaiProvider overrides it with a native openai.AsyncOpenAI client for true non-blocking I/O; the Ollama and LM Studio subclasses inherit that override for free.
Add Tools.aexecute_tool, which awaits async def tool callables and runs blocking sync tools in a worker thread via asyncio.to_thread. The per-call work (argument validation, policy evaluation, trace events, and tool-result message building) is factored into shared helpers, so the sync execute_tool and the new async path stay in lockstep with no duplicated tool-loop logic.
Add Completions.acreate and an async _atool_runner that await the provider's achat_completions_create and the async tool execution. Provider resolution, tool-runner setup, model-event emission, and response finalization are factored into shared helpers so the sync create/_tool_runner and the async paths stay in lockstep instead of duplicating the tool loop. MCP client cleanup remains synchronous via ExitStack.
Invert Runner so the async path is the core: run/continue_run await the client's acreate, and run_sync/continue_sync become thin wrappers that drive the coroutine to completion (falling back to nest_asyncio when called from within a running loop). The shared _run_impl/_continue_impl take a use_async_client flag so the sync wrappers still drive the provider's synchronous create — keeping existing callers and tests that patch client.chat.completions.create working unchanged. Concurrent Runner.run calls can now be awaited and gathered.
The async client replicated the manual tool-calling regression fixed in #266: acreate() pops `tools` from kwargs and never handed them to the provider when max_turns is absent. Both create() and acreate() now share a _provider_ready_tools() helper (schema dicts pass through, callables become OpenAI-format specs) and an async regression test mirrors the sync one.
1948b62 to
5397963
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Makes the core path genuinely async —
provider → client → tools → Runner— so agent workloads (parallel tool calls, concurrent subagents, many simultaneous runs) can actually overlap. Design: async-core with thin sync wrappers and a single tool loop, so there's one source of truth rather than two divergent code paths.Commits
Provider.achat_completions_createwith a thread-executor default (asyncio.to_thread), so every provider is awaitable immediately;OpenaiProvideroverrides it natively viaopenai.AsyncOpenAI(Ollama/LM Studio inherit it).Tools.aexecute_toolawaitsasync deftools and runs sync tools off the loop; per-call validation/policy/events/message-building are factored into shared helpers so sync and async stay in lockstep.Completions.acreate+ an async_atool_runner; provider resolution, runner setup, model-event emission, and finalization are shared with the sync path (no duplicated tool loop).run/continue_runawaitacreate;run_sync/continue_syncbecome thin wrappers (with anest_asynciofallback when invoked inside a running loop). Ause_async_clientflag keeps the sync wrappers driving the provider's synccreate, so existing callers and tests are unaffected. ConcurrentRunner.runcalls are now gatherable.Scope / follow-ups
agent_tool) run via the thread executor in the async path, and a turn's tool calls execute sequentially. Parallel tool-call fan-out within a turn is a clean follow-up.Testing
Full suite shows zero new failures vs.
main(pre-existing failures are optional-dep collection errors + the known prerelease/trace tests). New async tests cover the provider contract, async tool execution,acreate, asyncRunner.runparity,gatherconcurrency, andrun_syncinside a running loop.