Skip to content

Commit 323c840

Browse files
committed
fix(python): hand PyRuntime drop to shutdown_background inside a tokio context
Addresses review feedback on the ctx.fs PR after #2009 (deterministic teardown) landed and changed PyRuntime::drop underneath it. #2009's PyRuntime::drop does a blocking runtime join (join_without_gil) while the interpreter is alive. The ctx.fs adapters now hold PyRuntime clones, so when a Bash is dropped while `await execute()` is in flight, the in-flight future drops the last Arc<Mutex<Bash>> — and thus the last PyRuntime clone — on a pyo3-async-runtimes worker thread. A blocking runtime drop inside a tokio context panics ("Cannot drop a runtime in a context where blocking is not allowed"), surfacing as RustPanic to the awaiting caller. Neither suite caught it because the merge stays green. - PyRuntime::drop: also use shutdown_background() when Handle::try_current().is_ok(), keeping #2009's deterministic join_without_gil only on the normal Python-thread path. - Regression test: drop a Bash while an async-callback execute() is in flight, then await it (test_teardown_determinism.py). Verified it panics without the fix and passes with it. - Document and cover the third with_fs dispatch path: a Static handle used from a thread with no tokio context (async callbacks under execute_sync's private loop / await execute's caller loop) falls through to self.rt.block_on. Adds execute_sync + async-callback + ctx.fs tests (test_jupyter_compat.py). - Docs: note that a stashed ctx.fs handle outlives the Bash and keeps the VFS + runtime alive (specs + .pyi), and the in-tokio-context drop handoff in the teardown-determinism section.
1 parent 4b382a8 commit 323c840

5 files changed

Lines changed: 124 additions & 5 deletions

File tree

crates/bashkit-python/bashkit/_bashkit.pyi

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,11 @@ class BuiltinContext:
9595
files created by earlier commands and writes are visible to later
9696
ones. Same API as ``Bash.fs()`` — e.g. ``ctx.fs.read_file(path)``
9797
returns ``bytes``. Operates directly on the interpreter's VFS, so
98-
it is safe to use from inside the callback.
98+
it is safe to use from inside the callback. Ops are synchronous and,
99+
while a runtime is active, may run off-thread (relatively expensive
100+
per call), so batch filesystem work rather than looping over many
101+
tiny ops. May be retained past the callback: a stashed handle keeps
102+
the VFS (and its runtime) alive even after the ``Bash`` is dropped.
99103
"""
100104

101105
name: str

crates/bashkit-python/src/lib.rs

Lines changed: 33 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -238,7 +238,12 @@ fn join_without_gil<F: FnOnce() + Send>(f: F) {
238238
/// hang in `test_async_callback_execute_sync_honors_timeout`). Once the
239239
/// interpreter is exiting, joining is unsafe (threads attaching during
240240
/// finalization abort the process on CPython < 3.13), so the drop falls
241-
/// back to `shutdown_background()` and the OS reclaims resources.
241+
/// back to `shutdown_background()` and the OS reclaims resources. The same
242+
/// hands-off path is taken when the last clone drops *inside* a tokio context
243+
/// (e.g. a `Bash` dropped mid-`await execute()` finishes on a runtime worker
244+
/// thread, dropping its custom-builtin adapters' `PyRuntime` clones there): a
245+
/// blocking runtime drop in that context panics, so `shutdown_background()`
246+
/// is used instead.
242247
struct PyRuntime(Option<Arc<Runtime>>);
243248

244249
impl Clone for PyRuntime {
@@ -259,7 +264,21 @@ impl Drop for PyRuntime {
259264
let Some(rt) = self.0.take().and_then(Arc::into_inner) else {
260265
return;
261266
};
262-
if interpreter_at_exit() {
267+
// Two cases must avoid the blocking join:
268+
// - `interpreter_at_exit()`: joining threads that may re-attach is
269+
// unsafe during finalization (see the type doc).
270+
// - `Handle::try_current().is_ok()`: the last clone is dropping
271+
// *inside* a tokio context. This happens when a `Bash` is dropped
272+
// while `await execute()` is still in flight: the future holds the
273+
// last `Arc<Mutex<Bash>>` and, on completion, drops it on a
274+
// pyo3-async-runtimes worker thread — cascading into the
275+
// custom-builtin adapters that hold `PyRuntime` clones. Dropping a
276+
// runtime (a blocking join) from within another runtime panics
277+
// with "Cannot drop a runtime in a context where blocking is not
278+
// allowed", so hand off to `shutdown_background()` instead.
279+
// The normal path — last clone dropping on a Python thread with no
280+
// tokio context — keeps #2009's deterministic `join_without_gil`.
281+
if interpreter_at_exit() || Handle::try_current().is_ok() {
263282
rt.shutdown_background();
264283
} else {
265284
join_without_gil(move || drop(rt));
@@ -1634,6 +1653,18 @@ impl PyFileSystem {
16341653
// reused across ops via a channel (note: `block_in_place` is not an
16351654
// option while `make_runtime` builds a current-thread runtime).
16361655
//
1656+
// A `Static` handle used from a thread with *no* tokio context falls
1657+
// through to `self.rt.block_on` below. This is the async-callback case:
1658+
// the callback body runs on an asyncio loop thread — the caller's loop
1659+
// under `await execute()`, the private loop under `execute_sync()` — not
1660+
// a tokio worker, so `Handle::try_current()` is `Err`. Under
1661+
// `execute_sync()` that `block_on` runs on a *second* thread while the
1662+
// main thread holds the current-thread scheduler core; it completes only
1663+
// because tokio's parker fallback polls the future without owning the
1664+
// core. That is fine for VFS ops (they never need the runtime's
1665+
// timer/IO drivers) — but a timer- or IO-dependent fs future added here
1666+
// would stall this path, so keep `inner.resolve()` + fs ops driver-free.
1667+
//
16371668
// `Live` handles keep the original fast path: they lock the interpreter.
16381669
// Re-entrant use of a `Live` handle from inside the shared runtime is
16391670
// unsupported — it re-enters `self.rt.block_on` and panics with the same

crates/bashkit-python/tests/test_jupyter_compat.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -282,3 +282,50 @@ async def rw(ctx: BuiltinContext) -> str:
282282
assert result.exit_code == 0
283283
assert result.stdout == "await\n"
284284
assert captured == [caller_loop]
285+
286+
287+
# ---------------------------------------------------------------------------
288+
# ctx.fs from an *async* callback driven by execute_sync — the third dispatch
289+
# path. The async callback body runs on an asyncio loop thread (the private
290+
# loop, or the background daemon loop under a live loop), which has no tokio
291+
# context, so ctx.fs ops fall through to `self.rt.block_on` rather than the
292+
# worker-thread branch the sync-callback case takes. Sync callbacks +
293+
# execute_sync (worker-thread branch) and async callbacks + await execute
294+
# (caller loop) are covered above; this fills the remaining combination.
295+
# ---------------------------------------------------------------------------
296+
297+
298+
@pytest.mark.parametrize("factory", [Bash, BashTool], ids=["bash", "bash_tool"])
299+
def test_execute_sync_async_ctx_fs_builtin(factory):
300+
"""An async ctx.fs builtin driven by execute_sync() (no running loop) runs on
301+
the private loop thread; ctx.fs resolves via the block_on fallback there."""
302+
303+
async def rw(ctx: BuiltinContext) -> str:
304+
await asyncio.sleep(0) # force a real await so the body runs on the loop
305+
ctx.fs.write_file("/async-sync.txt", b"async-sync\n")
306+
return ctx.fs.read_file("/async-sync.txt").decode()
307+
308+
shell = factory(custom_builtins={"rw": rw})
309+
result = shell.execute_sync("rw")
310+
311+
assert result.exit_code == 0
312+
assert result.stdout == "async-sync\n"
313+
assert shell.fs().read_file("/async-sync.txt").decode() == "async-sync\n"
314+
315+
316+
def test_execute_sync_async_ctx_fs_inside_asyncio_run():
317+
"""Same third path under a live outer loop: execute_sync() inside
318+
asyncio.run() drives the async callback on a background daemon loop, and
319+
ctx.fs works there too."""
320+
321+
async def rw(ctx: BuiltinContext) -> str:
322+
await asyncio.sleep(0)
323+
return ctx.fs.read_file("/seed.txt").decode()
324+
325+
async def jupyter_cell():
326+
bash = Bash(custom_builtins={"rw": rw}, files={"/seed.txt": "seeded\n"})
327+
return bash.execute_sync("rw")
328+
329+
result = asyncio.run(jupyter_cell())
330+
assert result.exit_code == 0
331+
assert result.stdout == "seeded\n"

crates/bashkit-python/tests/test_teardown_determinism.py

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818

1919
import pytest
2020

21-
from bashkit import ScriptedTool
21+
from bashkit import Bash, ScriptedTool
2222

2323
PROC_AVAILABLE = os.path.exists("/proc/self/task")
2424

@@ -78,6 +78,34 @@ def test_fds_stable_across_tool_churn():
7878
assert _fd_count() <= baseline
7979

8080

81+
@pytest.mark.asyncio
82+
async def test_drop_bash_while_async_execute_in_flight_does_not_panic():
83+
"""A `Bash` dropped while `await execute()` is in flight must not panic.
84+
85+
The in-flight future holds the last `Arc<Mutex<Bash>>`; when it completes on
86+
a pyo3-async-runtimes worker thread, the `Bash` — and its custom-builtin
87+
adapters, which each hold a `PyRuntime` clone — drop *there*, so the last
88+
`PyRuntime` clone drops inside a tokio context. `PyRuntime::drop` must hand
89+
off to `shutdown_background()` rather than a blocking runtime join, which in
90+
that context panics ("Cannot drop a runtime in a context where blocking is
91+
not allowed") and surfaces as `pyo3_async_runtimes.RustPanic` to the
92+
awaiting caller. Regression for the #2009/#2010 interaction.
93+
"""
94+
95+
async def slow(ctx):
96+
await asyncio.sleep(0.3)
97+
return "done\n"
98+
99+
b = Bash(custom_builtins={"slow": slow})
100+
fut = asyncio.ensure_future(b.execute("slow"))
101+
await asyncio.sleep(0.05) # let execution start; callback is in flight
102+
del b
103+
gc.collect()
104+
result = await fut # RustPanic here before the drop-path fix
105+
assert result.exit_code == 0
106+
assert result.stdout == "done\n"
107+
108+
81109
def test_dropped_tool_cancels_abandoned_callback():
82110
"""Teardown is bounded by cancellation, not callback duration."""
83111
started = threading.Event()

specs/python-package.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,11 @@ calling back into the owning instance's `Bash.fs()` / `Bash.read_file()`, which
309309
is unsupported re-entrancy: it re-enters the interpreter's runtime and panics
310310
with a nested-runtime error (not a deadlock, and not caught by the
311311
`external_handler` reentry guard, which does not fire for custom builtins).
312+
A callback may retain `ctx.fs` beyond the invocation: the handle stays valid
313+
even after the `Bash` instance is dropped, and keeps the underlying VFS and
314+
its tokio runtime alive until the handle itself is released — so stashing it
315+
extends resource lifetime past `del bash` (see the teardown-determinism
316+
notes below).
312317

313318
**Sync callbacks** are called directly under the session's captured `contextvars`
314319
snapshot and may return either a stdout string or a `BuiltinResult` with
@@ -341,7 +346,11 @@ Callbacks that block without awaiting (e.g. `time.sleep` inside `async def`)
341346
cannot be cancelled mid-section; teardown then waits for the current section
342347
to reach an await point or return. At interpreter exit (boundary: an `atexit`
343348
handler registered at module import), teardown goes hands-off — native threads
344-
must not touch a finalizing CPython — and the OS reclaims resources.
349+
must not touch a finalizing CPython — and the OS reclaims resources. The same
350+
hands-off path applies when the last runtime handle is dropped *inside* a tokio
351+
context (a `Bash` dropped while `await execute()` is still in flight finishes on
352+
a runtime worker thread): a blocking runtime join there would panic, so the drop
353+
falls back to `shutdown_background()` instead of the deterministic join.
345354

346355
**ContextVar propagation**: ContextVars set before `execute()` or
347356
`execute_sync()` are captured at call time and replayed inside each callback

0 commit comments

Comments
 (0)