Skip to content

[KYUUBI #XXXX] Do not block server shutdown on batch sessions#7498

Closed
turboFei wants to merge 2 commits into
apache:masterfrom
turboFei:donot_close_batch
Closed

[KYUUBI #XXXX] Do not block server shutdown on batch sessions#7498
turboFei wants to merge 2 commits into
apache:masterfrom
turboFei:donot_close_batch

Conversation

@turboFei

@turboFei turboFei commented Jun 4, 2026

Copy link
Copy Markdown
Member

Why

ServiceDiscovery.stopGracefully waits for getActiveUserSessionCount > 0 to reach zero before shutting down. This count includes batch sessions, which are just handles to jobs running on external clusters (YARN/K8s). Those jobs can run for hours or days — causing the server to hang indefinitely on shutdown.

What

Introduces a closeOnServerStop flag on the Session trait (default true — fully backward-compatible with all existing session types).

KyuubiBatchSession overrides it to false. ServiceDiscovery.stopGracefully now waits only for sessions where closeOnServerStop = true, so batch sessions no longer block server shutdown.

Changes

  • Session trait: add closeOnServerStop: Boolean = true
  • KyuubiBatchSession: override to false
  • ServiceDiscovery.stopGracefully: switch from getActiveUserSessionCount to allSessions().count(_.closeOnServerStop)

Test plan

  • Server shuts down promptly when only batch sessions are active
  • Server still waits for active interactive sessions before shutting down
  • Batch jobs continue running on cluster after server restart

Batch sessions are handles to jobs submitted to external clusters
(YARN/K8s). The actual computation runs on the cluster and continues
regardless of Kyuubi server state. Including batch sessions in the
shutdown wait loop caused the server to hang indefinitely when long-
running batch jobs were active.

Introduces `closeOnServerStop` on the Session trait (default: true,
backward-compatible). KyuubiBatchSession overrides it to false.
ServiceDiscovery.stopGracefully now waits only for sessions where
closeOnServerStop is true.
@turboFei turboFei closed this Jun 6, 2026
@turboFei turboFei deleted the donot_close_batch branch June 6, 2026 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant