Fix parentless spawning. by hjoliver · Pull Request #7237 · cylc/cylc-flow

hjoliver · 2026-03-17T23:47:07Z

Premature shutdown or stall can result if a parented instance of a sometimes parentless task ends up at the runahead limit.

See #7235 (comment)

The correct way to do this is: for any task that is parentless in one or more recurrences, always spawn the next parentless instance - which may lie beyond multiple parented instances and/or beyond the runahead limit.

In effect, a parentless instance should always spawn the next parentless instance, at runahead release time.

NOTE my first attempt at this bug fix ran into trouble because the task pool spawning logic has gradually become too complicated to follow easily, so I bit the bullet and tried to rethink it.

As a result: this PR is a significant simplification of the scheduler core:. E.g. calls to compute_runahead() in the code: ~10 down to ~1; and git diff master cylc/flow: 114 insertions(+), 218 deletions(-)

On master: every time anything happens to spawn a task, the task pool recomputes the runahead limit and spawns and releases and queues any parentless instances of that task all the way to the limit (which is recursive, within a single main loop iteration).

On this branch: the task pool only spawns the one instance, not downstream consequences of it. The main loop then computes the limit, releases instances below the limit, and (on release) spawns the single next parentless instance (if there is one). ~~However, I do a single spawn-to-rh-limit at startup (not necessary, but many current integration tests expect that).~~

Note zero functional tests broke despite the many changes to the scheduler guts here.

The only consequences to be aware of are:

If the runahead limit suddenly jumps multiple cycles ahead, the workflow will spawn out to it at one cycle per main loop iteration, instead of immediately (this really doesn't matter, but I could put a single spawn-to-rh-limit in the main loop if desired)
Integration tests need to await schd._main_loop() and/or schd.pool.spawn_to_runahead_limit() before checking downstream consequences (i.e. beyond immediate spawn) of operations such as trigger and set (note this actually didn't break very many existing integration tests, and they were easily fixed)

Check List

I have read CONTRIBUTING.md and added my name as a Code Contributor.
Contains logically grouped changes (else tidy your branch by rebase).
Does not contain off-topic changes (use other PRs for other changes).
Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
Tests are included (or explain why tests are not needed).
Changelog entry included if this is a change that can affect users
Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

oliver-sanders · 2026-04-14T12:35:28Z

Branch base is master, but milestone is 8.6.x.

This changes some quite fundamental stuff, so it might make sense to leave this on master?

hjoliver · 2026-04-15T02:19:56Z

Yeah, that's why I put it on master initially, but I'm not entirely sure - it still fixes an important bug and does not add any new features.

(And stating the obvious, but the fundamental stuff needed changing because over time it had become a right mess).

oliver-sanders · 2026-04-15T08:21:52Z

Ok, will continue reviewing against master for now...

MetRonnie · 2026-05-20T10:47:46Z


+        self.pool.compute_runahead()
+        self.pool.release_runahead_tasks()
+        await self.workflow_shutdown()


Perhaps take this opportunity to rename workflow_shutdown to e.g. set_stop_mode

Can we punt that as off-topic? From a quick look it doesn't just set the stop mode, it might also shut the scheduler down.

It's a little confusing that we would attempt shutdown so soon after the start of main loop... Could you explain why this has been moved here?

At the least, I think a comment would be useful:

Suggested change

await self.workflow_shutdown()

# If applicable, set stop mode or shutdown on task failure:

await self.workflow_shutdown()

Co-authored-by: Ronnie Dutta <61982285+MetRonnie@users.noreply.github.com>

hjoliver · 2026-06-08T04:50:54Z

All review comments addressed @MetRonnie

hjoliver marked this pull request as draft March 18, 2026 01:11

hjoliver self-assigned this Mar 18, 2026

hjoliver added this to the 8.6.x milestone Mar 18, 2026

hjoliver added the bug Something is wrong :( label Mar 18, 2026

hjoliver mentioned this pull request Mar 18, 2026

mixed parentless/non-parentless task cause premature shutdown #5730

Open

hjoliver force-pushed the fix-parentless-spawning branch 4 times, most recently from a485c53 to b39e758 Compare April 12, 2026 09:14

Fix parentless spawning.

0dcfc96

hjoliver force-pushed the fix-parentless-spawning branch from b39e758 to 0dcfc96 Compare April 12, 2026 20:50

Simplify previous.

fa6accf

hjoliver force-pushed the fix-parentless-spawning branch from 5b3b179 to fa6accf Compare April 12, 2026 22:52

hjoliver marked this pull request as ready for review April 12, 2026 23:40

hjoliver added 2 commits April 14, 2026 09:51

tweak main loop

b7bb55d

Added a new integration test, and change log entry.

81119e7

hjoliver mentioned this pull request Apr 24, 2026

Add future final-incomplete tasks to n=0 for visibility. #7248

Draft

8 tasks

oliver-sanders requested review from MetRonnie and oliver-sanders May 6, 2026 10:14

MetRonnie reviewed May 20, 2026

View reviewed changes

MetRonnie modified the milestones: 8.6.x, 8.7.0 May 20, 2026

MetRonnie self-requested a review May 20, 2026 13:36

MetRonnie reviewed May 20, 2026

View reviewed changes

Comment thread cylc/flow/scheduler.py Outdated

MetRonnie self-requested a review May 20, 2026 13:53

hjoliver and others added 2 commits June 8, 2026 16:15

Apply suggestion from @MetRonnie

ce20af3

Co-authored-by: Ronnie Dutta <61982285+MetRonnie@users.noreply.github.com>

Apply suggestion from @MetRonnie

d9dff26

Co-authored-by: Ronnie Dutta <61982285+MetRonnie@users.noreply.github.com>

hjoliver and others added 2 commits June 8, 2026 16:24

Update cylc/flow/task_pool.py

584fa86

Co-authored-by: Ronnie Dutta <61982285+MetRonnie@users.noreply.github.com>

Address review comments.

d189ba1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix parentless spawning.#7237

Fix parentless spawning.#7237
hjoliver wants to merge 8 commits into
cylc:masterfrom
hjoliver:fix-parentless-spawning

hjoliver commented Mar 17, 2026 •

edited

Loading

Uh oh!

oliver-sanders commented Apr 14, 2026 •

edited

Loading

Uh oh!

hjoliver commented Apr 15, 2026 •

edited

Loading

Uh oh!

oliver-sanders commented Apr 15, 2026

Uh oh!

Uh oh!

MetRonnie May 20, 2026

Uh oh!

hjoliver Jun 8, 2026

Uh oh!

MetRonnie Jun 8, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hjoliver commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	await self.workflow_shutdown()
	# If applicable, set stop mode or shutdown on task failure:
	await self.workflow_shutdown()

Conversation

hjoliver commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oliver-sanders commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hjoliver commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oliver-sanders commented Apr 15, 2026

Uh oh!

Uh oh!

MetRonnie May 20, 2026

Choose a reason for hiding this comment

Uh oh!

hjoliver Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

MetRonnie Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hjoliver commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hjoliver commented Mar 17, 2026 •

edited

Loading

oliver-sanders commented Apr 14, 2026 •

edited

Loading

hjoliver commented Apr 15, 2026 •

edited

Loading