Skip to content

Add corona profile#64

Merged
tbennun merged 2 commits into
LBANN:mainfrom
xorJane:corona
Apr 1, 2026
Merged

Add corona profile#64
tbennun merged 2 commits into
LBANN:mainfrom
xorJane:corona

Conversation

@xorJane

@xorJane xorJane commented Mar 17, 2026

Copy link
Copy Markdown
Contributor

I was looking at launch vs. flux run performance using a small distributed training example and noticed Corona didn't have a system profile. Absent a profile, hpc-launcher falls back to a generic AMD configuration on Corona, and in my tests launch runs did not outperform flux run.

This PR adds a Corona system profile (drafted with Codex help). With this profile, my example improves from ~11 seconds/epoch with flux run to ~8 seconds/epoch with launch. The launch vs flux run gap is still smaller than what I see on El Cap systems, so there may be additional Corona-specific tuning opportunities; feedback on other settings to try is welcome! @tbennun @bvanessen

@tbennun tbennun left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment but otherwise looks good to go!

Comment thread hpc_launcher/systems/autodetect.py Outdated
return Sierra(sys)

# Try to find current system via other means
if sys in ("corona"):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if sys in ("corona"):
if sys == "corona":

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look! I just made that update. :)

@xorJane xorJane requested a review from tbennun April 1, 2026 19:16
@xorJane

xorJane commented Apr 1, 2026

Copy link
Copy Markdown
Contributor Author

Hey Tal, I made the change manually. Let me know if there's anything else I should do, and thanks a lot!

@tbennun

tbennun commented Apr 1, 2026

Copy link
Copy Markdown
Collaborator

Thank you! LGTM

@tbennun tbennun merged commit 68bf3d3 into LBANN:main Apr 1, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants