refactor: split machine state handler into per-state modules#725
refactor: split machine state handler into per-state modules#725chet wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
Sooo, I'm not sure how we feel about this, but it's come up before, and it's been in the back of my mind, and after restructuring the entire `admin-cli` crate, I wanted to poke at this and see what the vibe was.
If we hate it, I can close it!
In this case the machine state handler was over 10k lines long -- it was/is a single file with the entire machine lifestyle all in ther (all four sub-handlers, firmware upgrade logic, shared functions, etc). In a sense, it's *easy* to grep through, because you can just jump around the file, but at the same time, it has also grown so much that it has become hard to look at as well. It's growth is mainly a testament to its success, fwiw.
So, this refactor splits it into a module-per-state structure. The idea is that the module path tells you exactly what state you're looking at (which I think will help with discovery), e.g.
```
managed_host_state::verify_rms_membership::handle(...)
managed_host_state::waiting_for_cleanup::handle(...)
managed_host_state::failed::handle(...)
```
Each state enum gets its own directory, each variant gets its own file with a `.handle()` function, and `handler.rs` is now a < 1k line "dispatch" hub that reads like a table of contents.
New structure:
```
handler.rs
handler/
managed_host_state/ ManagedHostState::
dpu_machine_state/ DpuDiscoveringState + DpuInitState
host_machine_state/ MachineState::*
instance_state/ InstanceState::*
host_reprovision_state/ HostReprovisionState::*
common.rs Shared utility functions.
```
There were **zero logic changes** with this -- it's purely a structural refactor. All ~1100 tests still pass as-is; I set out with the intent that all existing tests should continue to pass to consider success.
Signed-off-by: Chet Nichols III <chetn@nvidia.com>
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-03-25 22:19:26 UTC | Commit: dc973d4 |
|
I never really thought about the file path being related to a state. typically we've been breaking it up by component, like machine validation states are all in machine_validation.rs this is going to blow peoples minds |
Lol @wminckler. I'm not sure if that's meant as a good or bad thing. I'm also not sure how people feel about this in general. CamelCase <-> snake_case is a natural mapping for me so this seems like a sane way to organize, if we wanted to. |
Description
Sooo, I'm not sure how we feel about this, but it's come up before, and it's been in the back of my mind, and after restructuring the entire
admin-clicrate, I wanted to poke at this and see what the vibe was.If we hate it, I can close it!
In this case the machine state handler was over 10k lines long -- it was/is a single file with the entire machine lifestyle all in ther (all four sub-handlers, firmware upgrade logic, shared functions, etc). In a sense, it's easy to grep through, because you can just jump around the file, but at the same time, it has also grown so much that it has become hard to look at as well. It's growth is mainly a testament to its success, fwiw.
So, this refactor splits it into a module-per-state structure. The idea is that the module path tells you exactly what state you're looking at (which I think will help with discovery), e.g.
Each state enum gets its own directory, each variant gets its own file with a
.handle()function, andhandler.rsis now a < 1k line "dispatch" hub that reads like a table of contents.New structure:
There were zero logic changes with this -- it's purely a structural refactor. All ~1100 tests still pass as-is; I set out with the intent that all existing tests should continue to pass to consider success.
Signed-off-by: Chet Nichols III chetn@nvidia.com
Type of Change
Related Issues (Optional)
Breaking Changes
Testing
Additional Notes
No testing required on purpose. The intent was that all ~1100 existing tests should pass.