Summary
The cloudbank UMD hub (and potentially skyline hub) had an outage due to storage being full. We did not receive alerts for this, and had to be informed by users.
|
Time |
| Reported to us |
2026 Apr 23 10:21 AM |
| Reported manually |
yes |
| Incident resolved |
2026 Apr 23 10:30 AM |
What happened
Cloudbank's UMD hub ran out of disk space, thus causing students to not be able to save files. This was reported to us via the #dsep-pilot-hubs slack channel. Luckily someone was around and looking at that channel, so they resized the disk. Our alerts did not fire
Where we got lucky
The UMD folks were able to let the cloudbank classroom rep (Sean Morris) know, and he was able to let us know via slack. And this happened to be a time when we had some engineering capacity in the pacific timezone, so they were able to fix it.
Action Items
Summary
The cloudbank UMD hub (and potentially skyline hub) had an outage due to storage being full. We did not receive alerts for this, and had to be informed by users.
What happened
Cloudbank's UMD hub ran out of disk space, thus causing students to not be able to save files. This was reported to us via the
#dsep-pilot-hubsslack channel. Luckily someone was around and looking at that channel, so they resized the disk. Our alerts did not fireWhere we got lucky
The UMD folks were able to let the cloudbank classroom rep (Sean Morris) know, and he was able to let us know via slack. And this happened to be a time when we had some engineering capacity in the pacific timezone, so they were able to fix it.
Action Items
kubelet_volume_stats_capacity_bytesmissing infrastructure#8164