TJ, Jonathan, EJ, Dave:
Around 01:30 this morning we had a Dolphin crash of all the frontends at EY (h1susey, h1seiey, h1iscey). h1susauxey is not on the Dolphin network as was not impacted.
We could not ping these machines, but were able to get some diagnostics from their IPMI management ports.
At 07:57 we powered down h1[sus,sei,isc]ey for about a minute and then powered them back on.
We checked the IX Dolphin switch at EY was responsive on the network.
All the systems came back with no issues. SWWD and model WDs were cleared. TJ is recovering H1.
Crash time: 01:43:47 PDT
FRS31922
Reboot/Restart LOG:
Mon26Aug2024
LOC TIME HOSTNAME MODEL/REBOOT
07:59:27 h1susey ***REBOOT***
07:59:30 h1seiey ***REBOOT***
08:00:04 h1iscey ***REBOOT***
08:01:04 h1seiey h1iopseiey
08:01:17 h1seiey h1hpietmy
08:01:30 h1seiey h1isietmy
08:01:32 h1susey h1iopsusey
08:01:45 h1susey h1susetmy
08:01:47 h1iscey h1iopiscey
08:01:58 h1susey h1sustmsy
08:02:00 h1iscey h1pemey
08:02:11 h1susey h1susetmypi
08:02:13 h1iscey h1iscey
08:02:26 h1iscey h1caley
08:02:39 h1iscey h1alsey
FYI: There was a pending filter module change for h1susetmypi which got installed when this model was restarted this morning.