Lockloss at 2025-07-17 04:11 UTC due to a power issue with ETMX and TMSX. Currently in contact with Dave and Fil is on his way in.
ETMX M0 and R0 watchdogs tripped
ETMX and TMSX OSEMs are in FAULT
ETMX ESD off
ETMX HWWD notified that it would trip soon, so SEI_ETMX was preemptively put into ISI_OFFLINE_HEPI_ON to keep ISI from getting messed up when it trips
H1SUSETMX ADC channels zeroed out at 21:11:39. SWWDs did not trip because there is no RMS on the OSEM signals, but the HWWD completed its 20 minute countdown and powered down the three ISI coil drivers at 21:32. This indicates ETMX's top stage OSEMs have lost power.
I've opened WP12692 to cover Fil going to EX to investigate.
During the recovery the +24VDC power supply for the SUS IO Chassis was glitched which stopped all the h1susex and h1susauxex models. To recover I first did a straight forward reboot of h1susauxex (no Dolphin), it came back with no issues.
To reboot h1susex was more involved, remember that the EX Dolphin switch was damaged by the 06 April 2025 power outage and has no network control. The procedure to reboot h1susex I used was:
When h1susex came back, I verified all the IO Chassis cards were present (they were all there)
I unpaused the SEI and ISC IPC by writing a 0 to their IPC_PAUSE channels.
The HWWD came back in nominal state.
I reset the SUS SWWD DACKILLs and unbypassed the SEI SWWD.
DIAG_RESET to clear all the IPC errors (it did so) and clear DAQ CRCs (they cleared).
Handed systems over to control room (Oli and Ryan S).
From Fil:
-18VDC Power supply had failed and was replaced.
Power supply is in rack VDD-2, location U25-U28, right-hand supply, label [SUS-C1 C2]
old supply (removed) S1202024
new supply (installed) S1300288
Last night's HWWD sequence is shown below. Reminder that at +40mins the SUS part of the HWWD trips, which sets bit2 of the STAT. This opens internal relay switches, but since we don't route the SUS drives through the HWWD unit (too noisy) this has no effect on operations. The delay between 22:52 and 23:20 is because h1iopsusex was down between 23:01 and 23:20.
FRS34786
Fan motor seized on failed power supply.
Wed16Jul2025
LOC TIME HOSTNAME MODEL/REBOOT
23:15:13 h1susauxex h1iopsusauxex
23:15:26 h1susauxex h1susauxex
23:20:21 h1susex h1iopsusex
23:20:34 h1susex h1susetmx
23:20:47 h1susex h1sustmsx
23:21:00 h1susex h1susetmxpi