Reports until 08:51, Saturday 19 April 2025
H1 CDS
david.barker@LIGO.ORG - posted 08:51, Saturday 19 April 2025 - last comment - 09:26, Saturday 19 April 2025(84008)
h1susauxh2 crash Fri 18 April 2025 23:43:02 PDT

h1susauxh2 models stopped running 23:43:02 Fri 18apr2025 PDT with an ADC timing error.

I am able to ssh onto the machine and first scans suggest we have lost an ADC in this system (only 7 of 8 are seen with lspci). We will need to power cycle the IO Chassis before deciding if an ADC replacement is needed.

This is an auxiliary SUS frontend for HAM2 meaning ADCs only and no control function has been lost.

Dmesg:

[Fri Apr 18 23:43:12 2025] rts_cpu_isolator: LIGO code is done, calling regular shutdown code
[Fri Apr 18 23:43:12 2025] h1iopsusauxh2: ERROR - An ADC timeout error has been detected, waiting for an exit signal.
[Fri Apr 18 23:43:12 2025] h1susauxh2: ERROR - An ADC timeout error has been detected, waiting for an exit signal.
 

Images attached to this report
Comments related to this report
david.barker@LIGO.ORG - 09:15, Saturday 19 April 2025 (84009)

h1susauxh2 is running again, no hardware issues.

When opening an FRS ticket for this I found a similar one from 20 April 2019 (FRS12775) at which time a reboot of the computer fixed it. At 09:02 I stopped the models and powered down h1susauxh2 from command line. After a minute I powered it back up using IPMI. All 8 ADC cards are visible and the models started with no problems.

david.barker@LIGO.ORG - 09:26, Saturday 19 April 2025 (84010)

FRS for today's issue: FRS33903