Reports until 17:47, Thursday 14 January 2016
H1 SUS (CDS)
david.barker@LIGO.ORG - posted 17:47, Thursday 14 January 2016 - last comment - 15:42, Friday 15 January 2016(24950)
SUS watchdog resets stopped working this afternoon

Jeff K, Jeff B. , Jenne, TJ, Jim, Dave:

Around 4pm PST TJ reported that OMC had tripped and the watchdog could not be untripped. Jeff K. recommended a model restart. Unfortunately due to a communication problem we first mistakenly restarted the OMC model on the LSC front end (sorry OMC). Then we restarted the correct SUS-OMC model on SUSH56. This did not fix it. We then restarted all the models on SUSH56 (including the IOP). This did not fix it. We then stopped all models and only started IOP and SUS-SRM to do further debugging. (in the mean time the SWWD on the IOP had tripped SEI for HAM5 and HAM6). After some debugging we found that the PERL script sus/common/scripts/wdreset_all.pl was throwing an error about not finding the PERL CA LIBRARY. Jim tracked this down to a missing CaTools.pm perl module in the userapps/release/guardian directory. Turns out this file was removed from the SVN repository way back on 2nd March 2015 and the LHO working directory was only updated this afternoon by Jenne and TJ. This all nicely ties in with the watchdog resets working last night but not this afternoon.

In the mean time we had manually reset the watchdogs for SUS-SRM/SR3/OMC and SEI HAM5,6 and set the SDF back to OBSERVE for SUSH56IOP, SUSSRM/SR3/OMC and OMC.

For now we have manually copied the CaTools.pm file into userapps/release/sus/common/scripts to get the watchdog reset script working again. 

This raises an FRS:

A perl module which is used by the watchdog systems has been deprecated. The watchdog system should be changed to no longer use PERL and instead use PYTHON (or perhaps BASH for exceptionally simple scripts).

Comments related to this report
david.barker@LIGO.ORG - 17:51, Thursday 14 January 2016 (24951)
stuart.aston@LIGO.ORG - 07:06, Friday 15 January 2016 (24962)CDS
We experienced a seemingly identical occurrence of this issue at LLO last Wednesday (see LLO aLOG entry 24156). However, as well as the SUS/SEI Watchdog reset scripts our initial alignment script was also affected, since it has Perl dependencies.

It is still unknown how the symbolic-link to CaTools.pm became broken at LLO, see #4180.
thomas.shaffer@LIGO.ORG - 15:22, Friday 15 January 2016 (24974)

Stuart, it was broken because I updated the same the same folder when I was visiting LLO. I am at fault for both of these CaTools.pm links being broken at both sites, though I had no idea that simply updating the SVN could cause this.

stuart.aston@LIGO.ORG - 15:42, Friday 15 January 2016 (24975)
Thanks for shedding light on this mystery! I would suspect that svn'ing up pushed the changes to deprecate the Perl module sooner than was intended.