Paul, Dave:
reference Cheryl's alog from Monday evening, 3rd July and Paul's alog from today.
If the psinject (continuous wave calibration) process on h1hwinj1 dies, the monit system on that computer restarts the process. It had been found that a sudden restart of the excitation could cause issues (loss of lock?) and so additional code was added in 2015 to slowly ramp up the CW injection signal on startup (and slowly ramp it down on shutdown).
Full details of the restart procedure are in this alog. To summarize; the gain is zeroed, psinject is started, after a 2 minute wait the gain is set back to unity over a ten second ramp. The GAIN and TRAMP channels were being monitored by h1calex SDF, which meant that the DIAG_SDF guardian node took H1 out of observation when differences were reported.
Given the rarity of a psinject crash/restart, we could continue to monitor the GAIN channel as a precaution against accidental change of the CW calibration amplitude. Perhaps we should go further and change the DIAG_EXC guardian node to take H1 out of observation if the h1calex excitation is missing (currently it is being ignored). Looking at the Verbal Alarms code I'm not sure a verbal alert is raised when psinject crashes, if this is the case we should activate this.