Displaying report 1-1 of 1.
Reports until 16:55, Friday 28 March 2014
H1 SEI
sheila.dwyer@LIGO.ORG - posted 16:55, Friday 28 March 2014 - last comment - 11:45, Monday 31 March 2014(11070)
HAM3 ISI trip

This trip happened around the time of a beckhoff restart.  A beckhoff restart causes a crash of the IMC guardian, which causes it to stop.  Its not clear to me why this should cause MC2 to get a large signal, but it seems to.  This causes a cascade of trips.  Even though I don't know why this happends, creating a safe state in the IMC guardian so that it handels missing channels better could help with the problem. 

Another solution is to have the SUS WDs not trip HEPI.  Can we get that fix soon?

HAM2 also tripped at the same time, Jeff and Hugh brought that back. 

Images attached to this report
Comments related to this report
jameson.rollins@LIGO.ORG - 17:54, Friday 28 March 2014 (11075)

To be clear, IMC guardian did not "crash" in this particular situation.  The guardian responded exactly as it's currently programmed to respond, which is to go into ERROR when it looses communication with any of the channels it's monitoring.  I want to distinguish and ERROR condition, which is something that guardian handles, to a "crash", which means that the guardian process died unexpectedly.

Here's my guess for the sequence of events:

  1. Beckhoff action somehow caused one of the PSL channels (H1:PSL-PERISCOPE_A_DC_ERROR_FLAG) to momentarily disappear.
  2. IMC guardian, which is monitoring the above channel, goes into ERROR when said channel disappears
  3. Around this time, the IMC looses lock, presumably for the same reason the PSL channels disappeared.  Since the IMC guardian is in ERROR and no longer monitoring the IMC situation, it doesn't shut off the feedback to MC2.
  4. Large control signals into MC2 cause it to trip.  A typical watchdog cascade causes everthing from here to Walla Walla to trip.

It's possible guardian could be made slightly more robust against loss of some of it's channels, but that only helps up to a point.  Eventually guardian has to drop into some error condition if it can't talk to whatever it's trying to control.  It could try to move everything to some sort of safe state, but that only works if it can talk to the front-ends to actually change their state.

daniel.sigg@LIGO.ORG - 09:30, Saturday 29 March 2014 (11084)

A Beckhoff restart also causes the IMC servo board to be reset, as well as all whitening for photodiodes, wavefront sensors and QPDs. I assume that the resulting transient caused the MC to trip. It would be interesting to know, if this is due the length or alignment system. Is it the initial transient or a run-away integrator? In either case this should not result in a trip. A better action would be to simply turn off the ISC inputs.

arnaud.pele@LIGO.ORG - 11:45, Monday 31 March 2014 (11094)

After taking a look at the time of WD trips, it seems like HAM3 ISI trips before MC2, see green plot vs red plot (the X axis is the number of seconds after gps=1080083200)

Images attached to this comment
Displaying report 1-1 of 1.