The following are a list of lock loss messages in the Guardian log. We've had a bunch of locklosses during the transition from locked to low noise this evening. As you can see there are a few different culprits, but one of the big ones is LOWNOISE_ESD_ETMY. It would be handy if someone can check out these lock losses and home in on what precisely went bad during this transition (e.g. ramping, switching, etc.). Then we can get back to SRCL FF tuning.
2015-07-17 06:47:05.694550 ISC_LOCK LOWNOISE_ESD_ETMY -> LOCKLOSS
2015-07-17 06:48:22.162260 ISC_LOCK LOCKING_ARMS_GREEN -> LOCKLOSS
2015-07-17 07:14:34.431170 ISC_LOCK NOMINAL_LOW_NOISE -> LOCKLOSS
2015-07-17 07:26:40.249110 ISC_LOCK CARM_10PM -> LOCKLOSS
2015-07-17 07:34:59.720030 ISC_LOCK PREP_TR_CARM -> LOCKLOSS
2015-07-17 07:42:08.269350 ISC_LOCK LOCKING_ALS -> LOCKLOSS
2015-07-17 08:02:41.773620 ISC_LOCK LOWNOISE_ESD_ETMY -> LOCKLOSS
2015-07-17 08:21:58.665420 ISC_LOCK LOWNOISE_ESD_ETMY -> LOCKLOSS
2015-07-17 08:31:29.035330 ISC_LOCK REDUCE_CARM_OFFSET_MORE -> LOCKLOSS
2015-07-17 08:48:32.514870 ISC_LOCK LOWNOISE_ESD_ETMY -> LOCKLOSS
Guardian error causing lock loss in LOWNOISE_ESD_ETMY (Evan, Keita)
Summary:
Out of four lock losses in LOWNOISE_ESD_ETMY that Rana and Sheila listed, one lock loss (15-07-17-06-47-05) was due to the guardian running main() of LOWNOISE_ESD_ETMY twice.
Running main() twice (some times but not always) is apprently a known problem of the guardian, but this specific state is written such that running main() twice is not safe.
Details:
Looking at the lock loss, I found that the ETMY_L3_LOCK_L ramp time (left of the attached, red CH16) was set to zero at the same or right after the ETMX and ETMY L3 gain (blue ch3 and brown ch5) were set to their final number (0 and 1.25 respectively). There was a huge glitch in EY actuators at that point but not to EX.
This transition is supposed to happen with the ramp time of 10 seconds, so setting the ramp time to 0 after setting the gain kills the lock.
Looking at the guardian code (attached right), the ramp time is set to zero at the beginning and set to 10 at the end.
Evan told me that main() could be executed twice, we looked at the log (attached middle), and sure enough, right after LOWNOISE_ESD_ETMY.main is finished at 2015-07-17T06:46:50.39059, the gain was set to zero again.
I have identified the source of the double main execution and have a patch ready that fixes the problem:
https://bugzilla.ligo-wa.caltech.edu/bugzilla3/show_bug.cgi?id=879#c7
If needed we can push out a point release essentially immediately, maybe during next Tuesday's maintenance period.
Bounce rang up during the EX-EY transition gain ramping, 3/4 of the times last night.
In three out of four lock losses in LOWNOISE_ESD_ETMY that Rana and Sheila listed, the guardian made it all the way to the gain ramping at the end, and it did not run main() twice.
However, about 7 to 8 seconds after the ramping started, 9.8Hz oscillation built up in DARM, then there came fast glitches in ETMY L2 drive, then the IFO lost lock.
This looks like a bounce but I have no idea why it was suddenly rang up.
See attached. First attachment shows the very end of the lock losses that clearly shows DARM oscillation.
The second plot shows the same lock losses but zoomed out so you can see that each lock losses happened 7 to 8 seconds after the ramping started.
The last attachment shows one of the DARM oscillation so you can see that 6 cycles = 0.309 seconds (i.e. 9.8Hz signal).
Update: After bounce was rung up, OMC DCPDs saturated before IFO lost lock.
In the attached, while 9.8Hz was getting bigger (top left), if you high-pass DARM_IN1_DQ (middle left) you can see that the high frequency part dominated by 2.5kHz suddenly quenched at about t=18sec.
Same thing is observed in OMC DCPDs (middle middle and bottom middle), and even though we don't have a fast channel for DCPD ADCs, it seems like they were very close to the saturation at 18sec (bottom left).
Though we don't know why 9.8Hz was excited, at least we know that the DCPD saturated to cause the lock loss.
Since the same thing happened 3 times, and each time it was 7 to 8 seconds after the ETMX and ETMY L3 LOCK_L gain started ramping, you could set the gains to the values corresponding to this in-between state, keep it there for a minute or so, and see if the IFO can stay locked. If you fail to keep it locked it's a sure sign that this instability is somehow related to the L3 actuator balance between X and Y, or L3-L2 crossover in Y (or in X) or both.
The in-between gain would be something like 1.1 for EY L3 lock and 0.125 for EX.