Displaying report 1-1 of 1.
Reports until 20:04, Friday 12 June 2015
H1 GRD (CDS, GRD, ISC)
sheila.dwyer@LIGO.ORG - posted 20:04, Friday 12 June 2015 - last comment - 17:29, Thursday 18 June 2015(19108)
some locking difficulties today

We had four known reasons for having difficulty locking today, one is an unsolved mystery that might be hurting us more often than we realize.

I reloaded both ISC_DRMI and ISC_LOCK guardians today, to incorporate these changes.  

Good luck TJ!

Images attached to this report
Comments related to this report
jameson.rollins@LIGO.ORG - 09:45, Saturday 13 June 2015 (19117)

I don't think the log snippet included above shows the problem, but I found where in the log it does:

2015-06-13T05:36:35.76316 ISC_LOCK [LOWNOISE_ESD_ETMY.enter]
2015-06-13T05:36:35.78009 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_TRAMP => 0
2015-06-13T05:36:35.78025 ISC_LOCK [LOWNOISE_ESD_ETMY.main] Preparing ETMY for DARM actuation transition...
2015-06-13T05:36:36.03624 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_M0_LOCK_L => OFF: INPUT
2015-06-13T05:36:36.03791 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L1_LOCK_L_GAIN => 0.16
2015-06-13T05:36:36.03886 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_GAIN => 0
2015-06-13T05:36:36.04021 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L1_LOCK_L_SW1S => 20804
2015-06-13T05:36:36.29496 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L1_LOCK_L => ONLY ON: INPUT, FM2, FM3, FM5, FM6, FM7, FM8, OUTPUT, DECIMATION
2015-06-13T05:36:36.55096 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L2_LOCK_L => ONLY ON: INPUT, FM6, OUTPUT, DECIMATION
2015-06-13T05:36:36.55229 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_SW1S => 16388
2015-06-13T05:36:36.80324 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L => ONLY ON: INPUT, FM6, FM8, FM9, FM10, OUTPUT, DECIMATION
2015-06-13T05:36:37.05938 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L2_DRIVEALIGN_L2L => ONLY ON: INPUT, FM2, OUTPUT, DECIMATION
2015-06-13T05:36:37.31538 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_DRIVEALIGN_L2L => ONLY ON: INPUT, FM3, FM4, FM5, OUTPUT, DECIMATION
2015-06-13T05:36:37.31694 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_TRAMP => 10
2015-06-13T05:36:37.32341 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMX_L1_LOCK_L_SW1 => 16
2015-06-13T05:36:37.57613 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMX_L1_LOCK_L => OFF: FM1
2015-06-13T05:36:38.57792 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_GAIN => 0.7
2015-06-13T05:36:38.57846 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMX_L3_LOCK_L_GAIN => 0.5
2015-06-13T05:36:49.58941 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L1_LOCK_L_SW1 => 16
2015-06-13T05:36:49.84053 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L1_LOCK_L => ON: FM1
2015-06-13T05:36:50.84245 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_GAIN => 1.25
2015-06-13T05:36:50.84585 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMX_L3_LOCK_L_GAIN => 0
2015-06-13T05:36:50.84640 ISC_LOCK [LOWNOISE_ESD_ETMY.main] timer['ETMswap'] = 10.0
2015-06-13T05:36:50.85290 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_TRAMP => 0
2015-06-13T05:36:50.85341 ISC_LOCK [LOWNOISE_ESD_ETMY.main] Preparing ETMY for DARM actuation transition...
2015-06-13T05:36:51.10580 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_M0_LOCK_L => OFF: INPUT
2015-06-13T05:36:51.11470 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_GAIN => 0
2015-06-13T05:36:51.11670 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L1_LOCK_L_SW1S => 20804
2015-06-13T05:36:51.37015 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L1_LOCK_L => ONLY ON: INPUT, FM2, FM3, FM5, FM6, FM7, FM8, OUTPUT, DECIMATION
2015-06-13T05:36:51.62486 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L2_LOCK_L => ONLY ON: INPUT, FM6, OUTPUT, DECIMATION
2015-06-13T05:36:51.88017 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L => ONLY ON: INPUT, FM6, FM8, FM9, FM10, OUTPUT, DECIMATION
2015-06-13T05:36:52.13170 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L2_DRIVEALIGN_L2L => ONLY ON: INPUT, FM2, OUTPUT, DECIMATION
2015-06-13T05:36:52.38753 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_DRIVEALIGN_L2L => ONLY ON: INPUT, FM3, FM4, FM5, OUTPUT, DECIMATION
2015-06-13T05:36:52.39457 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_TRAMP => 10
2015-06-13T05:36:52.65332 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMX_L1_LOCK_L => OFF: FM1
2015-06-13T05:36:53.65498 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_GAIN => 0.7
2015-06-13T05:36:53.65562 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMX_L3_LOCK_L_GAIN => 0.5
2015-06-13T05:37:04.66682 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L1_LOCK_L_SW1 => 16
2015-06-13T05:37:04.91783 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L1_LOCK_L => ON: FM1
2015-06-13T05:37:05.91955 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMY_L3_LOCK_L_GAIN => 1.25
2015-06-13T05:37:05.92183 ISC_LOCK [LOWNOISE_ESD_ETMY.main] ezca: H1:SUS-ETMX_L3_LOCK_L_GAIN => 0
2015-06-13T05:37:05.92216 ISC_LOCK [LOWNOISE_ESD_ETMY.main] timer['ETMswap'] = 10.0
2015-06-13T05:37:05.94222 ISC_LOCK [LOWNOISE_ESD_ETMY.run] MC not locked

Based on the ezca and log output during LOWNOISE_ESD_ETMY.main it does in fact look like main() was executed twice in a row.  That should never happen under any circumstances.  I'm investigating.

jameson.rollins@LIGO.ORG - 16:50, Saturday 13 June 2015 (19127)
sheila.dwyer@LIGO.ORG - 17:29, Thursday 18 June 2015 (19231)

I think that there are potentially two different issues, one being what is shown in the original alog, where the run should return true, but the guardian state doesn't change even though the current state is not the requested state.  We could re-wrtie the guardains (or at tleast this state) to reduce the harm from this, but it still seems like a bug in the way the gaurdian is working.  

On the other hand, the problem that Jamie pointed out is more serious.  For other reasons, I have been looking at histograms of how long the guardian spends in each state.  Some states should take the same amount of time to execute each time, but analog carm for example has 3 possibilities.  We often detect a lockloss in the first second of the state, if the state executes normally it takes 18 seconds, but there were 5 times that it took 35 seconds because it repeated main.  GPS times of these events are:

1117983542.06250
> 1118037936.06250
> 1118148903.93750
> 1118294947.06250
> 1118295997.06250

I looked at guardian log for the first three of these events, and indeed they
are times when the main was repeated. These are mosly sucsesfull locks, so the bug isn't causing locklosses here although it easily could.  

The ISC_LOCK code that was running at the time is in the svn as revision 10776.
Images attached to this comment
Displaying report 1-1 of 1.