Reports until 22:55, Wednesday 27 April 2016
H1 GRD
sheila.dwyer@LIGO.ORG - posted 22:55, Wednesday 27 April 2016 - last comment - 18:28, Thursday 28 April 2016(26840)
mystery DRMI guardian lockloss

Jenne, Sheila, Chris, TJ, Evan

It seems that tonight we have been sabotaged by some code that we have been using for a long time (this haas only happened once that we caught it, although we have a lot of unexplained locklosses tonight).

In the attached screenshot you can see that the DRMI guardian was sitting at DRMI_3F_LOCKED (130) when it decided to go to LOCK_DRMI_1F (30).  There is a decorator in DRMI_3F_LOCKED that apparently returned LOCK_DRMI_1F, because it though DRMI was unlocked (it was fine as you can see from the power build ups in the top row). 

The code that checks for DRMI lock is: 

def DRMI_locked():
    #log('checking DRMI lock')
    return ezca['LSC-MICH_TRIG_MON'] and ezca['LSC-PRCL_TRIG_MON'] and ezca['LSC-SRCL_TRIG_MON']
def DRMI_locked():
    #log('checking DRMI lock')
    return ezca['LSC-MICH_TRIG_MON'] and ezca['LSC-PRCL_TRIG_MON'] and ezca['LSC-SRCL_TRIG_MON']
def DRMI_locked():
    #log('checking DRMI lock')
    return ezca['LSC-MICH_TRIG_MON'] and ezca['LSC-PRCL_TRIG_MON'] and ezca['LSC-SRCL_TRIG_MON']
 
 
however, as you can see from the plot, all of these trig mons were 1 for the whole time.  Reutning LOCK_DRMI_1F resets settings for DRMI to reaquire, so we would expect that to break the lock.
 
Jenne used the new guardlog to grab the DRMI log from that time, it is attached.  
Images attached to this report
Non-image files attached to this report
Comments related to this report
sheila.dwyer@LIGO.ORG - 23:56, Wednesday 27 April 2016 (26842)

It happened again at 6:53:07

Images attached to this comment
sheila.dwyer@LIGO.ORG - 18:28, Thursday 28 April 2016 (26863)

Sheila, Jenne, Jamie, Chris, Evan, Dave

We still don't understand why this would have happened, although we should be able to debug it a little bit better if it happens again. 

Jenne and Jamie edited the DRMI_Locked function so that there will be more information in the guardian log in the future:

def DRMI_locked():

    MichMon = ezca['LSC-MICH_TRIG_MON']
    PrclMon = ezca['LSC-PRCL_TRIG_MON']
    SrclMon = ezca['LSC-SRCL_TRIG_MON']
    if (MichMon > 0.5) and (PrclMon > 0.5) and (SrclMon > 0.5):
        # We're still locked and triggered, so return True
        return True
    else: 
        # Eeep!  Not locked.  Log some stuff
        log('DRMI TRIGGERED NOT LOCKED:')
        log('LSC-MICH_TRIG_MON = %s' % MichMon)
        log('LSC-PRCL_TRIG_MON = %s' % PrclMon)
        log('LSC-SRCL_TRIG_MON = %s' % SrclMon)
        return False
 
This also avoids the question of what might happen if the ezca calls I don't return a bool.
 
Dave tells us that the data recorded in the DAQ is not necessarily synchronus with the EPICS data, so looking at H1:LSC-MICH_TRIG_MON using nds2 doesn't necessarily give us the same data that the guardian gets (this would explain why nothing would have shown up in the lockloss plots even though the guardian apparently sees one of the TRIG_MONs changing). Dave is going to add the TRIG_MON channels to conlog.  
 
We looked at POP18_I_ERR durring the time that this was happening, and it should have been above the threshold the entire time, so there seems to be no reason the trigger should have gone off.  One new suspicous thing is that this happens at the same time that the triggering is swapped over to POPDC by the ISC_LOCK guardian.  However, you can see in the attached plots that the thresholds are lowered (to -100) when the DRMI guardian thinks that the lock is lost.  Chris and I looked through the triggering in the model, and it doesn't seem like lowering the threshold should turn off the trigger in any case, although it looks based on the timing that it was lowering the thresholds that caused the problem. I added a sleep after the trigger matrix is reset to POPDC before the thresholds are reset, although I don't think this was the problem since DRMI seems to think it is unlocked before the thresholds are reset.   
 
 
def DRMI_locked():
    MichMon = ezca['LSC-MICH_TRIG_MON']
    PrclMon = ezca['LSC-PRCL_TRIG_MON']
    SrclMon = ezca['LSC-SRCL_TRIG_MON']
    if (MichMon > 0.5) and (PrclMon > 0.5) and (SrclMon > 0.5):
        # We're still locked and triggered, so return True
        return True
    else: 
        # Eeep!  Not locked.  Log some stuff
        log('DRMI TRIGGERED NOT LOCKED:')
        log('LSC-MICH_TRIG_MON = %s' % MichMon)
        log('LSC-PRCL_TRIG_MON = %s' % PrclMon)
        log('LSC-SRCL_TRIG_MON = %s' % SrclMon)
        return Falsedef DRMI_locked():
    MichMon = ezca['LSC-MICH_TRIG_MON']
    PrclMon = ezca['LSC-PRCL_TRIG_MON']
    SrclMon = ezca['LSC-SRCL_TRIG_MON']
    if (MichMon > 0.5) and (PrclMon > 0.5) and (SrclMon > 0.5):
        # We're still locked and triggered, so return True
        return True
    else: 
        # Eeep!  Not locked.  Log some stuff
        log('DRMI TRIGGERED NOT LOCKED:')
        log('LSC-MICH_TRIG_MON = %s' % MichMon)
        log('LSC-PRCL_TRIG_MON = %s' % PrclMon)
        log('LSC-SRCL_TRIG_MON = %s' % SrclMon)
        return False
Images attached to this comment