Displaying report 1-1 of 1.
Reports until 15:15, Monday 24 October 2016
H1 GRD
sheila.dwyer@LIGO.ORG - posted 15:15, Monday 24 October 2016 - last comment - 09:06, Saturday 29 October 2016(30815)
exca connection error

Ed, Sheila

Are ezca connection errors becoming more frequent?  Ed has had two in the last hour or so, one of which contributed to a lockloss (ISC_DRMI).

The first one was from ISC_LOCK, the screenshot is attached. 

Images attached to this report
Comments related to this report
thomas.shaffer@LIGO.ORG - 18:15, Monday 24 October 2016 (30828)

Happened again but for a different channel H1:SUS-ITMX_L2_DAMP_MODE2_RMSLP_LOG10_OUTMON ( Sheila's post was for H1:LSC-PD_DOF_MTRX_7_4). I trended and found data for both of those channels at the connection error times, and during the second error I could also caget the channel while ISC_LOCK still could not connect. I'll keep trying to dig and see what I find.

Relevant ISC_LOCK log:

2016-10-25_00:25:57.034950Z ISC_LOCK [COIL_DRIVERS.enter]
2016-10-25_00:26:09.444680Z Traceback (most recent call last):
2016-10-25_00:26:09.444730Z   File "_ctypes/callbacks.c", line 314, in 'calling callback function'
2016-10-25_00:26:12.128960Z ISC_LOCK [COIL_DRIVERS.main] USERMSG 0: EZCA CONNECTION ERROR: Could not connect to channel (timeout=2s): H1:SUS-ITMX_L2_DAMP_MODE2_RMSLP_LOG10_OUTMON
2016-10-25_00:26:12.129190Z   File "/ligo/apps/linux-x86_64/epics-3.14.12.2_long-ubuntu12/pyext/pyepics/lib/python2.6/site-packages/epics/ca.py", line 465, in _onConnectionEvent
2016-10-25_00:26:12.131850Z     if int(ichid) == int(args.chid):
2016-10-25_00:26:12.132700Z TypeError: int() argument must be a string or a number, not 'NoneType'
2016-10-25_00:26:12.162700Z ISC_LOCK EZCA CONNECTION ERROR. attempting to reestablish...
2016-10-25_00:26:12.175240Z ISC_LOCK CERROR: State method raised an EzcaConnectionError exception.
2016-10-25_00:26:12.175450Z ISC_LOCK CERROR: Current state method will be rerun until the connection error clears.
2016-10-25_00:26:12.175630Z ISC_LOCK CERROR: If CERROR does not clear, try setting OP:STOP to kill worker, followed by OP:EXEC to resume.

sheila.dwyer@LIGO.ORG - 21:12, Friday 28 October 2016 (30977)

It happened again just now. 

Images attached to this comment
david.barker@LIGO.ORG - 09:06, Saturday 29 October 2016 (30993)

Opened FRS on this, marked a high priority fault.

ticket 6558

Displaying report 1-1 of 1.