Displaying report 1-1 of 1.
Reports until 22:10, Friday 30 September 2016
H1 General
edmond.merilh@LIGO.ORG - posted 22:10, Friday 30 September 2016 - last comment - 09:40, Saturday 01 October 2016(30132)
Shift Summary - Eve
 
 
TITLE: 10/01 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Commissioning
INCOMING OPERATOR: None
SHIFT SUMMARY:
Strange instabilities causing locklosses were the order of the evening and then the Guardian went South. Stefan was the only commissioner here at the outset and he decided he had had enough. Nutsinee is here for Owl shift but there doesn't seem to be much reason to stay now. Heading home.
LOG:

01:26 Switched ISI blends to EQ v2. EQ Z axis reached up to ≈1 µm/s. ISC_LOCK Guardian set to DOWN waiting for ringdown.

03:43 Switched ISI config back to nominal state - WINDY

4:52 Guardian had an error party - Connection errors 1st, ALS_YARM, then ALS_DIFF and then ISC_LOCK

Comments related to this report
jameson.rollins@LIGO.ORG - 02:11, Saturday 01 October 2016 (30133)

ISC_LOCK, ALS_DIFF, and IFO were all showing connection errors because they lost connection to the ALS_YARM guardian, which had unceremoniously died:

2016-10-01T04:47:53.36265 ALS_YARM [INITIAL_ALIGNMENT.enter]
2016-10-01T04:47:53.36943 ALS_YARM [INITIAL_ALIGNMENT.main] ezca: H1:ALS-C_LOCK_REQUESTY => End Locked
2016-10-01T04:47:53.39004 ALS_YARM [INITIAL_ALIGNMENT.main] timer['pause'] = 10
2016-10-01T04:48:03.38190 ALS_YARM [INITIAL_ALIGNMENT.run] timer['pause'] done
2016-10-01T04:50:05.98295 ALS_YARM REQUEST: GREEN_WFS_OFFLOADED
2016-10-01T04:50:05.98317 ALS_YARM calculating path: INITIAL_ALIGNMENT->GREEN_WFS_OFFLOADED
2016-10-01T04:50:05.98360 ALS_YARM new target: OFFLOAD_GREEN_WFS
2016-10-01T04:50:06.04919 ALS_YARM EDGE: INITIAL_ALIGNMENT->OFFLOAD_GREEN_WFS
2016-10-01T04:50:06.04958 ALS_YARM calculating path: OFFLOAD_GREEN_WFS->GREEN_WFS_OFFLOADED
2016-10-01T04:50:06.04995 ALS_YARM new target: GREEN_WFS_OFFLOADED
2016-10-01T04:50:06.05074 ALS_YARM executing state: OFFLOAD_GREEN_WFS (-21)
2016-10-01T04:50:06.05096 ALS_YARM [OFFLOAD_GREEN_WFS.enter]
2016-10-01T04:50:06.05214 ALS_YARM [OFFLOAD_GREEN_WFS.main] starting smooth offload
2016-10-01T04:50:06.05215 ALS_YARM [OFFLOAD_GREEN_WFS.main] ['ITMY', 'ETMY', 'TMSY']
2016-10-01T04:50:06.55046 ALS_YARM stopping daemon...
2016-10-01T04:50:06.62930 ALS_YARM daemon stopped.
2016-10-01T04:50:07.48941 Traceback (most recent call last):
2016-10-01T04:50:07.48946   File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
2016-10-01T04:50:07.48954     "__main__", fname, loader, pkg_name)
2016-10-01T04:50:07.48959   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
2016-10-01T04:50:07.48963     exec code in run_globals
2016-10-01T04:50:07.48968   File "/ligo/apps/linux-x86_64/guardian-1.0.0/lib/python2.7/site-packages/guardian/__main__.py", line 262, in 
2016-10-01T04:50:07.49240     guard.run()
2016-10-01T04:50:07.49263   File "/ligo/apps/linux-x86_64/guardian-1.0.0/lib/python2.7/site-packages/guardian/daemon.py", line 452, in run
2016-10-01T04:50:07.49308     raise GuardDaemonError("worker exited unexpectedly, exit code: %d" % self.worker.exitcode)
2016-10-01T04:50:07.49380 guardian.daemon.GuardDaemonError: worker exited unexpectedly, exit code: -11
2016-10-01T04:50:07.61754 guardian process stopped: 255 0

As the error indicates, the worker process apparently died without explanation, which is not at all nice.

I restarted the ALS_YARM node with guardctrl and it came back up fine.  The rest of the nodes recovered their connections soon after.  As of right now all nodes appear to be funtioning normally.

This "worker exited unexpectedly" error isn't one I've seen much at all, so I'm very curious what could have caused it.

corey.gray@LIGO.ORG - 09:40, Saturday 01 October 2016 (30137)

Logged an FRS (6338) for this.  Jamie was able to get everything back before ~2am, but since we canceled the OWL shift (in this pre-ER10 epoch, operators are informed to NOT wake up help in the middle of the night), instead of being down 4hrs it was more like 10hrs.  

Since this issue was fixed, the above FRS can now be closed.

Displaying report 1-1 of 1.