Reports until 09:10, Sunday 27 September 2015
H1 CDS
david.barker@LIGO.ORG - posted 09:10, Sunday 27 September 2015 - last comment - 12:53, Thursday 01 October 2015(21989)
restarted ext_alert.py, need to get this to autostart

The ext_alert.py script which periodically views GraceDB had failed. I have just restarted it, instructions for restarting are in https://lhocds.ligo-wa.caltech.edu/wiki/ExternalAlertNotification

Getting this process to autostart is now on our high priority list (FRS3415).

here is the error message displayed before I did the restart.

 

 File "ext_alert.py", line 150, in query_gracedb

    return query_gracedb(start, end, connection=connection, test=test)

  File "ext_alert.py", line 150, in query_gracedb

    return query_gracedb(start, end, connection=connection, test=test)

  File "ext_alert.py", line 135, in query_gracedb

    external = log_query(connection, 'External %d .. %d' % (start, end))

  File "ext_alert.py", line 163, in log_query

    return list(connection.events(query))

  File "/usr/lib/python2.7/dist-packages/ligo/gracedb/rest.py", line 441, in events 

    uri = self.links['events']

  File "/usr/lib/python2.7/dist-packages/ligo/gracedb/rest.py", line 284, in links  

    return self.service_info.get('links')

  File "/usr/lib/python2.7/dist-packages/ligo/gracedb/rest.py", line 279, in service_info

    self._service_info = self.request("GET", self.service_url).json()

  File "/usr/lib/python2.7/dist-packages/ligo/gracedb/rest.py", line 325, in request

    return GsiRest.request(self, method, *args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/ligo/gracedb/rest.py", line 201, in request

    response = conn.getresponse()

  File "/usr/lib/python2.7/httplib.py", line 1038, in getresponse

    response.begin()

  File "/usr/lib/python2.7/httplib.py", line 415, in begin

    version, status, reason = self._read_status()

  File "/usr/lib/python2.7/httplib.py", line 371, in _read_status

    line = self.fp.readline(_MAXLINE + 1)

  File "/usr/lib/python2.7/socket.py", line 476, in readline

    data = self._sock.recv(self._rbufsize)

  File "/usr/lib/python2.7/ssl.py", line 241, in recv

    return self.read(buflen)

  File "/usr/lib/python2.7/ssl.py", line 160, in read

    return self._sslobj.read(len)

ssl.SSLError: The read operation timed out

Comments related to this report
duncan.macleod@LIGO.ORG - 12:53, Thursday 01 October 2015 (22151)

I have patched the ext_alert.py script to catch SSLError exceptions and retry the query [r11793]. The script will retry up to 5 times before crashing completely, which is something we may want to rethink if we have to.

I have request both sites to svn up and restart the ext_alert.py process at the next convenient opportunity (the next time it crashes).