Reports until 09:39, Wednesday 17 January 2018
H1 SEI (OpsInfo)
jim.warner@LIGO.ORG - posted 09:39, Wednesday 17 January 2018 (40151)
Some changes to seismon code, troubleshooting added to wiki

Our local copy of the Seismon code has been running pretty well for the last several months, but one of the python epics scripts has been crashing ~ once a week. Typically, it's pretty easy to recover : ssh in to the hwinj2 machine and restart the crashed code. This week, more of the seismon earthquake code had crashed, which is new, but I'm assuming this had something to do with the computer restarts yesterday. When I was fixing this, I finally got smart enough to ask for some help on the more routine crash, and Michael Coughlin suggested add some try-except loops to the epics code. So, before the lines 209 & 239 were both:

 (eq_file, eq_gps, eq_mag, p_arr, s_arr, r20_arr, r35_arr, r50_arr, rvel, gps0, gps1, lat, lng, dist, depth, azimuth, dum, location, ifo) =  line.split()

These lines are now:

try:
     (eq_file, eq_gps, eq_mag, p_arr, s_arr, r20_arr, r35_arr, r50_arr, rvel, gps0, gps1, lat, lng, dist, depth, azimuth, dum, location, ifo)
 =  line.split()
except:
    continue

Still running so far. I'll keep an eye on it.

I've also added a troubleshooting section to Dave's seismon wiki page, https://cdswiki.ligo-wa.caltech.edu/wiki/seismon . I think these instructions I added under LHO Troubleshooting should be enough to recover the code from most of the failures I've seen so far.