The frame writers became unstable after the PI model changes Thursday 24th March. Attached is a plot of their restarts since that time. Initially fw0 was unstable, it then became stable and fw1 went unstable for periods of time. Note the regularity of h1fw1 restarts, roughly every hour around the 30 minute mark, with occassional restarts around the 45 minute mark.
The hourly restarts around the 30 minute mark have been correlated to the hourly running of the wiper script (crontab starts this at 23 minutes in the hour, it finished around the 30 minute mark). I was able to correlate this by changing the crontab time from 23 minutes to 03 minutes at 10:40 today. From that time onwards the restarts happened around the 10 minute mark. h1fw1 restart times for today are:
27_Sunday_March_2016_14:06:49_PDT
27_Sunday_March_2016_13:08:09_PDT
27_Sunday_March_2016_12:11:29_PDT
27_Sunday_March_2016_11:08:50_PDT
27_Sunday_March_2016_10:45:11_PDT
--------------------------------- cronjob changed 23 to 03 minutes
27_Sunday_March_2016_10:29:01_PDT
27_Sunday_March_2016_09:30:52_PDT
27_Sunday_March_2016_08:28:12_PDT
27_Sunday_March_2016_07:44:03_PDT
27_Sunday_March_2016_07:30:23_PDT
27_Sunday_March_2016_06:27:18_PDT
27_Sunday_March_2016_05:42:08_PDT
27_Sunday_March_2016_05:27:59_PDT
27_Sunday_March_2016_02:28:18_PDT
27_Sunday_March_2016_01:28:08_PDT
27_Sunday_March_2016_00:26:28_PDT
This suggests SAMFS disk access or NFS file sharing as a possible cause of the problem. I'll work with Greg and Dan tomorrow to see what diagnostics we can run on these file systems.
opened an FRS ticket, #5221 https://services.ligo-la.caltech.edu/FRS/show_bug.cgi?id=5221