Epics variables that originate from the RCG don't seem to be updating this morning, although ones that originate in Beckhoff are updating. I don't see anything wrong on the CDS overview except the timing system errors which have been there since Tuesday.
On the GDS screens, the GPS times are all stuck at slightly different times around Sep 03 2017 11:42:43 UTC (so far I have seen times within about 10 seconds of each other with all models on the same IOP stopped at the same time.)
We have had what looks like many nearby EQs over the last 16 hours.
h1boot locked up around 04:40 PDT. Sheila is rebooting it.
h1boot is back, front ends look good. Sheila will try some testpoints and excitations.
here are h1boot's system messages for early this morning, last message before freeze up was an ntpd status change at approximately the time of the freeze. The next message is the reboot at 09:39:08
Sep 3 01:19:48 h1boot -- MARK --
Sep 3 01:39:48 h1boot -- MARK --
Sep 3 01:59:48 h1boot -- MARK --
Sep 3 02:19:48 h1boot -- MARK --
Sep 3 02:39:48 h1boot -- MARK --
Sep 3 02:59:48 h1boot -- MARK --
Sep 3 03:19:48 h1boot -- MARK --
Sep 3 03:39:48 h1boot -- MARK --
Sep 3 03:59:49 h1boot -- MARK --
Sep 3 04:19:49 h1boot -- MARK --
Sep 3 04:39:49 h1boot -- MARK --
Sep 3 04:41:40 h1boot ntpd[4865]: kernel time sync status change 6001
Sep 3 09:39:08 h1boot syslog-ng[4227]: Syslog connection established; fd='7', server='AF_INET(10.99.0.99:514)', local='AF_INET(0.0.0.0:0)'
Impact of h1boot freeze up:
The front end real-time processes were not affected by the freeze, neither was their data transfer to the DAQ. All EPICS IOCs on the front ends froze up, which mainly impacted the Guardian nodes which received stuck data. MEDMs were also frozen at their 04:41 PDT values, and conlog also did not receive any updates. I suspect testpoint and excitation operations would have been unavailable during the freeze.