Reports until 11:57, Monday 30 September 2013
H1 CDS
james.batch@LIGO.ORG - posted 11:57, Monday 30 September 2013 (7909)
Recover from power failure
(David Barker, Cyrus Reed, James Batch)

LHO experienced a 2 second power fail at 00:40:43 PDT and a power glitch at 06:06 PDT which caused front end reboots.  Also affected were both CDS file servers.

The main file server entered a read-only state, while the backup file server locked up and was unresponsive.  Both file servers were restored by 08:30 PDT.

All front-end computers were powered off, as several were not logically attached to their I/O chassis.  On power up, the following computers still could not find the I/O chassis:

h1seib1
h1seih45
h1sush2b
h1susquadtst
h1susex
h1susexaux

In each case, turning the power switch on the front of the I/O chassis from on to off to on caused the I/O chassis to power up.  The computers were then power cycled to get the I/O chassis reattached.

At this point, there were IPC problems in which channels could not be received from the following computers:

h1lsc0
h1asc0
h1seih23
h1seih16
h1seib2

In order to correct this problem, the Dolphin switches in the MSR needed to be power cycled, which in turn required all computers in the MSR that were attached to the Dolphin network to be power cycled.  Once this was accomplished, and a few models manually restarted, all appears to be well.  The system is back in operation as of 11:30 PDT.