Displaying report 1-1 of 1.
Reports until 19:03, Tuesday 18 March 2014
H1 SUS (CDS, DAQ, INS, ISC, SEI, SYS, VE)
jeffrey.kissel@LIGO.ORG - posted 19:03, Tuesday 18 March 2014 (10849)
Loose Wires in the Software
J. Kissel, S. Dwyer, D. Sigg, J. Rollins, A. Pele, H. Radkins, D. Gustafson, F. Clara, C. Reed, J. Batch, D. Barker

So, maintenance day was awesome. In summary, we had two problems that took us 4 hours to identify:
(1) The h1sush2a IOP model's DAC enable bit had gone bad after Sheila's restart of the h1suspr3 model. This resulted in the mode cleaner misalignment, because alignment offsets were not getting out to the SUS. This has happened several times before, see for example LHO aLOGs 10375, or 8964.
(2) Some of the /opt/rtcds/lho/h1/ directory structure has disappeared for an as yet unknown reason. This resulted in the mode cleaner blowing up the FSS every time it tried to LOCK.

The story, in chronological order for the record:

- Kiwamu and I flip HXTS signs, store new safe snapshots, and leave with the IMC locked. (see LHO aLOG 10837)
- Sheila installs optical lever BLRMS into PR3 by restarting the h1suspr3 front end code, which is on h1sush2a computer. (see LHO aLOG 10837)
- Restarting the SUS front end model trips the HAM2-ISI watchdogs (but NOT HPI), because the ISI loses communication with the SUS.
- Found the IMC TRANS and REFL cameras appearing grayed out, appearing to be broken / not reporting real data
- Sent Fil out to check if the camera's analog path has been screwed up some how. Confirms OK.
- Fil launched Cyrus and had him reboot the camera server processes on the relevant cameras and there's no affect.
- Sheila, Arnaud, Hugh, and Daniel launch on the "has the alignment changed since we last had a good lock, and to what SEI trips does the change correspond" DataViewer trending game. Red herrings everywhere, Daniel finally identifies that the IMC is just misaligned. The cameras look bad because of some automatic exposure or gain or aperture were searching for light and found none.
- Jim restarts the DAQ (see LHO aLOG 10832)
- Jamie identifies the h1sush2a IOP model's DAC enable bit had gone bad, because of his previous experience with the problem. We all look at the h1sush2a's IOP CDS bit word (one bit in the middle of a bit word, that is un-intuitively related to MC1 and MC3), and find, sure enough, the bit is red.
- Talk to Jim who says the way to solve the problem is to kill all user front-end processes, and restart the IOP process, and restart all the user processes.
- Performing Jim's FIXIT works, and we can drive out of MC1 and MC3 (and the rest on h1sush2a, PRM and PR3) again.
- We used the guardian to bring each SUS back into the aligned state, and turned on the IMC guardian.
- The IMC aligns, but is constantly flashing, breaking the FSS lock constantly during attempts to lock.
- After bringing up all the front end processes, while the DAC enable problems disappeared, we were left with IPC errors in the CDS bit words for 3 of he 4 SUS.
- Looking to clear these errors because they're a suspect, we tried opening the GDS TP screens to hit the "Diag Reset" button, only to discover that MEDM throws and error when SUS GDS_TP screens are called, complaining they don't exist.
- Looking into the /opt/rtcds/lho/h1/medm/ folder, we find that *all* SUS, and several more automatically generated MEDM directories have disappeared.
- Launched Jim and Dave to investigate, continuing to explore why the IMC won't lock (see LHO aLOG 10850)
- Sheila and I toggle switches, BURT restore h1lscepics, h1ascimcepics, h1ecatc1plc2 to 2014-03-18 14:00 PT with no affect
- Sheila checks analog IMC error signal (demod phase) and cabling at ISC racks. Everything looks great.
- Jim, beginning to replace all of the missing MEDM files, begins systematically make-installing all missing models.
- When he gets to h1susmc2 BLAMO -- the IMC locks right up.
- Dick titles the aLOG.


As I said, still unclear why all of these files when missing, but some ~200 files were apparently *non-MEDM* files, and *some* of those files were essential for the IMC to lock. YAAAAAAAY MAINTENANCE DAY YAAAAAAAAY.

The investigation continues on why we lost files from /opt/rtcds continues, in the mean time, Jim has reinstalled all models with the missing files, and things appear functional, but we've test little else than the IMC.

Displaying report 1-1 of 1.