Reports until 22:52, Wednesday 30 October 2013
H1 SUS (CDS, DAQ)
jeffrey.kissel@LIGO.ORG - posted 22:52, Wednesday 30 October 2013 - last comment - 05:21, Thursday 31 October 2013(8325)
ECR E1300578 and E1300261 Progress -- HLTS Models -- And crashed to Data Concentrator / Framebuilder
J. Kissel

I've now updated the HLTS front-end simulink models as per ECR E1300578, similar to the QUADs, TMTS, and BSFM, as described in G1301192. After successful compilation of both H1SUSPR3 and H1SUSSR3, Fabrice informed me that Arnaud was gathering some data looking for long-term drift on PR3, so I only installed and restarted SR3. Of course, up successful compilation and install, I went to restart the front end with the new process and it hung halfway through, completely taking down the data concentrator / frame builder / DAQ, and took down the entire h1sush56 front end. I attach a screenshot of the CDS overview screen. *sigh*. The reigning king of finding crazy obscure bugs in CDS and exercising them wins again. The only debugging I've done is trying to reboot the data concentrator once, by doing the following

controls@opsws3:models 0$ telnet h1dc0 8087
Trying 10.101.0.20...
Connected to h1dc0.cds.ligo-wa.caltech.edu.
Escape character is '^]'.
daqd> shutdown
OK
Connection closed by foreign host.
controls@opsws3:models 1$

This brought back *some* of the front ends back up and to green status (The SEI and SUS computers at the end stations), but the corner is cooked.

Sorry Arnaud, and anyone else who was gather data overnight.

Giving up for the night and will continue fighting the good fight tomorrow morning.

I tried uploading the source from the target area, but the aLOG doesn't like tar.gz's at all.

------
Here's the status of the sus corner of the SVN repo that's a result of my work:

MM      common/models/HAUX_MASTER.mdl                  # haven't started on the HAUX yet
MM      common/models/HLTS_MASTER.mdl                  # changes complete, but don't want to commit until I can successfully start the front end process
M       common/models/SIXOSEM_T_STAGE_MASTER.mdl       # same as above
MM      common/models/MC_MASTER.mdl                    # haven't started on the HSTS yet
M       common/models/OMCS_MASTER.mdl                  # haven't started on the OMCS yet
M       common/models/SIXOSEM_T_WD_AC_MASTER.mdl       # changes complete, but don't want to commit until I can successfully start the front end process
M       common/models/SIXOSEM_T_WD_DC_MASTER.mdl       # changes complete, but don't want to commit until I can successfully start the front end process
MM      common/models/HSTS_MASTER.mdl                  # haven't started on the HSTS yet

M       h1/filterfiles/H1SUSTMSX.txt                   # Haven't committed since new code has been installed, these still need a hand clean up of now-vestigial filter banks
M       h1/filterfiles/H1SUSTMSY.txt                   #     | 
M       h1/filterfiles/H1SUSBS.txt                     #     | 
M       h1/filterfiles/H1SUSSR3.txt                    #     | 
M       h1/filterfiles/H1SUSETMX.txt                   #     | 
M       h1/filterfiles/H1SUSETMY.txt                   #     | 
M       h1/filterfiles/H1SUSITMX.txt                   #     | 
M       h1/filterfiles/H1SUSITMY.txt                   #     V
 
M       h1/models/h1susprm.mdl                         # haven't started on the HSTS yet
M       h1/models/h1sussrm.mdl                         #     |
M       h1/models/h1suspr2.mdl                         #     V
M       h1/models/h1suspr3.mdl                         # changes complete, but don't want to commit until I can successfully start the front end process
M       h1/models/h1sussr2.mdl                         # haven't started on the HSTS yet
M       h1/models/h1sussr3.mdl                         # changes complete, but don't want to commit until I can successfully start the front end process
M       h1/models/h1susomc.mdl                         # haven't started on the OMCS yet
M       h1/models/h1susmc1.mdl                         # haven't started on the HSTS yet
M       h1/models/h1susmc2.mdl                         #     |
M       h1/models/h1susmc3.mdl                         #     V


Images attached to this report
Comments related to this report
keith.thorne@LIGO.ORG - 05:21, Thursday 31 October 2013 (8328)
The front-end models are running - however data shipping to the data concentrator is not working (or only partially).
What is needed is to restart the mx_stream processes on each front-end.
  ** There should be a script 'restart_all_mxstreams.sh' in /opt/rtcds/lho/h1/target/h1dc0.  If you log into the boot server as 'controls' you should be able to run this script smoothly.
All this script (should) do is ssh onto each front-end, then do /etc/init.d/mx_stream stop, /etc/init.d/mx_stream start.  
* You can do this manually on each front-end to see if it fixes the problem.

[ and yes, we need more complete info, helpful docs consistent at both sites]