Reports until 14:44, Tuesday 08 January 2019
H1 CDS (DAQ)
david.barker@LIGO.ORG - posted 14:44, Tuesday 08 January 2019 - last comment - 15:42, Tuesday 08 January 2019(46288)
CDS Maintenance Summary: Tuesday 8th January 2019

WP8030 New Dolphin EEPROM

Jonathan, Dave:

All dolphin'ed machines (except h1psl0) had their EEPROM version changed from 08 to 97. This firmware forces the dolphin IXH611 card to run in PCIe Gen1 mode (max bandwidth 2.5GT/s).

Procedure for this upgrade can be found in my wiki page

After each reboot, we verfied the eeprom version number and that the card reports x8 slot width with reduced bandwidth (excerpt from dis_diag shown)

            PCIe slot state            : x8, Gen1 (2.5 GT/s)
            PCIe slot capabilities     : x8, Gen2 (5 GT/s)


No problems were encountered, the machine reboot order was closely coordinated with detector engineering.

WP8026 Reboot of DAQ computers close to their 208day limit

Dave:

Close to the time for a DAQ restart due to model changes, h1tw1 was rebooted and h1broadcast0 was issued a poweroff command.

At the time of the DAQ restart, h1dc0 was rebooted. It did not automatically start daqd on startup, I had to manually start the process from monit.

Soon afterwards I started h1broadcast0.

In all cases the reboot/poweroff process got stuck and the machine needed its front panel RESET button to be pressed (a known problem with gen1 front ends). Also in all cases an FSCK was needed due to the long uptime, slowing down the recovery time.

WP7996 Removal of EZCAREAD parts from h1odcmaster

Dave:

I modified h1odcmaster to replace EzcaRead input parts with EpicsIn parts. DAQ restart was required. I still need to add the CA data transfer to the CDS_CA_COPY guardian node

WP8027 h1guardian1 removal of swap space

Jonathan, TJ:

h1guardian1 was configured to not have swap space (overflow memory on disk) and was rebooted.

WP7929 ISI Senscor HAM and BSC

Jim W, Jeff K, Dave:

New models for all BSC and HAM ISI plus h1seiproc were installed today. We found a problem with the front end model startup when FIR filters were removed by the fir filter file existed. Details of this upgrade in Jim's alog.

Crash of h1iscey apparently due to running dis_diag

Dave:

After the Dolphin upgrades, I ran my script on h1build which ssh's onto each Dolphined front end and runs dis_diag. It proceeded through the corner station machines but when it got to h1iscey this machine rebooted itself, Dolphin crashing h1susey and h1seiey in the process.

h1susb123 IOP Dackill

Dave:

After recovering the systems, h1iopsusb123 went into a DACKILL state which suggested a possible hardware problem with the 18bit AI chassis. I verified no one was in the CER, and then restarted all the models on h1susb123.

Crash of h1oaf0

Jim W, Dave:

During on of the h1seiproc restarts, h1oaf0 models stopped running. When I ssh'd into the machine and ran "lsmod" the login session froze. I eventually rebooted the computer via the front panel RESET button.

 

Comments related to this report
david.barker@LIGO.ORG - 15:42, Tuesday 08 January 2019 (46291)

CAL/PCAL changes

Jeff K, Dave:

new h1cal[cs,ex,ey] were installed. DAQ restart was required.