J. Oberling, P. King
The investigation
Today we investigated what appeared to be frozen PSL channel: H1:PSL-OSC_DCHILFLOW. Upon entering the laser diode room and comparing the data on the PSL Beckhoff computer's CHIL screen for the diode chiller to the front panel of the chiller itself, we found that all of the channels were not reading properly (flow rate, temp set point, actual temp, conductivity). These data are read into the Beckhoff computer via RS-232 connections; the flow rate is read into the PSL Interlock Control Box, and the rest of the data are read directly into the Beckhoff computer via an add-in PCI RS-232 card. Digging into the Beckhoff code it appeared that the Beckhoff data channels were not properly reading in the data from the diode chiller.
What we think happened
Back in May we performed several tests of the interlock system for the chillers to see how they behaved when certain cables were unplugged. See Peter's alog here, under "The Evidence," point e where he talks about opening and closing the interlock switch. We unplugged the cables to open the interlock and plugged them back in to close. When this was performed, we did not restart the PSL Beckhoff PC; we should have, as RS-232 is not hot-swappable. Therefore the channels for the diode chiller got stuck (interestingly enough the crystal chiller's channels have all been fine, and we performed the same test with it).
The solution
To fix this, we restarted the PSL Beckhoff PC. We first called the control room to inform them of our intentions, since restarting the PC shuts the PSL down. They put the IFO in a safe configuration (in this case they simply unlocked the IMC) and we shut the laser down and restarted the Beckhoff PC. Everything came back on just fine, the PSL fired right up without issue. Looking again at the CHIL screen the diode chiller data now matches that shown on the front panel of the diode chiller. Problem solved!
One caveat
While we are now reading data on H1:PSL-OSC_DCHILFLOW, we do not trust the actual reported flow rate. The same holds true for the crystal chiller, H1:PSL-OSC_XCHILFLOW. Back in May we replaced the turbine style flow sensors in the PSL chillers with vortex style flow sensors (no moving parts in the new sensors). With the sensor change we also had to change the flow sensor calibration, a number we received from LZH, who had tested these flow sensors (we went from ~550 pules/liter to 970 pulses/liter). Upon this change the flow rate on the crystal chiller "dropped" from ~18 lpm to 9.7 lpm, a change of almost a factor of 2. There was no associated drop in the HPO laser head flow rates (each of the 4 laser heads has its own flow sensor), therefore we do not trust this calibration number and therefore do not trust the flow rate being reported by the chillers. That being said, we can still use the reported flow rate to monitor if there is a change in flow and as an indicator that something might be wrong. The flow rate should not change, and if it does then something is not working correctly regardless of the flow rate being output by the chillers. We do, however, trust the flow rates being reported by the HPO laser head flow sensors and by the water cooled power meter flow sensors. To fix this, we will take the next opportunity when we have to swap out the running chillers for the spare chillers. Before putting the spare chiller into operation, we will hook up an external flow meter on its own water circuit (this is before hooking the chiller up to the PSL water circuit), measure the flow of the chiller and adjust the pulses/liter calibration number until the chiller reports the same flow rate as the external flow meter.
The h1fw1 computer is now only writing raw minute files to it's locally attached SSD RAID file system. The h1fw2 computer is writing science and commissioning frames to the /cds-h1-frames file system through the h1ldasgw1 computer. We will monitor the performance of this configuration and if it works reliably we will change the computer names to reflect their new usage. This should have no effect on h1nds1, other than the gap in data that occurred during the reconfiguration and restart of the frame and trend writers.
I am starting work on WP 5471 which will move the DMT to EPICS IOC from a test install to a perminant fixture. Dave will then configure the EDCU to capture the channel in the frames. There will be some interruptions on H1:CDS-SENSMON_CAL_SNSW_EFFECTIVE_RANGE_MPC and H1:CDS-SENSMON_CAL_SNSW_EFFECTIVE_RANGE_MPC_GPS.
This is follow up work on previsous work listed in the alog at: https://alog.ligo-wa.caltech.edu/aLOG/index.php?callRep=20925
I have added the following three guardian DIAG nodes:
DAQ restart required to acquire new channel names.
These nodes will all be folded under the IFO top node, but have not yet. Waiting for go ahead from ops...
I'm also in the process of updating the GUARD_OVERVIEW screen to add these new nodes. I will post when that screen has been updated.
In discussion, we are proceeding with tying this into the READY bit. Expect an alog from Jamie - but be aware that you may need to look further at some things if you cannot hit the INTENT bit. For example, all SDF must be clean (not red) now. You may need to consult the on-call commissioner if you cannot interpret and address any red SDF differences before going into observation mode. We have some time before the start of O1, so we'll be troubleshooting this if needed this and next week.
At least twice it was just stage 2. Is something going wrong with the ISI? or is this a side effect of the SR3 problems, which could cause the MICH ASC to get a kick?
Over the weekend while it was windy Evan noticed that we were close to saturating PRM M3, which is because the offloading I did 19850 didn't do as good a job as the M2 offloading, even though it solved the M2 range issue.
Tonight I had a second look. I moved the roll off up in the sus comp filter, moved the complex zeros down a bit to gain some phase, and removed the 3 Hz pole (this design was originally intended for a crossover above the sus resonances and didn't include the low frequency pole we are using.) The filter comparison is the second attached screenshot. The first one shows the current crossover measurement, the third shows the drives. The prominent peak at 0.6 Hz could be from an ISI. The last screenshot shows the current configuration of the filter.
None of this is in guardian, so the next time things are locked it should revert to the old offloading.
Tonight was our chance to work with PRMI.
We had a look at the phasing of AS36, following the same procedure as we used with MICH bright the other day 20961. (locked PRMI, aligned by hand, used OM1 +2 to align all the light onto one quadrant, maximize the Q signal. )
For AS A the results were satisfying,
For AS B the situation was also reasonable, but everything was a little less good (the phases are similar for all quadrants, but not quite the same, the signal levels varied more, and when exciting the BS the phases were almost right for PIT and YAW. Overall this seems reasonable to use. The resulting phases are in the attached screenshot, the new phases are the epics values, the old phases are the setpoints. AS B quadrant 3 did not change, it is 63 degrees.
I will revert these momentarily, but the next steps would be to set the matrix for AS A yaw back to something reasonable (the current set up doesn't use segments 1 and 2), set the phases to the values in the screenshots, lock DRMI, excite SRM and move the phase of all 4 quadrants on each WFS in common to minimize the SRM signal in the Q phase. Since DRMI locking is frustrating tonight with the SR3 glitches this will have to wait.
[Sheila, Anamaria, Jenne, Cheryl, Dan, Evan. Richard via phone]
SR3 glitches are still causing us grief. This is a continuation of the story started last night (alog 21046), and worked on earlier in the day (alog 21062). After some heroic efforts, we have determined that we cannot lock in this state.
To prove to ourselves that indeed it was a problem with the analog actuation chain we investigated turning different pieces of the analog electronics off. The first attached Dataviewer screenshot shows the NOISEMON channels of the SR3 M1 stage throughout this investigation. When the local damping is on, we cannot see the glitches very clearly in the noisemon channels - but we do see them in the voltmon channels (The second attached screenshot shows that the T2 voltmon channel does in fact see the glitches, so it's not a broken monitor). When the local damping is off, we only see the glitches in the noisemon channels.
Since we do not see the glitches when the AI chassis is powered off, we infer that the noise is not generated in the coil driver board. (Note that we also borrowed a triple top coil driver chassis from the H2 storage racks, S1100192, but did not swap it in since we don't think the noise is coming from there). We have not, however, determined whether the noise is coming from the AI board (probably not, since it's still there after a swap?) or the DAC.
We tried a few times to lock the IFO after the AI board swap, but we were continually losing lock before the CARM reduction is complete. Lockloss investigations showed that the problem for most of these was SR3 motion.
At this point, we have determined that we need more expert brains to have a look at the analog electronics. The owl operator shift has been cancelled, since there will be no more full IFO locking happening until this problem is resolved.
Looking at the noisemons for a 5 hour stretch when I knew no one was on site. See attached.
I would like to help here, but I am not 100% sure I follow the picture. After the AI swap (which seemed logical to me), you saw no more glitches (presumably in the noise monitors for those channels?). Then the story gets hazy. You once more tried to lock and still see SR3 motion as a lockloss initiator. What exactly is left that you suspect is causing inadvertent motion in SR3?
At LLO there are problems with engaging the PRC2 ASC loop when the recycling gain is low, but we don't seem to have this problem at LHO. One theorey about the difference could be that we just arrive in lock with a higher recycling gain. Although I believed this myself, the data I downloaded from the first 13 days of ER8 seem to indicate this is not the difference. I looked at times when we first arrived on resonance (112 examples), when we transition to DC readout (after the ASC including soft loops has been on for about 1 minute, 85 examples), and after we power up to the maximum available power (60 examples).
The recycling gain corresponding to a critically matched carrier TEM00 is at ~33.5. The successful plots show 32.5 as the lowest value. One might conclude that we typically arrive at the over-coupled case after initial lock. However, the conclusion that the wavefront sensor sign flips at the critically matched point is only true, if we neglect mode matching. If the incoming beam is larger than the cavity beam, the second order mode will support the over-coupled case, whereas a small incoming beam will reduce it.
SR3 is still glitching - IFO did relock a few times, enough to verify that the porblem is not fixed.
OWL shift called off - Jim was OWL OPS, but staying home.
Sheila, Jenne, Anamaria - staying to make alogs.
shift started with PRM satellite box swaps (8/31 23:00 - 9/1 00:10UTC, one hour)
relocking issues: 9/1 00:10UTC to 5:14UTC
- alignment that I corrected without a full realignment (9/1 ended 1:00UTC)
- delays by BS ISI being tripped - happened twice (ended 2:14UTC)
- delayed by SR3 glitching, preventing IFO from going to CARM_5PM (2:14UTC to 5:14UTC)
Having called Vern, Richard, Dave: 5:14UTC
Dave restarted h1nds1, so it's back
- option later tonight is to power cycle the computer and then log in and restart the process
Richard
- options are to power cycle AI chassis and or coil drivers for SR3
- and then change the coil driver if power cycle doesn't work
Vern
- ok to call off PWL shift if no SR3 fix and therefore no locking
- have emailed Jim who's on OWLs, to let him know
Currently:
Sheila and Jenne out to LVEA to power cycle AI chassis and coil driver on SR3 (5:46UTC)
rebooted AI and coil driver - no fix
- no change
powering off AI for a few minutes 6:08:11UTC
- voltmon power spectra shows T2 noise is now gone
AI chassis swapped out 6:24UTC
- SR3 looks better, going to CARM_10PM to look at it there
Anamaria's comment about voltmon:
when moving SR3 at DC using the alignment sliders we see 1urad of pitch on M0 shows up as 0.5 urad pitch of oplev and 2 cts of voltmon.
h1nds1 daqd process stopped running at 21:47 this evening
computer did not lock up. I logged in and started the process but it quickly stopped again.
I restarted a second time and now it is running. Investigation is continuing.
Monit is not restarting the process, as root I started with /etc/init.d/daqd_nds1 start
If h1nds1 stops running overnight and will not restart itself we have two options:
The work has been completed. The IOC is installed in h1fescript0. It will start at system boot. Dave has put the daq changes in such that it will be captured in the frames after the daq restart.