Sat amp S1100173 was brought to the shop missing -14VDC. A short between this supply and ground was found as well as a burnt trace. C340 was found to be shorting -14VDC to ground. Replaced 10µF 25VDC Tant cap and repaired burnt trace. See aLog 29304 . This action has also been recorded in E-traveler.
Let the saga commence...
Jenne called me this morning and told me that while checking the status of the IFO before she came in today she noticed that the PSL was off. I logged in remotely to confirm and drove out to investigate. I found the laser and the crystal chiller off, diode chiller still running, and the crystal chiller flow, FE flow, and NPRO OK interlocks all tripped. Trends indicate this happened on 8/28/2016 at 01:19:38 UTC (8/27/2016 at 18:19:38 PDT).
Attempt #1
I turned the crystal chiller on, and it turned off within 5 seconds, this time with a head 1-4 flow error (H1:PSL-OSC_FLOWERR) interlock trip. The cause here was the HPO laser head 4 flow meter was not getting above the minimum threshold of 0.4 lpm. I called Peter to consult, and we decided to lower the threshold to 0.2 lpm. I then attempted to restart the crystal chiller and everything came up fine. The flow for head 4 was sitting at ~0.38 lpm, and over the next 10 or so seconds it increased to 0.51 lpm. After a minute or so it dropped to ~0.46 lpm and then slowly worked its way back to ~0.53 lpm (maybe something working its way through the head 4 laser head water circuit?). I let the chillers run for a few minutes to ensure everything was working properly. Trends indicate that this was NOT the original cause of the PSL trip.
Attempt #2
I then restarted the HPO, and everything came up OK so I let it warm up for a few mintues. After about 7 minutes or so the laser turned off again, this time with the HPO DB heatsinks overtemp (H1:PSL-OSC_DBHTSNKOVRTEMP) interlock tripped. Looking at the chiller screen, the temp of the water in the diode chiller had slowly increased from its setpoint of 20 °C to ~28 °C, which caused the interlock to trip. Checking on the diode chiller, the front panel was showing a 'F-3 error' error message. I don't know what this message means, but coupled with the fact the water had gotten so warm, I'm assuming something with the chiller cooling system stopped working (fans maybe? F-3 could be fan 3? Will look into it more tomorrow). I had to power cycle the diode chiller to clear the error. Restarting the diode chiller, and everything appeared to work fine. Trends indicate that this was also NOT the original cause of the PSL trip.
Attempt #3
I restarted the HPO and let things sit for a while to warm up. The front end laser came up without issue, and the system injection locked as expected so I let things warm up for a bit before activating the PSL stabilization subsystems.
While waiting for everything to warm up, I trended every crystal chiller based interlock we monitor at the time of the trip to try to see which one killed the laser first (see first attachment). As can be seen, the only interlocks that tripped were H1:PSL-IL_XCHILFLOW (crystal chiller flow interlock), H1:PSL-AMP_FLOWERR (Front End flow error interlock), and H1:PSL-AMP_NPROOK (is the NPRO running interlock). They all tripped within the same second, so my best guess at this point is that something happened with the Front End flow that tripped the interlock. What that could be, I'm not sure; trending the humidity over the same period shows no evidence of a water leak.
While the injection locked system was warming up the laser tripped out again, once again with the HPO Head 1-4 Flow Error interlock tripped. Full data trends show no reason why the interlock would trip, none of the channels appear to drop below 0.2 lpm (see 2nd attachment). The MEDM version of the Beckhoff interlock screen also showed that the HPO Diode temp guard interlock had tripped, although the Beckhoff screen did not indicate a trip of this interlock and a full data trend also showed no trip of this interlock. The discrepancy is unclear at this time.
Attempt #4
I turned the chiller back on and let it run, all OK. I brought up the HPO (no issues), the front end (also no issues), and turned on the injection locking (also also no issues). While waiting for the system to warm up, I set up a StripTool of the 4 laser head flows and sat here and watched it. Less than 10 minutes later the laser tripped out again, same HPO Head 1-4 Flow Error interlock tripped. I happened to be looking at the StripTool when it happened and noticed the flow for laser head 4 drop, come back up, and then the laser tripped. Zooming in on the spot (see 3rd attachment) shows this. It doesn't show the flow drop below the threshold of 0.2 lpm, but I don't think StripTool is fast enough to catch it. It appears we have an issue with the water circuit for the 4th laser head in the HPO. Humidity trend doesn't indicate a leak, so something is likely going wrong with the flow sensor.
I've left the laser and the crystal chiller OFF. We will go in to the PSL tomorrow and attempt to figure out what the problem is. Please do not try to restart the laser.
As an aside
I took the opportunity while the laser was down to watch the current for HPO diode box 3 (DB3). We have had an issue where the current shows 100.2A drawn, which is impossible as the power supplies max out at ~60A (originally reported here and FRS 5955). I watched the reported current for HPO DB3 during the PSL restart process to see if it always reports 100A or if there is a spot during the startup procedure where it changes. See the below table:
PSL State | HPO DB3 Reported Current Draw (A) |
HPO off | 0.0 |
HPO on | 50.2 |
FE on | 50.2 |
Injection Locked | 50.2 |
Unfortunately I couldn't get any farther in the startup procedure (see above), but there was no point today where the reported current draw of HPO DB3 was 100A. Will monitor this the next time we restart the laser to see if there's a point where the reported current draw changes.
Filed FRS #6105.
Forgot to mention, that with the crystal chiller on and offs, I ended up adding 175mL of water to keep it full.
C. Cahillane I had a look at the ER9 actuation measurements vs. the new DARM model, looking for a comparison to O1 covariance values. The results are shown in the hard-to-see plot below. I cut off the frequency vectors below 6 Hz because the noise was very high and the DARM model did not fit well with the measurements there. The results indicate greatly inflated variance and covariance, but this is probably due to the lack of linear fitting. The UIM stage, for instance, is a factor of 2 off at 100 Hz again, just like in O1. This time I haven't yet removed these systematics, I'm just looking purely at measurements/DARM model For a better idea of what the covariance values mean, I have made a correlation matrix. The correlation of two variables has the following definition:cov(x,y) corr(x,y) = -------------, -1 <= corr(x,y) <= 1 std(x) std(y)
So if I take the covariance matrix and divide it by the square root of its diagonal, I will end up with a correlation matrix. The covariance matrix is shown in the titles of the plot attached to this alog. The correlation matrix is shown here:Re(A_U) Im(A_U) Re(A_P) Im(A_P) Re(A_T) Im(A_T) - - Re(A_U) | 1.0000 -0.7111 0.5866 0.1474 -0.3010 0.1088 | Im(A_U) | -0.7111 1.0000 -0.2323 0.2400 0.1260 0.1124 | Re(A_P) | 0.5866 -0.2323 1.0000 0.0672 -0.0510 0.3610 | Im(A_P) | 0.1474 0.2400 0.0672 1.0000 -0.4935 0.3717 | Re(A_T) | -0.3010 0.1260 -0.0510 -0.4935 1.0000 -0.3554 | Im(A_T) | 0.1088 0.1124 0.3610 0.3717 -0.3554 1.0000 | - -
The diagonal of the correlation matrix is unity, as expected. The largest correlation is between the UIM real and imaginary stage, which is to be expected since the UIM stage has a large systematic error polluting it's statistics. The other stages do not suffer from great systematic errors, and all have correlation statistics below 0.5. Once I've made the systematic error fits I'll post this matrix again in a comment for comparison.
New ISS outer loop prototype board was modifed and put in. Modification details will be in a separate alog.
PD5, 6, 7, and 8 anode and cathode cables were disconnected from the old transimpedance box under HAM2 and was connected to the PD5-8 cable for the new prototype board. One strange thing about the old setup was that the "LR C" and "LR A" cable that come from the chamber feedthrough were connected to the anode and cathode input of the old box, respectively. Since apparently they gave us a correct signal (no forward bias across the diode), I connected "LR C" chamber side cable to "LR A" rack side cable for the new setup.
I connected PD5-8 cable to PD1-4 input in the new prototype box. We can conveniently switch back and forth between the old and the new by just switching one cable that goes to the "outer loop input" connector on the 1st loop chassis.
I briefly connected the output of the new box to 1st loop, and enabled the servo, but it didn't work. It's not clear if the sign is correct, but there's no way to change the sign externally, I need to open the box. I'll make more measurements next week to see if the sign is indeed wrong.
I connected the old second loop back to the 1st loop.
Cold solder joint in the short circuit protection box.
Initially I was confused with the gain control of the VGA chip in the new box as there seemed to be a factor of 2 missing in the gain control chain. It turns out that PSL people put a small box on the anti-imaging chassis front panel in CER. These boxes are just DB9 pass-through except that all traces have 100 Ohm resistor in series.
Anyway, pin1 of this box, which corerespond to the positive leg of the VGA gain output, didn't have any signal coming though. When I opened the box I was baffled to find no error, but it turns out that just a tiny amount of bending would cut the connection. I re-soldered the resistor and the connectors, and now it seems to be good.
But this is not the reason why the new servo didn't work.
This is the box that was fixed.
There appears to be an entry in DCC for S1201761, but I don't have permission to view that.
Update:
Vern has the permission to view S1201761 and has found that the corresponding D document is D1102351, aLIGO PSL DAQ DB9 breakout. I can view that D document without problem.
The summary on that page doesn't sound like it was for short circuit protection as I have initially guessed, but an in-line damper against high frequency oscillation of AI output amplifier. I don't know if we need this, but if we do, I'd strongly suggest to improve the mechanical design (or move them inside the AI chassis).
While I was working on ISS outer loop at the PSL field rack, H1:PSL-MIS_FLOW_OK tripped.
Peter King walked me through the diagnostics over the phone and I was able to open the shutter.
2.5 hours into being locked at 50 W, we saw two PI ring up: ITMX 15522 Hz (Mode 2) and ETMX 15541 (Mode 17). We couldn't damp them and they broke the lock (22:22 UTC).
We had about 10 minutes before lockloss to try different damp settings. For Mode 2, any phase/gain changes I made rang up the mode more; the existing settings were already providing the most shallow slope. For Mode 17, a slight phase change helped but instability still bounced back. Below is Mode RMS monitor after the struggle. (Green and Red modes only appear to be ringing up from bleed over).
I've also attached the trend of the two modes.
Something looks like it really kicked on Mode 2; there's strong slope changes before we started trying to adjust damp settings. Both modes stayed rung up into the next lock aquisition which I've never seen before; we could see them on the RMS monitor before DC READOUT. I stopped at 30 W to tweak phase settings again and allow to damp. These settings have been saved to the guardian. After that we continued to 50 W and stayed locked for almost 2 hours without incident (lockloss unrelated to PI). Overall, after tweaking at 30 W for full peak minimalization, I've lowered gains (perhaps the high gains were 'kicking' things too much?).
This is the same time scale into the lock as we saw the other night, though note that ETMX did not ring up and ITMX did not have any active damping that night. Also note the oscillation in Mode 17 ~1156310300 in the trend.
We're leaving the interferometer locked at 50 W with the tweaked PI settings so we'll see in the morning.
Most of tonight's work was measuring the HARD Pit loops, to give them some low noise settings to match the pre-existing Yaw low noise settings. I also tried moving around the spot position on the AS_C QPD, which is pretty much the only QPD whos offsets we haven't tried tweaking lately.
It's clear on the AS camera as well as on the ALS X green camera that something moves significantly between 2W and 50W in yaw. I was hoping that perhaps adjusing the SRC alignment would help with this somehow, but it didn't. What I do think it might help with though is the fact that previously, we've seen our sideband powers drop significantly when moving POP and Soft offsets to increase the carrier power recycling gain. I haven't tested this yet though. The first two attachments show the improvement in AS90 and the AS90/POP90 ratio when the AS_C spot is moved. The max value in yaw that I moved was to 0.94, but this was obviously too far. Perhaps tomorrow I'll picomotor a little bit so I can go a little farther, if this indeed helps the sideband buildups.
I moved the filters in the CHARD loops around today so that I could fit in a new lowpass at 200Hz. CHARD has been acting funny lately, and is likely why we've been tripping the quads' rms watchdogs the last day or so. To at least help, I put in a high-freq cutoff so that we can at least limit the amount of sensor noise we send to the quads. Probably we should add this to the DHARD loops too, since none of them have had cutoffs of any kind during the acquisition sequence for a long time now. I modified the ISC_LOCK guardian, as well as the sensing matrix code to be aware of the new locations of the different filters.
I measured the HARD pitch loops several times while modifying the loop shapes. Both Chard and Dhard now have 50W modifying filters that move the plant compensation frequency response to match the 50W resonant frequencies rather than the 2W freqs. They also both have a 25Hz cutoff. CHARD Pit I can lower the gain by a factor of 2, but DHARD pit I can't. Most of the measurements in the attached screenshots are with the BoostHBW still on. These should have been turned off, and are turned off for the Yaw loops when we go to low noise mode, in order to win back enough phase to reasonably have our cutoffs. I need to think more carefully about how much gain we need at what frequencies before I finalize these loop shapes, but it seems like (from the DHARD pit measurement with the boost removed) that I should have enough phase to move the cutoff lower, if I don't require all that gain at low frequency.
The guardian is up to date with these changes - other than helping the still-cooling IFO with some alignment while trying to catch DRMI lock, you can just request Lownoise_ASC, and the IFO will go up to 50W and hang out. (I've commented out all of the offsets in the Adjust_offset state, but other than that, things are all as they were, modulo the changes mentioned above.)
Terra is writing up our PI experiences and mysteries from today, but I expect that the IFO will stay locked for just under 3 hours, so I'm going to put it in Observe just in case the data groups want to test pipelines. The noise is still bad though, especially the intensity noise coupling as the IFO thermalizes. To do this, I changed the nominal IMC_LOCK state to ISS_TR_CLOSED rather than the usual ISS_ON (which implies 2nd loop on), since we can't close the 2nd loop right now - we need the new hardware so that we can have both 2nd and 3rd simultaneously.
Oooh, I almost forgot. Sheila and I tried closing a dither loop around the PRM to keep the spot position on PR2 constant. This seems like it was doing good things, however once she fixed the 3rd loop's SR560 we decided that it would be good to get back to a campaign of persuing low noise for a while, rather than chasing offsets around. We should consider coming back to this though when we come back to offset moving for PRCgain improvement.
After the nearly 3hr long lock at ~50W, we were paused at Locking Arms Green due to an X-arm which started low in power and slowly recovered (probably due to cooling).
Backing up: During the previous long lock at high power, Jenne noticed that ALS X was visibly misaligned in yaw (could easily see in camera).
After the lockloss and attempting to get H1 back up, we could see the ALSX was slowly struggling to recover. The theory was this is due to some sort of cooling effect after the 50W lock. Attached is a screenshot of trends for oplevs, TMS pit/yaw, ALS transmission, and the PSL power. The story here (starting from left to right) is:
(For the plots, oplev pitches are the plots on the top & yaws are on the bottom)
Commissioners (Jenne, Terra, and Sheila) worked on H1 addressing ISS 3rd loop earlier, PRM spot servo, and then with some longer high power locks toward the end.
Notes from the evening:
General Operataor note regarding DRMI alignment: Due to a few DRMI locklosses, stopped at LOCK_DRMI_1F and tweaked on usual suspects: BS, PR3, PRM, IM4, & SRM (for my session, the BS & PR3 were the only optics to help out). Then locked up the DRMI with no issue.
J. Kissel, R. Abbott, D. Coyne, P. Fritschel, V. Sandberg As a part of the engineering change request to remove/disable the problematic QUAD's PUM driver, RMS current, entirely analog, watchdog (ECR E1600270, and FRS Ticket 6100), I've looked into several of today's episodes of watchdog tripping. I show four examples: 18:15 UTC -- where we ride through an aquisition event where the watchdog surpasses its threshold but does not trip. 19:20 UTC -- A lock loss that we believe is caused by the PUM watchdog tripping 19:37 UTC -- A lock loss caused by watchdog tripping, 01:00 UTC -- A "normal" lock acquisition. These examples are plotting the time series of the RMS Current Monitor EPICS channels (e.g. H1:SUS-ITMY_L2_RMSIMON_UL_MON), which serves as the trigger signal for the Flip Flop circuit in the driver. Check out pg. 3 of T1100378 for a block diagram, and a more detailed copy-and-paste draw that shows the inter-circuit connections is in D1600332. From the PUM driver board and monitor board schematics (D070483 and D070480, respectively), with the help of the block diagram, the calibration of these RMS IMON channels is as follows: RMSIMON [A] = RMSIMON [ct] * ADC Gain [V/ct] * RMS Input Gain [V/V] * RMS Output Gain [V/V] * Current Monitor Gain [V/V] * Output Impedance [A/V] = RMSIMON [ct] * 40 [V] / 2^16 [ct] * 1 / 10 [V/V] * 10 / 1 [V/V] * 3/2 [V/V] * (1 / 15) [A/V] = RMSIMON [ct] 6.1035e-5 [A/ct] As is reported in the PUM driver's design study (T0900277), user guide (T0900290), and several test reports (e.g. S1101561), the RMS watchdog level trips at ~100 [mA]. Thus, the attached examples (except for the 18:15 UTC example) works as designed. The StripTool isn't calibrated but after looking at the raw DTT time series and playing around with the calibration at various points: the voltage going into the Flip Flop circuits that corresponds to ~100 [mA] is ~7 [V] RMS, which is about ~2000 [ct] in the RMS I MONs. The StripTool for the 01:00 UTC lock acquisition sequence therefore shows that quiescent current levels are a few hundred counts, or a few 10s of [mA] RMS, well below the threshold. The humps and bumps you see are transitions between states as various DARM / QUAD fed ASC loops turn on. I'll also note, for the record that Jenne installed a low pass filter (aLOG to come later) between the DTT measurements and the StripTool, so we may be OK again. Maybe the watchdog being problematic every few months is because that's about the timescale in which we usually overhaul our ASC schemes, and the trips are just indicative of transitionary, mid-commissioning, loop outputs. Finally, the design study (T0900277) and user guide (T0900290) mention the original motivation for the time-constant of ~10 [sec] at a current level of ~100 [mA] was entirely motivated by outgassing concerns of a hot OSEM. Research is on-going as to whether this makes sense. Preliminary investigations find an early result (T0900611) show that the time scale is more along the lines of minutes rather than ~10 [s]. More details to come as the picture solidifies. Follow along in the FRS ticket comments!
Keita pointed out that the ISS 3rd loop 560 was overloaded, and was in low noise when it should have been in high dynamic range. I went to check it out and indeed, it saturates even without any input or with a terminator.
Fil didn't have a spare working 560, but loaned me an SR650. We are using an AC coupled low pass at 30 Hz, gain of 0 dB, positive polarity, to imitate the settings Keita used in 27895
We don't know how long the 560 has been overloaded, but this probably is the reason why we have had so much difficulty with the CSOFT instability in the last week.
First lock attempt after Sheila's fix we were able to go to 50W without any instability. No fancy offsets or anything, so the PRC gain still dropped (to ~24), but we didn't have any trouble acquiring or holding the lock.
Experiment on CP4 today: Started at 90% full Filled to well over 100% at 45% open on LLCV At 12:10pm local, CP reached 100% full (took ~3 hr 40 min) I sat at MY for about two hours with nothing happening (at this flow rate it takes a lot longer than 35 min. to overfill till LN2 comes out) Finally I began to ramp the LLCV 5%, and then 10%, every 20 min. or so At 72% open the signal from the flow meter was very noisy After a few min. at 72% the exhaust pressure started to rapidly rise so I set to PI-mode with min. allowance set to 39% (love the new code!) The pressure was still high and unstable and because I didn't want LN2 to spew onto the flow meter I lowered the PI-code min. to 37% The exhaust pressure has slowly ramped back down to nominal Leaving in PI-mode (at 37% min) over the weekend and will periodically monitor pump level and pressure
There should be NO alarms for the exhaust pressure The pump level will alarm for a few hours until it falls below 98% full (it is well over 100% right now)
It appears that something goes wrong when zooming in that creates random lines across the plots.
Pyplot does some strange things with the data when you zoom in, sometimes. Maybe this is a result of data gaps being handled poorly by pyplot? I've been able to get these artifacts to go away by resetting the plots and zooming in slightly less, but Patrick and I weren't able to get these particular ones to clean up. I'll see if I can make this a little nicer on Monday.
It looks like using NaN to fill in the gaps in the data was not the right thing to do. Filling with POF_INF seems to eliminate the glitching in the plots. I've also set some hard coded Y-axis limits on the pit and yaw plots and scaled the sum plots to the max in each dataset, so the plots should start out closer to a finished product.
I've also added the HAM2 oplev, which doesn't have any DQ channels, so I've used the OUT16 channel. Shouldn't matter much, since the plots are of m-trends.
[Sheila, Jenne]
This is starting to feel a bit like we're using popsicle sticks and duct tape to make a splint and hold things together, but we're still having trouble with the vertex ASC at high powers, so we have tried implementing a dither/ezcaservo combo loop for MICH pitch. As of now, Sheila has a script that will turn on a dither line for BS pitch and set up the demodulators in the ADS. The script then sets up a cdsutils ezca servo to move the offset of ASC-MICH_P to zero the output of the demodulator.
Earlier, we tried just a regular dither servo, moving the BS to minimize the demodulated version of AS90, but that wasn't working. Note that earlier today we removed the AS90 element from the SRM dither loop, so the SRM dither now only looks at POP90.
Since adding MICH offsets worked okay over the weekend to fix the sideband buildup after the POP offsets are adjusted, we thought we'd give the demodulator-adjusted offsets a try. This is well after fixing the TCS situation described in alog 29237.
On at least one occasion, this new system didn't keep up with the IFO heating when starting the offset from zero, so now it starts the pit offset from the value that Sheila et al. found the other day.
We've tried it now with both BS pitch and yaw under this new additive-offset-like configuration, but the sideband buildups still seem to be decaying. Perhaps SRM dither should go back to the AS/POP ratio, and BS should be servoed to maximize POP18?
Side note: We have got to find time to re-look at the OMC ASC. Engaging the dither loops once again rails the OMC suspension. This was happening last summer, and then the problem somewhat mysteriously went away, so I don't have any magical solution right now. But this certainly can't be good for our noise.
Reminder of Dan's parallel-universe HAM6 dc centering scheme: no OMC sus actuation needed, but requires giving up centering on one of the two AS WFS.
In case the DC range of the OMC SUS is the issue:
The WFS spot centering pushes OM1 and OM2, and causes OM3 and OMC SUS struggle to align the beam to the OMC.
To avoid this, align the OMC with OM1 and OM2, then use picomotors to align WFS spots.
This offloading is not trivial to do when the WFS spot servo and the OMC ASC are running.
So, the single bounce wil help to work with this offloading.