J. Kissel Albert asked me to describe the recovery from the Jul 5-6th 2017 Montana EQ for his upcoming LVC meeting LIGO status talk, so in doing so, to make sure I reported correct and citable information, I created a detailed timeline of the recovery. I post it here, for the record, and in case it might jog some folks memory in figuring out some smoking gun for the new noise source. %%%%%%%%%%% Exec Summary: We were able to return to nominal low noise within a day. We have problems with higher-than-1.0kHz harmonics (3rd harmonics at 1.5 kHz mostly) rung up for only a few days. We had problems with the automated damping of the 0.5 kHz and 1.0 kHz modes because of coupling changes and more EQs and HEPI Pump Failures, but only for about 8 days after the EQ. However, because a bunch of other problems, cancelled shifts from defeated employees, regular maintenance days, 2 weekends with only the operator on site, and broken scripts coupled with operator confusion, it wasn’t until Jul 19th 2017 (local), the Wednesday after maintenance (only 14 days after the July 5-6th 2017 EQ) that we recovered the new-normal sensitivity of ~55-60 Mpc. I back up this statement with a detailed timeline below for your perusal backed up with aLOG citations. %%%%%%%%%%% Detailed Timeline: - Jul 6th 2017 6:31 UTC, (Last half hour of Eve Shift Wednesday local time) EQ Hits. - Owl shift and Day shift Thursday morning (Jul 6th 2017) is spent recovering the SEI and SUS systems, and restoring alignment. LHO aLOG 37347 - We were able to recover ALS DIFF by mid-day Thursday, after hand-damping some Bounce and Roll modes. LHO aLOG 37357 - We we up to DC readout by the evening that evening (with no stages of DCPD whitening), but then one of the end station HEPI pump stations tripped. LHO aLOG 37359 - Jul 7th 2017 Once recovered, I worked on actively damping the 1.5 kHz 3rd harmonics for the first time that had rung up during the EQ. LHO aLOG 37361 These were the only modes new modes that were particularly problematic, and it was only a few of them, because their mode separation was in the ~mHz, and their coupling was week. - Then the weekend hit, and operators were still relatively inexperienced in tuning violin mode damping. While they were mostly successful by skipping the automated damping, and slowing inching up the gain on the active damping, they’d still occasionally they’d ring up, as they do after any normal lock loss, but when they tried to adjust the settings things got more rung up; e.g. LHO aLOG 37394 - Jul 9 2017 at ~2 UTC on Travis’ eve shift, something went belly up as the range slowly decayed and the lock broke. LHO aLOG 37399 Since it was a “slow” lock loss, we were likely sending junk to all of the test masses for an extended period of time. That coupled with another EQ, LHO aLOG 37403, and high winds, 2017-07-10 Wind Summary Page, reverted all the hard work on violin modes, including some of the new 1.5 kHz modes. Upon recovery all modes had rung back up, and modes began to become unresponsive to normal settings, LHO aLOG 37402 - Monday Morning / Day shift (Jul 10 2017), we picked up focused efforts on new modes, and made sure no other modes went bad, LHO aLOG 37412 and LHO aLOG 37426. And by that evening (~8:00 pm local, when the A team left), we now had what we now now to be the new normal: LHO aLOG 37433 and we thought violin modes were under control. - Then that night (Jul 10-11 2017) Patrick got his first exposure to what we’ve now found out later, that the coupling on some of the modes had changed (because of the Jul 6th EQ? because of the Jul 11th slow lock loss? Dunno.), and would now run away with time, LHO aLOG 37438. Not yet understanding that the mode coupling had changed, and coupled with more EQs and PSL trips due to continuing problems with flow sensors over night, meant we were dead, not debugging most of Monday Jun 11-12 UTC, and then we went into normal maintenance on Tuesday morning. - Recovery after maintenance (Jul 11 2017) went OK, but because the fundamentals were rung up so high from the night before, we’d lose lock earlier in the lock acquisition sequence than normal. Confused and bewildered we cancelled 12 hours worth of shift; LHO aLOG 37467 and LHO aLOG 37468. We found out later that it was just another PD that was saturating from the now re-rung up violins, LHO aLOG 37500. At this point I’ll emphasize — we were no longer having trouble with high order violin modes. It was that fundamental (~500 Hz) and 1st harmonic (1 kHz) “normal” damping wasn’t working. Further, we’re able to regularly get to reasonable sensitivity, we just didn’t hit the science mode button because we were still actively playing with violin mode damping filters to get them to work. It’s Wednesday Jul 12, so only 7 days after the EQ. - Another HEPI Pump Trip at EX while the IFO was down likely rung the normal 0.5 and 1.0 kHz modes back up, LHO aLOG 37475 - Jul 12 2017 Wednesday, we begin to realize the normal automated damping isn’t working LHO aLOG 37484, but it was unclear which modes were going bad, over the next few shifts, we slowly and systematically checked the phase and gain of every loop, and updated the LHO Violin Mode Table and Guardian accordingly. - The Beam Splitter CPS started glitching, for no known reasons, LHO aLOG 37499, which also decreased productivity. - By Jul 13 2017 Thursday, 8 days after the EQ, we had re-configured all of the automated damping and quantified how many of the modes now had changed coupling, LHO aLOG 37504, and violin modes were no longer a problem. The remainder of the time before we regularly went back to observing was spent exploring why the sensitivity was much worse, and/or other unrelated problems resulted in more defeated shift cancelling. It was unrelated to any violin modes. Spot position moves (LHO aLOG 37506 and LHO aLOG 37536) Fast shutter problems (LHO aLOG 37553) More PSL Trips (LHO aLOG 37560) A broken script and a mis-calculated SDF accept left us on only one DCPD for a few days (LHO aLOG 37585) etc. And our sensitivity has not changed since then. So I would say that, aside from this new mystery noise that we still don’t understand, we were fully recovered by Wednesday Jul 19th 2017, 14 days after the EQ.
TITLE: 08/17 Owl Shift: 07:00-15:00 UTC (00:00-08:00 PST), all times posted in UTC
STATE of H1: Observing at 53Mpc
INCOMING OPERATOR: Cheryl
SHIFT SUMMARY:
H1 running smoothly with a lock of about 40hrs.
LOG:
H1 locked for ~36hrs. Winds died down 2hrs ago.
L1 dropped out of lock 30min ago.
V1 no longer appears on our SenseMon/Range DMT plot. On GWIstat they have a status of ("Info too old"). On Teamspeak, I saw that they bumped off the LIGO-VIRGO Control Rooms channel (earlier they entered chat saying they could not acccess GraceDB).
OK, looks like Virgo's GraceDB issue was from 2 days ago (The TeamSpeak chat window lists a time, but no date, so I was curious). Anyway, I chatted with Virgo when they returned to the Control Rooms channel & they cleared this up.)
TITLE: 08/17 Owl Shift: 07:00-15:00 UTC (00:00-08:00 PST), all times posted in UTC
STATE of H1: Observing at 52Mpc
OUTGOING OPERATOR: Jeff
CURRENT ENVIRONMENT:
Wind: 18mph Gusts, 13mph 5min avg
Primary useism: 0.07 μm/s
Secondary useism: 0.11 μm/s
Windy walking in tonight, but might be calming down.
QUICK SUMMARY:
The Lock Clock must have been restarted, but looking at Verbal's records we've been locked since the beginning of the EVE shift on Tues (so it's been roughly locked 32.5hrs with some commissioning in there today).
Jeff informed me that there is an nds issue which will prevent various scripts/tools from working on various Control Room work stations---this work station (zotws3) is OK, so I will leave it logged in as Ops.
Seismon is currently dead (Jeff made a subentry to Jim's entry about this).
Current Triple Coincidence started about 50min ago with V1 relocking after a recent lockloss.
Locked and Triple-C Observing for first half of the shift. Wind is up a bit, bringing up the X & Y Axis primary microseism as well. No other issues or problems to report.
I have been starting to make some ESD actuation measurements while other measurements are ongoing. The idea is to measure 4 parameters in EQ 2 of G1600699 by putting an excitation onto the bias electrode with and without a bias, and onto the signal electrodes with and without a DC offset sent to the signal electrodes.
Jeff K kindly let me inject some lines during his calibration measurements today, and here are the results I get. I might have gotten some signs wrong, that needs to be double checked. These measurements could be scripted, and should not take more than 15 minutes per optic.
alpha | -2.7e-10 N/V^2 |
beta | -3.3e-9 N/V |
beta2 | -7e-9 N/V |
gamma | 8.45e-11N/V^2 |
Veff=(beta-beta2)/(2(alpha-gamma)) | -5.2 V |
Due to issues with running the report script locally (being worked on), Sudarshan will post the results of this past Tuesday's End Y Pcal calibration. Attached are pics of the logsheet so he has the pertinent information.
Other than the standard end station calibration measurements, I also removed the alignment irises that we installed the last time we fixed a clipping issue and tweaked the steering mirrors in the receiver module to recenter the beams into the RX integrating sphere as described in WP 7111.
TravisS, SudarshanK
Below is the link to the calibration trend document that includes the measurement done on Tuesday.
https://dcc.ligo.org/DocDB/0118/T1500131/016/D20170815_LHOY_PD_trend.pdf
After, Travis relieved the clipping on the receiver side by centering the beam in the optics and the integrating sphere, the TxPD and RxPD calibration seemed to have moved back to its non-clipping configuration. This also suggests that all the clipping we had seen during the last few months was all happening outside the vacuum in the receiver side module. Thanks Travis.
Date | TxPD (N/V) | RxPD (N/V) | Remarks |
2017/05/24 |
1.5193e-09 |
1.0500e-09 |
Before Clipping |
2017/08/08 |
1.5202e-9* |
1.1111e-09 |
During Rx Clipping |
2017/08/15 |
1.5202e-09 |
1.0502e-09 |
After Clipping is relieved |
* This TxPD calibration was estimated by using the optical efficiency number from the 2017/05/24 measurement because of RxPD clipping. Details on LHO alog # 38090.
TITLE: 08/16 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 51Mpc
INCOMING OPERATOR: JeffB
SHIFT SUMMARY: Quiet shift, with a few hours of calibration measurements in the middle. JeffK was operator for the first hour, then I took over for the rest of the shift.
LOG:
(JeffK) Aug 16 2017 15:15:44 UTC New remote user, ROOT
(JeffK) Aug 16 2017 15:18:13 UTC New remote user, ROOT
(JeffK) Aug 16 2017 15:51:12 UTC GRB Alert
Aug 16 2017 16:00:00 UTC Take over from JeffK
Aug 16 2017 16:10:06 UTC Parking lot crew through gate
Aug 16 2017 17:12:53 UTC Karen opening receiving roll-up door to remove cardboard
Aug 16 2017 17:31:09 UTC Out of Observe for JeffK calibration measurements
Aug 16 2017 18:12:26 UTC JohnW working on chiller fault at EndYAug 16 2017 19:09:33 UTC Bubba and JohnW to EndY chiller yard
Aug 16 2017 19:28:50 UTC Bubba and JohnW to EndY air handler mechanical room
Aug 16 2017 19:46:25 UTC Bubba and JohnW leaving EndY
Aug 16 2017 20:20:04 UTC TJ to optics lab to check on parts
Aug 16 2017 20:45:22 UTC TJ out of optics lab
Aug 16 2017 21:00:45 UTC JohnW to MidY
Aug 16 2017 21:11:28 UTC Measurements finished
Aug 16 2017 21:11:59 UTC Back to Observing
Aug 16 2017 21:19:52 UTC JohnW back
Aug 16 2017 22:50:08 UTC Hand over early to JeffB
J. Kissel Managed to get all other QUADs worth of UIM characterization -- to confirm what's seen on ETMY (see LHO aLOG 31603) is either normal, or abnormal. Will process and post detailed results in due time. Preliminary results indicate that all QUADs show the 167 Hz feature, but only some QUADs show the ~110 Hz feature or 300 Hz forest. Interesting! We'll see how he UIM blade dampers impact this stuff. Data lives here: 2017-08-16_H1SUSETMX_L1_iEXC2DARM_HFDynamicsTest_100-250Hz.xml 2017-08-16_H1SUSETMX_L1_iEXC2DARM_HFDynamicsTest_250-350Hz.xml 2017-08-16_H1SUSETMX_L1_iEXC2DARM_HFDynamicsTest_300-500Hz.xml 2017-08-16_H1SUSETMX_L1_iEXC2DARM_HFDynamicsTest_90-400Hz_SweptSine.xml 2017-08-16_H1SUSETMX_L1_PCAL2DARM_HFDynamicsTest_100-250Hz.xml 2017-08-16_H1SUSETMX_L1_PCAL2DARM_HFDynamicsTest_250-350Hz.xml 2017-08-16_H1SUSETMX_L1_PCAL2DARM_HFDynamicsTest_300-500Hz.xml 2017-08-16_H1SUSETMX_L1_PCAL2DARM_HFDynamicsTest_90-400Hz_SweptSine.xml 2017-08-16_H1SUSITMX_L1_iEXC2DARM_HFDynamicsTest_100-250Hz.xml 2017-08-16_H1SUSITMX_L1_iEXC2DARM_HFDynamicsTest_250-350Hz.xml 2017-08-16_H1SUSITMX_L1_iEXC2DARM_HFDynamicsTest_300-500Hz.xml 2017-08-16_H1SUSITMX_L1_iEXC2DARM_HFDynamicsTest_90-400Hz_SweptSine.xml 2017-08-16_H1SUSITMX_L1_PCAL2DARM_HFDynamicsTest_100-250Hz.xml 2017-08-16_H1SUSITMX_L1_PCAL2DARM_HFDynamicsTest_250-350Hz.xml 2017-08-16_H1SUSITMX_L1_PCAL2DARM_HFDynamicsTest_300-500Hz.xml 2017-08-16_H1SUSITMX_L1_PCAL2DARM_HFDynamicsTest_90-400Hz_SweptSine.xml 2017-08-16_H1SUSITMY_L1_iEXC2DARM_HFDynamicsTest_100-250Hz.xml 2017-08-16_H1SUSITMY_L1_iEXC2DARM_HFDynamicsTest_250-350Hz.xml 2017-08-16_H1SUSITMY_L1_iEXC2DARM_HFDynamicsTest_300-500Hz.xml 2017-08-16_H1SUSITMY_L1_iEXC2DARM_HFDynamicsTest_90-400Hz_SweptSine.xml 2017-08-16_H1SUSITMY_L1_PCAL2DARM_HFDynamicsTest_100-250Hz.xml 2017-08-16_H1SUSITMY_L1_PCAL2DARM_HFDynamicsTest_250-350Hz.xml 2017-08-16_H1SUSITMY_L1_PCAL2DARM_HFDynamicsTest_300-500Hz.xml 2017-08-16_H1SUSITMY_L1_PCAL2DARM_HFDynamicsTest_90-400Hz_SweptSine.xml
J. Kissel Was able to grab an entire suite of calibration measurements this afternoon. Data files listed below, will process in due time. Preliminary results are as expected -- attached is a screen shot of "perfect" reference calibration against current CAL-DELTAL_EXTERNAL, which we know is *not* corrected for time dependent correction factors -- namely kappa_TST which is up at ~10% at the moment, likely causing the shown discrepancy. Actuation: /ligo/svncommon/CalSVN/aligocalibration/trunk/Runs/O2/H1/Measurements/FullIFOActuatorTFs/2017-08-16 2017-08-16_H1SUSETMY_L1_iEXC2DARM_25min.xml 2017-08-16_H1SUSETMY_L1_PCAL2DARM_8min.xml 2017-08-16_H1SUSETMY_L2_iEXC2DARM_17min.xml 2017-08-16_H1SUSETMY_L2_PCAL2DARM_8min.xml 2017-08-16_H1SUSETMY_L3_iEXC2DARM_8min.xml 2017-08-16_H1SUSETMY_L3_PCAL2DARM_8min.xml Sensing: /ligo/svncommon/CalSVN/aligocalibration/trunk/Runs/O2/H1/Measurements/SensingFunctionTFs 2017-08-16_H1DARM_OLGTF_4to1200Hz_25min.xml 2017-08-16_H1_OMCDCPDSUM_to_DARMIN1.xml 2017-08-16_H1_PCAL2DARMTF_4to1200Hz_8min.xml 2017-08-16_H1_PCAL2DARMTF_BB_5to1000Hz_0p25BW_250avgs_5min.xml
CHWP1 tripped off this morning for some reason so I have started CHWP2. This has allowed the second chiller to start. Chilled water temps rose to 52F and are now back to 37F.
There may be a small blip in the VEA space temperature, although it appears it may only move by 1 or 2 tenths of a degree F.
Over the weekend, I the seismon code had died. I don't know why it died over the weekend, but it looks like I may have restarted some wrong versions of epics code. Corey had found it dead again after my last shift, probably a result of the epics code crashing in a way that's not totally obvious on the medm display. I restarted the right versions of the code this morning, but it's hard tell if it's actually, really running, so it needs monitoring. If the code is found not running again, I'd like to know, so I can try to diagnose the code. Some symptoms are the gps clock on the medm is dead (this will turn on a red light on my new medm & there is a diag_main test) or not catching new earthquakes. Also, the new version of the code seems to have more or less killed the old display, so there is no need to report on the old, single event seismon not updating.
Seismon looks dead again. The Dead box on the wall display is red. DIAG_MAIN is showing the "Seismon system not updating" message. Seismon-5 has not updated since 15:53:27utc, Terramon shows several EQs arriving after 15:53.
J. Kissel, J. Warner, P. Thomas (R. McCarthy and D. Barker remotely) I happened to be on site giving this Saturday's tour, for better or worse, when Jim and Patrick informed me that End X had some sort of electronics failure that took down everything (see first entry here: LHO aLOG 382517). After a debrief from both of them while driving to the end station, we knew to start our investigation at the SUS rack power supplies: the symptom, they'd informed me after having already gone to the end station, that the coil drivers and AA/AI chassis in the SUS rack showed only one leg one via the LEDs in the back. We don't know why this happened, and have not investigated. As a consequence, however, this explains why the OSEM sensor report to the independent software watchdog went dead, and thus tripped that watchdog, killing the seismic system via watchdogs. Upon arrival, we went immediately to the power supplies, and indeed found the left leg of the 18V power supply with its ON/OFF rocker switch in the OFF position. We turned it back ON, it came to life (see "after" picture IMG_2234.jpg), and this restored the SUS electronics and its OSEMs (see "after" picture IMG_2235.jpg). However, unfortunately, we also noticed that the EX:ISC 12V power supply LED lights were flickering (see video on YouTube because .m4v format is not accepted by the aLOG). Not knowing that this was a normal thing (we found out later once we got a hold of Richard), we also power cycled both legs of those ISC 12V power supplies hoping to clear the blinking. Sadly (a) this did not clear the blinking, (b) it killed the power to the neighboring IRIGB timing system, which informs (and thus killed) the end station ISC I/O chassis. That, in turn, killed the entire end station Dolphin Network. (c) (we found out later) it killed the power supply to the relay that feeds the power to the ESD HV. Namely, once we got up to looking at the ESD HV power supply (we worked our way up the rack from the bottom) we found that with rocker switches ON, there were no displays and OFF, and they were unresponsive to flipping the ON/OFF rocker switches (as pictured in IMG_2233.jpg). In order to fix (b) we got a hold of Dave, who is having to restart all front-ends and IO chassis remotely. He'll likely aLOG later. In order to fix (c) we followed power cables to the HV power supplies up to the top of the power supply rack, where found what we now know as a relay in the state as pictured in IMG_2236.jpg and IMG_2237.jpg. Eventually, we found the big red button on the "front" of the chassis (see IMG_2238.jpg), and like all good red buttons, it drove us to push it (cautiously, we had left the ESD power supplies in the OFF position before doing so, however). We heard a few clicks, and all the green lights lit up. Once that was happy, then the ESD HV power supplies came back to life with a simple flick of the power switch. As of now, Jim is re-isolating tables, and resuming recovery with a fully functional system. "Normal" recovery from here, but we should be cautious of violin modes. Again, we have not invested why the power supply tripped off in the first place; we'll leave that for the work week or for those more curious than us.
Corresponds to (now closed) FRS Ticket 8762.
TCSX is back - H1 in Observe - details to follow
Initial exit of H1 from Observe:
H1 recovery / return to Observe: