Distribution of hours at which scratchy glitches occurred according to the ML output from GravitySpy. In addition, a histogram of amount of O1 time spend in analysis ready mode is provided. I have uploaded omega scans and FFT spectrograms of what Scratchy glitch looked like in O1.
For those of us who haven't been on DetChar calls to have heard this latest DetChar nickname... "Scratchy glitches?"
Hi Jeff,
Scotty's comment above refers to Andy's comment to the range drop alog 30797 (see attachment here and compare to Andy's spectrogram). We're trying to help figure out its cause. It's a good lead that they seem to be related to RM1 and RM2 motion.
"Scratchy" is the name used in GravitySpy for these glitches. They are called that because they sound like scratches in audio https://wiki.ligo.org/DetChar/InstrumentSounds . In FFT they look like mountains, or if you look closer, like series of wavy lines. They were one of the most numerous types of H1 glitches in O1. In DetChar we also once called them "Blue mountains." Confusing, I know. But there is a DCC entry disambiguating (in this case equating) scratchy and blue mountain https://dcc.ligo.org/LIGO-G1601301 and a further entry listing all of the major glitch types https://dcc.ligo.org/G1500642 and the notes on the GravitySpy page.
The results of cdsutils.avg() in guardian is sometimes giving us very weird values.
We use this function to measure the offset value of the trans QPDs in Prep_TR_CARM. At one point, the result of the average gave the same (wrong) value for both the X and Y QPDs, to within 9 decimal places (right side of screenshot, about halfway down). Obviously this isn't right, but the fact that the values are identical will hopefully help track down what happened.
The next lock, it correctly got a value for the TransX (left side of screenshot, about halfway down), but didn't write a value for the TransY QPD, which indicates that it was trying to write the exact same value that was already there (epics writes aren't logged if they don't change the value).
So, why did 3 different cdsutils averages all return a value of 751.242126465?
This isn't the first time that this has happened. Stefan recalls at least one time from over the weekend, and I know Cheryl and I found this sometime last week.
This is definitely a very strange behavior. I have no idea why that would happen.
As with most things guardian, it's good to try to get independent verification of the effect. If you make the same cdsutils avg calls from the command line do you get similarly strange results? Could the NDS server be getting into a weird state?
On the one hand, it works just fine right now in a guardian shell. On the other hand, it also worked fine for the latest acquisition. So, no conclusion at this time.
This happened again, but this time the numbers were not identical. I have added a check to the Prep_TR_CARM state that if the absolute value of the offsets are larger than 5 (normally they're around 0.2 and 0.3, and the bad values have all been above several hundred) then notify and don't move on.
Operators: If you see the notification Check Trans QPD offsets!
then look at H1:LSC-TR_X_QPD_B_SUM_OFFSET and H1:LSC-TR_Y_QPD_B_SUM_OFFSET. If you do an ezca read on that number and it's giant, you can "cheat" and try +0.3 for X, and +0.2 for Y, then go back to trying to find IR.
This happened again to Jim, and Cheryl, today and caused multiple locklosses
I've commented out the averaging of the offsets in the guardian.
We used to not do this averaging, and jsut rely on the dark offsets not to change. Maybe we could go back to that.
For operators, until this is fixed you might need to set these by hand:
If you are having trouble with FIND IR, this is something to check. From the LSC overview sceen, click on the yellow TRX_A_LF TRY_A_LF button toward the middle oc the left part of the screen. Then click on the R INput button circled in the attachment, and from there check that both the X and Y arm QPD SUMs have reasonable offsets. (If there is not IR in the arms, the offset should be about -1*INMON)
Opened as high priority fault in FRS:
I drove a 222.3 Hz line in the 9 MHz RFAM stabilization error point (giving 3.4×10−4 RAN rms) and then watched the resulting lines in REFL LF and ASC NSUM as we powered up from 2 W to 25 W. [Note that the DC readback of the RFAM servo really does give us a RAN, not a RIN. This can be verified by noting that the dc value changes by a factor of 2 when the rf power is reduced by 6 dB.]
At 2 W, we have 0.013 W of 9 MHz power reflected from the PRM and 0.0007 W of 9 MHz power coming out of the AS port.
At 25 W, we have 0.11 W of 9 MHz power reflected from the PRM and 0.034 W of 9 MHz power coming out of the AS port.
The lock stretch happens around 2016-10-25 00:21:00 Z if anyone wants to look at it.
The values for the reflected PRM power still seem to imply that the 9 MHz sideband either is not strongly overcoupled to the PRC, or the modulation depth is smaller than the old PSL OSA measurement (0.22 rad). For 0.22 rad of modulation depth and strong overcoupling, we'd expect something like 0.045 W reflected off the PRM at 2 W of input power. Also, the amount of 9 MHz leaking out the AS port evidently does not scale linearly with the input power.
Summary: The Pcal Y laser beam is likely clipping somewhere in the beam path. This will need to be addressed ASAP. In the future we need to keep a close eye on the Pcal summary spectra on the DetChar web pages. Details: Jeff K. and I noticed that the spectrum for the Y-end Pcal seemed particularly noisy. I plotted some TX and RX PD channels at different times since Oct. 11. Several days since Oct. 11, the Pcal team has been to EY to perform some Pcal maintenance. One of those times (I think Oct. 18, but we don't have an aLOG reference for this), we realigned the beams on the test mass. Potentially, this change caused some clipping. Attached are the spectra for TX and RX. Notice that there are no dramatic changes in the TX spectra. In the RX spectra, there is structure becoming more apparent with time in the 15-30 Hz region and 90-140 Hz. Also, various other peaks are growing Also attached is a minute trend of the TX and RX PD mean values. On Oct 18, after realignment (the step down), the RX PD starts to drift downward while the TX PD power holds steady. The decrease in RX PD is nearly 10% from the start of the realignment. The Pcal team should address this ASAP, hopefully during tomorrow's maintenance time.
Evan, it seems they are ~14% off. On top of the ~10% drift we see there is also ~4% difference between RX and TX PD immediately after the alignment. The alignment itself seems to have ended up with some clipping.
Here is the sequence of events which happened in the 18:23 minute PDT Sat 22 October
18:23:10 - 18:23:24 | h1fw0 asked for data retransmissions |
18:23:24 | h1fw0 stopped running |
18:23:26 - 18:23:36 | h1fw1 asked for data retransmissions |
18:23:36 | h1fw1 stopped running |
Having both frame writers down meant we lost three full frames for the GPS times 1161220928, 1161220992, 1161221056
Clearly fixing the retransmission errors will become a higher priority if they are cascaded like this and not random as they have been in the past. Our third frame writer h1fw2 did not crash and could have been used to fill in the gap if it were to be connected to LDAS.
3:32 pm local Overfilled CP3 by doubling LLCV to 34%. Took 18 min. to see vapor and tiny amounts of LN2 out the exhaust line. TCs responded but still did not readout high negative #s like I'd expect or saw with 1/2 turn on bypass fill valve. So I left LLCV at 35% for an additional 7 min. but did not see a drastic change in temps. Never saw much flow out of the bypass exhaust pipe. Bumped LLCV nominal from 17% to 18% open.
Removed and replaced battery packs for all vacuum rack UPSes (Ends/Mids/Corner station). No glitches noted on racks.
Work done under WP#6270.
If FAMIS were allowed to digest this activity, it could expect to become more "regular" (I'm laughing at my own jokes!)
The front end diode box that was removed from service some months ago, S/N DB 12-07 had one of its Lumina power supplies replaced - the one on the right hand side as you face the front panel and key switch. old: S/N 38226 new: S/N 118533 Fil/Gerardo/Peter
J. Kissel Integration Issue 6463 ECR: E1600316 WP: 6263 P. Fritschel, S. Aston, and I have revamped the SUS channel list that is stored in the frames in order to (1) Reduce the overall channel frame rate in light of the new scheme of storing only one science frame (no commissioning frames), and (2) because the list was an un-organized hodge podge of inconsistent ideas of what to store from over 6 years of ideas. The new channel list (and the rationale for each channel and its rate) can be found in T1600432, and will not change throughout O2. I've spend the day modifying front-end models such that they all conform to this new model. This impacts *every* SUS model, and we'll install the changes tomorrow (including the removal of ISIWIT channels, prepped on Friday; LHO aLOG 30728). For the SUS models used in any control system, the channel list was changed in the respective suspension type's library part, Sending BSFM_MASTER.mdl Sending HLTS_MASTER.mdl Sending HSSS_MASTER.mdl Sending HSTS_MASTER.mdl Sending MC_MASTER.mdl Sending OMCS_MASTER.mdl Sending QUAD_ITM_MASTER.mdl Sending QUAD_MASTER.mdl Sending RC_MASTER.mdl Sending TMTS_MASTER.mdl Transmitting file data .......... Committed revision 14509. and commited to the user apps repo. For monitor models, the changes are done on the secondary level, but that layer is not made of library parts, so they have to be individually changed per suspension. These models, Sending models/h1susauxb123.mdl Sending models/h1susauxex.mdl Sending models/h1susauxey.mdl Sending models/h1susauxh2.mdl Sending models/h1susauxh34.mdl Sending models/h1susauxh56.mdl Transmitting file data ...... Committed revision 14508. are also no committed to the userapps repo. We're ready for a rowdy maintenance day tomorrow -- hold on to your butts!
note that to permt sus-aux channels to be acquired at 4kHz, the models h1susauxh2, h1susauxh34 and h1susauxh56 will be modified from 2K models to 4K models as part of tomorrow's work.
Today I did some more cleanup work on the /opt/rtcds file system following yesterday's full-filesytem errors.
We perform hourly zfs snapshots on this file system, and zfs-sync them to the backup machine h1fs1 at the same rate. h1fs0 had hourly snapshots going back to May 2016.
Yesterday I had deleted all of May and thinned June down to one-per-day. Today we made the decision that since all the files are backed up to tape, we can delete all snapshots older than 30 days. This will ensure that disk allocaed to a deleted file will be recovered after the last snapshot which references it is destroyed after 30 days. I destroyed all snapshots up to 26th September 2016.
After the snapshot cleanup, the 928G file system is using 728G (split as 157G used by snapshots and 571G used by disk-system). This is a usage of 78% which is what DF reports.
16:45 Small cube truck on site for Bubba. (Gutter Kings) Installation of gutters on the North Side of OSB. WP#6266
17:27 Peter and Jason into the PSL . WP#6268
17:39 Kyle out to EX to take measurements
17:45 Took IMC to 'OFFLINE' as per the request of Peter and Jason
17:56 ISI config changed to no BRS for both end stations for maintenance purposes
17:58 Fil at EY
18:00 Gerardo out to execute WP#6270
18:27 Kyle back from EX
18:37 Fil leaving EY
19:00 Fil at EX
19:03 Jason and Peter out. Circumstances did not allow the intended task to be performed
19:08 Fill leaving EX
19:10 Begin bringing the IFO back. IMC locking a bit daunting. Cheryl assisting.
19:12 BRS turned on at both end stations
19:35 reset PSL Noise Eater
20:39 having some dificulty aligning X-arm
20:40 Rick S and aguest up to the observation deck
20:59 Sheila out to PSL rack
21:11 Jeff into CER
21:41 Bubba to MX to inspect a fan
22:25 Chandra to MY to do CP3
Ed, Sheila
Are ezca connection errors becoming more frequent? Ed has had two in the last hour or so, one of which contributed to a lockloss (ISC_DRMI).
The first one was from ISC_LOCK, the screenshot is attached.
Happened again but for a different channel H1:SUS-ITMX_L2_DAMP_MODE2_RMSLP_LOG10_OUTMON ( Sheila's post was for H1:LSC-PD_DOF_MTRX_7_4). I trended and found data for both of those channels at the connection error times, and during the second error I could also caget the channel while ISC_LOCK still could not connect. I'll keep trying to dig and see what I find.
Relevant ISC_LOCK log:
2016-10-25_00:25:57.034950Z ISC_LOCK [COIL_DRIVERS.enter]
2016-10-25_00:26:09.444680Z Traceback (most recent call last):
2016-10-25_00:26:09.444730Z File "_ctypes/callbacks.c", line 314, in 'calling callback function'
2016-10-25_00:26:12.128960Z ISC_LOCK [COIL_DRIVERS.main] USERMSG 0: EZCA CONNECTION ERROR: Could not connect to channel (timeout=2s): H1:SUS-ITMX_L2_DAMP_MODE2_RMSLP_LOG10_OUTMON
2016-10-25_00:26:12.129190Z File "/ligo/apps/linux-x86_64/epics-3.14.12.2_long-ubuntu12/pyext/pyepics/lib/python2.6/site-packages/epics/ca.py", line 465, in _onConnectionEvent
2016-10-25_00:26:12.131850Z if int(ichid) == int(args.chid):
2016-10-25_00:26:12.132700Z TypeError: int() argument must be a string or a number, not 'NoneType'
2016-10-25_00:26:12.162700Z ISC_LOCK EZCA CONNECTION ERROR. attempting to reestablish...
2016-10-25_00:26:12.175240Z ISC_LOCK CERROR: State method raised an EzcaConnectionError exception.
2016-10-25_00:26:12.175450Z ISC_LOCK CERROR: Current state method will be rerun until the connection error clears.
2016-10-25_00:26:12.175630Z ISC_LOCK CERROR: If CERROR does not clear, try setting OP:STOP to kill worker, followed by OP:EXEC to resume.
It happened again just now.
Opened FRS on this, marked a high priority fault.
[Jenne, Daniel, Stefan]
There seems to be an offset somewhere in the ISS second loop. When the 2nd loop comes on, even though it is supposed to be AC coupled, the diffracted power decreases significantly. This is very repeatable with on/off/on/off tests. One bad thing about this (other than having electronics with unknown behavior) is that the diffracted power is very low, and can hit the bottom rail, causing lockloss - this happened just after we started trending the diffracted power to see why it was so low.
Daniel made it so the second loop doesn't change the DC level of diffracted power by changing the input offset for the AC coupling servo (H1:PSL-ISS_SECONDLOOP_AC_COUPLING_SERVO_OFFSET from 0.0 to -0.5), the output bias of the AC coupling servo (H1:PSL-ISS_SECONDLOOP_AC_COUPLING_INT_BIAS from 210 to 200), and the input offset of the 2nd loop (H1:PSL-ISS_THIRDLOOP_OUTPUT_OFFSET from 24.0 to 23.5 - this is just summed in to the error point of the 2nd loop servo). What we haven't checked yet is if we can increase the laser power with these settings.
Why is there some offset in the ISS 2nd loop that changes the diffracted power?? When did this start happening?
We were able to increase power to 25W okay, but turning off the AC coupling made things go crazy and we lost lock. The diffracted power went up, and we lost lock around the time it hit 10%.
The 2nd loop output offset observed by the 1st loop was about 30mV (attached, CH8). With the 2nd ISS gain slider set at 13dB and a fixed gain stage of 30, this correspond to 0.2mV offset in the AC coupling point. This offset is relatively small.
One thing that has happened in the past two weeks or so is that the power the 1st loop sensor (PDA) receives was cut by about half (second attachment). This was caused by the move from the old position to the new position of the PD.
Since the sensing gain of the 1st loop was reduced by a factor of two, seen from the 2nd loop the 1st loop is twice as efficient an actuator. Apparently the second loop gain slider was not changed (the slider is still at 13dB), so even if the same offset was there before, the effect was a factor of two smaller before.
Another thing which is kind of far-fetched is that I switched off the DBB crate completely and we know that opening/closing the DBB and frontend/200W shutters caused some offset change in the 2nd loop board.
At 5:20am local time we saw a significant range drop (from about 70Mpc to 60Mpc) that seems to be due to a signicant increase of the line structure in the bucket that always lingers around the noise floor.
Attached are two spectra - 1h12min apart (from 11:18:00 and 12:20:00 UTC on 2016/10/24), showing that structure clearly.
Plot two shows the seismic BLRMS from the three LEAs - the corner shows the clearest increase. We are now chasing any particularly bad correlations around 12:20 thgis morning in the hope that it will give us a hint where this scatter is from.
From you request, I ran Bruco on those times. The results are as follows,
bad time (12:20 UTC) : https://ldas-jobs.ligo-wa.caltech.edu/~youngmin/BruCo/PRE-ER10/H1/Oct24/H1-1161346817-600/
good reference(11:08 UTC): https://ldas-jobs.ligo-wa.caltech.edu/~youngmin/BruCo/PRE-ER10/H1/Oct24/H1-1161342497-600/
These could give you a hint for range drop.
Here is a plot of thge auxiliary loops, again comparing good vs bad.
Note the two broad noise humps around 11.8Hz and 23.6Hz. They both increased at the bad time compared to the good time.
Interesingly, the peaks showing up in the DARM spectrum are the 4th 5th, etc. to 12th harmonic of that 11.8-ish Hz.
It all smells to me like some form of scatter in the input chain.
The IMs do not change their motion between Stefan's good time (11:08 UTC) and bad time (12:20 UTC). But, the RMs, particularly RM2, see elevated motion, almost a factor of 2 more motion between 8Hz - 15Hz.
First screenshot is IMs, second is RMs. In both, the references are the good 11:08 time, and the currents are the bad 12:20 time.
Stefan and TeamSEI are looking at the HEPI and ISI motion in the input HAMs right now.
EDIT: As one would expect, the REFL diodes see an increase in jitter at these same frequencies, predominantly in pitch. See 3rd attachment.
I quickly grabbed a time during O1 when this type of noise was happening, and it also corresponds to elevated motion around 6 Hz in RM1 and RM2. Attached are a spectrogram of DARM, and the pitch and yaw of RM2 at the time compared to a reference. There is a vertical mode of the RMs at 6.1 Hz (that's the LLO value, couldn't find it for LHO). Maybe those are bouncing more, and twice that is what's showing up in PRCL?
There should not be any ~6 Hz mode from the RM suspensions (HSTS or HLTS), so I am puzzled what this is. For a list of expected resonant frequencies for HSTS and HLTS see links from this page https://awiki.ligo-wa.caltech.edu/aLIGO/Resonances
@Norna: the RMs, for "REFL Mirrors" are HAM Tip-Tilt Suspensions, or HTTS (see, e.g. G1200071). These, indeed, have been modeled to have their highest (and only) vertical mode at 6.1 Hz (see HTTS Model on the aWiki). I can confirm there is no data committed to the SUS repo on the measured vertical mode frequencies of these not-officially-SUS-group suspensions at H1. Apologies! Remember, these suspensions don't have transverse / vertical / roll sensors or actuators, so one have to rely on dirt coupling showing up in the ASDs of the longitudinal / pitch / yaw sensors. We'll grab some free-swinging ASDs during tomorrow's maintenance period.
Stefan has had Hugh and I looking SEI coupling to PRCL over this period, and so far I haven't found anything, but HAM1 HEPI is coherent with the RM damp channels and RM2 shows some coherence to CAL_DELTAL, around 10hz. Attached plot shows coiherence from RM2_P to HEPI Z L4Cs (blue), RM2_P to CAL_PRCL (brown), and RM2_P to CAL_DELTAL (pink). The HAM1_Z to PRCL is similar to the RM2_P to CAL_PRCL, so I didn't include it. HAM1 X and RY showed less coherence, and X was at lower frequency. There are some things we can do to improve the HAM1 motion if it's deemed necessary, like increasing the gain on the Z isolation loops, but there's not a lot of extra margin there.
Here are ASDs of the HAM3 HEPI L4Cs (~in-line dofs: RY RZ & X) and the CAL-CS_PRCL_DQ. The HAM2 and HAM1 HEPI channels would be assessed the same way: The increase in motion seen on the HAM HEPIs is much broader than that seen on the PRC signal. Also, none of these inertial sensor channels see any broadband coherence with the PRC, example also attached.
Freee swing PSD of RMs and OM are in alog 30852.