I didn't quite fully anticipate all of the affects of separating DOWN from the rest of the graph. In particular, one really bad unanticipated effect was that after lockloss, when the ISC_LOCK jumps to the LOCKLOSS state, it doesn't find any paths from LOCKLOSS to the last requested state, which causes it to just stall out in LOCKLOSS, and not proceed to DOWN. In other words, DOWN was not run after the lockloss this morning after last night's 10 hour lock.
When I came in this morning I therefore found a bit of a poo show that I then had to clean up. None of the control signals had been shut off, multiple SUS and SEI systems were tripped, and bouce roll modes were rung up. Evan and I eventually wrangled everything back under control, and we're now back to locking.
I have reconnected DOWN to the rest of the graph. NOTE, however, that this problem is not inherent in the fact that DOWN was disconnected. It's just that once you do something like that you remove the ability of guardian to find the right path for you, so you have to be careful to make sure you have all the appropriate jumps to get you where you need to be. I'll rethink things.
Some notable issues:
Lesson's learned:
It seems like the rate of epics freezes has increased today, I have seen more than 5 in the last 2 hours.
Sheila, Evan, Jeff B, Corey
Both yesterday and this morning, we had extremly rung up bounce and roll modes (both times because the IFO lost lock and DOWN was not run, yesterday for the reasons explained in comments to alog 20103, today because of a different snafu).
When this happens, we need to damp bounce on ETMY while locked on ALS. To do this, it seems that we need to use a phase that is +150 degrees compared to the phase we use in full lock. This phase shift comes from the difference between the DARM loop here and in full lock. When locked on RF DARM, we need to use +120 degrees compared to the normal settings.
We also had difficulty yesterday with rung up roll modes. To damp roll we use AS WFS, so we need to get to RF DARM before we try to damp them this way. One difficulty we had was that the roll mode notches in the PUMs were not wide enough (Evan adds that the notch needs to be wide because of the Shapiro effect), so that DHARD could saturate because of the roll mode.
Bringing these things down when they are verry rung up is very slow, because the actuation authority is small compared to the amount of energy in the mode. Fortunately, we are normally in the regime where the mode is small and it only takes a few minutes to damp them.
Evan, Lisa This entry is to clarify the fact that the impact of this excess of high frequency noise is actually bigger than the coherence with the ASC channels suggests, as it can clearly be seen by comparing OMC NULL and SUM. For example, around 2 kHz, the discrepancy in the noise floor between OMC SUM (total noise) and OMC NULL (shot + dark noise) is about 15%, so corresponding to a noise which is 0.6 times shot + dark. The attachment shows OMC SUM/NULL in H1 at low noise (left) compared to L1 (right). So, the message is that we are looking for something quite big here..
After 17:30 UTC the interferometer was not undisturbed: I was making PEM injections.
The interferometer has been locked undisturbed for several hours in low noise before Robert started his injections. The range degraded slowly over time, and it has been polluted by some huge glitches, similarly to what has been observed in the past.
It turns out that the range was degraded by a changing ISS coupling during the lock. Evan and Matt had left the ISS second loop open, as they were having problems with it. You would see a plot with the a DARM spectrum at the beginning and at the end of this lock, showing large peaks appearing in DARM (a factor of a few above the noise floor), if DTT hadn't crash on me twice while trying to save the plot as PDF...
Measured temps of heated areas of RGA -> 95C < temps < 120C -> Made slight changes to variac settings -> Aux. cart @ 2.5 x 10-5 torr (seems high for this configuration)
Its been a week since the DAQ reconfiguration which reduced the NFS/QFS disk loading and both framewriters continue to be 100% stable. The attached plot shows the restarts of h1fw0 (red circles), h1fw1 (green circles) and the DAQ system as a whole (blue squares) for the month of July. The Magenta lines show when h1fw0 and h1fw1 were modified. In the past 7 days, the only restarts of the framewriters are associated with complete DAQ restarts.
Which indicates the existing aLIGO DAQ frame writer meet/exceed the original design requirement (~10MB/sec frames to disk). They do not meet the current needs of ~30-40 MB/sec, of course
Matt, Lisa, Evan
Tonight we looked at the coherences between the OMC DCPD channels and ASC AS C, this time at several different interferometer powers. In the attached plots, green is at 11 W, violet is at 17 W, and apricot is at 24 W.
Evidently, the appearance of excess high-frequency noise in OMC DCPD sum (and the coherence of OMC DCPD sum with ASC AS C) grows as the power is increased. We believe that this behavior rules out the possibility that this is excess noise is caused by RIN in the AS port carrier, assuming that any such RIN is independent of the DARM offset and of the PSL power. Since the DARM offset is adjusted during power-up to maintain a constant dc current on the DCPDs, RIN in the AS carrier should result in an optical power fluctuation whose ASD (in W/rtHz) does not vary during the power-up. This is the behavior that we see in the null stream, where the constant DCPD dc currents ensure that the shot-noise-induced power fluctuation is independent of the PSL power.
On a semi-related note, the slope in the OMC DCPDs at high frequencies is mostly explained by the uncompensated preamp poles and the uncompensated AA filter.
I modified the ISC_LOCK guardian to revert the DOWN state back to being a 'goto'. This allows you to select the state directly, without having to go to MANUAL.
The reason it had been removed as a 'goto' was because occaissionally someone would accidentally request a lower state while the IFO is locked, which would cause the IFO to go back through DOWN to get to the errantly requested state. To avoid this I implemented some graph shenanigans: I disconnected DOWN from the rest of the graph, but told it to jump to a new READY state at the bottom of the main connected part of the graph once it's done:

This allows DOWN to be a goto, so it's always directly requestable, but prevents guardian from seeing a path through it to the rest of the graph. Once DOWN is done, though, it jumps into the main part of the graph at which point guardian will pick up with the last request and move on up as expected.
Well that didn't work. See alog 20134. Separating DOWN from the rest of the graph caused some unanticipated bad affects. This is actually not inherent in disconnected DOWN from the rest of the graph, but it needed to be considered a bit more carefully. See the other post for more info.
J. Kissel WP 5395 ECR E1500230 II 1054 I've removed all redundant IPC Error EPICs channels from the top-lvel models of all SUS this evening. This is in accordance with ECR E1500230. The models compile, and have been committed to the SVN. They will be installed this coming Tuesday. Once installed, this closes out the ECR and Integration Issue for LHO.
Jeff, Sheila
We have had three locklosses in the last 2 days that were caused by the ETMX UIM coil driver rocker switch tripping. The only solution is to drive to the end station, and flip the rocker switch back on. Something is wrong with this coil driver (Jeff thinks it should just be replaced).
The only way to notice this is looking at the OSEM centering medm screen. It would be great to add it to SYS DIAG (its not caught by the current noisemon check), and the ops overview screen, and the quad overview in a more obvious way.
Over the past two days h1susey has IOP glitched 10 times compared with only two times the previous two days. Here is a log of the recent glitches
With the IMC unlocked, these are some numbers for power on the IMC REFL PD (S1203775):
attached is a file listing the channel differences between the L1 and H1 science frames.
It looks like almost all of the non-PEM differences can be explained by differences in hardware, control scheme/choices, and non-deprecated channels due to little-to-no maintenance. LHO has a beam rotation sensor, and LLO does not. < H1:ISI-GND_BRS_ETMX_REF_OUT_DQ 256 < H1:ISI-GND_BRS_ETMX_RY_OUT_DQ 256 LHO uses a different tidal scheme (T1400733). < H1:LSC-Y_ARM_OUT_DQ 256 < H1:LSC-Y_TIDAL_OUT_DQ 256 LHO has not yet updated the CAL-CS calibration for IMC-F, so it remains in OAF. < H1:OAF-CAL_IMC_F_DQ 16384 LLO has PI damping and LLO does not? > L1:LSC-X_EXTRA_AI_1_OUT_DQ 2048 > L1:LSC-X_EXTRA_AI_2_OUT_DQ 2048 > L1:LSC-X_EXTRA_AI_3_OUT_DQ 2048 > L1:LSC-Y_EXTRA_AI_1_OUT_DQ 2048 > L1:LSC-Y_EXTRA_AI_2_OUT_DQ 2048 > L1:LSC-Y_EXTRA_AI_3_OUT_DQ 2048 Regardless of what was decided via the formal process, Daniel hasn't visited LLO recently and force-reduced the ODC data rate. < H1:ODC-X_CHANNEL_OUT_DQ 16384 < H1:ODC-Y_CHANNEL_OUT_DQ 16384 < H1:PSL-ODC_CHANNEL_OUT_DQ 16384 --- > L1:ODC-X_CHANNEL_OUT_DQ 32768 > L1:ODC-Y_CHANNEL_OUT_DQ 32768 > L1:PSL-ODC_CHANNEL_OUT_DQ 32768 LLO has not completely deprecated OAF for all of its LSC DOF calibrations. > L1:OAF-CAL_CARM_X_DQ 16384 > L1:OAF-CAL_DARM_DQ 16384 > L1:OAF-CAL_MICH_DQ 16384 > L1:OAF-CAL_PRCL_DQ 16384 > L1:OAF-CAL_SRCL_DQ 16384 > L1:OAF-CAL_XARM_DQ 16384 > L1:OAF-CAL_YARM_DQ 16384 LLO uses a different ISS second loop scheme (or hasn't deprecated one of its attempts that is no longer used)? > L1:PSL-ISS_SECONDLOOP_PD_14_SUM_OUT_DQ 16384 > L1:PSL-ISS_SECONDLOOP_PD_58_SUM_OUT_DQ 16384 LLO has a HV ESD driver on its ITMX, LHO does not. > L1:SUS-ITMX_L3_ESDAMON_DC_DQ 256 > L1:SUS-ITMX_L3_ESDAMON_LL_DQ 256 > L1:SUS-ITMX_L3_ESDAMON_LR_DQ 256 > L1:SUS-ITMX_L3_ESDAMON_UL_DQ 256 > L1:SUS-ITMX_L3_ESDAMON_UR_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_CAS_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_HVN_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_HVP_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_LVN_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_LVP_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_MCU_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_TM1_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_TM2_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_TM3_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_TM4_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_TM5_DQ 256 > L1:SUS-ITMX_L3_ESDDMON_TM6_DQ 256
Kyle, Gerardo In and out of X-end VEA ~1030 - 1230 hrs. local Added 1.5" O-ring valve in-series with existing 1.5" metal angle valve -> Wrapped RGA with heater tapes and foil -> Elevated pump cart off of floor (resting on foam) -> NW40 inlet 50 L/s turbo (no vent valve) backed by aux. cart (no vent valve) -> Begin 100C bake In and out of X-end VEA between 1405 - 1425 hrs. local, 1450 - 1455 hrs. local and 1600 - 1605 hrs. local. NOTE: Will need to enter X-end VEA to make adjustments Saturday morning
~1710 -1800 hrs. local I realized that I had a CFF inlet 50L/s turbo on the shelf as well as UHV 1.5" valve -> Swapped out 1.5" O-ring valve and NW40 inlet turbo for their dryer cousins -> resumed bakeout
Something happened at lock loss and the IFO did not reach the defined DOWN state.
Symptoms:
Dave, Jaime, Sheila, operators, and others are investigating.
Let me try to give a slightly more detailed narrative as we were able to reconstruct it:
So what's the take away:
bug number 1: guardian should have caught the NDS connection error during the NDS restart and gone into a "connection error" (CERROR) state. In that case, it would have continually checked the NDS connection until it was re-established, at which point it would have continued normal operation. This is in contrast to the ERROR state where it waits for operator intervention. I will work on fixing this for the next release.
bug number 2: The operators didn't know or didn't repond to the fact that the IMC_LOCK guardian had gone into ERROR. This is not good, since we need to respond quickly to these things to keep the IFO operating robustly. I propose we set up an alarm in case any guardian node goes into ERROR. I'll work with Dave et. al to get this setup.
As an aside, I'm going to be working over the next week to clean up the guardian and SDF/SPM situation to eliminate all the spurious warnings. We've got too many yellow lights on the guardian screen, which means that we're now in the habit of just ignoring them. They're supposed to be there to inform us of problems that require human intervention. If we just leave them yellow all the time they end up having zero affect and we're left with a noisy alarm situation that everyone just ignores.
A series of events lead to the ISC_LOCK Gaurdian to not understand that there was a lockloss.
To prevent this from happening in the future, Jamie will have Guardian continue to wait for the NDS server to reconnect, rather than stopping and waiting for user intervention before becoming active again. I also added a verbal alarm for Guardian nodes in Error to alert Operators/Users that action is required.
(If i missed something here please let me know)
Matt, Evan
Why do the TMSX RT and SD OSEMs have such huge spikes at 1821 Hz and harmonics? These spikes are about 4000 ct pp in the time series. In comparison, the other OSEMs on TMSX are 100 ct pp or less (F1 and LF shown for comparison).
Also attached are the spectra and time series of the corresponding IOP channels.
On a possibly related note: in full lock, the TMSX QPDs see more than 100 times more noise at 10 Hz than the TMSY QPDs do.
From Gabriele's bruco report, the X QPDs have some coherence with DARM around 78 Hz and 80 Hz. A coherence plot is attached.
It seems similar to the problem from log 12465. Recycling AA chassis power fixed the issue at the time.
Quenched the oscillation for now (Vern, Keita)
We were able to clearly hear some kHz-ish sound from the satellite amplifier of TMSX that is connected to SD and RT. Power cycling (i.e. removing the cable powering the BOSEM and connecting it again) didn't fix it despite many trials.
We moved to the driver, power cycled the driver chassis, and it didn't help either.
The tone of the audible oscillation changed when we wiggled the cable on the satellite amp, but that didn't fix it.
Vern gave the DB25 connector on the satellite amp a hard whack in a direction to push the connector further into the box, and that fixed the problem for now.
Summary
To decrease uncertainty in calculation of actuation function correction factor, kappa_A, sensing function correction factor, kappa_C, and CC pole frequency, f_c, we've recently increased calibration line amplitudes to give SNR of 100 with 10s FFT (see LHO alog #19792). Earlier Kiwamu posted his investigation of CC pole frequency over the last weekend in LHO alog comment #19988. In this alog we show kappa_A, kappa_C and f_c calculated according to the method described in T1500377-v3 for the same time interval (2015-07-25 00:00 UTC to 2015-07-27 UTC, when GRD-ISC_LOCK_STATE_N >= 501, 1 min FFTs).
Statistical uncertainties of kappa_A, kappa_C and f_c within 1.5 hours time interval highlighted with green are:
Xctrl(34.7) and PcalX(33.1), std(kappa_A) = +/- 0.45 % (1 sigma) PcalX(325.1), std(kappa_C) = +/- 1.12 % (1 sigma); std(f_c) = +/- 5.20 Hz (1 sigma) PcalY(331.9), std(kappa_C) = +/- 1.43 % (1 sigma); std(f_c) = +/- 5.55 Hz (1 sigma) PcalX(534.7), std(kappa_C) = +/- 0.70 % (1 sigma); std(f_c) = +/- 2.08 Hz (1 sigma) PcalY(540.7), std(kappa_C) = +/- 0.78 % (1 sigma); std(f_c) = +/- 2.68 Hz (1 sigma)
Notice that kappa_C and f_c on the left subplots were calculated from low SNR 325.1 Hz and 331.9 Hz Pcal lines set by Evan (see LHO alog comment #19823). Calculation of these parameters using higher SNR 534.7 Hz and 540.7 Hz Pcal lines (right subplots) gave less noisy results.
Details
C_0, D_0 and A_0 were taken from LHO ER7 DARM model.
To make kappa_C calcluations consistent between results from 4 Pcal lines, a manual correction to phases of Pcal lines that correspond to 130us of time advance was applied. On the plot we report only changes in f_c by subtracting mean value of about 300 Hz. In order to receive an absolute value of f_c using this method, we must take into account exact time delay/advance of PCAL RXPD channel w.r.t. DARM_ERR; possibly a frequency independent phase shift (however we do now know any reason for that); and the DARM model TFs at the reference time, C_0, D_0 and A_0.
Plots of 1 min FFT dewhitened calibration line amplitudes and phases are given below.
Calibration line uncertainties in DARM_ERR readout in a 1.5 hours interval (highlighted with green color) are as follows:
Xctrl( 34.7) = 2.2000e-01 (+/- 0.00 %); Derr( 34.7) = 2.9738e-10 (+/- 0.15 %) PcalX( 33.1) = 2.4587e-02 (+/- 0.00 %); Derr( 33.1) = 3.0817e-10 (+/- 0.26 %) PcalX(325.1) = 1.0724e-01 (+/- 0.00 %); Derr(325.1) = 2.0593e-10 (+/- 1.48 %) PcalY(331.9) = 9.3791e-02 (+/- 0.01 %); Derr(331.9) = 2.0150e-10 (+/- 1.55 %) PcalX(534.7) = 7.1100e-01 (+/- 0.00 %); Derr(534.7) = 5.8223e-10 (+/- 0.52 %) PcalY(540.7) = 6.3845e-01 (+/- 0.01 %); Derr(540.7) = 5.8948e-10 (+/- 0.45 %)
P.S.
After today's calibration telecon we've changed calibration lines that will be used for estimation of kappa_A, kappa_C and f_c to (see LHO alog #20063):
We are also planning to add an ESD line close to low frequency PCALY line and another high frequency low SNR PCALX line at 3001.3 Hz after completing power budget investigations of PCALX module.
Plot of kappa_A, kappa_C and f_c calculated from new calibration lines (see LHO alogs #20063 and #20052) from the last night lock stretches undisturbed lock stretches is given below.
As it was reported in LHO alog #20089, undisturbed data was collected for ~25 minutes in the interval [Jul 31 2015 09:21:13 UTC, Jul 31 2015 09:46:13 UTC], this interval is highlighted with green data points.
Details
Statistical uncertainties of 1 min FFT calibration line amplitudes in d_err in undisturbed interval (highlighted with green markers on the attached plot) are:
PCALY line in d_err(36.7 Hz) = 3.6189e-10 (+/- 0.14 %) DARM line in d_err(37.3 Hz) = 4.4939e-10 (+/- 0.08 %) PCALY line in d_err(331.9 Hz) = 3.0103e-10 (+/- 0.41 %) PCALY line in d_err(1083.7 Hz) = 3.5701e-10 (+/- 1.30 %)
Statistical uncertainties of calculated kappa_A, kappa_C and f_c in undisturbed interval are:
from Xctrl(37.3) and PcalY(36.7):
std(kappa_A) = +/- 0.92 % (1 sigma)
from PcalY(331.9):
std(kappa_C) = +/- 0.73 % (1 sigma)
std(f_c) = +/- 3.40 Hz (1 sigma)
Statistical uncertainties of calculated kappa_A, kappa_C and f_c are:
A quick look at my monitors is not showing anything unusual for Saturday. The dolphin manager reports 5 connection errors spread evenly throughout saturday (list show below), my LSC, ASC, SUSAUXB123 CA-monitors only caught the 22:19 event. I'll do some more detailed analysis tomorrow using the EDCU DAQ channels.
08 01 01:29
08 01 12:39
08 01 16:17
08 01 17:27
08 01 22:19