I** have written a program to compare slow controls channels before and after with Wed 10sep2025 power glitch to see if there are any which look like they may have been broken and need further investigation.
(** Full disclosure, it was actually written by AI (Claude-code) and runs on my GC laptop safely contained in a virtual machine running Deb13 Trixie)
As a first pass, the code is looking for dead channels. These are channels which were active before the glitch, and are flat-line zero following.
The code gets its channels to analyze from the slow controls INI files.
Using Jonathan's simple_frames python module, it reads two minute trend GWF frame files. For before I'm using the 10:00-11:00 Wednesday file (an hour before the glitch) and for after I'm using the Thu 00:00-01:00 file. In both cases H1 was locked.
I'll post the results as comments to this alog.
TITLE: 09/15 Day Shift: 1430-2330 UTC (0730-1630 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
OUTGOING OPERATOR: None
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 6mph Gusts, 3mph 3min avg
Primary useism: 0.01 μm/s
Secondary useism: 0.09 μm/s
QUICK SUMMARY: I'll start by taking H1 through an initial alignment then relock up through CARM_5_PICOMETERS, since it sounds like that's still the latest stable state H1 can get to.
Attempt 1:
CARM_5_pm by guardian. TR_CARM offset -52, could then set TR_CARM gain to 2.1, then could step TR_CARM offset to -56. Then could set DHARD P gain to -30 and DHARD Y gain to -40.
Then ran CARM_TO_TR with the guardian. We could step the TR_REFLAIR9 offset to -0.03 and things looked stable, things started to ring up at 2Hz when we stepped to -0.02.
It seems like the increased DAHRD gain helped keep things more stable than last night.
Plot attached of the lockloss, Elenna pointed out we need to look at the faster channels. The oscillation started once the REFLAIR9 offset was -0.02 and higher. It's a 17Hz to 18Hz woble, also seen growing in all the LSC signals, which makes more sense for the fast frequency.
This same 17Hz LSC wobble was seen in the last lockloss last night too, plot.
TITLE: 09/15 Eve Shift: 2330-0500 UTC (1630-2200 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
INCOMING OPERATOR: None
SHIFT SUMMARY:
IFO is in IDLE for CORRECTIVE MAINTENANCE
The mission today was to get to RESONANCE. We got really close!
The idea was that the calculated TR_CARM offset was likey incorrect so we had to step the offset to maintain stability as we went into resonance. So attempts below are all from CARM_TO_5_PICOMETERS.
While sitting in CARM_TO_REFL during handoff, took RF_DARM TF (attached).
I'm stealing Ryan S's format for his IFO troubleshooting today since it was so easy to read! (Thanks Ryan!)
Reference for healthy CARM to RESONANCE transition pre-outage: 2025/09/09 20:01 UTC, GPS: 1441483301
Relocked,
Relocked,
Ran Initial Alignment - fully auto
Relocked,
Relocked,
Plan now was to try to step the TR_CARM_GAIN up as we were stepping the TR_CARM_OFFSET down further from -52 to keep the UGF consistent and not cause a LL. Sheila plannd to measure this each time we stepped it.
Relocked,
IFO in IDLE per Sheila's Instruction. We will try again tomorrow.
LOG:
None
Here are two screenshots to show our last attempt last night.
Second one show the TR_REFL 9 erorr signal as we stepped the TR_CARM offset closer to resonance. There is some noise in the TR_REFL9 signal, due to alignment fluctuations coupling when we are not on resonance. As we get closer to resonance by increasing the TR_CARM gain as the optical gain of the side of fringe signal drops, the refl signal gets a little quieter. The offset gets set to zero the error signal right before the handoff, and as you can see the handoff to the error signal works well and the refl 9 signal become quiet once it is in loop. Looking at the TR_CARM signal after the handoff, you can see that the arm powers fluctate because they are now seeing the noise from refl 9 which is now in loop.
When the offset is removed, TR_CARM goes to about 58. All build ups become wobbly when the offset is removed to go to resonance. This means that something seems wrong with the TR REFLAIR9 error signal, which could be due to an alignment problem or an issue with the sensor itself.
TITLE: 09/14 Day Shift: 1430-2330 UTC (0730-1630 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
INCOMING OPERATOR: Ibrahim
SHIFT SUMMARY: More troubleshooting today on H1 locking, primarily focused on getting through the CARM_TO_REFL transition and RESONANCE states. See my morning alog for details on the first half of the day; the rest I list here:
Generally locking has been very easy today up through CARM_5_PICOMETERS and alignment has been good.
TITLE: 09/14 Eve Shift: 2330-0500 UTC (1630-2200 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
OUTGOING OPERATOR: Ryan S
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 10mph Gusts, 6mph 3min avg
Primary useism: 0.02 μm/s
Secondary useism: 0.12 μm/s
QUICK SUMMARY:
IFO is DOWN for CORRECTIVE MAINTENANCE
Ryan S has tried an exhaustive combination (I will link the alog when posted) of lock attempts to get beyond where we were yesterday, which was losing lock upon loading the matrix necessary to get from CARM_5_PICOMETERS to CARM_TO_REFL to RESONANCE.
I don't have any ideas of what to try but will be in communication with comissioners.
Below is my outline of locking attempts so far this morning, with any changes I made in Guardian or otherwise in bold:
Summary of what we've changed so far since the overnight lock where IMC visibility degraded: (strting some notes, plan to add links later).
Sun Sep 14 10:08:17 2025 INFO: Fill completed in 8min 14secs
TITLE: 09/14 Day Shift: 1430-2330 UTC (0730-1630 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
OUTGOING OPERATOR: None
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 30mph Gusts, 24mph 3min avg
Primary useism: 0.04 μm/s
Secondary useism: 0.11 μm/s
QUICK SUMMARY: Windy and rainy morning on-site. I'll start by running H1 through a fresh initial alignment, then continue troubleshooting where Ibrahim left off last night.
TITLE: 09/14 Eve Shift: 2330-0500 UTC (1630-2200 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
INCOMING OPERATOR: None
SHIFT SUMMARY:
IFO is in IDLE and CORRECTIVE MAINTENANCE
The "good" news:
DHARD P loop has been successfully automatic since closing it this afternoon. We can lock all the way to CARM_5_PICOMETERS quite quickly. DRMI has also been good to me this evening.
The bad news:
We thought we were losing lock at RESONANCE, but we were really losing lock at CARM_5_PICOMETERS.
Stepping through this state has showed that its only at the end of this, where the matrix involving REFL_BIAS, TR_REFL9 and TR_CARM is loaded, we lose lock roughly 10s later due to SRM becoming unstable quickly.
I investigated this via alogs 86909, 86910, 86911, 86912, which are comments to my OPS shift start (alog 86908). After being led down a rabbit hole of ramp times, from 3 other times where that was the problem, I can confirm that this isn't it. Curiously, lock was lost faster with a longer ramp time.
With confirmation from TJ and Jenne, I'm leaving IFO in IDLE with the plan being to solve this problem in tomorrow's shift.
I do feel like we're close to solving this problem or at least figuring out where the issue lies. Specific details are in the alogs listed below.
LOG:
None
TITLE: 09/13 Eve Shift: 2330-0500 UTC (1630-2200 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
OUTGOING OPERATOR: Ryan S
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 17mph Gusts, 11mph 3min avg
Primary useism: 0.02 μm/s
Secondary useism: 0.15 μm/s
QUICK SUMMARY:
IFO is DOWN in CORRECTIVE MAINTENANCE
There was a lot of good work done today. Primarily rephasing REFLAIR (alog 86903) and closing DHARD_P loop. This has allowed us to easily get past DRMI. It seems that we are still losing lock around resonance due to a 35hz fast ringup.
The Procedure for the EVE:
What has happened so far:
Same PR_G ringup caused lockloss at RESONANCE and CARM_TO_REFL so I'm going to the state before, which is CARM_5_PICOMETERS. Will follow the blueprint above. Thankfully DHARD_P worked 2x now.
Writing this before I attempt to look at ISC_library but:
Our last stable state took us to CARM_5_PICOMETERS. I stayed there just to be sure it was stable.
I then followed the procedure and stepped through the next state, CARM_TO_REFL.
I will begin investigating what I can but I think this an issue with loading the matrix from ISC_Library, implying something is up with REFLBIAS or TR_REFL9? Somewhat out of my depth but will investigate alogs, wiki, ISC_library and anything else I can.
Either way, it seems that we thought we were losing at RESONANCE but it was the CARM_TO_REFL's last line, unless I got something wrong in the if-loop I didn't run through, which is also something I will investigate.
In looking at the problem lines loaded by the matrix, I see REFL_BIAS, TR_REFL9 and TR_CARM.
According to ISC_library.py, these are REFL_BIAS=9, TR_REFL9 = 27 and TR_CARM = 26. I don't know what this means yet.
Then, I looked at alogs hoping to find the last time something like this happened, which from vague memory was during post-vent recovery this year (with Craig and Georgia):
Looking for REFL_BIAS, I found Sheila's 2018 alog 43623 talking about the TR_CARM transition and how SRM would saturate first, which was what I was experiencing. Specifically the line is: TR_CARM transition: This has been unreliable for the last several days, our most frequent reason for failure in locking. We found that the problem was a large transient when the REFL_BIAS (TR CARM) path is first engaged would saturate SRM M1. We looked at the old version of the guardian and found that this used to be ramped on over 2 seconds, but has been ramped over 0.1 seconds recently. This transition has been sucsesfull both of the times we tried it since changing the ramp time, in one try there was no transient and in the other there was a transient but we stayed locked.
Looking for TR_REFL9, I found Craig's alog 84679 from the aforementioned post-vent recovery. Same issue with ramp time too, specifically referencing the few lines of ISC_LOCK that I posted above. He moved the ramp time to 2s, which I confirmed is still there. He has a picture of the lockloss (attached below) and it looks very similar to the ones we have been having. I trended the same channels he did and found that after this 2s ramp time, PR_GAIN would become unstable after 10s (attached below). Also found Elenna's alog (where I was also on-shift) dealing with the same issue, but before Craig had increased the ramp time - alog 84569. The deja vu was real. Buut I'm unsure this is the same issue because the ramp time is indeed 2s.
Looking for TR_CARM, I found this recent alog from Elenna discussing CARM, REFL and DHARD WFs - unknown whether it is relevant. Alog 86469. I will read this more closely.
While the I'm being led to just increase the ramp time further, I truly doubt this will change much since 2s is quite large of a ramp time and doesn't really explain much. Given all that, I'm going to try it before looking further since we've gotten back here in the time it took to write and investigate this.
As expected, that did not fix it, rather it seemed as though the faster ramp time caused a weird offset in REFLAIR9, which didn't happen last time. I checked the log and confirmed that there was a 4s instead of 2s ramp time but I'm kind of struggling to see it (attached).
There seems to be a 25hz signal on POP_A as soon as I increased the ramp time. Additionally, there wasn't really an instability upon turning on CARM_TO_REFL. It was stable for 9s until LL, showing a kick 3s before rather than a degradation. Additionally, the lockloss wasn't as violent.
I don't have many ideas right now, but I'm reading the alog, the wiki and then devising some plan based off an idea of how this part of lock works.
I found an alog from Craig about REFLAIR9 (alog 43784). I'm somewhat more convinced that something involving the REFLAIR9_OFFSET and POP_LF?
The line from the alog: "we have made it through this transition every time now, but the PR_GAIN number on the striptool still spikes, meaning the POP_LF power changes rapidly through this transition, which is probably bad." sounds relevant?
I'm still confused why the REFLAIR9_OFFSET turning on causes an instability in PR_GAIN, which causes an instability in SRM which causes a lockloss.
I'm sufficiently confused at this stage so will attempt to go through the usual lockloss again to see if it's the same. Then I'll try stepping through again as a sanity check.
So far, I've only changed ramp times, immediately changing them back.
One thing that's confusing me is what the "REFL_BIAS" means. The REFL_BIAS Gain is 86 but the REFL_BIAS value in ISC_Library is 9. Nevermind, I think one refers to the location within a matrix in terms of the element number whereas the other is the value of that number within the matrix.
I posted the following message in the Detchar-LHO mattermost channel:
Hey detchar! We could use a hand with some analysis on the presence and character of the glitches we have been seeing since our power outage Wednesday. They were first reported here: https://alog.ligo-wa.caltech.edu/aLOG/index.php?callRep=86848 We think these glitches are related to some change in the input mode cleaner since the power outage, and we are doing various tests like changing alignment and power, engaging or disengaging various controls loops, etc. We would like to know if the glitches change from these tests.
We were in observing from roughly GPS time 1441604501 to 1441641835 after the power outage, with these glitches and broadband excess noise from jitter present. The previous observing period from roughly GPS 1441529876 to 1441566016 was before the power outage and these glitches and broadband noise were not present, so it should provide a good reference time if needed.
After the power outage, we turned off the intensity stabilization loop (ISS) to see if that was contributing to the glitches. From 1441642051 to 1441644851, the ISS was ON. Then, from 1441645025 to 1441647602 the ISS was OFF.
Starting from 1441658688, we decided to leave the input mode cleaner (IMC) locked with 2 W input power and no ISS loop engaged. Then, starting at 1441735428, we increased the power to the IMC from 2 W to 60 W, and engaged the ISS. This is where we are sitting now. Since the interferometer is has been unlocked since yesterday, I think the best witness channels out of lock will be the IMC channels themselves, like the IMC wavefront sensors (WFS), which Derek reports are a witness for the glitches in the alog I linked above.
To add to this investigation:
We attenuated the power on IMC refl, as reported in alog 86884. We have not gone back to 60 W since, but it would be interesting to know if a) there was glitches in the IMC channels at 2W before the attenuation, and b) if there were glitches at 2 W after the attenuation. We can also take the input power to 60 W without locking to check if the glitches are still present.
ini file: H1EPICS_ECATAUXCS.ini
num_chans: 23392
dead chans: 10
H1:PSL-ENV_LASERRMTOANTERM_DPRESS
H1:SYS-ETHERCAT_AUXCORNER_INFO_CB_QUEUE_2_USED
H1:SYS-PROTECTION_AS_TESTNEEDED
H1:SYS-PROTECTION_AS_TESTOUTDATED
H1:SYS-TIMING_C_FO_A_PORT_13_CRCERRCOUNT
H1:SYS-TIMING_C_MA_A_PORT_14_NODE_UPLINKCRCERRCOUNT
H1:SYS-TIMING_C_MA_A_PORT_8_NODE_FANOUT_DELAYERR
H1:SYS-TIMING_X_FO_A_UPLINKCRCERRCOUNT
H1:SYS-TIMING_Y_FO_A_PORT_6_CRCERRCOUNT
H1:SYS-TIMING_Y_FO_A_PORT_6_NODE_PCIE_OCXOLOCKED
ini_file: H1EPICS_ECATAUXEX
num_chans: 1137
dead_chans: 0
ini_file: H1EPICS_ECATAUXEY
num_chans: 1137
dead_chans: 0
ini_file: H1EPICS_ECATISCCS
num_chans: 2618
num_dead: 1
H1:ISC-RF_C_AMP24M1_POWEROK
ini_file: H1EPICS_ECATISCEX
num_chans: 917
dead_chans: 0
ini_file: H1EPICS_ECATISCEY
num_chans: 917
dead_chans: 0
ini_file: H1EPICS_ECATTCSCS
num_chans: 1729
dead_chans: 1
H1:TCS-ITMY_CO2_LASERPOWER_RS_ENC_INPUTA_STATUS
ini_file: H1EPICS_ECATTCSEX
num_chans: 353
dead_chans: 0
ini_file: H1EPICS_ECATTCSEY
num_chans: 353
dead_chans: 0
ini_file: H1EPICS_ECATSQZCS
num_chans: 3035
dead_chans: 0