TITLE: 09/17 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 145Mpc
INCOMING OPERATOR: Corey
SHIFT SUMMARY: Still Observing and have now been Locked for 10hours. Very uneventful day.
15:00UTC In Observing, have been Locked for 2hours
LOG:
no log
There were two locklosses during the 09/17 OWL shift, 07:45UTC and 11:30UTC.
09/17 07:45UTC Lockloss
PI24 started ringing up and wasn't able to be damped, and we lost lock 15 minutes after the ringup started(attachment1). Different from the last time we lost lock due to to the PIs (72636), as that time the phase was cycling through too fast to see if anything was helping, and was updated to wait longer before changing phase. This time the phase stopped cycling through - it just stopped trying after a few tries(SUS_PI_logtxt). It actually looks like if it had stayed at the first phase it tried, 340degrees, it might have been able to successfully damp it instead of ringing it up more(attachment2). Does this mean that checker time should be extended again? Or alternatively could we make it so if the slope of the mode increases when the phase is changed, it changes the phase in the opposite direction?
Mode24 started ringing up almost immediately after the 'super_checker' timer was completed (which turns off damping every 1/2 hour), and surpassed a value of 3 15s later, triggering the damping to turn back on.
It seems like it's a relatively common occurrance that the ESDs need to turn back on at max damping within a minute of turning off to damp mode24 or mode31.
Timeline
07:30:00 PI24 starts trending up
07:30:01 'super_checker' timer times out, turns off damping
07:30:17 PI24 exceeds 3
- ETMY PI ESD turns on & reaches its max output in ~30s and continues at max until lockloss
- Phase stops changing once ESD driver is at max output
07:35 PI31 also starts ringing up
07:42 DCPD saturates
07:44 PI24 reached a max of 2212
07:45 Lockloss
09/17 11:30UTC Lockloss
Caused by a 5.6 magnitude earthquake off the coast of western Canada only ~900km away from us. Seismon labeled it as two separate earthquakes from Canada that arrived at the same time, but it was only one.
Since this earthquake was so close and big, the detector wasn't given any warning, and we lost lock 4seconds after earthquake mode was activated. We actually received the "Nearby earthquake from Canada" two minutes after having lost lock!
Timeline
11:30:23 Earthquake mode activated
11:30:27 Lockloss
Looked into the SUS_PI issue and couldn't see why the phase stopped stepping.
During this time, SUS-PI_PROC_COMPUTE_MODE24_RMSMON was > 3 and the new rms from cdu.avg would have been ~11 which is larger than the old saved value of 7.94 . This should have caused SUS_PI to move forward with 'scanning mode 24 phase to damp', but it didn't. There could have been an issue with cdu.avg()? There was no reported errors with the guardian code.
new_24_rms = cdu.avg(-5,'SUS-PI_PROC_COMPUTE_MODE24_RMSMON')
if new_24_rms > self.old_mode24_rmsmon:
if true then would have gone ahead with stepping phase
Vicky, Oli, Camilla. Commented out the super_timeout code from SUS_PI. PI damping will now remain on.
After talking with Oli and Vicky, it seems like the super_timeout timer isn't working as as soon as damping is turned off mode 24 rings up and damping is turned back on. This gives more opportunities for the PI damping guardian to fail as it did for this lockloss.
The super_timeout was added to turn off damping after 30 minutes as LLO saw noise from PI damping LLO67285, but we don't see noise 71737.
I common failure mode when using cdsutils.avg is when it can't get data from NDS for whatever reason, it returns None. In python you can still do comparison operators with Nonetypes (ie. None > 7 evaluates to False). I'd guess that's what happened here, since it wouldn't get into the code under the conditional you posted.
A solution to this is to always check that there actually is data returned, and if not try again. I'd also recommend using the timeout_utils.py call_with_timeout function to avoid times the data call gets hung.
Observing at 144Mpc and Locked for 6hrs 25mins.
Quiet shift so far, although wind seems to have picked up a bit but not too bad.
Sun Sep 17 10:08:15 2023 INFO: Fill completed in 8min 11secs
The web images from nuc26 have been frozen from 09:56 UTC (02:56 PDT) and I cannot ping this machine. If the local camera images in the control room are not updating this computer will need to be rebooted, otherwise it can wait until tomorrow.
The nuc26 cameras in the control room were also frozen and showing that they had frozen at 09/17 06:12UTC. I also could not get into the computer remotely but I restarted the nuc and the cameras are back up and live.
I have opened a FRS ticket FRS29119 and closed it as "resolved by rebooting" in case this happens again.
Aplogies, my times were incorrect. The image froze up late Sat night, 16sep 23:12 PDT (17sep 06:12 UTC).
TITLE: 09/17 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 147Mpc
OUTGOING OPERATOR: Tony
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 4mph Gusts, 2mph 5min avg
Primary useism: 0.02 μm/s
Secondary useism: 0.28 μm/s
QUICK SUMMARY:
We are currently Observing and have been Locked for 2 hours. It looks like there were two earthquakes that were in the yellow earthquake zone during the OWL shift, causing us to lose lock twice EDIT: the two earthquakes are in USGS as just one earthquake, which makes sense since they hit us at the same time, and were only the cause of one of our locklosses. Everything will hopefully be calm now!
SDF diffs Changes were made today. I'm not sure if they were intentional or not.
But since we are in NOMINAL_LOW_NOISE I'm gonna accept them.
#edit: at 2AM when I first posted this ALOG the picture of the SDF Diffs were wrong. I have since fixed that.
same issue with ALS-Y_REFL_SERVOIN1GAIN and ALS-Y_REFL_SERVO_COMBOOST, I just accepted the changes.
TITLE: 09/16 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Observing at 144Mpc
INCOMING OPERATOR: Tony
LOG:
H1 recently had a lock end after just over 8hrs. Currently waiting for H1 to get back to NLN.
Environmentally all is well.
TITLE: 09/16 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 147Mpc
INCOMING OPERATOR: Corey
SHIFT SUMMARY: Relocking after the lockloss (72911) was a bit strange but seemed like a one-off. Rest of the day was quiet.
15:00UTC In Observing, have been Locked for 18hours
16:38 Lockloss (72911)
16:51 LOCKLOSS_PRMI during third time going through ACQUIRE_PRMI, took itself to DOWN and restarted
17:02 I took it into INITIAL_ALIGNMENT
17:32 INITIAL_ALIGNMENT completed, heading to NOMINAL_LOW_NOISE
17:42 Lockloss at OFFLOAD_DRMI_ASC
18:24 Reached NOMINAL_LOW_NOISE
18:41 Into Observing
LOG:
| Start Time | System | Name | Location | Lazer_Haz | Task | Time End |
|---|---|---|---|---|---|---|
| 20:42 | Austin+1 | LExC, OSB, Overpass | n | tour | 22:04 |
All looks nominal for the site HVAC fans (did catch a transition between fans at MY for this one). Closing FAMIS 26251.
TITLE: 09/16 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Observing at 147Mpc
OUTGOING OPERATOR: Oli
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 8mph Gusts, 6mph 5min avg
Primary useism: 0.02 μm/s
Secondary useism: 0.24 μm/s
QUICK SUMMARY:
H1's been locked just under 5hrs & all looks nominally from here (slight continued drift up in microseism from yesterday).
Sat Sep 16 10:10:29 2023 INFO: Fill completed in 10min 25secs
Lockloss @ 09/16 16:38UTC. EX saturation right before/as the lockloss occurred.
18:41 Observing
I was able to confirm that LSC-DARM_IN1 does see the lockloss before ASC-AS_A_DC_NSUM_OUT does (attachment1), which had been brought up by TJ (72890), but I want to clarify that this isn't a recent occurance - these locklosses from July 27th(attachment2) and August 10th(attachment3) also show the lockloss in DARM_IN1 a few milliseconds before the light falls off the AS_A photodiode.
In the case of this lockloss, I could not find an obvious cause, but since there was an EX callout right before the lockloss I looked into the ETMX suspension channels that we had looked into a bit yesterday (72896).
The lockloss was seen in DARM and the DCPDs before being seen in ETMX_L3_MASTER_OUT_{UL,UR,LL,LR}_DQ channels (attachment4), followed by the AS_A channel later. However, SUS-ETMX_L3_MASTER_OUT... goes through a few filters/functions before arriving at this reading and so might have a higher latency than the DARM and DCPD channels.
The EX glitch occurred ~700ms before DARM saw the lockloss, and similarly, it was seen in DARM and the DCPDs before the ETMX L3 channels (attachment5). Not sure if this glitch could cause a lockloss (at least not on its own), since we have had glitches this big (and much bigger) in ETMX that were also seen in DARM but did not cause a lockloss, see attachment6 for an example from almost a day earlier.
Thinking about what Camilla said in (72896) "ETMX moving would cause a DARM glitch (so the DARM BLRMs to increase) or vice versa, DARM changing would cause ETMX to try to follow", in this case depending on the latency either could still be true but I don't see why we would lose lock from a (relatively) mid-size saturation but hold on during a larger one unless there was some other factor (which I guess there probably usually is).
Relocking after this lockloss:
Lost lock after going through ACQUIRE_PRMI three times, so I decided to run an initial alignment. Initial alignment completed fine and we started locking and moving through states quickly, but then we lost lock at OFFLOAD_DRMI_ASC. I let it try locking again and everything was fine, so seems like that OFFLOAD_DRMI_ASC lockloss was just a one-off strange lockloss.
Closes FAMIS#26209, last completed Sept 8th
Laser Status:
NPRO output power is 1.833W (nominal ~2W)
AMP1 output power is 67.19W (nominal ~70W)
AMP2 output power is 134.8W (nominal 135-140W)
NPRO watchdog is GREEN
AMP1 watchdog is GREEN
AMP2 watchdog is GREEN
PMC:
It has been locked 41 days, 0 hr 12 minutes
Reflected power = 16.6W
Transmitted power = 109.0W
PowerSum = 125.6W
FSS:
It has been locked for 0 days 19 hr and 29 min
TPD[V] = 0.8165V
ISS:
The diffracted power is around 2.4%
Last saturation event was 0 days 19 hours and 29 minutes ago
Possible Issues: None
TITLE: 09/16 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 148Mpc
OUTGOING OPERATOR: Tony
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 4mph Gusts, 3mph 5min avg
Primary useism: 0.03 μm/s
Secondary useism: 0.24 μm/s
QUICK SUMMARY:
Everything is looking good this morning. We're Observing and have been Locked for 18hrs.
Earthquake mode was activated between 13:04-13:14UTC.