Observing at 145Mpc
We've now been Locked for 3 hours and are Observing. Microseism was high when I first came in, but has slowly been going back down. Wind is picking up though.
FAMIS 19994
No major events of note this week.
Mon Sep 18 10:06:13 2023 INFO: Fill completed in 6min 9secs
Jordan confirmed a good fill curbside
TITLE: 09/18 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Lock Acquisition
OUTGOING OPERATOR: Tony
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 0mph Gusts, 0mph 5min avg
Primary useism: 0.05 μm/s
Secondary useism: 0.59 μm/s
QUICK SUMMARY:
Detector unlocked and running though INITIAL_ALIGNMENT when I came in. Looks like an earthquake unlocked it overnight.
Back to Observing at 16:21UTC
TITLE: 09/17 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Observing at 143Mpc
INCOMING OPERATOR: Tony
SHIFT SUMMARY:
Fairly uneventful shift with H1 locked for 18hrs. There was another M4.6 EQ from Canada, but nothing observable. Microseism conitnues slow increase (w/ step up in last 8hrs).
LOG: n/a
H1 continues a lock approaching 14hrs with a range hovering at/just above 140Mpc. Still light breezes outside. Was a little late, but the sky was nice out and managed to snap a pic just after the sunset.
Below is the summary of the LHO DQ shift for the week of 2023-09-04 to 2023-09-10
The full DQ shift report with day by day details is available at https://wiki.ligo.org/DetChar/DataQuality/DQShiftLHO20230904
TITLE: 09/17 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Observing at 146Mpc
OUTGOING OPERATOR: Oli
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 15mph Gusts, 9mph 5min avg
Primary useism: 0.04 μm/s
Secondary useism: 0.38 μm/s
QUICK SUMMARY:
Handed a 10+hr locked H1 from Oli (noticing L1 lost lock 45min ago, but they are almost back up). Winds are slightly touching 20mph recently.
TITLE: 09/17 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 145Mpc
INCOMING OPERATOR: Corey
SHIFT SUMMARY: Still Observing and have now been Locked for 10hours. Very uneventful day.
15:00UTC In Observing, have been Locked for 2hours
LOG:
no log
There were two locklosses during the 09/17 OWL shift, 07:45UTC and 11:30UTC.
09/17 07:45UTC Lockloss
PI24 started ringing up and wasn't able to be damped, and we lost lock 15 minutes after the ringup started(attachment1). Different from the last time we lost lock due to to the PIs (72636), as that time the phase was cycling through too fast to see if anything was helping, and was updated to wait longer before changing phase. This time the phase stopped cycling through - it just stopped trying after a few tries(SUS_PI_logtxt). It actually looks like if it had stayed at the first phase it tried, 340degrees, it might have been able to successfully damp it instead of ringing it up more(attachment2). Does this mean that checker time should be extended again? Or alternatively could we make it so if the slope of the mode increases when the phase is changed, it changes the phase in the opposite direction?
Mode24 started ringing up almost immediately after the 'super_checker' timer was completed (which turns off damping every 1/2 hour), and surpassed a value of 3 15s later, triggering the damping to turn back on.
It seems like it's a relatively common occurrance that the ESDs need to turn back on at max damping within a minute of turning off to damp mode24 or mode31.
Timeline
07:30:00 PI24 starts trending up
07:30:01 'super_checker' timer times out, turns off damping
07:30:17 PI24 exceeds 3
- ETMY PI ESD turns on & reaches its max output in ~30s and continues at max until lockloss
- Phase stops changing once ESD driver is at max output
07:35 PI31 also starts ringing up
07:42 DCPD saturates
07:44 PI24 reached a max of 2212
07:45 Lockloss
09/17 11:30UTC Lockloss
Caused by a 5.6 magnitude earthquake off the coast of western Canada only ~900km away from us. Seismon labeled it as two separate earthquakes from Canada that arrived at the same time, but it was only one.
Since this earthquake was so close and big, the detector wasn't given any warning, and we lost lock 4seconds after earthquake mode was activated. We actually received the "Nearby earthquake from Canada" two minutes after having lost lock!
Timeline
11:30:23 Earthquake mode activated
11:30:27 Lockloss
Looked into the SUS_PI issue and couldn't see why the phase stopped stepping.
During this time, SUS-PI_PROC_COMPUTE_MODE24_RMSMON was > 3 and the new rms from cdu.avg would have been ~11 which is larger than the old saved value of 7.94 . This should have caused SUS_PI to move forward with 'scanning mode 24 phase to damp', but it didn't. There could have been an issue with cdu.avg()? There was no reported errors with the guardian code.
new_24_rms = cdu.avg(-5,'SUS-PI_PROC_COMPUTE_MODE24_RMSMON')
if new_24_rms > self.old_mode24_rmsmon:
if true then would have gone ahead with stepping phase
Vicky, Oli, Camilla. Commented out the super_timeout code from SUS_PI. PI damping will now remain on.
After talking with Oli and Vicky, it seems like the super_timeout timer isn't working as as soon as damping is turned off mode 24 rings up and damping is turned back on. This gives more opportunities for the PI damping guardian to fail as it did for this lockloss.
The super_timeout was added to turn off damping after 30 minutes as LLO saw noise from PI damping LLO67285, but we don't see noise 71737.
I common failure mode when using cdsutils.avg is when it can't get data from NDS for whatever reason, it returns None. In python you can still do comparison operators with Nonetypes (ie. None > 7 evaluates to False). I'd guess that's what happened here, since it wouldn't get into the code under the conditional you posted.
A solution to this is to always check that there actually is data returned, and if not try again. I'd also recommend using the timeout_utils.py call_with_timeout function to avoid times the data call gets hung.
Observing at 144Mpc and Locked for 6hrs 25mins.
Quiet shift so far, although wind seems to have picked up a bit but not too bad.
Sun Sep 17 10:08:15 2023 INFO: Fill completed in 8min 11secs
The web images from nuc26 have been frozen from 09:56 UTC (02:56 PDT) and I cannot ping this machine. If the local camera images in the control room are not updating this computer will need to be rebooted, otherwise it can wait until tomorrow.
The nuc26 cameras in the control room were also frozen and showing that they had frozen at 09/17 06:12UTC. I also could not get into the computer remotely but I restarted the nuc and the cameras are back up and live.
I have opened a FRS ticket FRS29119 and closed it as "resolved by rebooting" in case this happens again.
Aplogies, my times were incorrect. The image froze up late Sat night, 16sep 23:12 PDT (17sep 06:12 UTC).
R. Short, T. Shaffer
Our automation has called for assistance when earthquakes roll through and make locking H1 difficult, which typically just has an operator request H1 to 'DOWN' and wait until ground motion is low, then try locking again. In an attempt to further improve our automation and lower the need for intervention, I've added a 'WAITING' state to H1_MANAGER that holds ISC_LOCK in 'READY' and waits for the SEI_ENV guardian to leave its 'EARTHQUAKE' state before moving back to 'RELOCKING.' H1_MANAGER will jump from 'RELOCKING' to 'WAITING' if the SEI_ENV node is in 'EARTHQUAKE' or 'LARGE_EQ' and ISC_LOCK is not past 'READY' (the motivation for this being that if H1 is making progress in locking when an earthquake hits, we don't want it to stop if the earthquake is harmless enough).
These changes are committed to svn and H1_MANAGER has been loaded.
There were two cases over the weekend where an earthquake caused a lockloss and H1_MANAGER correctly identified that with SEI_ENV being in 'EARTHQUAKE' mode, it would be challenging to relock, so it kept ISC_LOCK from trying (one on 9/17 at 11:30 UTC and another on 9/18 at 13:44 UTC). However, after 15 minutes of waiting, IFO_NOTIFY called for assistance once it saw that ISC_LOCK had not made it to its 'READY' state; confusing behavior at first, since H1_MANAGER requests ISC_LOCK to 'READY' when it moves to the 'WAITING' state. When looking into this, I was reminded that ISC_LOCK's 'DOWN' state has a jump transition to 'PREP_FOR_LOCKING' when it finishes, meaning that ISC_LOCK will stall in 'PREP_FOR_LOCKING' unless revived by its manager or is requested to go to another state. To fix this, I've added an "unstall" decorator to H1_MANAGER's 'WAITING' state's run method, which will revive ISC_LOCK so that it can move past 'PREP_FOR_LOCKING' and all the way to 'READY' while waiting for the earthquake to pass.