Lockloss due to unknown cause. Not PSL this time. Still investigating.
Sun Oct 20 10:13:53 2024 INFO: Fill completed in 13min 49secs
TITLE: 10/20 Day Shift: 1430-2330 UTC (0730-1630 PST), all times posted in UTC
STATE of H1: Observing at 149Mpc
OUTGOING OPERATOR: Ryan C
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 6mph Gusts, 4mph 3min avg
Primary useism: 0.02 μm/s
Secondary useism: 0.15 μm/s
QUICK SUMMARY:
IFO is in NLN and OBSERVING as of 09:46 UTC (5 hr lock).
Range is a bit low so will see what that's about.
H1 called for assistance at 09:42 UTC from the NLN timer expiring, by the time I logged in 4 minutes later we were Observing 09:46 UTC. There was a high state lockloss, at LASERNOISE_SUPRESSION.
TITLE: 10/20 Eve Shift: 2330-0500 UTC (1630-2200 PST), all times posted in UTC
STATE of H1: Observing at 152Mpc
INCOMING OPERATOR: Ryan C
SHIFT SUMMARY: Fairly quiet shift with one lockloss followed by an automatic relock. H1 has now been observing for over 2 hours.
Lockloss @ 01:32 UTC - link to lockloss tool
No obvious cause; looks like a sizeable ETMX glitch about a half second before the lockloss.
H1 back to observing at 02:47 UTC. Fully automatic relock.
TITLE: 10/19 Day Shift: 1430-2330 UTC (0730-1630 PST), all times posted in UTC
STATE of H1: Observing at 158Mpc
INCOMING OPERATOR: Ryan S
SHIFT SUMMARY: Its been a windy day, a few short locks today. Relocking has been easy, we've been locked for just over 3 hours.
LOG: No log
TITLE: 10/19 Eve Shift: 2330-0500 UTC (1630-2200 PST), all times posted in UTC
STATE of H1: Observing at 155Mpc
OUTGOING OPERATOR: Ryan C
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 14mph Gusts, 8mph 3min avg
Primary useism: 0.03 μm/s
Secondary useism: 0.21 μm/s
QUICK SUMMARY: H1 has been locked and observing for almost 3 hours.
Ryan C taking over for the last 4 hours but here's what happened in the first 4:
After issues getting the IMC locked during SRC_ALIGN, we manged to automatically get to NLN, Observing at 16:04 UTC.
SQZ Tuning: Dropped out of OBSERVING 17:14-17:26 UTC
Calibration Sweep: Dropped out of OBSERVING 18:34 UTC - Lockloss 19:05 UTC
Lockloss (alog 80760): FSS Caused lockloss during simulines.
Timing glitch followup (Dave): While yesterday's EY time glitch had no consequences. Dave said that it has been glitching by quite erratic numbers lately and has recommended another power supply swap (like the EX one recently).
Fast Shutter Guardian Glitch Followup: Dave found that the guardian fast shutter malfunction yesterday was caused because there was a 5s delay in data-grabbing but why that happened is unknown. He will post an alog about this soon.
Ran the usual calibration sweep following the wiki. IFO was thermalized (locked for 2.5 hrs but ~3hrs since MAX_POWER. Monitor Attached. Times are in GPS.
Broadband
Start: 1413398319
End: 1413398630
Simulines
Start: 1413398808
End (Lockloss): 1413399944
No files written on the screen due to abort from lockloss. Here's the error message:
2024-10-19 19:05:22,079 | ERROR | IFO not in Low Noise state, Sending Interrupts to excitations and main thread.
2024-10-19 19:05:22,080 | ERROR | Ramping Down Excitation on channel H1:LSC-DARM1_EXC
2024-10-19 19:05:22,080 | ERROR | Ramping Down Excitation on channel H1:SUS-ETMX_L3_CAL_EXC
2024-10-19 19:05:22,080 | ERROR | Ramping Down Excitation on channel H1:CAL-PCALY_SWEPT_SINE_EXC
2024-10-19 19:05:22,080 | ERROR | Ramping Down Excitation on channel H1:SUS-ETMX_L2_CAL_EXC
2024-10-19 19:05:22,080 | ERROR | Ramping Down Excitation on channel H1:SUS-ETMX_L1_CAL_EXC
2024-10-19 19:05:22,080 | ERROR | Aborting main thread and Data recording, if any. Cleaning up temporary file structure.
ICE default IO error handler doing an exit(), pid = 3195655, errno = 32
PDT: 2024-10-19 12:05:26.374801 PDT
UTC: 2024-10-19 19:05:26.374801 UTC
GPS: 1413399944.374801
PSL Caused Lockloss during last part of Simulines Calibration.
IMC and ASC lost lock within 5ms of one another. (plot attached).
20:21 UTC Observing
Looking at more channels around this lockloss, I'm not entirely sure if the FSS was at fault in this case. In a longer-term trend (starting 6 sec before lockloss, see screenshot), there were a handful of glitches in the FSS_FAST_MON channel, but no significant corresponding NPRO temp or output power changes at those times and the EOM drive had not reached high levels like we've seen in the past when this has caused locklosses. Zooming in (see other screenshot), it seems that the first of these channels to change is the FSS_FAST_MON, but instead of a glitch, it looks to actually be moving. Soon after, there's a drop in AS_A_DC_NSUM, I believe indicating the IFO was losing lock, then 5 ms later the IMC starts to lose lock. I suppose we're at the mercy of data sampling rates here, but this may add some more context to these kinds of locklosses.
Sat Oct 19 10:12:16 2024 INFO: Fill completed in 12min 12secs
TITLE: 10/19 Day Shift: 1430-2330 UTC (0730-1630 PST), all times posted in UTC
STATE of H1: Aligning
OUTGOING OPERATOR: Ryan C
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 19mph Gusts, 11mph 3min avg
Primary useism: 0.03 μm/s
Secondary useism: 0.21 μm/s
QUICK SUMMARY:
IFO was sitting in SRC_ALIGN part of initial alignment unable to lock the IMC, with MC2 saturations every few seconds.
This issue has happened before and it's becauser the LASER_PWR was not able to revert to 2W so was sitting at 10W - this is related to FSS glitching it seems from Ryan C's OWL alog 80754. Will set the power to 2W and attempt initial alignment again after the IMC locks.
What seems to be happening is FSS glitches and unlockes the IMC, but the power doesn't go down to 2 - this happens during MICH_FRINGES and MICH_BRIGHT_ALIGNING.
The solution that seems to work:
I had the FSS glitch out on me twice during this process meaning I had to repeat it, but it worked eventually. I think this is a temp solution.
H1 called for assistance at 14:00UTC, from Initial alignment timing out. When I got on it was in SR2 align, waiting for ASC to converge but the IMC keeps unlocking. DIAG_MAIN is reporting shutter_A is not open and IMC_LOCK keeps trying to open it, the log is saying the FSS is unlocking as well. The FSS is oscillating.
When Ibrahim relocked the IFO, the fast shutter guardian was stuck in the state "Check Shutter"
This was because it apparently got hung up getting the data using cdsutils getdata.
The first two attachments show a time when the shutter triggered and shows up in the HAM6 GS13s, and the guardian passes. The next time shows our most recent high power lockloss, where the shutter also triggered and shows up at a similar level in the GS13s, but the guardian doesn't move on.
The guardian log screenshot shows both of these times, it seems that it was still waiting for data and so the test neither passed nor failed.
To get around this and go to observing, we manualed to HIGH_ARM_POWER.
Vicky points out that TJ has sovled this problem for other guardians using his timeout utils, this guardian may need that added.
Another thing to do is make things so that we would notice that the test hasn't passed before we power up.
I looked at nds1's logs for this data request, the request appears to come it at the time it had timed out on h1guardian1.
From Shela's guardlog (paraphrasing somewhat):
2024-10-18_19:36:22Z timer
which is
2024-10-18T12:36:22 in "log PDT" format
NDS1 logs show (h1guardian1 is 10.101.0.249):
2024-10-18T12:36:27-07:00 h1daqnds1.cds.ligo-wa.caltech.edu daqd[1267959]: [Fri Oct 18 12:36:27 2024] connection on port 38943 from 10.101.0.249; fd=75
2024-10-18T12:36:27-07:00 h1daqnds1.cds.ligo-wa.caltech.edu daqd[1267959]: [Fri Oct 18 12:36:27 2024] ->42: version
2024-10-18T12:36:27-07:00 h1daqnds1.cds.ligo-wa.caltech.edu daqd[1267959]: [Fri Oct 18 12:36:27 2024] ->42: revision
2024-10-18T12:36:27-07:00 h1daqnds1.cds.ligo-wa.caltech.edu daqd[1267959]: [Fri Oct 18 12:36:27 2024] ->42: status channels 3 {"H1:ISI-HAM6_BLND_GS13Z_IN1_DQ"}
2024-10-18T12:36:27-07:00 h1daqnds1.cds.ligo-wa.caltech.edu daqd[1267959]: [Fri Oct 18 12:36:27 2024] ->42: status channels 3 {"H1:SYS-MOTION_C_SHUTTER_G_TRIGGER_VOLTS"}
So on first look it appears nds1 didn't get the request until after the 5 second timeout had expired.
Here is the request line (line 53) of isc/h1/guardian/LOCKLOSS_SHUTTER_CHECK.py
gs13data = cdu.getdata(['H1:ISI-HAM6_BLND_GS13Z_IN1_DQ','H1:SYS-MOTION_C_SHUTTER_G_TRIGGER_VOLTS'],12,self.timenow-10)
This request is for 12 seconds of data with a start time 10 seconds in the past, meaning it cannot complete until +2 seconds have elapsed. Does this mean the 5 second timeout is effectively a 3 second timeout?
From 07:32 - 07:37 Fri 18oct2024 PDT we had another glitch on the EY CNS-II GPS 1PPS as read by the timing system's EY comparator. This had happened twice recently, 05:50-06:40 Mon 30sep2024 (50mins) and a pair the next day 07:14 (3mins) 07:42 (8mins) Tue 01oct2024.
attached are details of today's glitch, and a month trend showing all three.
On a related note, the EX CNS-II broke on the 8th Oct with a bad power supply (a wall wart). It might be prudent to schedule a replacement of EY's power supply next Tuesday.
It started again at 10:02. This time it is switching between -2200ns and -800ns
Follow up: EY CNS-II started glitching again around 10am Friday, it continued to do this for about an hour, then went good again. No further glitches since then over the past 23 hours.