Not sure why, no saturations, I saw the PSL ISS go red on OPS_OVERVIEW at the same time as the LL.
https://ldas-jobs.ligo-wa.caltech.edu/~lockloss/index.cgi?event=1374856278
We briefly dropped out of Observing from 15:59UTC till 16:09UTC from the SQZ ISS losing lock, I followed alog 70050 and recovered it by setting opo_grTrans_setpoint_uW to 50 from 65 in sqzparams.py, we'll want to set this back later and accepted the SDF diff from adjusting the H1:SQZ-OPO_TEC_SETTEMP value.
We dropped out again for 3 seconds 16:18:32 to 16:18:35 from 2 syscssqz diffs, the same scenario as seen here alog71652
TITLE: 07/31 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 146Mpc
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 3mph Gusts, 2mph 5min avg
Primary useism: 0.01 μm/s
Secondary useism: 0.06 μm/s
QUICK SUMMARY:
TITLE: 07/31 Owl Shift: 07:00-15:00 UTC (00:00-08:00 PST), all times posted in UTC
STATE of H1: Observing at 146Mpc
SHIFT SUMMARY:
Once we recovered from the computer crash, everything was quiet. We have now been Locked for 5.5 hours.
7:00 Detector in corrective maintainance due to Dolphin crash (71829 https://alog.ligo-wa.caltech.edu/aLOG/index.php?callRep=71829)
9:36 Reached NOMINAL_LOW_NOISE
9:52 Back to Observing
LOG:
no log
Everything has been good since we got back into Observing two hours ago.
TITLE: 07/31 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
SHIFT SUMMARY: Quiet evening up until a Dolphin glitch caused a lockloss. Dave assisted in CDS recovery then Oli and I recovered the IFO.
Handing off fully to Oli for the rest of the night; IFO is currently locking DRMI.
LOG:
No log for this shift.
Detector back to Observing as of 9:52
TITLE: 07/31 Owl Shift: 07:00-15:00 UTC (00:00-08:00 PST), all times posted in UTC
STATE of H1: Corrective Maintenance
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 20mph Gusts, 16mph 5min avg
Primary useism: 0.04 μm/s
Secondary useism: 0.06 μm/s
QUICK SUMMARY:
Detector down because of Dolphin crash (see Ryan S.'s alog71829). Working on getting everything aligned.
Detector back to Observing as of 9:52
Lockloss @ 05:50 UTC, caused by Dolphin glitch.
First event I noticed was HEPI HAM1 watchdog tripping, then seeing IOP DACKILL tripped for iopsusb123, iopsush2a, iopsush34, and iopsush56 and a very red CDS overview (attached). Called Dave and we're now working on recovery.
EDIT: Attached CDS overview screenshot at time of glitch.
At first glace Dave believes the glitch originated from the OMC model, causing everything else to trip. He's isolating it from the Dolphin network and restarting it now.
Ryan S, Dave:
After diag resetting which cleared a bunch of cached IPC errors we were left with:
1. Every receiver of IPCs originating from h1omc0 were continuously bad, including those at the end stations.
2. The IOP DACKILLs were permanently asserted on h1sus[b123, h2a, h34, h56]
So it looks like the IX Dolphin card on h1omc0 has gone offline and caused a glitch which took down the corner station SUS listed above.
Recovery:
Log into h1omc0 and check it can see its IO Chassis [it can] and see if the dmesg logs show anything from this time [they dont].
Fence h1omc0 from Dolphin and reboot.
When h1omc0 came back and its models restarted, all the outstanding IPC receive errors cleared.
Onto SUS. For each we safed the SUS, bypassed the SEI SWWD receivers, stopped all the models then started all the models. When the IOP was running again, I unbypassed the corresponding SEI SWWDs.
This worked well for h1susb123, h1sush2a and h1sush34. It did not work for h1sush56, the IOP model failed to restart.
I stopped the partially started h1sush56 models, checked the IO Chassis was visible [it was], fenced it from Dolphin and rebooted.
As h1sush56 came back from reboot, we saw a lot of IPC flashes on various systems. I had seen one or two flashes, but h1sush56 flashed the IPCs for many seconds until its IOP model got going. From that point onwards the Dolphin network was good, no new IPC errors.
State of H1: Observing at 147Mpc
Very quiet evening so far. H1 has been locked and observing for 6.5 hours.
TITLE: 07/30 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 137Mpc
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 16mph Gusts, 12mph 5min avg
Primary useism: 0.03 μm/s
Secondary useism: 0.07 μm/s
QUICK SUMMARY:
18:39 Lockloss from NLN
https://ldas-jobs.ligo-wa.caltech.edu/~lockloss/index.cgi?event=1374777588
18:55 UTC lockloss while relocking
https://ldas-jobs.ligo-wa.caltech.edu/~lockloss/index.cgi?event=1374778448
19:07 H1 Assistance required
19:08 Initial Alignment Started.
19:33 Initial Alignment Completed.
20:23 Nominal_Low_Noise reached
Observing reached at 20:33 UTC
TITLE: 07/30 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Observing at 143Mpc
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 18mph Gusts, 14mph 5min avg
Primary useism: 0.02 μm/s
Secondary useism: 0.06 μm/s
QUICK SUMMARY: Taking over from Tony. H1 has been locked and observing for 2.5 hours.
18:39 UTC Lockloss from NLN Lockloss
https://ldas-jobs.ligo-wa.caltech.edu/~lockloss/index.cgi?event=1374777588
No obvious earthquakes, Wind mildly gusty up to 20+ mph but unusually high. Analysis failed but the plots have some motion before the lockloss which are interesting.
18:55 UTC lockloss while relocking certainly due to poor alignment.
https://ldas-jobs.ligo-wa.caltech.edu/~lockloss/index.cgi?event=1374778448
I looked at the LSC/SQZ channels that are plotted using lockloss select and compared them to when H1:ASC-AS_A_DC_NSUM_DQ saw the lockloss(attachment1) (replaced top left channels with H1:ASC-AS_A_DC_NSUM_DQ in yellow), to me at least it looks like they all changed after the lock loss? Basing this partially off of what was said in alog 71659.
I went through the summary pages, and the corner station HAM6 magnetic noise spectrogram (attachment2, attachment3) looks off to me before and during the time of the lockloss. Typically it looks like the magnetic noise here will suddenly jump due to a lockloss (see 1:45UTC on attachment2 or attachment3), but for whatever reason the noise jumped ~24 minutes before the lockloss.
However, it does look like this jump has happened a couple other times (july5, july21-smaller) without causing us to lose lock, so possibly not related.
The spectrogram is created as x2+y2+z2 of the channels H1:PEM-CS_MAG_LVEA_OUTPUTOPTICS_X_DQ, H1:PEM-CS_MAG_LVEA_OUTPUTOPTICS_Y_DQ, and H1:PEM-CS_MAG_LVEA_OUTPUTOPTICS_Z_DQ. Unfortunately I am currently having issues getting data from LDVW so I'm not able to look into it more closely.
Sun Jul 30 10:12:57 2023 INFO: Fill completed in 12min 52secs
TITLE: 07/30 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 147Mpc
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 3mph Gusts, 2mph 5min avg
Primary useism: 0.01 μm/s
Secondary useism: 0.07 μm/s
QUICK SUMMARY:
Mattermost site is inaccessable, tagging CDS.
This is the error message i get if i wait long enough:
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request
Reason: Error reading from remote server
Teamspeak is confirmed working, and I spoke to Dr. Dripta B. who is the RRT04 Shifter today on teamspeak.
Looks like mattermost is back, I logged back in and posted a test message.