22:55UTC YAW out by approx .75
19:11UTC LLO Operator contacted for alert verification. 1 hr stand-down time initiated by Guardian INJ_TRANS node.
There is a write up of the bug which is possibly causing our front end issues here
I confirmed that our 2.6.34 kernel source on h1boot does have the arch/x86/include/asm/timer.h file with the bug.
To summarize, there is a low level counter in the kernel which has a wrap around bug with kernel versions 2.6.32 - 2.6.38. The bug causes it to wrap around after 208.499 days, and many other kernel subsystems assume this could never happen.
At 06:10 Thursday morning h1seiex partially locked-up. At 05:43 Friday morning h1susex partially locked-up. As of late Wednesday night, all H1 front ends have been running for 208.5 days. The error messages on h1seiex and h1susex consoles show timers which had been reset Wednesday night, within two hours of each other.
We are going on the assumption that the timer wrap-around has put the front end computers in a fragile state where lock-ups may happen. We don't know why only two computers at EX have seen the issue, or why around 6am, or why one day apart. Nothing happened around 6am this morning.
I am filing a work permit to reboot all H1 front end computers and DAQ computers which are running kernels with this bug.
15:03UTC
15:14 Intention bit Undisturbed.
Also, got the strange, "Conflicting IFO Status" verbal just before the intention bit verbal.
TITLE: 04/29 Owl Shift: 07:00-15:00 UTC (00:00-08:00 PST), all times posted in UTC
STATE of H1: Observing at 64Mpc
INCOMING OPERATOR: Ed
SHIFT SUMMARY:
Nice shift with H1 being locked for over 14.5hrs and only a trio of glitches. Luckily no frontend issues (knock on wood). L1 continues to be down due to possible alignment issues post-EQ from yesterday. Chatted briefly with Doug L who said Adam M was inbound.
LOG:
While looking into possible sources of glitches (noticed one Evan mentioned (alog#35800) earlier this week about ITMx glitching every few seconds....is this acceptable?).
I was using the Oplev Overview screen to grab the oplev sum channels and while doing that I noticed that the buttons on it for the SUS screens looked different from the main SUS screens one gets via the Sitemap. The problem button links on the oplev sum screen were for the ITMx, BS, & ITMy. The windows which I clicked open had INVALID/white areas and they just looked subtly different. (perhaps this medm screen is calling out old SUS screens.
Attached is a screenshot showing the difference for the BS. The Oplev Screen opens a screen called SUS_CUST_HLTX_OVERVEW.adl (vs the one from the sitemap which is SUS_CUST_BSFM_OVERVIEW.ADL).
Alaska 5.4magnitude EQ which originated from there at 11:15utc
Fairly quiet. Two glitches on H1.
See BLRMS 0.03-0.1Hz seismic elevating.
Whoa. Actually looks like we'll be shaken more by a quake from Alaska (5.4magnitude w/ 4.1um/s) which should be inbound soon.....watching & waiting.
TITLE: 04/29 Owl Shift: 07:00-15:00 UTC (00:00-08:00 PST), all times posted in UTC
STATE of H1: Observing at 65Mpc
OUTGOING OPERATOR: Nutsinee
CURRENT ENVIRONMENT:
Wind: 5mph
Primary useism: 0.02 μm/s
Secondary useism: 0.08 μm/s (at 50 percentile)
QUICK SUMMARY:
Nutsinee mentioned glitches in this current lock, but haven't had one for almost an hour. H1 is going on 7.5hrs of being locked. Let's see if our frontends give us grief tonight/this morning.
TITLE: 04/29 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Observing at 66Mpc
INCOMING OPERATOR: Corey
SHIFT SUMMARY: Some oplev related tweak in the beginning of shift. No trouble relocking after the earthquake. The range seems a bit glitchy during this lock stretch. I looked at random channels but no luck so far. Detchar Hveto page doesn't have data for today yet.
LOG:
23:40 Sheila+TJ to power cycle EX Oplev
23:55 Sheila+TJ out
00:21 Observe
00:54 Sheila going to back to EX to adjust oplev power. Switching off BRSX (forgot to go out of Observe while I did this). Out of Observe shortly after.
01:05 Sheila out. BRS turned back on. Back to Observe
Oplev glitches seem okay so far since Sheila turned the oplev power back up. Not much else to report.
We noticed on today's Hveto page that ETMX oplev is glitching. TJ and I went out and I turned the power knob down by about 1/8th of a turn. This reduced the output power by about 1%, and so far we don't have glitches.
After an hour it became clear that that power setting was actually worse, so I went back to the end station and turned the power up again at about 1 UTC. The glitching seems better in the first half hour at this higher power, but we will have to wait longer to see if it is really better. The attached screenshot shows the interferometer build up (which can explain some of the changes in the oplev sum) as well as the oplev sum for the past 5 hours.
Hopefully this means that we will be OK with glitches over the weekend.
I've used my IMC beam center measurements to calculate the change of the beam on IM1 in yaw, and propagated that change to the IO Faraday Isolator input, CalciteWedge1. The change on IM1 is calculated using an ideal IMC (centered beams) to recent beam spot measurements from March 2017. Nominal IM alignments are from the vent, July 2014, when the IMC REFL beam was routed through the IMs and the beam was well centered on the input and output of the IO Faraday.
My calculations show that the beam on CalciteWedge1 has moved +8.1mm, which is in the -X IFO direction, and the incident angle has changed by -1217urad, reducing the incident angle from 6.49deg to 6.42deg.
Beam Changes on IM1, IM2, and the IO Faraday input, CalciteWedge1:
| change | units | |
| im1 yaw, mm | -6.8 | mm |
| im1 yaw, urad | 253 | urad |
| im2 yaw, mm | -8.4 | mm |
| im2 yaw, urad | -1417 | urad |
| cw1 yaw, mm | 8.1 | mm |
| cw1 yaw, urad | -1217 | urad |
The beam change on IM1 is well understood, since it comes from the IMC beam spot changes. The IM positions can be assumed to have some error, however I've done the same calculations with IM positions from before and after the vent, and the change on CalciteWedge1 varies only by about 1mm.
A change of 8mm (+/-1mm) on the IO FI input is significant.
The optics inside the Faraday Rotator are only 20mm in diameter, and there is a small loss in aperture due to the optic mounts.
[JeffK, JimW, Jenne]
In hopes of making some debugging a bit easier, we have updated the safe.snap files in SDF just after a lockloss from NomLowNoise.
We knew that an earthquake was incoming (yay seismon!), so as soon as the IFO broke lock, we requested Down so that it wouldn't advance any farther. Then, we accepted most of the differences so that everything but ISIBS, CS ECAT PLC2, EX ECAT PLC2 and EY ECAT PLC2 (which don't switch between Observe.snap and safe.snap) were green.
Jeff is looking at making it so that the ECAT models switch between safe and observe snap files like many of the other models, so that ISIBS will be the only model that has diffs (21 of them).
Note that if the IFO loses lock from any state other than NLN, we shouldn't expect SDF to all be green. But, since this is the state of things when we lose lock from NLN, it should be safe to revert to these values, in hopes of helping to debug.
After talking with Jeff and Sheila, I have made a few of the OBSERVE.snap files in the target directory a link to the OBSERVE.snap in userapps.
This list includes:
I have also updated the switch_SDF_source_files.py script that is called by ISC_LOCK on DOWN and on NOMINAL_LOW_NOISE. I changed the exclude list to only exclude the h1sysecatplc[1or3] "front ends". The sei models will stay in OBSERVE always just as before. This was tested in DOWN and in NLN, and has been loaded into the Guardian.
TITLE: 04/28 Owl Shift: 07:00-15:00 UTC (00:00-08:00 PST), all times posted in UTC
STATE of H1: LOCKING by Jim, but still in CORRECTIVE MAINTENANCE (will write an FRS unless someone else beats me to it again!)
INCOMING OPERATOR: Jim
SHIFT SUMMARY:
Groundhog's Day shift, with H1 fine for the first 6hrs and then EX going down again (but this time SUS...see earlier alog). This time I kept away from even breathing on the TMSx dither scripts & simply restored ETMx & TMSx to their values before all of the front end hub bub of this morning. I was able to get ALSx aligned & this is where I'm handing off to Jim (he had to tweak on ALSy & see that he is already tweaking up a locked PRMI. Much better outlook than yesterday at this time for sure!
LOG:
FRS Assigned & CLOSED/RESOLVED for h1susex frontend crash:
https://services.ligo-la.caltech.edu/FRS/show_bug.cgi?id=7995
23:03 Intention Bit Undisturbed