I have run the dither alignment twice for both TMSY and ITMY. This improved things to the point that I can now get flashes and lock at around .28. However I am still struggling to improve on this.
Many thanks to Ryan for our short term fix of reverting our DS service back to login5. After making this change (and increasing the sleeptime to 300 seconds), I restart ligods and apache2 services on cdswiki and the CDS web pages are back. As Ryan said, the login screens will revert to the old-style form. As soon as the new DS is available, we will switch back.
This will allow Vacuum, Facilities and Detector Engineering teams to remotely monitor LHO remotely for the remainer of the long-weekend.
Thanks again to Ryan.
TITLE: 01/15 Day Shift: 16:00-00:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Aligning
OUTGOING OPERATOR: Ed
CURRENT ENVIRONMENT:
Wind: 3mph Gusts, 1mph 5min avg
Primary useism: 0.03 μm/s
Secondary useism: 0.27 μm/s
QUICK SUMMARY:
I could not log in to the alog earlier. This seems to be resolved now (see Dave's alog).
Ed was having trouble with initial alignment when I arrived and was on the phone with Keita. Ed told me that Keita sent a message to Sheila, Jenne and Kiwamu to ask for assistance. Sheila called me and I told her about the diag main message I was seeing: 'IM_ALIGNED: IM3 P is out of its nominal range of 1961'. I sent her a plot of the IM 1-4 witness sensor positions over the last 7 days. She suggested that I move IM3 P back to 1961 and retry initial alignment. I did so and am now on the initial alignment of the arms on green. The X arm locked easily, but I am not getting any signal for the Y arm. There is a spot on the camera, but it does not seem to move with the optics. I trended back and moved the positions of the ITM, TMS and ETM to no avail. I have just started looking for the dither align script.
Moved IM3P alignment offset from 29542 to 29864 to move the WIT P from 1945 to 1961.
The auth team has reported server issues, meaning that access to services with LIGO.ORG authentication is sporadic. I can confirm that I can 2FA SSH into LHO CDS, and as is self evident I can make alogs. Patrick reports he cannot log into alog to make entries.
The auth team are actively working this problem.
I'm not sure if this is related to our ongoing SAML-DS error which has prevented LHO CDS web pages from being accessible since yesterday morning.
Very directly related.
https://login.ligo.org/idp/shibboleth needs to return a signed XML document, not a human-readable error about missing files. LIGO-SAML-DS service on every SP running it trips over this. The very short-term fix is to modify your ligods.ini 's "master" property to point to the site IdP (login5. for LHO) rather than the master (login.) and restart ligods service. This is not something you should do outside of emergencies, since you really do want the latest metadata. Just happens to be a failure mode that hadn't been thought of at the time it was written -- it does handle "cannot talk to main IdP at all" case fine (e.g. Internet outage or similar).
I was just able to log in.
Ryan, thanks for the description of the problem. Should we go ahead with the short-term fix you mention or do we know if service restoration is imminent?
I would go ahead and make the change, with an eye towards reverting once folks are back at work next week. You can check if the central service has been fixed by hitting the first URL in my other comment (if you get a large blob of XML, you're good. If you get "missing file", not so good). No ETA since I have no management involvment with the central IdP server.
... and in bad form replying to myself, permanently changing sleeptime to something more like 300 seconds would not be amiss either, since the default 15 seconds induces a large load on all of the IdPs from purely "hello, are you there?" checks.
We are seeing glitches that come in sets and look like stacks of harmonics in the most recent lock. It looks like they can be explained as the non-linear upconversion of some 1 kHz violin modes, based on the spacing between glitches. We think that these glitches happen when the violin modes get high enough to run into some non-linearity in the sensing. I used the derivative of OMC-DCPD_SUM, since it seems like these glitches should not care about the DC value and would most likely be some kind of slew-rate limit. The 1 kHz violin modes dominate the RMS, in particular a pair at 1009.44 and 1009.487 Hz. The beat frequency between these gives a period of about 21 seconds, which is the spacing between bursts of glitches. The glitches occur when the amplitude of the DCPD derivative is highest. The amplitude has a period of 21 seconds because of the two 1 kHz violin modes. Comparing to the previous lock, when these glitches were not present, the amplitude of these two modes is no higher. But there are a number of other modes near 1 kHz and several of those are substantially higher. So they may have pushed the amplitude into the nonlinear region. Attached is a PDF showing the glitches as they appear on the summary page (they are most clearly seen at 2 kHz in Omicron), and the comparison of the 1 kHz spectrum with the previous lock which did not have these glitches. The second page shows the bursts of glitches compared to the amplitude of the DCPD derivative.
If the interferometer is up I will spend some time damping them tonight.
It is important to notice that this 2kHz glitch line has been appearing and dissapearing quite irregularly in the past, but when it is present the associated Omicron glitches are of high SNR. In fact the last time this line showed up was all the way back to 29-30th November:
* 2kHz glitch line started to show on first 29th Nov lock
* 2kHz glitch line disappears on the 30th Nov
Originally I thought that the 2kHz glitch line could have been related to PCALX roaming calibration lines, based on Evan's alog on PCALX roaming calibration line frequency changes. The 2kHz glitch line seem to start as soon as the detector locked after the PCALX calibration line at 1001.3Hz was activated on UTC 2016-11-30 17:16:00, and then the glitch line disappeared around the time the cal line at 2001.3Hz was moved to 2501Hz at 2016-11-30 22:07:00. The fact that the time coincidence was not precise made me believe that the time coincidence may have been casual. It can now be confirmed that it must be unrelated because not such PCALX roaming line was not set at 2001.3Hz during the time of the current appearence of the 2kHz glitch line.
The obvious question is if the November 2kHz high SNR glitch line shows a similar 21 second spacing between bursts of glitches. The answer is yes ,as seen next during the dissapearance of the 2kHz glitch line on the 30th Nov (attached are the original images from which this image was made):

A zoom around the beginning of the above spectrum shows the ~21secs periodicity of the features:

A closer look to the 2nd harmonic violin modes for 30 mins during the time of the 2kHz glitch line (in blue) and 30 mins after the glitch line dissapears (in red) shows that only few violin modes were higher during the time of the 2kHz glitch line:

There are two cases when the blue lines are higher than the red:
* At about 1003.7Hz:

* At about 1009_4Hz:

It is clear that only the pair at around 1009.45Hz would beat with a periodicity of about 20 seconds. And in this case while the lower frequency violin mode of the pair does not change much in amplitude however it is the higher frequency violin mode of the pair which increases by 30.
I have also compared the 1009.45Hz pair peaks amplitude for 4 different cases, two of them correspond to times when the 2kHz glitch line was present, and 2 other (dashed lines) to cases when the glitch line was not present. It shows how the higher frequency line has to be high enough to cause the nonlinearity for the 2kHz glitch to be present:

Nutsinee has just now created a damping filter for this pair of violin modes, so hopefully this will be enough to avoid growth of this peaks to the point of causing appearance of the 2kHz glitch line but only time will tell.
A small note for the operation purpose: Borja's 2kHz glitches thresholds on 1009.44Hz and 1009.49Hz correspond to
8e-13 and 6.5e-13 m/sqrt(Hz) in the (dtt calibrated) CAL-DELTAL_EXTERNAL_DQ channel in November
and
7e-13 and 4.4e-13 m/sqrt(Hz) in January.
Now that the guardian is turning the damping on, these two modes should be well controlled. But if something bad happened and the modes ring up I would suggest operators to take some time to damp the mode when they get close to 1e-12 on DARM FOM.
10:16:08 OMC DCPD Saturation
Nothing jumps out as apparent at this time.
Initiating re-lock. Had to re-align Beam Splitter/PRM and PR2. We'll see.
ITMY Roll Mode seems to be an ongoing obstacle to locking.
17 minutes for for damping. Added 30degrees to phase (60 total) and increased gain from 10 to 20.
Unsuccessful lock attempt @ LOWNOISE_ASC. decrease in ASAIR_B_RF90
11:08UTC Begin Initial Alignment