Reports until 09:50, Sunday 15 January 2017
H1 CDS
david.barker@LIGO.ORG - posted 09:50, Sunday 15 January 2017 - last comment - 10:25, Sunday 15 January 2017(33298)
alog inaccessible due to auth server issues this morning

The auth team has reported server issues, meaning that access to services with LIGO.ORG authentication is sporadic. I can confirm that I can 2FA SSH  into LHO CDS, and as is self evident I can make alogs. Patrick reports he cannot log into alog to make entries.

The auth team are actively working this problem.

I'm not sure if this is related to our ongoing SAML-DS error which has prevented LHO CDS web pages from being accessible since yesterday morning.

Comments related to this report
ryan.blair@LIGO.ORG - 10:01, Sunday 15 January 2017 (33299)

Very directly related.

https://login.ligo.org/idp/shibboleth needs to return a signed XML document, not a human-readable error about missing files. LIGO-SAML-DS service on every SP running it trips over this. The very short-term fix is to modify your ligods.ini 's "master" property to point to the site IdP (login5. for LHO) rather than the master (login.) and restart ligods service. This is not something you should do outside of emergencies, since you really do want the latest metadata. Just happens to be a failure mode that hadn't been thought of at the time it was written -- it does handle "cannot talk to main IdP at all" case fine (e.g. Internet outage or similar).

patrick.thomas@LIGO.ORG - 10:01, Sunday 15 January 2017 (33300)
I was just able to log in.
david.barker@LIGO.ORG - 10:13, Sunday 15 January 2017 (33302)

Ryan, thanks for the description of the problem. Should we go ahead with the short-term fix you mention or do we know if service restoration is imminent?

ryan.blair@LIGO.ORG - 10:17, Sunday 15 January 2017 (33303)

I would go ahead and make the change, with an eye towards reverting once folks are back at work next week. You can check if the central service has been fixed by hitting the first URL in my other comment (if you get a large blob of XML, you're good. If you get "missing file", not so good). No ETA since I have no management involvment with the central IdP server.

ryan.blair@LIGO.ORG - 10:25, Sunday 15 January 2017 (33304)

... and in bad form replying to myself, permanently changing sleeptime to something more like 300 seconds would not be amiss either, since the default 15 seconds induces a large load on all of the IdPs from purely "hello, are you there?" checks.