Reports until 19:22, Sunday 15 June 2025
H1 CDS
david.barker@LIGO.ORG - posted 19:22, Sunday 15 June 2025 - last comment - 07:18, Monday 16 June 2025(85064)
CDS DNS and GC WIFI issues

Ryan C, Jonathan, Dave:

Starting around 18:28 Sun 15jan2025 the control room reported name resolution issues within CDS. Also the GC WIFI went offline.

The CDS alarm system froze up at 18:28, which agrees with the time the other services went offline.

Jonathan is reporting issues contacting GC DNS and managment machines, indicating this could be a GC issue. 

Comments related to this report
david.barker@LIGO.ORG - 19:32, Sunday 15 June 2025 (85065)

Jonathan is heading to the site to investigate. 

david.barker@LIGO.ORG - 19:44, Sunday 15 June 2025 (85066)

From the control room perspective:

teamspeak continues to run on the verbal machine.

phones continue to work

alog is accessible if the IP number is used, not the name.

scripts are failing if they need to resolve names, this is preventing squeezer work and H1's range is down to the 80s.

the alarm/alert system cannot resolve twilio's address, so no alarm texts/emails can be sent.

jonathan.hanks@LIGO.ORG - 20:28, Sunday 15 June 2025 (85067)

The issue has been resolved by power cycling the sw-osb163-0 switch.  This is what DNS and a few other key services hang off of.

I restarted the switch around 8:14pm local time.  Ryan C. confirms that he has access to the alog.  I can get to the management machines and the dns servers, both locally and via offsite routes.

david.barker@LIGO.ORG - 21:02, Sunday 15 June 2025 (85068)

Alarms restarted itself at 20:20 and I restarted alerts at 20:54. Test messages confirmed these services are working correctly.

david.barker@LIGO.ORG - 07:18, Monday 16 June 2025 (85069)

Opened FRS34439 to cover this, specifically how it impacted on control room operations.