WP11562 Reduce SUS EX SWWD countdown period
Dave, Erik, Jonathan, TJ:
As a follow on from the EX HWWD trip early Sat morning, we reduced the time for the SUS SWWD to issue its DACKILL from 20 minutes to 15 minutes.
This will mean that the SUS SWWD will trip 5 minutes before the HWWD.
The SWWD timers are hard coded (by design) in the IOP models. I created a new h1iopsusex with the second timer changed from 900 seconds (15 mins) to 600 seconds (10 mins). Adding this to the first timer of 5 mins gives us the required 15 mins.
In theory this just needed a restart of the models on h1susex, but it did not go well.
The models were stopped with 'rtcds stop --all'.
The h1iopsusex model was started, I verfied the timer change was installed (it was)
I restarted the models h1susetmx, h1sustmsx, h1susetmxpi. So far so good.
I did a list check with 'rtcds status' and was just about to logout when h1susex completely locked up.
h1iopsusex started at 11:58:46, lockup was at 12:00:50 (2min 4sec later)
There was no recourse except to remotely power cycle h1susex via its IPMI port. It was fenced from the dolphin switch before power was cycled.
Tue05Dec2023
LOC TIME HOSTNAME MODEL/REBOOT
11:58:31 h1susex h1iopsusex
11:58:54 h1susex h1susetmx
11:59:13 h1susex h1sustmsx
11:59:35 h1susex h1susetmxpi
12:08:27 h1susex ***REBOOT*** <<< power cycle following computer lock up
12:10:39 h1susex h1iopsusex
12:10:52 h1susex h1susetmx
12:11:05 h1susex h1sustmsx
12:11:18 h1susex h1susetmxpi