H1SUSEX front end computer and IO chassis were rebooted this morning to deal with the issue posted by Corey. Richard / Peter
Same failure mode on h1susex today as h1seiex had yesterday. Therefore we were not able to take h1susex out of the Dophin fabric, and so all dolphin connected models were glitched after the reboot of h1susex.
Richard power cycled h1susex and its IO Chassis. I killed all models on h1seiex and h1iscex, and then started all models on these computers. No IRIG-B timing excursions. Cleared IPC and CRC errors, Corey reset the SWWD. IFO recovery has started.
here are the front end computer uptimes (times since last reboot) ran at 10:02 this morning. The longest any machine has ran is 210 days since the site power outage 30 Sep 2016.
h1psl0 up 131 days, 18:38, 0 users, load average: 0.37, 0.13, 0.10
h1seih16 up 210 days, 3:00, 0 users, load average: 0.11, 0.14, 0.05
h1seih23 up 210 days, 3:00, 0 users, load average: 0.62, 1.59, 1.37
h1seih45 up 210 days, 3:00, 0 users, load average: 0.38, 1.31, 1.17
h1seib1 up 210 days, 3:00, 0 users, load average: 0.02, 0.04, 0.01
h1seib2 up 210 days, 3:00, 0 users, load average: 0.02, 0.08, 0.04
h1seib3 up 210 days, 3:00, 0 users, load average: 0.00, 0.05, 0.06
h1sush2a up 210 days, 3:00, 0 users, load average: 1.64, 0.59, 0.56
h1sush2b up 210 days, 3:00, 0 users, load average: 0.00, 0.00, 0.00
h1sush34 up 210 days, 3:00, 0 users, load average: 0.00, 0.03, 0.00
h1sush56 up 210 days, 3:00, 0 users, load average: 0.00, 0.00, 0.00
h1susb123 up 210 days, 3:00, 0 users, load average: 0.17, 1.07, 1.10
h1susauxh2 up 210 days, 3:00, 0 users, load average: 0.00, 0.00, 0.00
h1susauxh34 up 117 days, 17:07, 0 users, load average: 0.08, 0.02, 0.01
h1susauxh56 up 210 days, 3:00, 0 users, load average: 0.00, 0.00, 0.00
h1susauxb123 up 210 days, 2:07, 0 users, load average: 0.00, 0.00, 0.00
h1oaf0 up 164 days, 20:16, 0 users, load average: 0.10, 0.24, 0.23
h1lsc0 up 207 days, 41 min, 0 users, load average: 0.06, 0.57, 0.65
h1asc0 up 210 days, 3:00, 0 users, load average: 1.03, 1.82, 1.80
h1pemmx up 210 days, 3:53, 0 users, load average: 0.05, 0.02, 0.00
h1pemmy up 210 days, 3:53, 0 users, load average: 0.00, 0.00, 0.00
h1susauxey up 205 days, 23:41, 0 users, load average: 0.07, 0.02, 0.00
h1susey up 210 days, 3:06, 0 users, load average: 0.14, 0.04, 0.01
h1seiey up 206 days, 51 min, 0 users, load average: 0.00, 0.03, 0.00
h1iscey up 210 days, 3:07, 0 users, load average: 0.04, 0.21, 0.20
h1susauxex up 210 days, 3:16, 0 users, load average: 0.00, 0.00, 0.00
h1susex up 3:06, 0 users, load average: 0.00, 0.00, 0.00
h1seiex up 1 day, 2:57, 0 users, load average: 0.00, 0.00, 0.00
h1iscex up 177 days, 21:59, 0 users, load average: 0.08, 0.33, 0.24
Here is the list of free RAM on the front end computers in kB:
h1psl0 4130900
h1seih16 4404868
h1seih23 4023316
h1seih45 4024452
h1seib1 4754280
h1seib2 4763256
h1seib3 4753960
h1sush2a 4009216
h1sush2b 5160476
h1sush34 4389108
h1sush56 4400720
h1susb123 4013144
h1susauxh2 5338804
h1susauxh34 5351172
h1susauxh56 5350144
h1susauxb123 5339900
h1oaf0 9102096*
h1lsc0 4065228
h1asc0 3988536
h1pemmx 5358464
h1pemmy 5358352
h1susauxey 5352196
h1susey 64277012~
h1seiey 4758336
h1iscey 4117788
h1susauxex 5349644
h1susex 64301796~
h1seiex 4769840
h1iscex 4138204
* oaf has 12GB
~ end station sus have 66GB