Jonathan, Erik, EJ, Dave:
A follow up on the unexpected IPC errors we saw sitewide after the power cycle of h1seib1 Tue 17.47 19th December 2023 PST (cycled because of an ADC issue).
Details can be found in FRS30025
When h1seib1 rejoined the dolphin network (i.e. when it re-enabled its IX dolphin switch port during its boot cycle) most, but not all, frontends with Dolphin and/or long-range Dolphin IPC receivers saw Rx errors (see attachment).
This error condition persisted for 64 seconds.
For some front ends the error rate was small (e.g. a dozen single errors spread over 64 seconds) and for others it was larger (over 100 errors, multiple per second (up to 8), continuous stretch of errors for 15 seconds)
Unfortunately the HAM ISI models were in the list of models which saw 15 seconds of continuous errors. This caused their Payload Watchdogs to trip (they trip on 10 seconds or more continuous IPC Rx errors) (see attachment)
Currently the only event we have detected which happened at the time the IPC errors stopped was when the h1isiitmy model got going on h1seib1, at which point it would have started writing to its dolphin card.