Found many IPC errors on the CDS Overview screen. Used the universal Diag Reset button to clear the errors and they have not returned. Attached is a screen shot of the nodes that were in error before the reset.
I did some investigation, this appears to be a new problem not seen before.
Normally the h1lsc0 user models only glitch because the IOP model (h1ioplsc0) has glitched (this is a Gen 2 front end computer), and I do indeed see this happening over the weekend. My clearing code detects these glitches and issues a diag-reset of the user models.
In this specific case, at 07:21 Mon 9th July 2018 UTC (00:21 PDT) the user models h1omc, h1lsc, h1lscaux, h1omcpi and h1sqz time glitched (ran long), however the IOP model did not. Since there was no IOP problem, the clearing script did not issue a diag_reset.
The overview snapshot shows the LSC models with a timing error (they ran long) and many other user models (PSL, OAF, SUS) showing IPC errors as a direct result of these.
It is unclear how user models (timed from the IOP) can show timing errors, but the IOP model did not report any problems. /proc and dmesg logs do not report anything at this time.