There are frequently times when the build ups and ASC signals in DRMI become verry noisy about 1 minute after it locks. The guardian does several things at the same time here, engages the DRMI ASC and waits for it to converge before offloading, requests BS stage 2 isolation, and changes DRMI from 1F to 3F locking. Today we went through these steps slowly to try to understand which of these steps causes the big kick, and it seems that the BS ST2 isolation loops are the problem. The attached screenshot shows that the problem happened durring the BS ST2 transition (there was not anything else going on at the time) although nothing seems to show up in the GS13 signals.
We are often able to ride this out, but we sometimes loose lock due to this. Looking at the lockloss tool summary plot that Niko posted in the second attachment to 52514, this is probably responsible for the cluster of locklosses between the states 105 (TURN_ON_BS_ST2) and 111 (offload DRMI ASC) which is about 100 locklosses in O3a.
Edit: In another locking attempt we waited again for the BS isolation loops, saw the big glitch but survived, then lostlock during the ASC offload, so there might be multiple problems in these states. For now I've set ISC_LOCK to wait for the BS to finish isolating ST2 before it moves on to the other DRMI things, so that it will be easier to tell why we are loosing lock.
Also, watching the isolation loops, the glitch seems to happen at the time when FM1 is switched off and FM8 is switched on for the horizontal loops. (second attached screenshot.)
The filter being engaged when the glitch happens is a dc boost. It is a pair of poles at .05hz and a pair of zeros at 2hz. It's engaged with a ramp time of 5 seconds, but the filter has a step reponse of something like 10 seconds (first plot is the foton step response for this filter).
There are two things we could try to reduce the wiggle cause by engaging the boost: we could reduce the ramp time on the boost (maybe something like 1-2 seconds? That makes me a bit nervouse) or we could push the poles down to something like .02hz to slow down the step response. This shouldn't affect the stability of the controls.Second attached plot is the step response of a boost with .025 hz poles. This takes about twice as long to get to the same point.
Third plot is bode plot comparing the old filter in red and the new in blue. I adjusted the zeros so the DC gain was roughly the same. We lose gain between .1 and 1 hz, but most of the ISI performance comes from St1 anyway, which is unaffected. The BS already has lower gain St2 loops compared to the other ISIs, and we use to run with the St2 loops off, so maybe it's not a big deal.
I tweaked the St2 boosts on the BS ISI more or less as proposed, and seems like it's better. I pushed the poles down to .025hz and decreased the ramp time to 3 seconds.
First attached screenshot are the POPAIR_B_RF18 and MICH_P asc from the BS turning on the ST2 loops just a couple minutes ago, second plot are the same trends from the glitch Sheila posted from the 26th. The glitch is about half as big on POP with the new filter settings. I'll check on this again after a couple more locks, but seems like this is a good change.