We lost lock from the calibration, so we tried to lock ALS without the linearization (some background in this alog: 83278.) An active measurement of the transfer function from DRIVEALIGN_L to MASTER out was 1 without the linearization, and -0.757 with the linearization on. So I've changed the DRIVEALIGN gain to -1.3 in the ALS_DIFF guardian when the use_ESD_linearization is set to false.
We tried this once, and it stayed locked for a DARM gain of 400, but unlocked as the UIM boosts were turning on. We tried this again but it also didn't lock DIFF, so it is now out of the guardian again.
I looked at a few more of the past ALS DIFF locks, both sucsesful and unsucsesful attempts we are saturating the ESD (either the DAC or the limiter in the linearization) in the first steps of locking DIFF. We do these steps quite slowly, stepping the darm gain to 40 waiting for the DARM1 ramp time, stepping it to 400, then waiting twice the ramp time, then engaging the boosts for offloading to L1. I reduced the ramp time from 5 seconds to 2 seconds to make this go faster. This worked on the first locking attempt, but that could be a coincidence.
We will leave this in for a while, so that we can compare how frequently we loose lock at LOCKING_ALS. In the last 7 days we've had 48 LOCKING_ALS locklosses, and 19 locklosses from NLN, so roughly 2.5 ALS locklosses per lock stretch.
Since the time of this alog, around 19 UTC on March 13th, we've had 68 locking_ALS locklosses and 12 NLN locklosses, so about 6 locklosses per sucsesful lock. It seem though that the change to 2 seconds was never in place, and the guardian code still said 5 seconds. So this issue seems to be getting worse without any change.
Now I've loaded the change to 2 seconds, so this should be sped up after today's maintence window.
I've looked at a bunch more of these locklosses, and they mostly happen in the time when the DARM gain is ramping, less often as the boosts are coming on in L1, and 1 I saw happened while COMM was locking.
In all the cases the linearization seems to hit its limiter before anything else goes wrong.