In the past few weeks have seen rocky performance out of the Calibration pipeline and its IFO-tracking capabilities. Much, but not all, of this is due to [my] user error. Tuesday's bad calibration state is a result of my mishandling of the recent drivealign L2L gain changes for the ETMX TST stage (LHO:78403, LHO:78425, LHO:78555, LHO:79841). The current practice adopted by LHO with respect to these gain changes is the following: 1. Identify that KAPPA_TST has drifted from 1 by some appreciable amount (1.5-3%), presumably due to ESD charging effects. 2. Calculate the necessary DRIVEALIGN gain adjustment to cancel out the change in ESD actuation strength. This is done in the DRIVEALIGN bank so that it's downstream enough to only affect the control signal being sent to the ESD. It's also placed downstream of the calibration TST excitation point. 3. Adjust the DRIVEALIGN gain by the calculated amount (if kappaTST has drifted +1% then this would correspond to a -1% change in the DRIVEALIGN gain). 3a. Do not propagate the new drivealign gain to CAL-CS. 3b. Do not propagate the new drivealign gain to the pyDARM ini model. After step 3 above it should be as if the IFO is back to the state it was in when the last calibration update took place. I.e. no ESD charging has taken place (since it's being canceled out by the DRIVEALIGN gain adjustments). It's also worth noting that after these adjustments the SUS-ETMX drivealign gain and the CAL-CS ETMX drivealign will no longer be the copies of each other (see image below). The reasoning behind 3a and 3b above is that by using these adjustments to counteract IFO changes (in this case ESD drift) from when it was last calibrated, operators and commissioners in the control room could comfortably take care of performing these changes without having to invoke the entire calibration pipeline. The other approach, adopted by LLO, is to propagate the gain changes to both CAL-CS and pyDARM each time it is done and follow up with a fresh calibration push. This approach leaves less to 'be remembered' as CAL-CS, SUS, and pyDARM will always be in sync but comes at the cost of having to turn a larger crank each time there is a change.Somewhere along the way I updated the TST drivealign gain parameter in the pyDARM model even though I shouldn't have. At this point, I don't recall if I was confused because the two sites operate differently or if I was just running a test and left this parameter changed in the model template file by accident and subsequently forgot about it. In any case, the drivealign gain parameter change made its way through along with the actuation delay adjustments I made to compensate for both the new ETMX DACs and for residual phase delays that haven't been properly compensated for recently (LHO:80270). This happened in commit 0e8fad of the H1 ifo repo. I should have caught this when inspecting the diff before pushing the commit but I didn't. I have since reverted this change (H1 ifo commit 41c516). During the maintenance period on Tuesday, I took advantage of the fact that the IFO was down to update the calibration pipeline to account for all of the residual delays in the actuation path we hadn't been properly compensating for (LHO:80270). This is something that I've done several times before; a combination of the fact that the calibration pipeline has been working so well in O4 and that the phase delay changes I was instituting were minor contributed to my expectation that we would come back online to a better calibrated instrument. This was wrong. What I'd actually done was install a calibration configuration in which the CAL-CS drivealign gain and the pyDARM model's drivealign gain parameter were different. This is bad because pyDARM generates FIR filters that are used by the downstream GDS pipeline; those filters are embedded with knowledge of what's in CAL-CS by way of the parameters in the model file. In short, CAL-CS was doing one thing and GDS was correcting for another. -- Where do we stand? At the next available opportunity, we will be taking another calibration measurement suite and using it reset the calibration one more time now that we know what went wrong and how to fix it. I've uploaded a comparison of a few broadband pcal measurements (image link). The blue curve is the current state of the calibration error. The red curve was the calibration state during the high profile event earlier this week. The brown curve is from last week's Thursday calibration measurement suite, taken as part of the regularly scheduled measurements. -- Moving forward, I and others in the Cal group will need to adhere more strictly to the procedures we've already had in place: 1. double check that any changes include only what we intend at each step 2. commit all changes to any report in place immediately and include a useful log message (we also need to fix our internal tools to handle the report git repos properly) 3. only update calibration while there is a thermalized ifo that can be used to confirm that things will back properly, or if done while IFO is down, require Cal group sign-off before going to observing