TITLE: 09/15 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 149Mpc
OUTGOING OPERATOR: Tony
CURRENT ENVIRONMENT:
SEI_ENV state: CALM
Wind: 6mph Gusts, 4mph 5min avg
Primary useism: 0.02 μm/s
Secondary useism: 0.12 μm/s
QUICK SUMMARY:
Detector is Observing and has been Locked for 16hrs 22mins. Everything looks good
TITLE: 09/15 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Observing at 143Mpc
INCOMING OPERATOR: Tony
SHIFT SUMMARY:
IFO is in NLN and OBSERVING as of 22:53 UTC
Continued Lockloss (alog 72875) Investigations (Continued from alog 72876)
The same type of lockloss happened today during TJ’s Day shift. Instead of continuing exactly where I left off by inspecting the EX saturations. I will briefly trend the lockloss select (as done yesterday and today with TJ with separate “this issue” locklosses. From the LSC lockloss scopes (Screenshot 1), we can clearly see that about 92ms before any of the LSC channels saw the lockloss, H1:LSC-DARM_IN1_DQ saw it first. From speaking with TJ the day earlier, this is a channel that goes back to the OMC DCPDs (if I recall correctly).
Before hunting the actuator down, I zoomed into the channel and saw that this channel’s bumpy behavior started building up at 4:45:47 (Screenshot 2) - a second before that lockloss. This second picture is just a zoom on the tiny hump seen in the first screenshot.
And unfortunately, there was not enough time to continue investigating but will be picking this up next week. We essentially found that there is one particular channel, related to OMC DCPDs that has a build-up followed by a violent kick that knocks everything from time to time, causing locklosses. What I would like to know/ask:
Most of these questions hit on the same few pieces of evidence we have (EX saturation - potential red herring, OMC Channel kick - a new area to investigate) and the BLRMs glitch incidence (the evidence that it wasn’t external).
Other:
3 GRB-Short Alarms
Many glitches but 0 lockloss causing ones
LOG:
Start Time | System | Name | Location | Lazer_Haz | Task | Time End |
---|---|---|---|---|---|---|
20:22 | EPO | Oregon Public Broadcasting | Overpass | N | Setting up timelapse camera | 20:59 |
Tony, Oli, Camilla
Good lockloss investigations Ibrahim. The lockloss tool shows these ETMX glitches in the ~2 seconds before the lockloss in the "Saturations" and "Length-Pitch-Yaw" plots. I think ETMX moving would cause a DARM glitch (so the DARM BLRMs to increase) or vice versa, DARM changing would cause ETMX to try to follow. Faster ETMX channels to look at would be H1:SUS-ETMX_L3_MASTER_OUT_UL_DQ, 16384Hz vs 16Hz. You can see the framerate of the channels using command 'chndump | grep H1:SUS-ETMX_L3_MASTER_OUT_' or simular...
Plot attached shows L1, L2, L3 of ETMX all see these fast noisy glitches but the OMC and DARM channels show a slower movement. Can this us tell us anything about the cause?
See similar glitches in:
IFO is in NLN and OBSERVING as of 22:53 UTC
Nothing else to report.
TITLE: 09/14 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Observing at 145Mpc
OUTGOING OPERATOR: TJ
CURRENT ENVIRONMENT:
SEI_ENV state: SEISMON_ALERT
Wind: 10mph Gusts, 7mph 5min avg
Primary useism: 0.03 μm/s
Secondary useism: 0.12 μm/s
QUICK SUMMARY:
IFO is in NLN and OBSERVING as of 22:53 UTC
TITLE: 09/14 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 148Mpc
INCOMING OPERATOR: Ibrahim
SHIFT SUMMARY: One lock loss with a full auto relock. Quiet shift otherwise.
LOG:
Start Time | System | Name | Location | Lazer_Haz | Task | Time End |
---|---|---|---|---|---|---|
14:59 | FAC | Tyler | EY | n | Check on chiller alarm | 15:15 |
15:22 | FAC | Tyler | EX | n | Check on chiller alarm | 15:37 |
18:06 | FAC | Randy | EY | n | Inspections outside of building | 19:02 |
20:14 | FAC | Tyler | EX | n | Checking on chiller | 20:22 |
20:31 | FAC | Tyler | EX | n | More chiller checks | 21:31 |
21:42 | VAC | Gerardo | EY | n | Check on water lines | 22:08 |
21:53 | FAC | Tyler | MY | n | Check on Fan1 noise | 22:25 |
22:13 | - | Jeff, OBC | EX, overpass | n | Film arm driving | 22:35 |
Looks like the same one as yesterday and similar to others recently were LSC-DARM_IN1 shows the first signs of movement. Need to look into this further. Ibrahim has dug a good amount here 72876.
I received an email from the control room (thank you TJ) about the Mid Y AHU Fan 1 showing some excessive vibration. Tyler went down to investigate and had me turn the fan off to listen to the sound as it was ramping down. It was terrible, one or both bearings are damaged or destroyed. We switched to Fan 2 which should be much smoother. We will investigate Fan 1 and replace the bearings.
An alarm was noted this morning on the Alerton System at End X chiller 2. The alarm information is not accessible remotely. At the control a "Low Refrigerant Temperature" alarm was present. Because this is a non-latched alarm, I took note and cleared it before returning to the corner station. Shortly after clearing, the alarm returned. I investigated the setup of the chiller via compass and noticed that the chiller was set to "manually enable". It is my understanding that a manual enable of the chiller will hold the chiller on regardless of the demand for cooling. I suspect that the cooler temperatures combined with the manual enable kept the chiller running well past the buildings demand for chiller water and possible driving refrigerant temperatures lower than the chiller would like. When returning this afternoon to clear the second alarm, I observed, for the first time in recent memory, EX chiller 2 ramping down indicating that the requirement for cooling had been satisfied. The control boards diagnostics agreed. Im hopeful the chiller refrigerant will return to normal following this change but I will continue to monitor it in the coming days. B. Gateley T. Guidry
R. Short, T. Shaffer
Our automation has called for assistance when earthquakes roll through and make locking H1 difficult, which typically just has an operator request H1 to 'DOWN' and wait until ground motion is low, then try locking again. In an attempt to further improve our automation and lower the need for intervention, I've added a 'WAITING' state to H1_MANAGER that holds ISC_LOCK in 'READY' and waits for the SEI_ENV guardian to leave its 'EARTHQUAKE' state before moving back to 'RELOCKING.' H1_MANAGER will jump from 'RELOCKING' to 'WAITING' if the SEI_ENV node is in 'EARTHQUAKE' or 'LARGE_EQ' and ISC_LOCK is not past 'READY' (the motivation for this being that if H1 is making progress in locking when an earthquake hits, we don't want it to stop if the earthquake is harmless enough).
These changes are committed to svn and H1_MANAGER has been loaded.
There were two cases over the weekend where an earthquake caused a lockloss and H1_MANAGER correctly identified that with SEI_ENV being in 'EARTHQUAKE' mode, it would be challenging to relock, so it kept ISC_LOCK from trying (one on 9/17 at 11:30 UTC and another on 9/18 at 13:44 UTC). However, after 15 minutes of waiting, IFO_NOTIFY called for assistance once it saw that ISC_LOCK had not made it to its 'READY' state; confusing behavior at first, since H1_MANAGER requests ISC_LOCK to 'READY' when it moves to the 'WAITING' state. When looking into this, I was reminded that ISC_LOCK's 'DOWN' state has a jump transition to 'PREP_FOR_LOCKING' when it finishes, meaning that ISC_LOCK will stall in 'PREP_FOR_LOCKING' unless revived by its manager or is requested to go to another state. To fix this, I've added an "unstall" decorator to H1_MANAGER's 'WAITING' state's run method, which will revive ISC_LOCK so that it can move past 'PREP_FOR_LOCKING' and all the way to 'READY' while waiting for the earthquake to pass.
I have put together an MEDM for the FCES (HAM8 shack) FMCS. It can be opened from the FMCS Overview MEDM.
On a related note, I have added a button on the Overview to run a script which restores the FMCS EPICS alarm settings.
We are now in day 113 of O4 and we have not had any spontaneous IPC receive errors on any model throughout this time.
During Tuesday maintenance this week I forgot to issue a DIAG_RESET on h1oaf after the pem models were restarted, and therefore it is showing latched IPC errors from this time which I just noticed today.
To elevate the visibility of latched transient IPC errors, I have added a new block on the CDS overview which will turn yellow if the model has a latched IPC error. This block does not differentiate between IPC type (shared-memory, local-dolphin, x-arm, y-arm). The new block is labeled lower case "i". Clicking on this block opens the model's IPC channel table.
The upper case "I" block remains as before which turns red if there are any ongoing IPC errors (reported as a bit in the model's STATE_WORD)
To make space for this new block (located at the end by the CFC) I have reduced the width of the DAQ-STAT and DAQ-CFC triangles to the same width as the blocks (10 pixels).
I have added a legend to the CDS Overview, showing what all the model status bits mean.
Clicking on the Legend button opens DCC-T2300380 pdf using the zathura image viewer.
Thu Sep 14 10:07:22 2023 INFO: Fill completed in 7min 18secs
Gerardo confirmed a good fill curbside.
TITLE: 09/14 Day Shift: 15:00-23:00 UTC (08:00-16:00 PST), all times posted in UTC
STATE of H1: Observing at 145Mpc
OUTGOING OPERATOR: Ryan S
CURRENT ENVIRONMENT:
SEI_ENV state: SEISMON_ALERT
Wind: 1mph Gusts, 0mph 5min avg
Primary useism: 0.03 μm/s
Secondary useism: 0.12 μm/s
QUICK SUMMARY: Locked for 9 hours. A 5.2M earthquake from Columbia is starting to roll through and elevate 30-100mHz.
TITLE: 09/14 Eve Shift: 23:00-07:00 UTC (16:00-00:00 PST), all times posted in UTC
STATE of H1: Observing at 142Mpc
INCOMING OPERATOR: Ryan S
SHIFT SUMMARY:
IFO is in NLN and OBSERVING as of 06:04 UTC
Lock acquisition was fully automatic, but had to go through PRMI and MICH_FRINGES.
3 IY SDF diffs accepted as per Jenne's instruction (screenshotted)
Lockloss Investigation (alog 72875)
“EX” Saturation one second before the lockloss.
The lockloss happened at 04:45:48 UTC. In order to find the so-called “first cause”, I realized that there was an EX saturation at 04:45:47. I went and checked the L3 actuators since those were the highest (blinking yellow) on the saturation monitor.
Upon trending this lockloss, I found that there was indeed a very big actuation at 04:45:47 at around the same time for all 4 ESDs. The fact that they were also in the same millisecond tells me that they were all caused by something else. (Screenshot "EXSaturation")
More curious however is that there was a preliminary negative “kick” about half a second before (1.5 seconds pre-lockloss at 04:45:46.5 UTC). This wasn’t a saturation but contributes to instability in EX (I’m assuming). This “kick” did not happen regularly before the saturation so I think may be relevant to the whole mystery. All of these also happened at the same time (to the nearest ms) and all had the same magnitude of ~-330,000 cts. I think that this half a second kick was caused by something else since it’s the same magnitude and time for all 4.
It’s worth noting that this preliminary kick was orders of magnitude higher (lower) than the EX activity leading up to it (Screenshot "EXSaturation2") and the saturation was many orders of magnitude higher than that one, which went up to 1*1011 counts.
It is equally worth noting that EX has saturated 5 separate times since the beginning of the shift:
Now the question becomes: was it another stage of “EX” or was it something else?
First, we can trend the other stages and see if there is any before-before-before lockloss behavior. (This becomes relevant later so I’ll leave this investigation for now).
Second, we can look at other fun and suspicious factors:
This leads me to believe that whatever caused the glitch behavior may be related to the EX saturation, which begs the question: Have the recent EX saturations all prompted these BLRM all-bands glitches? Let’s find out.
Matching the EX saturation timestamps above:
Looking at the saturation 12 mins before, there was indeed a ring up, this time with 20-34 Hz band being first and highest in magnitude (Screenshot "EXSaturation4"). It’s worth saying that this rang up much less (I had to zoom in by many orders of magnitude to see the glitch behavior), which makes sense because it didn’t cause a lockloss but also tells us that these locklosses are particularly aggressive if:
Anyway, all EX saturations during this shift caused this behavior in the BLRMs screen. All of the non-lockloss causing ones were about 1*108 times lower in magnitude than this one.
This all isn’t saying much other than that an EX ring-up in DARM will show up in DARM.
But, now we have confirmed that these glitches seem to be happening due to EX saturations so let’s find out if this is actually the case. So far, we know that a very bad EX saturation happens, the BLRMs screen lights up, and then we lose lock. This splits up our question yet again:
Does something entirely different (OMC perhaps) cause EX to saturate or is the saturation happening and caused within another EX stage? (Our “first” from before) We can use our scope archeology to find out more.
But sadly, not today because it’s close to 1AM now - though the investigation would continue as such:
I will continue investigating this in tomorrow’s shift.
P.S. If my reasoning is circular or I’m missing something big, then burst my bubble as slowly as possible so I could maximize the learning (and minimize the embarrassment)
LOG:
None
Lockloss due to another of the same type of Glitch as yesterday's Lockloss Alog 72852 (and less than 20 minutes apart too). Attempting to re-lock automatically now while investigating the cause of the glitch.
Investigation in Ibrahim's summary alog72876
The Picket Fence client was updated. This new version points at a server with lower latency.
It also fixes some bugs, and reports the current time and start time of the service.
I merged this into the main code.
Thank you Erik!
Between 09/04 00:48 and 00:50UTC the TCSX CO2 laser lost lock and pushed us out of Observing(attachment1). There were two SDF diffs for TCS-ITMX_CO2_CHILLER_SERVO_GAIN and TCS-ITMX_CO2_PZT_SERVO_GAIN(attachment2), both of which resolved themselves as the laser locked back up. Currently these unlocks are happening every ~2.8ish days. Jeff and Camilla noted(72627) that the ITMX TCS laser head power has been declining over the course of the past 1.5 months, and the drops in the power output line up with every time that the TCSX laser loses lock(attachment3).
08/15 22:00UTC - TCSX Chiller Swap to spare -> ..sn813 (72220)
08/16 08:30UTC TCSX unlock
08/18 13:30UTC ''
08/23 08:42UTC ''
08/26 09:46UTC ''
08/29 08:27UTC ''
09/01 01:44UTC ''
09/04 00:48UTC ''
TJ and I noticed that since the 72220 chiller swap, the CO2 laser temperature is 0.3degC hotter, see attached. The guardian increased the set point to relock the laser when the chiller swapped but maybe this is not a good temperature. On next IFO unlock we can try adjusting the CHILLER_SET_POINT lower. I did this with the old chiller in alog 71685 but it didn't help.
While the IFO was relocking, at 22:13UTC I reduced H1:TCS-ITMX_CO2_CHILLER_SET_POINT_OFFSET 0.3deg from 21.16 to 20.86 degC. This changed the laser tempurature 0.03degC.
This is a continuation of a discussion of mis-application of the calibration model raised in LHO alog 71787, which was fixed on August 8th (LHO alog: 72043), and further issues with what time varying factors (kappas) were applied while the ETMX UIM calibration line coherence was bad (see LHO alog 71790, which was fixed on August 3rd. We need to update the calibration uncertainty estimates with the combination of these two problems where they overlap. The appropriate thing is to use the full DARM model (1/C + (A_uim + A_pum + A_tst) * D), where C is sensing, A_{uim,pum,tst} are the individual ETMX stage actuation transfer functions, and D is the digital darm filters. Although, it looks like we can just get away with an approximation, which will make implimentation somewhat easier. As a demonstration of this, first I confirm I can replicate the the 71787 result purely with models (no fitting). I take the pydarm calibration model Response, R, and correct it for the time dependent correction factors (kappas) at the same time I took the GDS/DARM_ERR data, and then take the ratio with the same model except the 3.2 kHz ETMX L3 HFPoles removed (the correction Louis and Jeff eventually implemented). This is the first attachment. Next we calculate the expected error just from the wrong kappas being applied in the GDS pipeline due to poor UIM coherence. For this initial look, I choose GPS time 1374369018 (2023-07-26 01:10), you can see the LHO summary page here, with the upper left plot showing the kappa_C discrepancy between GDS and front end. So just this issue produces the second attachment. We can then look at what the effects of the 3.2 kHz pole being missing for two possibilities, for the Front end kappas, and for the GDS bad kappas, and see the difference is pretty small compared to typical calibration uncertainties. Here it's on the scale of a tenth of a percent at around 90 Hz. I can also plot the model with the front end kappas (more correct at this time) over the model of the wrong GDS kappas, for a comparison in scale as well. This is the 3rd plot. This suggests to me the calibration group can just apply a single correction to the overall response function systematic error for the period where the 3.2 kHz HFPole filter was missing, and then in addition, for the period where the UIM uncertainty was preventing the kappa_C calculation from updating, apply an additional correction factor that is time dependent, just multiplying the two. As an example, the 4th attachment shows what this would look like for the gps time 1374369018.
For further explanation of the impact of Frozen GDS TDCFs vs. Live CAL-CS Computed TDCFs on the response function systematic error, i.e. what Joe's saying with Next we calculate the expected error just from the wrong kappas being applied in the GDS pipeline due to poor UIM coherence. For this initial look, I choose GPS time 1374369018 (2023-07-26 01:10 UTC), you can see the LHO summary page here, with the upper left plot showing the kappa_C discrepancy between GDS and front end. So just this issue produces the second attachment. and what he shows in his second attachment, see LHO:72812.
I've made some more clarifying plots to help me better understand Joe's work above after getting a few more details from him and Vlad. (1) GDS-CALIB_STRAIN is corrected for time dependence, via the relative gain changes, "\kappa," as well as for the new coupled-cavity pole frequency, "f_CC." In order to make a fair comparison between the *measured* response function, GDS-CALIB_STRAIN / DARM_ERR live data stream, and the *modeled* response function, which is static in time, we need to update the response function with the the time dependent correction factors (TDCFs) at the time of the *measured* response function. How is the *modeled* response function updated for time dependence? Given the new pydarm system, it's actually quite straightforward given a DARM model parameter set, pydarm_H1.ini and good conda environment. Here's a bit of pseudo-code that captures what's happening conceptually: # Set up environment from gwpy.timeseries import TimeSeriesDict as tsd from copy import deepcopy import pydarm # Instantiate two copies of pydarm DARM loop model darmModel_obj = pydarm.darm.DARMModel('pydarm_H1.ini') darmModel_wTDCFs_obj = deepcopy(darmModel_obj) # Grab time series of TDCFs tdcfs = tsd.get(chanList, starttime, endtime, frametype='R',verbose=True) kappa_C = tdcfs[chanList[0]].value freq_CC = tdcfs[chanList[1]].value kappa_U = tdcfs[chanList[2]].value kappa_P = tdcfs[chanList[3]].value kappa_T = tdcfs[chanList[4]].value # Multiply in kappas, replace cavity pole, with a "hot swap" of the relevant parameter in the DARM loop model darmModel_wTDCFs_obj.sensing.coupled_cavity_optical_gain *= kappa_C darmModel_wTDCFs_obj.sensing.coupled_cavity_pole_frequency = freq_CC darmModel_wTDCFs_obj.actuation.xarm.uim_npa *= kappa_U darmModel_wTDCFs_obj.actuation.xarm.pum_npa *= kappa_P darmModel_wTDCFs_obj.actuation.xarm.tst_npv2 *= kappa_T # Extract the response function transfer function on your favorite frequency vector R_ref = darmModel_obj.compute_response_function(freq) R_wTDCFs = darmModel_wTDCFs_obj.compute_response_function(freq) # Compare the two response functions to form a "systematic error" transfer function, \eta_R. eta_R_wTDCFs_over_ref = R_wTDCFs / R_ref For all of this study, I started with the reference model parameter set that's relevant for these times in late July 2023 -- the pydarm_H1.ini from the 20230621T211522Z report directory, which I've copied over to a git repo as pydarm_H1_20230621T211522Z.ini. (2) One layer deeper, some of what Joe's trying to explore in his plots above -- the difference between low-latency, GDS pipeline computed TDCFs and real-time, CALCS pipeline -- because of the issues with the GDS pipeline computation discussed in LHO:72812. So, in order to facilitate this study, we have to gather TDCFs from both GDS and CALCS pipeline. Here's the channel list for both: chanList = ['H1:GRD-ISC_LOCK_STATE_N', 'H1:CAL-CS_TDEP_KAPPA_C_OUTPUT', 'H1:CAL-CS_TDEP_F_C_OUTPUT', 'H1:CAL-CS_TDEP_KAPPA_UIM_REAL_OUTPUT', 'H1:CAL-CS_TDEP_KAPPA_PUM_REAL_OUTPUT', 'H1:CAL-CS_TDEP_KAPPA_TST_REAL_OUTPUT', 'H1:GDS-CALIB_KAPPA_C', 'H1:GDS-CALIB_F_CC', 'H1:GDS-CALIB_KAPPA_UIM_REAL', 'H1:GDS-CALIB_KAPPA_PUM_REAL', 'H1:GDS-CALIB_KAPPA_TST_REAL'] where the first channel in the list is the state of detector lock acquisition guardian for useful comparison. (3) Indeed, for *most* of the above aLOG, Joe chooses an example of times when the GDS and CALCS TDCFs are *the most different* -- in his case, 2023-07-26 01:10 UTC (GPS 1374369018) -- when the H1 detector is still thermalizing after power up. They're *different* because the GDS calculation was frozen at the value they were on the day that the calculation was spoiled by a bad MICH FF filter, 2023-08-04 -- and importantly when the detector *was* thermalized. An important distinction that's not made above, is that the *measured* data in his first plot is from LHO:71787 -- a *different* time, when the detector WAS thermalized, a day later -- 2023-07-27 05:03:20 UTC (GPS 1374469418). Compare the TDCFs between NOT THERMALIZED time, 2023-07-26 first attachment here with the 2023-07-27 THERMALIZED first attachment I recently added to Vlad's LHO:71787. One can see in the 2023-07-27 THERMALIZED data, the Frozen GDS and Live CALCS TDCF answers agree quite well. For the NOT THERMALIZED time, 2023-07-26, \kappa_C, f_CC, and \kappa_U are quite different. (4) So, let's compare the response function ratio, i.e. systematic error transfer function ratio, between the response function updated with GDS TDCFs vs. CALCS TDCFs for the two different times -- thermalizes vs. not thermalized. This will be an expanded version Joe's second attachment: - 2nd Attachment here: this exactly replicates Joe's plot, but shows more ratios to better get a feel for what's happening. Using the variables from psuedo code above, I'm plotting :: BLUE = eta_R_wTDCFs_CALCS_over_ref = R_wTDCFs_CALCS / R_ref :: ORANGE = eta_R_wTDCFs_GDS_over_ref = R_wTDCFs_GDS / R_ref :: GREEN = eta_R_wTDCFs_CALCS_over_R_wTDCFs_GDS = R_wTDCFs_CALCS / R_wTDCFs_GDS where the GREEN trace is showing what Joe showed -- both as the unlabeled BLUE trace in his second attachment, and the "FE kappa true R / applied bad kappa R" GREEN trace in his third attachment -- the ratio between response functions; one updated with CALCS TDCFs and the other updated with GDS TDCFs, for the NOT THERMALIZED time. - 3r Attachment here: this replicates the same traces, but with the TDCFs from Vlad's THERMALIZED time. For both Joe and my plots, because we think that the CALCS TDCFs are more accurate, and it's tradition to put the more accurate response function in the numerator we show it as such. Comparing the two GREEN traces from my plots, it's much more clear that the difference between GDS and CALCS TDCFs is negligible for THERMALIZED times, and substantial during NOT THERMALIZED times. (4) Now we bring in the complexity of the missing 3.2 kHz ESD pole. Unlike the "hot swap" of TDCFs in the DARM loop model, it's a lot easier just to create an "offline" copy of the pydarm parameter file, with the ESD poles removed. That parameter file lives in the same git repo location, but called pydarm_H1_20230621T211522Z_no3p2k.ini. So, with that, we just instantiate the model in the same way, but calling the different parameter file: # Set up environment # Instantiate two copies of pydarm DARM loop model darmModel_obj = pydarm.darm.DARMModel('pydarm_H1_20230621T211522Z.ini') darmModel_no3p2k_obj = pydarm.darm.DARMModel('pydarm_H1_20230621T211522Z_no3p2k.ini') # Extract the response function transfer function on your favorite frequency vector R_ref = darmModel_obj.compute_response_function(freq) R_no3p2k = darmModel_no3p2k_obj.compute_response_function(freq) # Compare the two response functions to form a "systematic error" transfer function, \eta_R. eta_R_nom_over_no3p2k = R_ref / R_no3p2k where here, the response function without the 3.2 kHz pole is less accurate, so R_no3p2k goes in the denominator. Without any TDCF correction, I show this eta_R_nom_over_no3p2k compared against Vlad's fit from LHO:71787 for starters. (5) Now for the final layer of complexity need to fold in the TDCFs. This is where I think a few more traces and plots are needed comparing the two THERMALIZED vs. NOT times, plus some clear math, in order to explain what's going on. In the end, I make the same conclusion as Joe, that the two effects -- Fixing the Frozen GDS TDCFs and Fixing the 3.2 kHz pole are "separable" to good approximation, but I'm slower than Joe is, and need things laid out more clearly. So, on the pseudo-code side of things, we need another couple of copies of the darmModel_obj: - with and without 3.2 kHz pole - with TDCFs from CALCS and GDS, - from THERMALIZED (LHO71787) and NOT THERMALIZED (LHO72622) times: R_no3p2k_wTDCFs_CCS_LHO71787 = darmModel_no3p2k_wTDCFs_CCS_LHO71787_obj.compute_response_function(freq) R_no3p2k_wTDCFs_GDS_LHO71787 = darmModel_no3p2k_wTDCFs_GDS_LHO71787_obj.compute_response_function(freq) R_no3p2k_wTDCFs_CCS_LHO72622 = darmModel_no3p2k_wTDCFs_CCS_LHO72622_obj.compute_response_function(freq) R_no3p2k_wTDCFs_GDS_LHO72622 = darmModel_no3p2k_wTDCFs_GDS_LHO72622_obj.compute_response_function(freq) eta_R_wTDCFS_over_R_wTDCFs_no3p2k_CCS_LHO71787 = R_wTDCFs_CCS_LHO71787 / R_no3p2k_wTDCFs_CCS_LHO71787 eta_R_wTDCFS_over_R_wTDCFs_no3p2k_GDS_LHO71787 = R_wTDCFs_GDS_LHO71787 / R_no3p2k_wTDCFs_GDS_LHO71787 eta_R_wTDCFS_over_R_wTDCFs_no3p2k_CCS_LHO72622 = R_wTDCFs_CCS_LHO72622 / R_no3p2k_wTDCFs_CCS_LHO72622 eta_R_wTDCFS_over_R_wTDCFs_no3p2k_GDS_LHO72622 = R_wTDCFs_GDS_LHO72622 / R_no3p2k_wTDCFs_GDS_LHO72622 Note, critically, that these ratios of with and without the 3.2 kHz pole -- both updated with the same TDCFs -- is NOT THE SAME THING as just the ratio of models updated with GDS vs CALCS TDCFs, even though it might look like the "reference" and "no 3.2 kHz pole" might cancel "on paper," if one naively thinks that the operation is separable [[ ( R_wTDCFs_CCS / R_ref )*( R_ref / R_no3p2k ) ]] / [[ ( R_wTDCFs_GDS / R_ref )*(R_ref / R_no3p2k) ]] #NAIVE which one might naively cancel terms to get down to [[ ( R_wTDCFs_CCS / R_ref )*( R_ref / R_no3p2k ) ]] / [[ ( R_wTDCFs_GDS / R_ref )*(R_ref / R_no3p2k) ]] #NAIVE [[ ( R_wTDCFs_CCS ]] / [[ R_wTDCFs_GDS ]] #NAIVE So, let's look at the answer now, with all this context. - NOT THERMALIZED This is a replica of what Joe shows in the third attachment for the 2023-07-26 time: :: BLUE -- the systematic error incurred from excluding the 3.2 kHz pole on the reference response function without any updates to TDCFs (eta_R_nom_over_no3p2k) :: ORANGE -- the systematic error incurred from excluding the 3.2 kHz pole on the CALCS-TDCF-updated, modeled response function (eta_R_wTDCFS_over_R_wTDCFs_no3p2k_CCS_LHO72622, Joe's "FE kappa true R /applied R (no pole)) :: GREEN -- the systematic error incurred from excluding the 3.2 kHz pole on the GDS-TDCF-updated, modeled response function (eta_R_wTDCFS_over_R_wTDCFs_no3p2k_GDS_LHO72622, Joe's "GDS kappa true R / applied (no pole)") :: RED -- Compared against Vlad's *fit* the ratio of CALCS-TDCF-updated, modeled response function to (GDS-CALIB_STRAIN / DARM_ERR) measured response function Here, because the GDS TDCFs are different than the CALCS TDCFs, you actually see a non-negligible difference between ORANGE and GREEN. - THERMALIZED: (Same legend, but the TIME and TDCFs are different) Here, because the GDS and CALCS TDCFs are the same-ish, you can't see that much of a difference between the two. Also, note, that even when we're using the same THERMALIZED time and corresponding TDCFs to be self-consistent with Vlad's fit of the measured response function, they still don't agree perfectly. So, there's likely still yet more systematic error going in the thermalized time. (6) Finally, I wanted to explicitly show the consequences of "just" correcting for GDS and from "just" correcting the missing 3.2 kHz pole to be able to better *quantify* the statement that "the difference is pretty small compared to typical calibration uncertainties," as well as showing the difference between "just" the ratio response functions updated with the different TDCFs (the incorrect model), against the "full" models. I show this in - NOT THERMALIZED, and - THERMALIZED For both of these plots, I show :: GREEN -- the corrective transfer function we would be applying if we only update the Frozen GDS TDCFs to Live CALCS TDCFs, compared with :: BLUE -- the ratio of corrective transfer functions, >> the "best we could do," updating the response with Live TDCFs from CALCS and fixing the missing 3.2 kHz pole against >> only fixing the missing 3.2 kHz pole :: ORANGE -- the ratio of corrective transfer functions >> the "best we could do," updating the response with Live TDCFs from CALCS and fixing the missing 3.2 kHz pole against >> the "second best thing to do" which is leave the Frozen TDCFs alone and correct for the missing 3.2 kHz pole Even for the NOT THERMALIZED time, the BLUE never exceeds 1.002 / 0.1 deg in magnitude / phase, and it's small compared to the "TDCF only" the simple correction of Frozen GDS TDCFs to Live CALCS TDCFs, shown in GREEN . This helps quantify why Joe thinks we can separately apply the two corrections to the systematic error budget, because GREEN is much larger than BLUE. For the THERMALIZED time, in BLUE, that ratio of full models is even less, and also as expected the ratio of simple TDCF update models is also small. %%%%%%%%%% The code that produced this aLOG is create_no3p2kHz_syserror.py as of git hash 3d8dd5df.
Following up on this study just one step further, as I begin to actually correct data curing the time period where both of these systematic errors are in play -- the frozen GDS TDCFs and the missing 3.2 kHZ pole... I craved one more set of plots to convey "Fixing the Frozen GDS TDCFs and Fixing the 3.2 kHz pole are "separable" to good approximation" showing the actual corrections one would apply in the different cases: :: BLUE = eta_R_nom_over_no3p2k = R_ref / R_no3p2k >> The systematic error created by the missing 3.2 kHz pole in the ESD model alone :: ORANGE = eta_R_wTDCFs_CALCS_over_R_wTDCFs_GDS = R_wTDCFs_CALCS / R_wTDCFs_GDS >> the systematic error created by the frozen GDS TDCFs alone :: GREEN = eta_R_nom_over_no3p2k * eta_R_wTDCFs_CALCS_over_R_wTDCFs_GDS = the product of the two >> the approximation :: RED = a previously unshown eta that we'd actually apply to the data that had both = R_ref (updated with CALCS TDCFS) / R_no3p2k (updated with GDS TDCFs) the right thing As above, it's important to look at both a thermalized case as well as a non-thermalized case., so I attach those two, NOT THERMALIZED, and THERMALIZED. The conclusions are the same as above: - Joe is again right that the difference between the approximation (GREEN) and the right thing (RED) is small, even for the NOT THERMALIZED time But I think this version of the plots / traces better shows the breakdown of which effect is contribution where on top of the approximation vs. "the right thing," and "the right thing" was never explicitly shown. All the traces in my expanded aLOG, LHO:72879, had the reference model (or no 3.2 kHz pole models) updated either both CALCS TDCFs or both GDS TDCFs in the numerator and denominator, rather than "the right thing" where you have CALCS TDCFs in the numerator and GDS TDCFs in the denominator). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% To create these extra plots, I added a few lines of "calculation" code and another 40-ish lines of plotting code to create_no3p2kHz_syserror.py. I've now updated in within the git repo, so it and the repo now have git hash 1c0a4126.
Naoki and I unmonitored H1:SQZ-FIBR_SERVO_COMGAIN and H1:SQZ-FIBR_SERVO_FASTGAIN from syscssqz observe.snap. They have been regularly taking us out of observing (72171) by changing when the TTFSS isn't really unlocking, see 71652. If the TTFSS really unlocks there will be other sdf diffs and the sqz guardians will unlock.
We still plan to investigate this further tomorrow. We can monitor if it keeps happening using the channels.
Daniel, Sheila
We looked at one of these incidents, to see what information we could get from the beckhoff error checking. The attached screenshot shows that when this happened on August 12th at 12:35 UTC, the beckhoff error code for the TTFSS was 2^20, counting down on the automated error screen (second attachment) the 20th error is Beatnote out of range of frequency comparator. We looked at the beatnote error epics channel, which does seem to be well within the tolerances. Daniel thinks that the error is happening faster than it can be recorded by epics. He proposes that we go into the beckhoff code and add a condition that the error condition has to be met for 0.1s before throwing the error.
In the last 5 days these channels would have taken us out of observing 13 times if they were still monitored, plot attached. Worryingly, 9 times in the last 14 hours, see attached.
Maybe something has changed in SQZ to make the TTFSS more sensitive. The IFO has been locked for 35 hours where sometimes we get close to the edges of our PZT ranges due to temperature drifts over long locks.
I wonder if the TTFSS 1611 PD is saturated as power from the PSL fiber has drifted. Trending RFMON and DC volts from the TTFSS PD, it looks like in the past 2-3 months, the green beatnote's demod RF MON has increased (its RF max is 7), while the bottom gray DC volts signal from the PD has flattened out around -2.3V. Also looks like the RF MON got noisier as the PD DC volts saturated.
This PD should see the 160 MHz beatnote between the PSL (via fiber) and SQZ laser (free space). From LHO:44546, it looks like this PD "normally" would have like 360uW on it, with 180uW from each arm. If we trust the PD calibrations, then current PD values report ~600uW total DC power on the 1611 PD (red), with 40uW transmitted from the PSL fiber (green trend). Pick-offs for the remaining sqz laser free-space path (iem sqz laser seed/LO PDs) don't see power changes, so unlikely the saturations are coming from upstream sqz laser alignment. Not sure if there's some PD calibration issues going on here. In any case, all fiber PDs seem to be off from their nominal values, consistent with their drifts in the past few months.
I adjusted the TTFSS waveplates on the PSL fiber path to bring the FIBR PDs closer to their nominal values, and at least so we're not saturing the 1611. TTFSS and squeezer locks seem to have come back fine. We can see if this helps the SDF issues at all.
These were re-monitored in 72679 after Daniel adjusted the SQZ Laser Diode Nominal Current, stopping this issue.