My detailed DQ shift report for August 27th - 30th can be found here:
https://wiki.ligo.org/DetChar/DataQuality/DQShiftLHO20150827
Highlights:
Main oustanding issues:
In addition, many glitches picked up by Omicron or similar tools could use some further investigation, that I did not have the time to perform. I plan on following up on a few of them in the coming days.
The "noisemon pringle" actuator noise measurement from LLO alog 19853 has been calibrated to meters, and extended to the L1 and L2 stages of all test masses.
Total noise from these stages is estimated to be ~1e-20 m/rtHz around 30 Hz. This includes driver noise, DAC noise, and glitches (to the extent they were present during the measurement).
A hierarchical noise budget PDF is attached. Here's how things look for ITMX L2:
From this plot you can see the following:
The calibration of these signals is approximate, based on the coil driver and suspension models -- not the precise measurements people have been taking during the past week. There may be some discrepancies especially in the L1 (UIM) stage. The ETMY L1 plot definitely shouldn't be trusted, because its noisemons appear to be broken (see attached plot).
Scripts have been checked in to the NoiseBudget SVN.
(All time in UTC)
15:00 Jim has been having trouble locking DRMI all night. So I work on PRMI.
15:32 Managed to get pass DRMI_LOCKED, lockloss at REDUCE_CARM_OFFSET. Saturation alarm complained about PRM.
16:17 Hugh to end stations
16:29 Robert to CER and LVEA
16:40 SR3 is totally unresponsive to Gurdian. See alog 21057.
DAC recalibratio and DAC restart.
16:53 Hugh back
17:05 Robert back
17:45 Begin initial alignment after several locklosses at random places.
burtrestored the alignment to last night lock
18:20 Karen/crew to change room and optics lab for cleaning.
19:36 Since there's an operator training going on, Evan insisted I switch the Operating Mode to TRAIN.
21:39 Jeff/Evan to restart PR3, PRM satellite amplifier.
22:00 Jeff back
22:26 Gerado to LVEA hunting for parts.
22:28 Jeff K. brings IMC_LOCK Guardian to OFFLINE
22:39 Gerado back
This afternoon, Richard and Fil tried various hardware diagnostics of the cabling and boxes out on the floor for the shared PRM/PR3 chain noise. While swapping sat boxes and reterminating cables a few times, the PRM RT TOP BOSEM flopped between flatlining, or being super noisy multiple times after they made connections. There was additional confusion in the fact that the PRM misaligned state moves the PRM a huge amount in YAW, thus sending the RT BOSEM signal to near zero - we thought this meant it "died" a few times during our troubleshooting. Long story short, the troubleshooting of the electronics this afternoon didn't seem to improve the noise much.
To be continued...
C. Cahillane, D, Tuyenbayev, E, Hall I have discovered the difference between Plot 1 and Plot 7 in aLOG 20974. To recap, Plot 1 was strain uncertainty calculated from actual ER7 data, and Plot 7 was strain uncertainty calculated from sensing, digital, and actuation transfer functions only. One issue was the dewhitening filters were incorrect. Evan taught me to locate the correct filters based on the GPS times and channel names. (They can be found in /opt/rtcds/lho/h1/chans/filter_archive/h1calcs/ The shortcut terminal command is "chans") Another issue was I was using only the ASD of DARM_ERR and DARM_CTRL for my calculations, and ignoring the phase information. I have corrected this in my code, but not in my uncertainty document T1400586. Below I have replotted the carpet plots from aLOG 20974 in the same order. Now you can see very good agreement between New Plot 1 and New Plot 7. The next step is to discover why the data is so glitchy at high frequency. I believe it is due to interpolation of the data to fit our ER7 transfer function frequency vector. Darkhan has been working on the ER8 DARM model and has made every element into an LTI object. This will make getting the ER8 transfer functions into any frequency vector very easy, so this is the next step for this project. The step after that is to begin calculating the error bars themselves. I will have to revisit my Mathematica notebooks to recalculate uncertainty in strain magnitude σ_{|h|} and strain phase σ_{φ_h} with phase info from DARM_ERR and DARM_CTRL included.
Noticed HVE-EX:75IPBSC9_511LOG red on the vacuum main screen, the ion pump appears to have railed around 12:51 pm today. I drove to End X and found the ion pump connected and powered on, controller shows all LEDs on.
Attached is 1 day trend data.
J. Kissel, Nutsinee
Dan found glitches in SR3 T2 last night and it seems like we have been having trouble locking ever since so we investgated. SR3 T2 problem started right before Aug 31 09:23 lockloss and never go away. DAC recalibration this morning didn't solve the issue and the power glitch didn't cause it. The noisy/glitchy feature looks just like what happened before the drive swap back in Aug 18.
We brought the SR3 to SAFE for a few minutes. The width of the noise still looks the same but the glitches were gone. This implies that the glitches come from the actuator. These glitches don't seems to appear in MASTER_OUT or NOISEMON channels but they do appear in the witness channels. The first plot is the SR3 T2 before we brought it to SAFE, second plot is after SAFE, and the third plot is 15 minutes before 09:23 lockloss.
The LHO SITEMAP MEDM was modified to add two new MAIN catagories: generic seismic (SEI) and Observing Run related screens (O-1).
I have added Hugo's generic SEI_SAT_OVERVIEW to the SEI button. I have added my new H1CDS_O1_OVERVIEW_CUST to the O-1 list. The new CDS OVERVIEW MEDM must always be GREEN. Any deviation from the O1 configuration will show as RED. Hopefully nothing will be red for long on this screen during O1.
h1nds1 (the default NDS) daqd process died this afternoon. This was a more drastic failure, the computer locked up and had to be reset. The last messages in the log files are different from yesterday to today:
Sunday 30 Aug 15:29:
Retry expired; requesting again
....
Packet buffer is full
Moday 31st Aug 15:58:
[date-time] Ask for retransmission of 15 packets; port 7097
....
Have to skip 15 packets (packet buffer limit exceeded)
The times of the restarts of h1sush2a and h1sush56 are summarized below.
2015_08_31 09:13 h1iopsush2a
2015_08_31 09:15 h1susmc1
2015_08_31 09:15 h1susmc3
2015_08_31 09:15 h1suspr3
2015_08_31 09:15 h1susprm
2015_08_31 10:09 h1iopsush56
2015_08_31 10:09 h1susomc
2015_08_31 10:09 h1sussr3
2015_08_31 10:09 h1sussrm
For the record, the above mentioned DAC recalibrations did NOT solve any of the problems that have reared up over the weekend. I can, however, report that the auto-calibration was successful for all DAC cards that were restarted. The 3rd DAC card's calibration on h1sush2a succeeded slowly, as it has done previously both on Aug 03 2015 (LHO aLOG 20165), and the time prior, Jun 09 2015 (LHO aLOG 19030). As reported before, this DAC card controls PRM M1 RT and SD. Last six channels are PR3 M1 T1, T2, T3, LF, RT, SD. Also as reported before, we don't know what this means or if it is significant. HOWEVER, according to what tests we have done, this DAC card being slow is merely coincidental with the problems we've been having with the PRM LF RT and PR3 T1 T2 noise found on those OSEM sensors. We've confirmed this by measuring the ASD of the OSEM sensors (as Evan has done in LHO aLOG 21056) with the suspension in SAFE (i.e. no requested drive) and found the noise as expected. We then switched the TEST/COIL enable switch to removed the DAC's ability to drive by removing the DAC input to the coil driver. The noise remained. The investigation continues... For h1sush2a (which houses MC1, MC3, PRM, and PR3): [8808794.054622] h1iopsush2a: DAC AUTOCAL SUCCESS in 5344 milliseconds [8808799.414955] h1iopsush2a: DAC AUTOCAL SUCCESS in 5344 milliseconds [8808806.438391] h1iopsush2a: DAC AUTOCAL SUCCESS in 6572 milliseconds [8808811.798720] h1iopsush2a: DAC AUTOCAL SUCCESS in 5344 milliseconds [8808817.586599] h1iopsush2a: DAC AUTOCAL SUCCESS in 5344 milliseconds [8808822.947012] h1iopsush2a: DAC AUTOCAL SUCCESS in 5345 milliseconds [8808828.307368] h1iopsush2a: DAC AUTOCAL SUCCESS in 5344 milliseconds the last time these DACs were recalibrated was Aug 03 2015. For h1sush2a (which houses SRM, SR3, and OMC): [5345389.136545] h1iopsush56: DAC AUTOCAL SUCCESS in 5333 milliseconds [5345394.496918] h1iopsush56: DAC AUTOCAL SUCCESS in 5344 milliseconds [5345400.284896] h1iopsush56: DAC AUTOCAL SUCCESS in 5341 milliseconds [5345405.645225] h1iopsush56: DAC AUTOCAL SUCCESS in 5344 milliseconds [5345411.433249] h1iopsush56: DAC AUTOCAL SUCCESS in 5341 milliseconds the last time these DACs were recalibrated was Aug 18 2015 (LHO aLOG 20631)
As per alog 21036, Evan dropped the absurd precision in the WFS_offset_* scripts which set the dark offsets. I reran it now since the IFO is down, confirmed that none of the 80 offsets changed by very much, and then accepted them in SDF. Now at only 4 sig didgits, SDF was able to write them into the SAFE.snap when I hit ACCEPT all.
ER8 Day 14
model restarts logged for Sun 30/Aug/2015
2015_08_30 15:30 h1nds1
2015_08_30 15:31 h1nds1
both restarts unexpected.
J. Kissel, K. Izumi, S. Dwyer, N. Kijbunchoo While trying to being the h1sush56 SUS (SR3, SRM, and OMC) to safe for DAC calibration of all cards in that chassis (see LHO aLOG 21048), we had great trouble with the SUS_SR3 guardian node. This node is not managed, but it was entire unresponsive to requests to change its state, to load, to pause, to stop, anything. Sadly, this was all true and the guadian node screen did *not* turn red indicating its in error. Looking at the guardian log, there was a message: epicsMutex pthread_mutex_unlock failed: error Invalid argument epicsMutexOsdUnlockThread _main_ (0x3e7a0f0) can't proceed, suspending. This is after requesting the ISC_LOCK guardian to DOWN. Note, we also had the SR3_CAGE_SERVO guardian still running, because, though this is managed by the ISC_LCOK guardian, it does not turn it OFF in the DOWN state. Probably not the issue, but it gathered our attention because it had gone nuts and was driving the M2 stage into constant saturation. See first attachment for screenshot of the broken situation. For the record this has happened on a smattering of guardian nodes in the past, see LHO aLOGs 17154 and 16967. ----------------------- Here's what we did to solve the problem: - Tried to restart the guardian node from a work station, guardctrl restart SUS_SR3 No success. - Tried to destroy the guardian node from a work station, guardctrl destroy SUS_SR3 No success. Both report stopping node SUS_SR3... timeout: run: SUS_SR3: (pid 3042) 23232015s, want down, got TERM (see second attachment). - Tried logging into guardian machine, h1guardian0, and began to kill process IDs sound on the machine related to SUS_SR3, (see third attachment). - Curiously, as I killed the first two processess, runsv SUS_SR3 and svlogd -ttt /var/log/guardian/SUS_SR3 as soon as I killed the latter, the guardian log came alive again, and the the SUS_SR3 node became responsive. - At Kiwamu's recommendation, I killed all processes simultaneously, destroyed the node, and restarted the node, just to make sure that all bad joojoo had been cleared up. ---------------- The problem is now solved, and we've moved on. Unsure what to do about this one in the future other than the same successively agressive restarting techniques...
This is a follow-up on alog alog20541 and alog20717
I took some time to study Robert's noise injections in more details. The result attached below. Quick conclusion: Human jumps in the change room and car sudden breaks near 2k electronics building and high-bay seems to couple into DARM.
At the request of Richard M and Nutsinee, I restarted the models on h1sush2a to force a recalibration of the 18bit DAC cards. All cards reported successfull calibration.
Nairwita Mazumder, Rich Abbott A few days back Jim noticed (alog ) that the "Bumbling line" which varies over a large frequency range is again back on ETMX seismic channels . This was first noticed on March and disappeared before ER7 and again was seen from 4th August. One can see the lines at all the horizontal and vertical sensors on ETMX. I have attached a pdf containing some follow up work done during Rich's recent visit to LHO. The first plot in the pdf is the spectrogram of ETMX GS13 on 26th August. It can be seen that there are multiple wandering lines having a fixed offset. We were suspecting that some magnetometers at the End X might be the culprit (as we could not find any correlation between temperature fluctuation with the line ). The second and third plots are the spectrum of H1:PEM-EX_MAG_EBAY_SEIRACK_Z_DQ and H1:ISI-ETMX_ST2_BLND_Z_GS13_CUR_IN1_DQ for 2nd August and 26th August respectively. The red one is for 2nd August when the bumbling line could not be found and the blue one is the recent data (26th August). It is clear that the peaks appearing on ISI-ETMX_ST2_BLND_Z_GS13 after 3rd August are correlated with the peaks of the spectrum (which also appeared around the same time) of SEIRACK magnetometer . The plots on the second page shows the coherence between GS13 and the magnetometers in the VEA and SEIRACK. It looks like the magnetometer on the SEI rack has stronger coherence with GS13 sensors than the magnetometer located at VEA . I have marked two points (blue and red cross) in the coherence plots to highlight two of the many peaks.
Adding to Nairwita's comments, the signal seen in the GS13 spectra is also present in the magnetometer data. This being the case, it's most likely that the harmonic series points to an electromagnetic artifact associated with the HEPI pump variable frequency drive. The fact that the same signature does not exist at the other end station (I assume this to be true, but have not verified) may point to an enhanced susceptibility in the X-end electronics for some reason. No reason to panic much yet, but duly noted.
I have attached the coherence plots computed between PEM-EX_MAG_SEIRACK and GS13 , ST1 CPS and ST2 CPS over the frequency range 0.4Hz-900Hz to check the following two points: (1) If there exists any coherence between CPS and the Magnetometer at frequency above 256 Hz (2) What the low frequency behavior is I can be seen that the coherence between CPS and the magnetometer above ~25Hz is pretty low compared to GS13, but them have relatively high coherence with PEM-EX_MAG_SEIRACK near 20Hz .
Kiwamu, Nutsinee
As we were trying to relock the ifo after several locklosses due to high wind (50mph), we noticed the sideband signals wiggled a lot before another lockloss at DC_READOUT (wind speed ~35-40 mph). We found a coherence between POP18, POP19, POP_A_LF, AS90 and PRM, SRM, BS which indicates that the DRMI was unstable. The BS ISI Windy blends weren't turned on.
One of the two lock losses seemed to be associated with PRM saturation. We heard of the saturation alarm voice pointing PRM DAC in full lock mutiple times before the lockloss in NOMINAL_LOWNOISE. I am not sure if this is the direct cause, but as shown in the attached, PRM had been experiencing ~20 sec oscillation in longitudinal which used to be a big issue in the past (alog 19850). At that point wind was around ~40 mph on average. Also, I attach spectrum of each coil on the M3 stage. It is clear that the components below 0.1 Hz are using up the DAC range when wind is high.
Just as a check, I remade Kiwamu's plot for PRM, SRM, and MC2, with all the stages that are used for actuation.
At this point, the wind ine corner station varied between 3 and 13 m/s. The 30 mHz – 100 mHz BLRMSs were about 0.02 µm/s in the CS Z (consistent with sensor noise), 250 µm/s for EX X, and 250 µm/s for EY Y.
Since this time, we have increased the offloading of PRM and SRM to M1 by a factor of 2, but we probably need an even higher crossover in order avoid saturation during these times. It may have the added benefit of allowing us to stay locked during even windier times. Additionally, MC2 does not look like it needs any work on its crossovers in order to avoid saturation.
The above comment should say 0.25 µm/s for EX X and EY Y.
J. Kissel, K. Kawabe, S. Karki We've broken observation mode such that we can enable the DAC DuoTone timing readbacks on the front ends that are responsible for DARM control, i.e. h1lsc0, h1susex, and h1susey. We needed to take the IFO down for this because the last channel on the first DAC cards for the end station SUS are used for top-mass OSEMs for damping the suspensions. If the damping loops get a two sign waves at 960 and 961 [Hz] instead of the requested control signal for one of the OSEMs, then we get bad news. Here are the times when the DAC DuoTone switches were ON for the following front ends: h1susex and h1susey --- 19:04 to 20:04 UTC (12:04 to 13:04 PDT) h1lsc0 --- 19:16 to 20:06 UTC (12:16 to 13:04 PDT) Though all relevant channels (ADC_0_30, ADC_0_31, DAC_0_15) are free on the h1lsc0 front end, we elected to turn the DAC DuoTone off, so that we aren't in danger of an oscillitory analog voltage being sent around the IO chassis that's used to measure the OMC DCPDs. Data and analysis to come. The IFO will be staying down for a few hours, while we finish up some electronics chain characterization of the OMC DCPD analog electronics (along with some other parasitic commissioning measurements).
I showed Sudarshan which signal to look at and how to analyze them. He will make an awesome drawing of how things are connected up in this alog.
The first and second attachment shows the duotone timing of the signals pulled from the IOP channels (all 64kHz). The results are summarized in the following table.
Measurement time (UTC) |
IOP | ADC0 Ch31 (direct) (us) | ADC0 Ch30 (loopback) (us) | Round-trip (us) |
27/08/2015 19:16:11.0 | LSC0 | 7.34 | 83.78 | 76.44 |
SUS_EX | 7.25 | 68.90 | 61.65 | |
SUS_EY | 7.26 | 68.93 | 61.67 | |
27/08/2015 22.32:20.0 | ISC_EX (PCALX) | 7.32 | 68.93 | 61.61 |
ISC_EY (PCALY) | 7.26 | 68.90 | 61.84 |
As per yesterday's alog, duotone is about 7.3usec delayed behind LSC ADC, and actually this turned out to be the case for all ADCs.
According to Zuzsa Marka, duotone was "delayed a bit above 6 microseconds compared to the GPS 1pps" (report pending), so probably this means that the ADC timing (i.e. time stamp of ADC) is decent.
Duotone round trip delay for all IOPs except IOP-LSC0 is about 61us or about 4 64k-clock cycles. For LSC0, this was about 5 64k-clock cycles.
I don't know where the difference comes from. This is totally dependent on how the 64kHz ADC input is taken, routed to 64kHz DAC when "DT DAC" bypass switch is in "ON" position (third attachment), and finally output by DAC, but I don't think there should be difference between LSC and everybody else. At least LSC DAC timing doesn't come into the DARM timing.
The next table is for 16kHz pcal channels on the frame. The measurement results as well as the channel names are shown in the last attachment.
UTC | user model |
ADC0 Ch31 (direct in) (raw, raw-decimation) |
loop back | Round trip |
27/08/2015 22.07.23.0 | CAL-PCALX | (63.30, 7.37) |
ADC0 Ch30 (direct in without AI and AA) |
61.62 |
ADC0 Ch28 (with AI and AA) 377.72 |
||||
CAL_PCALY | (63.24, 7.31) |
ADC0 CH30 (direct in without AI and AA) (raw, raw-decimation) (124.89, 68.96) |
61.65 | |
ADC Ch28 (with AI and AA) 377.07 |
For Ch31 and Ch30, the routing is done bypassing the user model, the signals are merely imported into the user model and decimated.
Sudarshan found the 4x decimation filter delay to be 19.34deg or 55.93us at 960.5Hz, and "raw-decimation" number is obtained by just subtracting this from the raw number. This is consistent with the 64kHz result, so from now on we can look at 16kHz signals as far as pcal is concerned.
I don't know anything about AA and AI, so I'll leave the analysis to Sudarshan.
Relevant scripts and dtt templates are in /ligo/home/keita.kawabe/Cal/Duotone.
Keita's alog explained the timing on Duotone to ADC and DAC to ADC loop as well. Additionally in pcal, channel 28 is routed through the analog AI and AA chasis. The details about how the channels are connected can be found in the attached schematics.
From the schematics we can see there are three (3) 4X decimation filters (two downsampling and one upsampling) in this particular chain (Channel 28). This amounts 3*55.93 us = 167.79 us of delay (each of these filter produce phase delay of 19.34deg or 55.93us at 960.5Hz). The analog AA and AI chassis produce phase delay of 13.76 degrees which amounts to about 39.82 us at 960.5 Hz from each chassis totaling in 79.64 us of time delay.
Total Delay = 3*55.93+2*39.82 =247.73 us.
Column 3 contains the measured (raw) time delay and "raw- total delay".
Column 4 contains the roundtrip time (raw-timedelay-7 us) = ~ 122 us (8-64 KHz cycle).
UTC | Channel | ADC CH 28 LOOP BACK (FILT DUOTONE) | Round trip |
27/08/2015 22.07.23.0 | |||
CAL_PCALX |
ADC0 Ch28 (with AI and AA) (raw, raw-(3*decimation+2*analog AA/AI)) (377.72, 130.29) |
122.92 | |
CAL_PCALY |
ADC Ch28 (with AI and AA) (raw, raw-(3*decimation+2*analog AA/AI)) (377.07, 129.64) |
122.33 |
UTC | user model |
ADC0 Ch31 (direct in) (raw, raw-decimation) |
loop back | Round trip |
27/08/2015 22.07.23.0 | CAL-PCALX | (63.30, 7.37) |
ADC0 Ch30 (direct in without AI and AA) |
61.62 |
ADC0 Ch28 (with AI and AA) 377.72 |
||||
CAL_PCALY | (63.24, 7.31) |
ADC0 CH30 (direct in without AI and AA) (raw, raw-decimation) (124.89, 68.96) |
61.65 | |
ADC Ch28 (with AI and AA) 377.07 |
|