Observing the many trips HAM2 has experienced using the guardian recently, I've seen that after the first group of dofs are isolating, the remaining dof error points often grow large and when these dofs gains are ramped up, the ISI trips with a rung up GS13. I saw that the guardian was loading the reference locations between the dof groups engagement. Whereas the old seismic command scripts would only load the reference locations after all the dofs are isolating at the free hanging position. Remember, the command script was not having any trouble turning the isolation back on.
Jamie modified the Guardian to do the loading of the reference location at the end of the isolation process like the command scripts. When I got my window to test this, I actually forgot at first to restart the guardian. And, the first couple times guardian was able to bring it up! When I realized it wasn't doing it the new way, I remembered the restart, but I kept trying the guardian. On the third or fourth attempt with the guardian it tripped. I then restarted the guardian and then the guardian was successful at full isolation at least 6 times before the commissioners returned to the control room.
So again, my theory is, when the isolation loops are servoing to a place away from the free hanging position, the uncontrolled dofs error points can grow large. This is dependent on many thing and lots of luck good & bad. When the loop is engaged at low gain, the error point may be swinging about zero and if the loop gain is changed to 1.0 (from 0.01) at an unlucky time, bam goes the GS13.
Currently, HAM2 ISI is successfully back under Guardian conrol.
Keith T, Jim, Daniel, Dave
As Keith mentioned in yesterday's CDS meeting, the RFM IPC error rate at the SUS model is related to the CPU usage of the LSC sender. To show this, I've attached a minute trend plot of the past 7 months of IPC error rates at SUS-EX, along with the CPU-MAX of the LSC model. As can be seen, when the LSC CPU-MAX is regularly around 40uS, the receive error rate goes up from 2 to 10 errors per 16384 received packets.
At 10 errors per 16384 packets, the error rate is 0.06%. Can we determine if this is significant?
questions/ideas which have been mentioned when thinking about possible solutions:
reduce LSC processing (filter modules unnessary or unnessarily complicated) split LSC into two cores (again) delay the RFM receiver by one whole cycle (adds 60uS latency to control signal) replace LSC computer with faster hardware (being investigated by Rolf, Keith)ISS first loop was all over the place as was the diffracted power at the beginning of my shift. A slight tweak in the less negative direction set it back to stable. Further adjustment got it back to ~7:5% diff power at 2.04V refsignal. Perhaps more attention can be given to ISS on a daily basis by the operator on shift until a better solution can be implemented.
https://alog.ligo-wa.caltech.edu/aLOG/index.php?callRep=13781
Based on the TCSX central heating calibration (62.3 micro-diopters single-pass per Watt) and the calculated static lens of -80213m, we require:
Edited 31-Oct-2014: this isn't correct because of an error in the laser power calibration
The calibration of defocus vs delivered power is incorrect as the delivered power channel, H1:TCS-ITMX_CO2_LSRPWR_MTR_OUTPUT, was not calibrated correctly.
I went back and reviewed the delivered power for this measurement:
Before thermal lens: H1:TCS-ITMX_CO2_LSRPWR_MTR_INMON = 172.7 counts
During thermal lens H1:TCS-ITMX_CO2_LSRPWR_MTR_INMON = 2113.4 counts
The new gain through the filter banks is 7.2087E-4 Watts per count.
This means 1.399 Watts was applied to ITMX during the thermal lens measurement.
Further analysis of the HWS measurements of the thermal lens show:
Based on the TCSX central heating calibration and the calculated static lens of -80213m, we require:
Daniel, Dave
We have stopped the following CSD guardian nodes from running. They can be restarted at a future date if needed.
H1ECATC1PLC2 H1ECATX1PLC2 H1ECATY1PLC2 H1ISCEX H1ISCEY H1LSC H1LSCAUXIf we choose to restart these, they should be generically renamed (dropping the H1 in the name)
also stopped the following nodes:
LSC LSC_PRMI-VAR_FINESSEK. Venkateswara
I've attached ASD and coherence plots from 40k seconds of data from BRS/T240 and the super sensor from last night. Wind speeds were in the 5-10 mph range. This time, I showed the raw signals coming out from BRS and the T240 X instead of showing the output of the filters. I've also shown the ref. mirror and the output of the super-sensor.
BRS has much larger noise below 2-3 mHz, than the T240, even though the sensor noise appears to be much less. It was interesting to me that it had a similar shape as the ref. mirror output and there does appear to be some coherence between the two. Temperature is the usual suspect at these frequencies. A sensitive temperature sensor on the BRS platform would be useful in diagnosing this noise.
The current cdsutils installation (r361) has a new and improved NDS avg() function. The previous version was buggy, was calculating the standard deviation incorrectly, and had a clunky interface. This new version fixes those issues:
The standard deviation was previously being calculated incorrectly and the returned values were wrong (off by some unknown but kind of large factor). The new version uses the python numpy.var() function to calculate the variance, from which the standard deviation is calculated. It has been verified to be correct.
Previously, the avg() function required full channel names, with IFO prefixes. This new version works like the ezca interface whereby the IFO prefix can be left off of the channel name. This makes things much cleaner, and allows scripts to be portable. E.g., instead of:
cdsutils.avg(2, IFO+':LSC-DARM_OUT')
just do:
cdsutils.avg(2, 'LSC-DARM_OUT')
Probably the thing that's going to cause the most trouble is that the interface to the function has been fixed up. Previously, the function was returning a dictionary of channel:avg key pairs. This was very clunky to use.
The return value of the function now mirrors the input argument. So for instance, if a single channel is requested, a single avg value is returned. If a list of channels is requested, a list of average values is returned, in the same order as the input channel list. So calls like:
avg = cdsutils.avg(2, 'LSC-DARM_OUT').values()[0]
can now just be:
avg = cdsutils.avg(2, 'LSC-DARM_OUT')
If the stddev option is provided, the output will be an (avg, stddev) tuple, or if multiple channels is requested, a list of such tuples, e.g.:
avg, stddev = cdsutils.avg(2, 'LSC-DARM_OUT', True)
I have fixed all calls to cdsutil.avg() that I could find in the USERAPPS/release. I likely didn't catch everything, though, so be aware of the new interface and update your code appropriately.
controls@opsws5:sbin 0$ cdsutils -h
I pushed a new version of cdsutils (r366) that fixes the issue with the help. Commands were being intercepted by one of the gstreamer libraries:
jameson.rollins@operator1:~ 0$ cdsutils -h
usage: cdsutils
Advanced LIGO Control Room Utilites
Available commands:
read read EPICS channel value
write write EPICS channel value
switch read/switch buttons in filter module
sfm decode/encode filter module switch values
step step EPICS channels over specified range
servo servo EPICS channel with simple integrator (pole at zero)
trigservo servo EPICS channel with trigger
avg average one or more NDS channels
audio play NDS channel as audio stream
dv plot time series of NDS channels
water NOT SUPPORTED: No module named PyQt4
version print version info and exit
help this help
Add '-h' after individual commands for command help.
jameson.rollins@operator1:~ 0$
The "junk" messages, as you refer to them, in the avg output are actually just logging that are going to stderr, and therefore do not affect the stdout output:
jameson.rollins@operator1:~ 0$ foo=$(cdsutils avg -n 2 H1:LSC-MICH_IN1)
ignoring first online block...
received: 1098998811 + 1.0
received: 1098998812 + 1.0
jameson.rollins@operator1:~ 0$ echo $foo
75.2394218445
jameson.rollins@operator1:~ 0$
After the recent upgrades, a couple of small bugs were identified in guardian and cdsutils, and new versions were pushed out:
The new version have been pushed, and some of the guardian nodes have been restarted, but not all. We should restart all nodes at the next opportunity.
I pushed out a new guardian node log viewing interface, so that you no longer need to ever type in a password when viewing logs. This new interface can be used by using the new guardlog
program that is now installed everywhere. Just provide as arguments to the program the names of the nodes whose logs you would like to view, e.g.:
guardlog SEI_ITMY HPI_ITMY ISI_ITMY_ST1 ISI_ITMY_ST2
The logs of all specified nodes will be tailed, interleved as they are spit out, to the current terminal. The GUARD medm control interfaces now use this log viewer as well.
tl;dr: A very lightweight "guardlog" server is now running on the guardian machine (h1guardian0), listening on port 5555. Making a simple TCP connection to that port (with the guardlog client) will open up the interface and spit out the logs for the specified nodes.
J. Kissel Another data point on the issue of IPC timing causing errors between the LSC model and QUAD models -- Stuart and I added 1 (one) RFM IPC sender to the LSC and 1 (one) RFM PC receiver to each of the ETM models for DARM-sensed, violin mode damping on Tuesday morning (see LHO aLOG 14657). This appears to have increased the ETM model's CPU_METER by ~1 [us], and made no change to the random-time computation time jumps up by as much as ~5-6 [us]. Surprisingly, it *decreased* the LSC model's average compute time by ~ 1 [us], but again also did not affect the timing jumps as high as an extra 7-9 [us]. I attach a 7 day, minute trend. The only discrete jump in the trend is when the models were restarted around 2014-10-28 14:15 UTC (or Tuesday 2014-10-28 07:15 PDT).
I noticed that the DARM Error damping display on the QUAD MEDM Overview screen was showing the incorrect channel information. So I substituted them for the correct channels (H1:SUS-ITMY_L2_DAMP_MODE1_OUTPUT, H1:SUS-ITMY_L2_DAMP_MODE2_OUTPUT etc.) and committed the fixed screen to the userapps svn, as follows:- /opt/rtcds/userapps/release/sus/common/medm/quad/ M SUS_CUST_QUAD_OVERVIEW.adl This will need to be svn'd up at LLO too.
Between approximately 10:07 AM PT and 10:16 AM PT the network upstream of LHO had some routing changes or outage that caused a temporary loss of connectivity to the Internet past PNNL/ESnet.
Bubba, John,
In order to compensate for the cooler weather we have turned on two LVEA heaters at ~ 09:40 local
HC1A is set to 15ma (two stages) - it appears that the first stage is open circuit.
HC2b is set to 9ma (one stage).
There will be a minor impact on LVEA temperatures.
At Peter Fritschel's request, I have extracted lists of science and commissioning channels, as a follow-up to the frame rate studies ( see aLOG 14718 ). I modified my existing script to extract the names (see attached scichannels_script.txt). As a reminder 'acquire=1' means commissioning only, 'acquire=3' means science and commissioning frames. The list of science frame channels is 'h1_sci_channels.txt'. The list of commissioning frame-only channels is 'h1_comm_channels.txt'. Each list has a channel and its rate.
Kyle Ryan, R. Weiss Running the ionizer on purge gas air solves the problem of unequal positive and negative ion currents in the gas being used to discharge the test masses. The oxygen which has known negative ion states provides the bulk of the negative ions. The ionizer was run for 2 hours with sampled + and - ion currents of 1.5e-9 Amps ( a factor of 100 more in the ion stream). Optical HR coating test samples and some VITON pieces were placed in the ion stream to determine if the chemically reactive oxygen ions and ozone formed in the ionizer do damage. The optical samples will be sent to Caltech for absorption tests. The purge air system was turned off at 7:10PM last night.
The 4.5 hour heating test last night on ITMX was successful. The initial thermal lens transient is shown in the attached plot.
The thermal lens forms relatively rapidly in response to the CO2 heating. Then it decays at the same rate when the heating is turned off. Once again, Physics works.
When the offset in the spherical power at t = 0s is accounted for, the thermal lens magnitude is approximately 80 micro-diopters (double-passed) for 680mW of applied CO2 laser power.
Further analysis on the shape and centering of the lens is pending ...
(The noise in the HWS measurement in the last half hour is coming from SUS injections into ITMX).
Further analysis of the HWS measurements of the thermal lens show:
Additionally, I reviewed the thermal lens location over time. The attached plot shows the location of the thermal lens in the ITMX coordinates. The HWS data is correctly positioned in this coordinate system.
The thermal lens center drifted left by 15 mm or so after the first hour or so. I've plotted the center of the lens at earlier time.
The thermal lens center is measured at [-53.5, 4.7] mm. To center this lens we need to move the upper periscope pico (PICO_F_3) by [-8900, +800] counts.
I moved the mirror from [-16012, -16006] to [-24909, -15190].
The CO2X central heating should be centered now.
As requested by the seismic team, I ran the script to configure the ETMX ISI to a test configuration at around 1:47:00 in UTC or 18:47:00 in PDT. I am leaving it in this configuration.
The configuration was set back to the nominal by running Switch_To_Comm.py (see alog 14695) at around 9:00 am in PDT.
Alexa, Kiwamu,
In response to Keita's alog about the PR2 baffle, we took a peek at the PR2 baffle by opening some of the viewports on HAM3 and HAM2 spool.
Conclusions:
(PR2 baffle check out)
We opened up a viewport on the East side of HAM3 which the leftmost one with an illuminator attached on. This was the only available viewport to open up. We removed the illuminator and looked at the baffle through the viewport. We confirmed that there was no cross bar structure on the side of the baffle unit as shown in the DCC document (D1000328). We could not see the top part of the baffle where it supposed to have a cross bar structure. We took some pictures and I attach them to this entry.
(Peek at the aperture position)
Then we tried a different zooming in the PR2 digital camera in order to have a better view so that we can determine if the aperture hole is in the middle of the baffle unit or not. With a help of flash light illuminating the PR2 baffle from the HAM2 East side viewport, we could clearly see the edge on both right and left sides of the unit as well as the edge of the aperture. It looked like the hole is centered with respect to the right and left edges of the unit. We took several pictures of it via the digital camera so that one can evaluate the position later if necessary.
(MC2 scraper baffle check out)
Then we opened up another viewport on the West side of HAM3 in order to see both MC2 and PR2 baffles. Though, the MC2 baffle was completely occulting the PR2 baffle and therefore we could not see it. We could still confirm that the MC2 baffle has a cross bar structure on the side as shown in the design document (D1000327).
The baffle hole seems to be centered in the baffle frame within 3mm from the picture. Good.
The distance from the left baffle hole edge to the left inner edge of the baffle frame is about 52 pixels, it's 56 pixels for the right, i.e. about 2 pixels offset to the left, which corresponds to 2 or 3mm.
Keita
The hole diameter is also good (i.e. the ratio of baffle frame width to the hole diameter on the picture reasonably agrees with the spec).
Switch between the first attachment and the second to see if you agree with my assessment of the edges.
Nominal | Image | |
Baffle diameter | 2.756" | 53px |
Baffle height | 8.34" | 160px |
Baffle width | 8.74" | 165px |
Diam/Width | 0.315 | 0.321 |
Height/Width | 0.954 | 0.970 |
Center offset | none | 2px to the left ~ 3mm |
[Jeff K, Stuart A] Following work yesterday preparing QUAD model updates (see LHO aLOG entry 14645), we convened early this morning to rebuild, install and restart models. A detailed log of our activities follows:- - Bring down the IMC to DOWN state via Guardian (no change to LSC model) - Bring down QUADs to SAFE state via Guardian - Bring down SEI to OFFLINE state via Guardian - Capture new safe BURT snapshots for h1lsc, h1susetmx, h1susetmy, h1susitmx and h1susitmx - Made all QUADs - Installed all QUADs - Restarted all QUADs - Svn-up updated QUAD MEDM screens - Untripped all Watchdogs - Restore QUAD alignments - Restore SEIs to FULLY_ISOLATED via Guardian - Restore IMC to LOCKED via Guardian - DAQ process restart at 14:29 (UTC) - Update other SUS MEDM screens (see LLO aLOG entry 15004) - Cleaned errant IPC issues with diag reset on GDS TP screen - Committed new safe BURT snapshots to svn Summary of benefits: This now makes damping of violin, bounce and roll modes possible via the the L2 (PUM) stage of all QUADs using DARM error. Other benefits include: ESD linearization, providing infrastructure for remote ESD activation and deactivation. Also, old Guardian infrastructure has been removed from the model and MEDM screen, with new Guardian embedded mini control-panel replacing them in the QUAD Overview MEDM screen (as well as for other Suspensions too) This closes-out WP#4915.
We also updated the h1susauxex and h1susauxey models to provide analog and digital monitoring of the ESD Driver, as has been carried out at LLO (see LLO aLOG entry 12688).
The updated local top level SUS QUAD and SUS AUX models have been committed to the svn:- /opt/rtcds/userapps/release/sus/h1/models/ M h1susetmx.mdl M h1susetmy.mdl M h1susitmx.mdl M h1susitmy.mdl M h1susauxex.mdl M h1susauxey.mdl
Here are trends of the HAM2 ISO INMONs, OUTPUTS and the location servo Residuals. I expected something to jump out in the OUTPUT or the residuals but nothing really at this resolution. Well at least the INMONs correlate to the trip out. You can see three successful isolations on these X & Y channels at the beginning of the plots. You can see the OUTPUT quickly grow large after the gain has ramped. On the fourth attempt, which is the trip, the INMON spikes much larger than before but I think it does so after the trip turn off...Yeah looked at full data and the large spike is after the OUTPUT drops to zero.
So well...I think my theory is sound but I don't think I've convinced anyone yet with data.