During the "make installWorld" part of RCG3.0.2 install the /opt/rtcds NFS server crashed (h1fs0). We reset h1fs0, but the NFS services did not come back cleanly. We restarted the nfs-server daemon and the services restarted correctly and the NFS clients reconnected.
Looking at the h1fs0 logs, problems were being reported starting at 09:05 PDT this morning.
We are restarting the install process and monitoring the error logs and disk usage carefully.
Collected the temperature and RH data from the two 3IFO Dry Boxes in the VPW, and the 3IFO desiccant cabinet in the LVEA. Relative humidity data for all three containers are fine (mean range between -0.71 and 3.29%). Temperature data shows a different story. There were several 20 plus degree swings in the VPW temperature during the first part of the month. The second half of the month the temperature swings were around 10 degrees.
Did a flow check and Zero Count test of all operating dust monitors (except in the PSL as those were checked at install). All DMs are performing normally.
Today, I finished tweaking the DAQ channel data rates for the SUS HAM FASTIMONS which are all in the commissioning frames - Kissel and Jenne discussed upping some of these rates, especially at the lowest stage of the globally controlled SUSes. Namely the rates per stage of SUS are as follows.
For MC2, PRM, SRM:
M1 256
M2 2048
M3 Full rate
For MC1, MC3, PR2, PR3, SR2, SR3:
M1 256
M2 2048
M3 2048
Pictures of examples of each type are attached.
I have compiled, with a "successful" message:
h1susauxh2
h1susauxh34
h1susaux56
I think this means we can closeout ECR 1200. 'Will work on that with Kissel.
Note, this bug used be ECR 1200, now on FRS it is apparently 4702.
14:30
15:24 The portable toilet people are on site
15:45 Joe into LVEA
15:49 Jeff B added 200ml to the Xtal chiller
15:50 Jeff B to End stations
15:50 No L4C watchdog counts for me to reset. Perhaps It was done by someone else this morning?
Superseeds alogs 26910 and 26924
Bad news: There is lots of MHzish pick-up on the cables to the ITM L2 coils: ~50mVpk
Good news: The ITM L2 coil noise at low freuency is very good: 1.5nV/rtHz at 25Hz, and we might not care about all that pickup.
Details:
The 10Hz harmonics reported in alog 26910 was a measurement problem, generaterd in Rai's preamplifier box (D060205). The cables pick up on the order of 50mVpk at around 1MHz, which was amplified by 100x, causing slew-rate down-conversion.
This was fixed (in the measurment setup) with a 270nF capacitor in prarallel to the 23.8Ohm cable and coil resistance, resulting in a 24.7kHz pole to cut off the cable pick-up.
Plot 1 and 2:
ITMX and ITMY coil noise.
Configuration:
- Everything (coil driver, cable, coil) was connected. The breakout box was inserted between coil driver and cable to the satellite amplifier.
- The L2 cois drivers were both is state 3: Acq Off, LP On, which is the run mode. They are never switched for the ITMs.
- The coil driver inputs were left connected to the DAC/AI. I also sent a 100ctpk, 3kHz signal into H1:SUS-ITM[XY]_L2_DRIVEALIGN_L2L_EXC, corresponing to a 2.6ctpk signal on the DAC. I did this to make sure the DAC is as least flipping bits, which raises its noise level.
- A 270nF capacitor was put in parallel to the coil using a pomona box to avoid saturating the D060205-preamp.
- The preamp has a gain of 100. After the preamp a 100Hz low pass was used (1595Ohm and 1uF) to allow the SR785 to run in the lowest noise mode.
Plot 3:
Noise projection assuming incoherent noise and assuming the ETMs behave the same.
Plot 4:
High frequency noise pick-up on the coil cable (coil driver disconnected).
The dominant noise is at ~1MHz, broadband.
Shown are two traces: one in nominal configuratiuon (green), and one with an additional choke on the cable to the coils.
Plot 5:
Scope trace of the high-frequency signal (only the x100 amplifier is used). The signal is made up of ~10msec bursts every ~100msec.
Plot 6:
All 4 coils (without amplifyer) directly connected to the scope. Note that the grounded inputs of the scope slightly change the signal.
Plot 7:
Scope trace with only an antenna connected to the scope. The signal pickup was largest between the racks - I could not trace it to a source yet.
For reference, the data and matlab code is available at ~controls/sballmer/20160429/plotIt.m
Also, since we probably have to do similar noise checks when we have the IFO back, here is the equipment I used: Top: Ring antenna Bottom, from left to right: choke, breakout card, 270nF parallel capacitor box, Rai's preamp box, 100Hz LP filter, AC coupling for looking at RF on the spectrum analyzer.
Plot 4. With the cable disconnected from the Coil drive the Shield is no longer terminated. This may contribute to the pickup.
The high frequency noise coupling in in plot 4 is mostly common, and shows up because Rai's preamp has no differential sensing.
In the attached plot the noise seen on ITMX coil 1 is plotted, once sensed with Rai's single-ended preamp, once with an SR560 in differential mode.
Conclusion: This noise does not show up on the coil current.
However: The same common HF noise pickup seems to be present on all cables. This now makes me worry about the ESD: I suspect the ESD has much less common rejection, because the +400V and -400V comes in on different cables. Moreover, a broadband noise at 1MHz on the ESD will produce noise near DC due to the quadratic nature of the ESD coupling.
I also tried to use the antenna to locate the source. The field is strongest in a circle arouind the rack, suggesting the source might not be in the electorics room, but rather just brought in by all the cables.
Rich M., Patrick T. I noticed that the end Y ISI and SUS watchdogs had tripped. I tried untripping them but they immediately tripped again. Rich found that the BRS was reporting bad. He turned off the sensor correction for RY and turned off the BRS correction to the ground seismometer. He left on the sensor correction for X. I untripped the watchdogs again and they remained untripped.
Daniel, Patrick, Matt
We did a little more rotation stage science today. The objective was to understand the remaining acceleration mystery, and to confirm that the resistor was helping. The on-screen EPICS values are the ones being used for acceleration and deceleration, and they now have an upper limit of 65000 (or 65s to reach the maximum speed of 100 RPM). Note that the on-screen velocity is in units of 0.01% of the maximum, so a value of 10000 gives the maximum speed of 100 RPM, and a value of 100 gives 1 RPM. (These RPM values are presumably for the motor, not the waveplate.)
We found that with the current firmware settings (which Patrick will append), the 50 Ohm resistor was not necessary, so we removed it. This means that other waveplates in the field need no hardware modification to achieve the 0.01dg accuracy we are seeing with this rotation stage.
The attached screenshot shows a move from 10W to 2W (velocity = 3000, acc = 6000, dec = 6000) and then from 2W to 10W (velocity = 300, acc = 60000, dec = 60000). Note that the higher values of acceleration and deceleration for lower velocities result in a smoother ride.
Current settings attached.
A couple of diagnostics features have been added to the code:
I reduced the calibration velocity from 3000 to 500. Driving too fast towards the home position seems to reduce its reproducibility. This test will have to be repeated by looking at the laser power.
The TCS rotation stages also got the new motor settings and can be tested. The TCS medm screens need to be updated as well. (why are they diffeerent?)
J. Kissel Continuing the work of Corey et al. have done cleaning up SDF files, (see LHO aLOG 26917), I've gone one level deeper to ensure that all snap files used in the target areas are soft links to locations in the userapps repo. There *is* a safe.snap for every front-end model / epics db, of which there are 129. Unfortunately, because they're human construction, there are less OBSERVE.snaps (112) and down.snaps (28). OBSERVE.snaps at least exist for every front-end model / epics db that existed during O1. However, weather station dbs, dust monitor dbs, and pi front-end models are new since O1, so OBSERVE.snaps don't exist for them. Further, down.snaps seem to have only been created for ISC models, the globally controlled SUS models, and the ISC-related beckhoff PLCs. We know the safe.snaps are poorly maintained, and sadly we haven't been in a configuration we'd call OBSERVE.snap worth in a long time, so they're also out of date. On top of all this, each subsystem seems to have a different philosophy about safe vs. down. Daniel, Sheila, Jamie, and myself were discussing this on Friday, we'd come to the conclusion that it is far too difficult to maintain three different SDF files. If the SDF mask is built correctly, then there should be no difference between the "down" and "safe" state. The inventors of the "safe" state are the SEI and SUS teams because they have actuators strong enough to damage hardware. As such, they've designed the front-end models such that all watchdogs come up tripped and user intervention is required to allow for excitations. So, as the model comes up, it's already "safe" regardless of it's settings. Of course, even though the IFO is "down" at that starting point, we still want the platforms to be fully isolated. So, in that sense, for the ISIs "down" is the same as "OBSERVE." And again, if all settings that change via guardian are correctly masked out, then "safe" is the same as "down" is the same as "observe" and you only need one file. So, eventually -- we should get back to having only one file per subsystem. But this will take a good bit of effort to make sure that what's controlled via guardian is masked out of every SDF, and vice versa, that what is masked out of SDF *is* controlled by guardian. The temporary band-aid idea, will be at least to make sure that every model's down is the same as it's safe. Because Corey et al. put a good bit of effort into reconciling the down and safe.snap files today, I've copied all of the down.snap's over to the safe.snaps and committed them to the repo. I've not yet gone as far as to change the safe.snap softlinks to point to the down.snaps, but that will be next. Anyways -- this aLOGs kinda rambling, because this activity has been disjointed, rushed, and sporadic, but I wanted to get these thoughts down and give an update on the progress. In summary, at least every safe, down, and OBSERVE.snap in the target area is a soft link to the userapps repo, and all of those files in the userapps repo are committed. More tomorrow, maybe.
Thanks for the write-up here! A couple of comments/notes:
1) Does every frontend really have a safe.snap? I thought I could not find some safe.snaps for some of the ECAT (i.e. slow control) frontends. Or is there a way for the SDF Overview medm to not display *all* SDF files?
2) If we manage to get to ONE SDF file, what' will we name it? Will we stay with "safe" since that's what the RCG calls out, or will we change it to a name more preferred (this was another subtle note I overheard you all discussing on Fri.)
~21:01 UTC I turned off the camera, frame grabber, then powercycled the computer (then turned the frame grabber and the camera back on). Only HWSX code is running at the moment. Things look good for now.
May 3 16:44 UTC Stopped HWSX code and ran HWSY code alone. HWSX code had been running fine since yesterday.
May 5th 18:20 UTC I noticed HWSY code stopped running. There has been many comuter and front end restart since I left it running so it was unclear what caused it to stop. I reran it again and going to leave it again for another day.
May 2 12:08:37 h1conlog1-master conlog: ../conlog.cpp: 301: process_cac_messages: MySQL Exception: Error: Data too long for column 'value' at row 1: Error code: 1406: SQLState: 22001: Exiting. Coincident with Bechoff restarts?
Restarted.
Restarted again. Same error.
This may be different, but we also ended up with a couple of corrupt autoburt files. There the problem seems to be that string values are not properly escaped. A carriage return character in a string will force a line break in the autoburt text file. A burt restore will then complain that the string is not terminated by double quotes.
I assumed that the same noise is present in all 16 coils, but is incoherent, i.e. the scaling factor was sqrt(16)=4. Note that the 10Hz and harmonics are due to pick-up, and thus likely coherent at least in the ITMs. Thanks to the alternating magnet polarity, this coupling likely cancels to some degree for length noise, i.e. the attached plot is an over-estimate.
Superseded. See alog 26948.
(see also alog 26902)
I measured the PUM coild driver noise of the ITMs at the coil driver output.
Bottom line:
- All 8 chammels (4 ITMX & 4 ITMY) show the same nosie.
- There is significant pickup in the cable running from driver to coil. This corresponsds to the 10msec bursts every 100msec mentioned in alog 26902)
- This pick-up leads to 10Hz and harmonics.
- There is some residual pickup of the same signal in the coil driver, but significantly less.
- above 10kHz there is a huge amount of junk flying around (10uV/rtHz and more !)
The 4 traces in the plots correspond to
Blue: Coil driver output, terminated with 50Ohm.
Red: Coil driver output, with cable and coil attached.
Yellow: Only cable and coil attached. Coil driver disconneced.
Purple: Measurement noise. (Only terminted breakout box)
Attached are
plot 1: 0-200Hz (for example ITMY coil 3)
plot 2 & 3: 0-100kHz (for example ITMY coil 1 & 4) Note the 3kHz line - it corresponds to a 100cts 3kH length drive, resuling in about +-2ct at the ADCs.
Stefan - can you add:
- what state the coil drivers were in for these measurements?
- state of the CD inputs: terminated; connected to AI; connected to DAC/AI; ... ?
Peter:
- The L2 cois drivers were both is state 3: Acq Off, LP On, which is the run mode. They are never switched for the ITMs. (Binary IO medm screens attached).
- The coil driver inputs were left connected to the DAC/AI. I also sent a 100ctpk, 3kHz signal into H1:SUS-ITM[XY]_L2_DRIVEALIGN_L2L_EXC, corresponing to a 2.6ctpk signal on the DAC. I did this to make sure the DAC is as least flipping bits, which raises its noise level. You can also see that signal in the plots at 3kHz. Note though that at 15Hz the total coil driver electronics + DAC noise is still less than what we pick in the cable - blue is below yellow in plot 1.
Superseded. See alog 26948