(Filling in for Ed till Patrick comes on shift)
The HAM5 CPS issue seems to be fixed, so we are on our way back up.
16:29 Fil , J. Batch ad John Worden out to LDAS.
16:42 Jim. Johm and Fil back from LDAS. Electrical panels are tripped. Electrician is on his way to investigate. Still not sure what order things went down.
16:46 Lockloss EQ
16:50 Switched ISI config for Large EQ in Northern California.
17:04 Switch ISI config back to Windy after quick decay of EQ
17:20 GRB alert. H1 not locked
17:34 H1 locked and observing. Monitoring f1 violin modes which are slightly rung up.
19:14 Fil reports that the LDAS power panel trips were due to the room, first, overheating. Queries into remote monitoring of LDAS room temps are being made.
19:17 filed FRS ticket for noisy ops station computer fan.
20:12 lockloss HAM5 and 6 ISI tripped
20:20 HAM5 CPS glitching. Jim out to power cycle chassis in CER.
20:34 Jim and Fil out to LVEA to physically "exercise" CPS connections.
21:30 Recieved Hanford Release advisory. Decided that I should get my non-snow ready vehicle home. TJ is there as well as Patrick who is the evening shift operator.
2:20 pm local
Took 30 sec to overfill CP4 at 70% open using Patrick's new automation code. See aLOG https://alog.ligo-wa.caltech.edu/aLOG/index.php?callRep=32573
More testing Friday.
Chandra lifted the lead for TE252A. Kyle put the CP4 LLCV into PID and disabled. The script to fill CP4 was run: vacuum@vacuum1:/ligo/home/patrick.thomas/svn/scripts$ python cp4_fill.py 70 -60 3600 Starting CP4 fill. TC A error. LLCV enabled. LLCV set to manual control. LLCV set to 70% open. Fill completed in 30 seconds. LLCV set back to 35.0% open. Chandra put back the lead for TE252A. Chandra lifted the lead for TE202B. Kyle put the CP3 LLCV into PID and disabled. The script to fill CP3 was run: vacuum@vacuum1:/ligo/home/patrick.thomas/svn/scripts$ python cp3_fill.py 50 -30 3600 Starting CP3 fill. TC B error. LLCV enabled. LLCV set to manual control. LLCV set to 50% open. Fill completed in 110 seconds. LLCV set back to 19.0% open. Chandra put back the lead for TE202B. Both scripts are in svn under /trunk/scripts in the projects repository. Revision 4023.
Done under WP 6402 and 6403.
Updated scripts: Added check for exceptions on returning LLCV to initial position. Changed loop period from 10 seconds to 1 second. Revision 4024.
Moved both scripts to trunk/cds/h1/scripts in the cds_user_apps repository.
The h1fw0 frame writer was restarted about 12:30 PM PST. The cause of the shutdown was a power failure caused by a cooling system failure in the LDAS room where the file system for frames is located. The h1fw1 frame writer ran without interruption, so no data was lost.
J. Kissel, for the people actually doing the work: J. Warner, F. Clara We've lost lock because HAM5 ISI capacitive position sensors (CPS) have suddenly started going terribly glitchy and saturating. Jim and Fil are investigating, but this'll likely involve some electronics power cycling, if not swapping. Stay tuned for further details.
Some detail on finding and fixing the problem.
HAM5 suddenly tripped, presumably causing the lockloss, when I went to the overview several of the HAM5 CPSs were saturating nearly constantly, but the GS-13s didn't seem very perturbed, so I didn't believe the CPS were reading real motion. I couldn't reset the watchdog because of this, and when I bypassed the CPS to force a reset, the ISI rang up and tripped immediately.
While Jeff did the paperwork, Fil and I went first to the CER to try power cycling the AA and CPS power. Several rounds of this showed no improvement (several CPS were still glitching over 20k counts), so we then went out to the chamber. There we reseated the power and data cables, disconnected and reconnected the probes and cards in the Corner 1 & 2 satellite rack, but saw no improvement.
We then went to the corner 3 rack, and as soon as Fil grabbed the power cable to try reseating it (first thing he was going to try), the CPS settled down. Doing some "tap" tests on the ground showed believable motion on the live CPS signals in dataviewer (i.e. stomping on the ground showed "real" motion, mostly in Z). We were then able to reset the watchdog and the ISI was able to isolate. We went out again to make sure all of the cables were tight, but while we were there the CPS started glitching again. We then did another round of checking cables on the Corner 3 box and this time when Fil unscrewed the retaining screws on the data cable, the CPSs started behaving again. He carefully reseated the screws and we gently walked away. It's not clear which tweak to what cable fixed the problem, and we didn't find anything obviously wrong before touching stuff. I did find one of the data cables a the CER rack loose, but didn't fix anything by securing it.
This whole process was kind of hampered by not being able to open the sitemap on the mac book we were using. I could get dataviewer, but I could not get the ISI overview to see when we tripped the platform or reset the watchdogs. We were forced to use the workstation in the CER for MEDM.
We'll monitor this for the next couple of days. This does not seem to be exactly the same failure that Hugh found on HAM3 a few weeks ago (in alogs 31564, 32076, 32079). The attached watchdog plot from HAM5 doesn't seem to show the single glithches that Hugh saw on HAM3. Instead the noise on V2, H3 and V3 seem to just suddenly get worse. Kind of worrying.
Second CPS unit (slave) for HAM5 needs to be investigated further. Possible grounding or connector issue, as glitchy/saturation in channels comes and goes with moving of cables for this unit.
Attached are the trends of the relative humidity of the desiccant Spare parts and 3IFO storage cabinets in the VPW. Looks like both have been at near 1-0% since the last reading a month or 2 ago, other than the spike at the begining and at the end when I pulled the meter to my office (higher RH) for the reading. As well, this morning there were some power issues that may have effected the RH of the cabinets briefly this morning.
Patrick, Dave, Kyle, Chandra
First test of Patrick's automation overfill code successful! Took 33 min. to overfill CP3 at 50% open on LLCV (from 18%). Raised nominal to 19% after fill.
Next test will be on CP4 by lifting leads of one TC and setting beckhoff settings to "disable" and "PID" modes to see if code corrects those.
here is a strip-tool plot of the autofill. The red line is the fill valve %-open, goes to 50% during the overfill. The blue and green lines are the two thermocouples, which drop to -60C when the LN2 reaches the overflow outlet (code stops the overfill when TC<-30C).
Work Permit | Date | Description | alog/status |
6401.html | 2016-12-13 15:00 | Send cell phone text alarms if any LN2 dewar level drops below a specified level (determined by vacuum group) | 32541 |
6400.html | 2016-12-13 13:52 | Add a second PCIe card to the h1hwsmsr computer. | |
6399.html | 2016-12-13 10:56 | The CDSFS0 controller is generating errors on the kernel, as part as troubleshooting this it is required to upgrade the firmware on the card, this needs to be done and power off and on the server to load the new firmware. All the control room workstations will be disable while this job is in process. The process will be performed on Tuesday 12/20/16 but We open the Work permit from today in case the severity of the problem increases before then. | |
6398.html | 2016-12-13 10:50 | CP4 pump level to be removed from cell phone alarm texting, similar to what was done for CP3. | 32541 |
6397.html | 2016-12-12 19:53 | Public tour for ~2-3 UW Bothell engineers. Requesting ~15-30 minutes around 11a. | |
6396.html | 2016-12-12 16:16 | Replace ion pump #12 power supply at EX. | 32514 |
6395.html | 2016-12-12 08:19 | Update gstlal-calibration on the DMT to the latest version and restart it during the next maintenance period: gstlal-calibration-1.1.0-v1: * A small bug fix to make primary and redundant pipelines produce identical output. gstlal-calibration-1.1.2-v1: * A bug fix that allows the pipeline to produce output when run offline. The plan is to install the latest version (gstlal-calibration-1.1.2-v1) including both bug fixes on the DMT, if it gets approved by the Software Change Control Board (SCCB). Otherwise, the gstlal-calibration-1.1.0-v1 version has already been approved by SCCB, and will be installed. | 32517 |
6394.html | 2016-12-09 12:18 | Swap yellow & red TC wires at Beckhoff rack: (TE252A & TE252B wires) red should be where yellow is and vice versa | |
6393.html | 2016-12-08 10:22 | Plug in a TCS RTD temperature probe to it's unused 9pin temp port on Front Panel of Flipper Breakout Box on TCSY table. This will allow for another independent check of LVEA temperature. Will need laser hazard in LVEA to complete this work. | Closed. Installation deferred until temperature sensor locations are reviewed. |
6392.html | 2016-12-08 08:44 | Due to RAID failure I will swap cdsfs0 for cdsfs1 | |
6391.html | 2016-12-07 13:14 | Fix some bugs in the way that the ASC and LSC loops are handled in DOWN to prevent sending impulses to the optics which could ring up violin modes. (related to LLO alog 29111) | |
6390.html | 2016-12-07 11:04 | Install low pass filtering in the DAQ readbacks of the IMC servo board. | |
6389.html | 2016-12-07 10:31 | As part of testing/implementing the Beckhoff based safety system, chassis D1600379 will be installed. The unit will be placed in the TCS rack in the mechanical mezzanine. Chassis will be powered on, with none of its inputs/outputs connected. | 32526 |
6386.html | 2016-12-06 10:23 | Vent annulus volume between 2K input Mode Cleaner tubes A and B. Connect annulus ion pump hardware and pump with rotating shaft vacuum pumps until ion pump can maintain unassisted. Can be done limited to maintenance days until complete | 32254 & 32531 |
'Previous WPs’ | |||
6381.html | 2016-12-05 12:18 | Replace the TCSY Flow meter. Turn TCSY laser off, valve out piping volume, remove yellow paddle wheel flow meter, install new one (same version/model number). Check flow, get laser going again - will need to stabilize. | Defer to Dec 20 or Jan 3 Maintenance |
6368.html | 2016-12-02 12:01 | Continue with schedule of roaming high-frequency calibration line from PCALX to establish high-frequency calibration uncertainty. Switching frequencies will only occur in either Single IFO time or when IFO is down, otherwise we should be observation ready. Detchar will continue to be informed. We expect to complete the schedule in ~1 week, and then line will be turned off until further notice. | 32179 & 32515 |
I incremented the heat at both ENDX and ENDY.
The X control signal has been increased to 13.5ma from 12.5.
The Y control signal has been increased to 12ma from 11.
Tried to trend the LVEA and PSL dust monitors this morning to see if there were any changes the the PSL dust readings after swapping the dust monitor in the PSL enclosure (PSL 101). Found a gap in the data, that coincides with the time when the Control was unplugged. Can trend before the outage and after, but not across. The plots are of the past 21 hours, which is the longest period of data available right now. It is too soon to tell, but it looks like the PSL enclosure data is consistent between the two monitors. Will check again after a few days after more data has accumulated.
Interesting to note: After I noted all of the pertinent information for the California quake. USGS updated and inserted 3 more quakes in between the one reported here and the previous one in California. (2 that could have contributed to the detriment of the lock) See far right attached image.
OOPS! USGS didn't update in the way that I thought it did. I was zoomed on California, so those were the only showing in the margin. Sorry about the mis-post.
I'm attaching a few lockloss plots. The first two are the "stock" lockloss plots, one from about ten minutes before the earthquake and one at the lock loss. These plots are a number of useful angular signals from the IFO. Not much to say, but it looks like the arm angular controls are the ones that get the most upset (ASC C/DHARD/SOFT). They seem to go up by ~10x or more, while PRC/SRC etc go up by a factor of 4 or so. The point where the are ASC gets fuzzy at -10 seconds on the second plot looks to be of interest. Did we saturate some sensor here?
The next two plots are of different length drives for different suspensions, some ISI sensors and some ground sensors. Third plot is for 10 minutes before, the fourth is for the lockloss. I also included the MICH freeze signal (LSC_CPSFF), which looks like it could be interesting to monitor during an earthquake. Most of the time this seems to be a couple hundred nanometers peak to peak, but during the earthquake this grows to a couple thousand. We didn't get much warning for this earthquake (it was in California), so I don't know what good monitoring local sensors would have done this time around.
Andy L. contacted us via TeamSpeak to let us know that we have a local issue with LDAS (& "cluster"?). On the status page (https://monitor.ligo.org/ldg),
Also noticed that fw0 is INVALID on the CDS Overview (but fw1 appears to still be operational), and so I believe I remember Dave saying as long as we have one of these writers still operational, we should be OK----when they BOTH go down, it's an emergency.
The main 200 amp breakers for the computer power are tripped, so there is no power other than lighting.
The h1fw0 frame writer daqd has been stopped until the disk system in LDAS can be brought up again.
Greg called me at 7:30amPST. Asked me to head out to the LDAS room (at the VPW), because he feared we had an HVAC failure because he could not ping any of his computers. He wanted me to see if the LDAS room was hot. I was not able to open the door with my card. I did hear a high pitch tone from within. I was also able to reach Greg on his cell (it must have been off when I tried calling earlier).
I notified Bubba and John about the situation and they will check out the room.
Greg is going to contact Dan Moraru to have him check out their system.
I gave Ed (incoming operator) an update on the status.
Assuming Ed meant to mark this entry for 12/14.
Assuming Ed meant to mark this entry for 12/14.