Displaying reports 44281-44300 of 84044.Go to page Start 2211 2212 2213 2214 2215 2216 2217 2218 2219 End
Reports until 11:57, Thursday 08 March 2018
H1 SQZ
daniel.sigg@LIGO.ORG - posted 11:57, Thursday 08 March 2018 (40898)
Slow controls improvements

The PZT scan trigger now implements a dead time that ignores what happens at the beginning of a scan when the PZT slews towards  the start offset. At the same time an elapsed time was added, which describes the time between arming the trigger and the trigger event.

An button to ignore the PSL condition was added to the TTFSS auto-locker. Since we are using a temporary reference laser, checking that the reference cavity is locked, isn't useful.

The single axis picomotor screens now show buttons to select the step size and speed. Also, fixed the links from the squeezer overview screen.

The Trans OPO DC PD readout was changed to a baffle PD type with larger transimpedance gain some time back. However, due to the TwinCAT code issue, the gain settings on the medm screen were backwards. This is now fixed.

LHO VE
chandra.romel@LIGO.ORG - posted 11:20, Thursday 08 March 2018 - last comment - 14:18, Thursday 08 March 2018(40897)
CP4 regen bake

GN2 flow survived the night (but did Kyle, who monitored the screen?). We now have an alarm enabled that will text the vacuum group almost immediately if the GN2 heater trips, in which case relatively cold GN2 would continue to flow through warm CP4, across bibraze joints. Our bake set up is different from the original PSI design, which I don't think was intended to run hot GN2 for days on end.

I increased the regen setpoint to 105C. Gas temp. was pretty steady at ~85C last night, with a setpoint of 95C and prop. gain of 7. There seems to be an offset (~10C) between setpoint and temp reading that I haven't converged on by changing the gain, so for now I increase the setpoint to make up for it.  The GN2 flow is ~ 20 scfhx100 and 1/4 of the vaporizer is frosted (see photo). Dewar consumption varies but doesn't look to be more than the typical consumption of normally operating pumps. May need to open up vaporizer feed valve more to allow more flow.

Images attached to this report
Comments related to this report
chandra.romel@LIGO.ORG - 14:18, Thursday 08 March 2018 (40904)

Increased flow of regen GN2 and also temperature. The enclosure heater is outputting 100%; I suspect what is happening is the GN2 is sucking heat inside the enclosure. Because we're measing the temp. of GN2 outside, we should bump it up 10C higher from where we want it at CP4 to compensate for losses. New setpoint is 115C with intension to raise outside gas temp to 105C. Supply air at bake enclosure is at 95C.

The vaporizer feed valve is fully open at 2-1/8 turns, with flow measuring around 40 scfhx100. Not sure we can achieve PSI's 55 scfhx100 spec.

H1 CAL (CAL)
richard.savage@LIGO.ORG - posted 21:58, Wednesday 07 March 2018 (40895)
Document explaining Pcal absolute displacement calibration procedure

SudarshanK, RickS

We recently generated a document that explains the procedure for calibrating the Pcal power sensor readback channels in absolute displacement of the ETM.

The document is LIGO-T1800046-v2 , Calculation of Pcal absolute force coefficients and (displacement) calibration of Pcal power sensor readback channels ​​.

H1 CAL (CAL)
richard.savage@LIGO.ORG - posted 21:50, Wednesday 07 March 2018 (40894)
X-end Pcal in-chamber optical loss measurements

NikoL, DarkhanT, EvanG, RickS

This afternoon, we made in-chamber measurements using a Pcal integrating sphere-based power sensor (photos attached).

We made a complete set of measurements that should allow us to quantify the optical losses on the incident and reflected paths for both beams.

Darkhan plans to process the data using the scripts he developed for similar measurements made on Jan. 10, 2018 at Y-end and post the results soon.

Images attached to this report
H1 CAL (CAL)
richard.savage@LIGO.ORG - posted 21:38, Wednesday 07 March 2018 (40892)
X-end Pcal work this morning - Alignment and beam centering checks

TravisS, NikoL, JimW, RickS

This morning, after Jim locked the ICS, we aligned the ETM (to the OptLev) and installed the Pcal target on the ETM suspension frame.

We checked the gaps between Periscope upper flexures and the A7 adapter wall.  They were both well over  0.230", on both the front and back sides.   We rotated the upper flexure compression screws to decrease the gaps (increase the compression) to less than 0.220".

We then checked the centering of the Pcal beams on the periscope optics.  They were not too bad.

We then carefully aligned the Pcal beams to the input beam scribes on the Pcal ETM target (see attached photos) using the final relay mirrors on the input paths on the periscope structure and re-checked the beam centering on the periscope mirrors.  They were pretty good.

We removed the ETM alignment target.

We then carefully centered the beams on the input aperture of the Receiver Module integrating sphere using the turning mirrors in the Receiver Module.

Overall, things look pretty good, Pcal-wise, at Xend.

Images attached to this report
H1 SQZ
sheila.dwyer@LIGO.ORG - posted 18:48, Wednesday 07 March 2018 (40890)
squeezer progress this afternoon

Terry, Sheila, Nutsinee,

In addition to the work this morning on the DC diodes, we made some more alignment progress in HAM6 this afternoon. 

We are now finished with the viewpoint simulator, and ready to inject light into HAM5. 

LHO VE
kyle.ryan@LIGO.ORG - posted 17:39, Wednesday 07 March 2018 - last comment - 21:42, Wednesday 07 March 2018(40888)
Weird pressure trends (CP4 bake-out)
Attached are the pressure trends related to the CP4 from the past two days.  Keep in mind that much experimenting was taking place during this time frame as we sought to understand our bake-out system so PT245's erratic behavior isn't surprising.  Rather, the correlation to PT210 absent a correlation with PT246B is what has me scratching my head.  

I didn't make any changes to the regeneration parameters but did notice when standing outside by the flow meter and exhaust outlet that there was some cyclical surging of flow between 25-40 SCFH which could be heard in the exiting exhaust that I hadn't noticed before.  Also, that more of the radiator sections of the ambient-air vaporizer were frosted than had been earlier.  I am surprised that CP4's exhaust pressure indicates 0.0 PSIG with so much flow.  

Note the increase in LN2 consumption (see attached graph) as the result of regenerating CP4
Non-image files attached to this report
Comments related to this report
chandra.romel@LIGO.ORG - 21:42, Wednesday 07 March 2018 (40893)

Y2 side of beam tube is not affected by the bake because it is bone dry compared to Y1 side (spool that replaced BSC). PT-245 and PT-210 are sensitive to bake temperature which has been nonlinear the past couple of days with new variables - heated GN2 and heat transfer from bake enclosure to GN2 pipes.

Unfortunately we don't have the luxury of measuring the exhaust pressure because the pressure transducers are out of commission.

H1 SUS
filiberto.clara@LIGO.ORG - posted 17:25, Wednesday 07 March 2018 - last comment - 12:30, Thursday 08 March 2018(40889)
H1 ETMY - Ground Loops

Continued with hunting of ground loops at EY. After various attempts all shorts for ETMY are now fixed. Some of the steps taken to remedy shorts:

1. Betsy loosen and re-tighten some of the connections at the feedthru (in chamber).
2. Shorts on the UIM and PUM were no longer present. Short on ETMY MO still present. New short on MO/RO.
3. Disconnected cables at feedthru (air side) and verified pin 13 was not shorted to ground inside chamber.
4. Checked that the shield and backshell on connectors were not touching (feedthru air side).
5. Disconnected all cables coming from chamber at the SUS Amplifier boxes.
6. Checked that all pins and backshells were not shorted to chamber ground. All passed.
7. Reconnected all cables back to SUS Amplifier units.

Comments related to this report
betsy.weaver@LIGO.ORG - 12:30, Thursday 08 March 2018 (40899)

Further to the above, Fil reported that he had not secured the cables connected to the 7 SUS EY Sat Amp boxes with their mating screws.  

This morning, we ran the spectra again and see that all of the combs cleared up.  So that is good.

However a new round of TFs (since Jim had the ISI unlocked and we were diagnosis something else), showed no coherence in the R0 Transverse DOF.  This DOF uses only the Side BOSEM.  This T TF was fine on Friday before all of the cable short remediation work.  (Not sure why it doesn't reveal itself in the spectra...)  So we think the actuation of the 1 OSEM is now broken.

Travis and I went down to EY this morning and secured all of the cables at the Sat Amp Boxes.  The R0 T DOF still looked bad.  SO, we went ahead and did a quick swap of the R0 Side BOSEM inside on the QUAD.  Still no change.

Richard happened to be in the building - he power cycled the R0 Face1,2,3,Side Coil driver.  Still bad.

We all broke for lunch.

For the record, the olds BOSEM that came off of R0 Side was s/n 083, the new one going in is 291.

LHO VE
kyle.ryan@LIGO.ORG - posted 16:59, Wednesday 07 March 2018 (40887)
CP4 bake-out controller code "bug" demonstrated
Today, Chandra R. had the MCE representative on site and we were able to demonstrate to him the steps that we had done during our normal operation that had resulting in CP4's bake-out controller violating our specified 1C / hour maximum rate of change.  Specifically, if the SETPOINT value is manually lowered to something less than the real-time measured temperature (TC1 supplies this) input while the program is running, and in the initial RAMP stage of the ramp/soak heating profile, the program responds by advancing to the SOAK segment of the profile.  This is logical and expected.  However, if then the SETPOINT value is raised to a value greater than the real-time measured temperature, the program does not revert back to the RAMP segment but stays in the SOAK mode.  This is a problem.  In that, the output of the controller is not limited to the 1C / hour rate of change while in the SOAK segment and is thus free to increase the heater output as high as needed to get the real-time measured temperature to match the new value of SETPOINT.  This situation happened to us and could have been damaging if left unnoticed for too long (see https://alog.ligo-wa.caltech.edu/aLOG/index.php?callRep=40867).  

Anyway, the MCE guy acknowledged this shortcoming of their in-house code and resolved to develop a patch and get back to us.  In the meantime, we now know the cause, effect and workaround so have opted to continue with the in-progress bake of CP4.  



H1 PSL
jason.oberling@LIGO.ORG - posted 16:50, Wednesday 07 March 2018 - last comment - 16:54, Thursday 15 March 2018(40886)
PSL 70W Amplifier Installation Update

J. Oberling, E. Merilh

The last few days have been spent taking caustic measurements and searching for mode matching solutions so we can identify what lenses we need for mode matching prior to placing elements on the table; as a result there is not a whole lot of installation activity to report.  We left off Friday with a lovely LG01 mode in the Wincam.  On Monday morning Bubba very kindly shaved some mounts down for us (for WP02 and PBS02) to solve our clipping problem.  These were installed and the LG01 mode was still observed on the Wincam.  To see if we were seeing something real or just an issue with the Wincam, we borrowed a Thorlabs rotating slit beam profiler from the Pcal folks (who had lent it to the SQZ folks).  Setting this profiler up, we saw a nice Gaussian beam on the profiler.  Apparently something is up with that Wincam, so we will continue to use the Thorlabs profiler.

Using the same 300mm focal length lens, we took a measurement of the FE beam caustic, attached as LHO_FE_Caustic1.txt.  The first column is scale position in cm (corrected for the fact we had the wrong location of the sensor in the profiler when taking the measurement), and the last 2 columns are horizontal and vertical beam radii in µm, respectively.  This was imported into JamMT, the lens added, and the resulting FE waist given as 77.2 µm in radius, positioned ~15mm outside of the FE box.  This is uploaded as FE_caustic_after_distance_check.png (z=0 is the end of the scale used to take the caustic measurement, all distances are relative to that); we doubled checked all of our distance measurements this morning and made some small corrections, hence the filename.  While the location makes sense, the beam radius seems small as the LIGO FE lasers were all measured to be between 150µm and 250µm; although we have swapped the NPRO in the FE, which can have an effect on the waist size and location.  JamMT yielded only one mode matching solution with this initial waist that fits within the constraints of the beam path; this is uploaded as 70W_MM_solution1-FE_caustic_with_lens.png.  Unfortunately, I didn't think to take a picture of the measurement setup used; I'll take one tomorrow morning and upload it as a comment to this alog.

As a double check, we measured the caustic with no lens in place, and followed the same procedure as above.  The measurement is uploaded as LHO_FE_Caustic_no_lens.txt (same units as the previous .txt file, this time the position dimension is referenced to the FE box (since there was no lens installed)), and the resulting FE waist as FE_Caustic_no_lens.png.  As can be seen, something is not quite right with this measurement as it claims the waist is 2mm in diameter and located some 4.9 meters (yes, meters) behind the FE box (this is somewhere way past the NPRO and not in the FE box at all...), so this isn't the double check we thought it would be.  We tried installing a 400mm focal length lens in place of the 300mm, but this put the resulting focus off the edge of the table; a 200mm focal length lens gets the spot too small for the profiler to give accurate measurements.  We will do some more investigation of this in the morning (maybe try the 200mm lens anyway and see what we get), but at this point I think our best check is to set up the lenses from the given solution, put the Thorlabs profiler at the location the 70W amplifier expects the beam waist to be, and see what our beam diameter is.  If we're close, then onward we go; if not, then more investigation is needed.

Images attached to this report
Non-image files attached to this report
Comments related to this report
jason.oberling@LIGO.ORG - 16:54, Thursday 15 March 2018 (41029)

Promised picture of the setup used to measure the FE beam caustic.  This is the same setup used for both measurements, with the 300mm lens and with no lens.  The optics are, from right to left: 95%R output coupler, pump light filter, OD = 4.0 ND filter.  The Thorlabs beam profiler was moved along the scale attached to the table to get the distance measurements.

Images attached to this comment
H1 General
jeffrey.bartlett@LIGO.ORG - posted 16:20, Wednesday 07 March 2018 (40883)
Ops Day Shift Summary
Ops Shift Log: 03/07/2018, Day Shift 16:00 – 08:00 (08:00 - 16:00) Time - UTC (PT)
State of H1: Unlocked during vent for upgrades
Intent Bit: Engineering  
Support: X
Incoming Operator: N/A
Shift Summary: Commissioning and upgrade work continues at both end stations. Vacuum work at Mid-Y. Alignment of the PSL 70W laser.  
 
Activity Log: Time - UTC (PT)
15:30 (07:30) Terry – In LVEA at ISCT6
16:00 (08:00) Start of shift
16:27 (08:27) Joel from RDO on site to deliver parts for tractor
16:43 (08:43) Marc – Going into the LVEA
16:44 (08:44) HFD – On site to run smoke tests at End-Y
16:46 (08:46) Sheila – Going to HAM6
16:57 (08:57) Marc – Out of the LVEA
17:11 (09:11) Travis – Going to LVEA to look for parts
17:17 (09:17) Richard – Going to LVEA to find headsets
17:30 (09:30) Travis – Out of the LVEA
17:31 (09:31) Jim – Going to End-X to lock the ISI
17:40 (09:40) Nutsinee – Going to ISCT6
17:58 (09:58) Rick, Niko, & Travis – Going to End-X for PCal work (WP #7404)
18:13 (10:13) Laser hazard at End-X
18:30 (10:30) TJ – Going to HAM6 for cabling work
18:31 (10:31) Jason & Ed – Going into the PSL enclosure for mode matching
18:31 (10:31) Marc – Going into LVEA to look a CRD
18:45 (10:45) Dave B. – Going into H2 building to work on 3IFO computers
18:50 (10:50) Betsy – Going to End-Y for chamber closeout work
18:54 (10:54) Chandra – Going to Mid-Y
18:56 (18:56) Corey – Going over the tube bridge to look for shipping container
19:04 (11:04) Elizabeth – Going to both end stations for marking off electrical exclusion areas
19:22 (11:22) Jason & Ed – Out of the PSL Enclosure
19:23 (11:23) Karen – Cleaning at Mid-Y
19:34 (11:34) Sheila – Out of the LVEA
19:38 (11:38) Nutsinee – Out of the LVEA
20:06 (12:06) Karen – Finished at Mid-Y
20:15 (12:15) Elizabeth – Back from end stations
20:21 (12:21) Rick, Niko, & Travis – Leaving End-X
20:21 (12:21) Laser Safe at End-X
20:54 (12:54) TJ – Out of the LVEA
21:02 (13:02) Terry – Out of the LVEA
21:03 (13:03) Contractor on site to see Chandra
21:14 (13:14) Karen – Cleaning in the H2 building
21:20 (13:20) Elizabeth – Into the LVEA for marking electrical exclusion zones
21:22 (13:22) Filiberto – Going to End-Y for ground loop checks
21:23 (13:23) Apollo crew – Taking one-ton to End-X to drop off pallet for SEI
21:27 (13:27) Jason & Ed – Into the PSL Enclosure for 70W amp work
21:31 (13:31) Karen – Finished in H2 building. – Going to Wood shop
21:46 (13:46) Sheila – Going to HAM6
21:44 (13:44) Marc – Going into LVEA to check the CRD
22:54 (13:54) Nutsinee – Going to HAM6
21:55 (13:55) Terry – Going to HAM6
22:15 (14:15) TJ – Out of the LVEA
22:41 (14:41) Rick, Evan, Darkhan, Niko – Going to End-Y for PCal
22:43 (14:43) Jason & Ed – Out of the PSL Enclosure
23:22 (15:22) Jim & TJ – Going to End-Y to work on relocking the ISI
23:38 (15:38) Marc – Going into the LVEA to check on the CRD
00:00 (16:00) End of Shift
00:03 (16:03) Marc – Out of the LVEA
 

 

H1 SUS
betsy.weaver@LIGO.ORG - posted 14:29, Wednesday 07 March 2018 - last comment - 13:34, Thursday 08 March 2018(40882)
ETMX SUS OPLEV trends - unexpected jump. Already.

This morning, Travis had to repoint the ETMX with some bias to get it back on the Oplev which I found odd.  The Oplev had been zeroed to the ETMX SUS last ~Monday reportedly by Jason.  However, besides that jump in the trend data, there is another jump on Tuesday.  I am guessing that the ISI EX model work/bootfest changed the pointing of the floating ISI/SUS.  Travis used the ETMX SUS to repoint back to zero.

This morning, Jim locked the ISI so hopefully there will be no more unexpected shifts due to anything ISIish.

Side note - try not to bump the OPLEV piers.  Ever. 

 

 

Images attached to this report
Comments related to this report
hugh.radkins@LIGO.ORG - 13:34, Thursday 08 March 2018 (40900)

Not that it isn't possible, but, it does not look to me like the reboots of the isi did anything to the OpLev readout.  Further, what can SEI do to pitch unless the optic is locked?  Based on everything I know and can see, I did not think the SEI could do this, it has only been DAMPED, READY or TRIPPED.

The attached 3 hour plot shows the OpLev Pitch, and WD, Guardian and CPS positions for the ETMX.  The step on the OpLev occurs at 1940 on Tuesday.  It is a full 14 minutes later that the Guardian is manipulated and 26 minutes later before the isietmx FE boot at 2006.  Below is the day boot start log--nothing is causal to the OpLev shift.  Meanwhile there is no hint of movement on the ISI (HEPI has been locked for more than a week) during the OpLev Pitch shift, (yaw is tiny) and none until the signals go to zero at 2005 at the start of the isi boot.  Just goes to show, sometimes you have to look closer.

hugh.radkins@opsws1:06 1$ more *.log
2018_03_06 11:13 h1asc
2018_03_06 11:13 h1omc
2018_03_06 11:13 h1sqzwfs
2018_03_06 11:15 h1susopo
2018_03_06 12:02 h1isiitmy
2018_03_06 12:04 h1oaf
2018_03_06 12:06 h1isietmx
2018_03_06 12:06 h1pemex
2018_03_06 12:13 h1alsex
2018_03_06 12:13 h1calex
2018_03_06 12:13 h1iopiscex
2018_03_06 12:13 h1iscex
2018_03_06 12:13 h1pemex
2018_03_06 12:24 h1oaf
2018_03_06 12:43 h1iopiscex
2018_03_06 12:43 h1pemex
2018_03_06 12:45 h1alsex
2018_03_06 12:45 h1calex
2018_03_06 12:45 h1iscex
2018_03_06 12:56 h1dc0
2018_03_06 12:57 h1dc0
2018_03_06 13:01 h1broadcast0
2018_03_06 13:01 h1dc0
2018_03_06 13:01 h1fw0
2018_03_06 13:01 h1fw1
2018_03_06 13:01 h1fw2
2018_03_06 13:01 h1nds0
2018_03_06 13:01 h1nds1
2018_03_06 13:01 h1tw1
2018_03_06 15:20 h1oaf
2018_03_06 15:20 h1pemcs
2018_03_06 22:03 h1fw0
2018_03_06 22:12 h1fw0
hugh.radkins@opsws1:06 0$ pwd
/opt/rtcds/lho/h1/data/startlog/2018/03/06
 

Images attached to this comment
H1 SQZ
daniel.sigg@LIGO.ORG - posted 13:17, Wednesday 07 March 2018 - last comment - 16:31, Wednesday 07 March 2018(40881)
VOPO Work

TJ Sheila Daniel

TJ fixed the short on DCPD2 betwen anode and case following Rich's suggestion of adding a piece of viton under the ceramic circuit board. The same was done for DCPD1.

However, wew found that for DCPD1 the anode and cathode are swapped. We made it work with some clip leads (for now).

We tested the PZTs and both of them are working.

Comments related to this report
thomas.shaffer@LIGO.ORG - 16:31, Wednesday 07 March 2018 (40884)

Small correction, I added a piece of Kapton tubing that we normally use to help route/control cables. I rolled the tubing and then cut a small slit vertically so that I could stick the top pin through and hold it in place. Picture attached.

Images attached to this comment
H1 GRD
jameson.rollins@LIGO.ORG - posted 10:14, Wednesday 28 February 2018 - last comment - 09:38, Thursday 08 March 2018(40765)
starting process of moving guardian nodes to new guardian supervision host

We are setting up a new guardian host machine.  The new machine (currently "h1guardian1", but to be renamed "h1guardian0" after the transition is complete) is running Debian 9 "stretch", with all CDS software installed from pre-compiled packages from the new CDS debian software archives.  It has been configured with a completely new "guardctrl" system that will manage all the guardian nodes under the default systemd process manager.  A full description of the new setup will come in a future log, after the transition is complete.

The new system is basically ready to go, and I am now beginning the process of transferring guardian nodes over to the new host.  For each node to be transferred, I will stop the process on the old machine, and start it fresh on the new system.

I plan on starting with SUS and SEI in HAM1, and will move through the system ending with HAM6.

Comments related to this report
jameson.rollins@LIGO.ORG - 17:02, Saturday 03 March 2018 (40831)

There's been a bit of a hitch with the guardian upgrade.  The new machine (h1guardian1) has been setup and configured.  The new supervision system and control interface are fully in place, and all HAM1 and HAM2 SUS and SEI nodes have been moved to the new configuration.  Configuration is currently documented in the guardian gitlab wiki.

Unfortunately, node processes are occasionally spontaneously seg faulting for no apparent reason.  The failures are happening at a rate of roughly one every 6 hours or so.  I configured systemd to catch and log coredumps from segfaults for inspection (using the systemd-coredump utility).  After we caught our next segfault (which happened only a couple of hours later), Jonathan Hanks and I started digging into the core to see what we could ferret out.  It appears to be some sort of memory corruption error, but we have not yet determined where in the stack the problem is coming from.  I suspect that it's in the pcaspy EPICS portable channel access python bindings, but it could be in EPICS.  I think it's unlikely that it's in python2.7 itself, although we aren't ruling anything out.

We then set up the processes to be run under electric fence to try to catch any memory out-of-bounds errors.  This morning I found two processes that had been killed by efence, but I have not yet inspected the core files in depth.  Below are the coredump summaries from coredumpctl on h1guardian1.

This does not bode well for the upgrade.  Best case we figure out what we think is causing the segfaults early in the week, but there still won't be enough time to fix the issue, test, and deploy before the end of the week.  A de-scoped agenda would be to just do a basic guardian core upgrade in the existing configuration on h1guardian0 and delay the move to Debian 9 and systemd until we can fully resolve the segfault issue.

Here is the full list of nodes currently running under the new system:

HPI_HAM1        enabled    active    
HPI_HAM2        enabled    active    
ISI_HAM2        enabled    active    
ISI_HAM2_CONF   enabled    active    
SEI_HAM2        enabled    active    
SUS_IM1         enabled    active    
SUS_IM2         enabled    active    
SUS_IM3         enabled    active    
SUS_IM4         enabled    active    
SUS_MC1         enabled    active    
SUS_MC2         enabled    active    
SUS_MC3         enabled    active    
SUS_PR2         enabled    active    
SUS_PR3         enabled    active    
SUS_PRM         enabled    active    
SUS_RM1         enabled    active    
SUS_RM2         enabled    active    

If any of these nodes are show up white on the guardian overview screen it's likely because they have crashed.  Please let me know and I will deal with them asap.


guardian@h1guardian1:~$ coredumpctl info 11512
           PID: 11512 (guardian SUS_MC)
           UID: 1010 (guardian)
           GID: 1001 (controls)
        Signal: 11 (SEGV)
     Timestamp: Sat 2018-03-03 11:56:20 PST (4h 50min ago)
  Command Line: guardian SUS_MC3 /opt/rtcds/userapps/release/sus/common/guardian/SUS_MC3.py
    Executable: /usr/bin/python2.7
 Control Group: /user.slice/user-1010.slice/user@1010.service/guardian.slice/guardian@SUS_MC3.service
          Unit: user@1010.service
     User Unit: guardian@SUS_MC3.service
         Slice: user-1010.slice
     Owner UID: 1010 (guardian)
       Boot ID: 870fed33cb4446e298e142ae901c1830
    Machine ID: 699a2492538f4c09861889afeedf39ab
      Hostname: h1guardian1
       Storage: /var/lib/systemd/coredump/core.guardianx20SUS_MC.1010.870fed33cb4446e298e142ae901c1830.11512.1520106980000000000000.lz4
       Message: Process 11512 (guardian SUS_MC) of user 1010 dumped core.
                
                Stack trace of thread 11512:
                #0  0x00007f1255965646 strlen (libc.so.6)
                #1  0x00007f12567c86ab EF_Printv (libefence.so.0.0)
                #2  0x00007f12567c881d EF_Exitv (libefence.so.0.0)
                #3  0x00007f12567c88cc EF_Exit (libefence.so.0.0)
                #4  0x00007f12567c7837 n/a (libefence.so.0.0)
                #5  0x00007f12567c7f30 memalign (libefence.so.0.0)
                #6  0x00007f1241cba02d new_epicsTimeStamp (_cas.x86_64-linux-gnu.so)
                #7  0x0000556e57263b9a call_function (python2.7)
                #8  0x0000556e57261d45 PyEval_EvalCodeEx (python2.7)
                #9  0x0000556e5727ea7e function_call.lto_priv.296 (python2.7)
                #10 0x0000556e57250413 PyObject_Call (python2.7)
...

guardian@h1guardian1:~$ coredumpctl info 11475
           PID: 11475 (guardian SUS_MC)
           UID: 1010 (guardian)
           GID: 1001 (controls)
        Signal: 11 (SEGV)
     Timestamp: Sat 2018-03-03 01:33:51 PST (15h ago)
  Command Line: guardian SUS_MC1 /opt/rtcds/userapps/release/sus/common/guardian/SUS_MC1.py
    Executable: /usr/bin/python2.7
 Control Group: /user.slice/user-1010.slice/user@1010.service/guardian.slice/guardian@SUS_MC1.service
          Unit: user@1010.service
     User Unit: guardian@SUS_MC1.service
         Slice: user-1010.slice
     Owner UID: 1010 (guardian)
       Boot ID: 870fed33cb4446e298e142ae901c1830
    Machine ID: 699a2492538f4c09861889afeedf39ab
      Hostname: h1guardian1
       Storage: /var/lib/systemd/coredump/core.guardianx20SUS_MC.1010.870fed33cb4446e298e142ae901c1830.11475.1520069631000000000000.lz4
       Message: Process 11475 (guardian SUS_MC) of user 1010 dumped core.
                
                Stack trace of thread 11475:
                #0  0x00007fa7579b5646 strlen (libc.so.6)
                #1  0x00007fa7588186ab EF_Printv (libefence.so.0.0)
                #2  0x00007fa75881881d EF_Exitv (libefence.so.0.0)
                #3  0x00007fa7588188cc EF_Exit (libefence.so.0.0)
                #4  0x00007fa758817837 n/a (libefence.so.0.0)
                #5  0x00007fa758817f30 memalign (libefence.so.0.0)
                #6  0x00005595da26610f PyList_New (python2.7)
                #7  0x00005595da28cb8e PyEval_EvalFrameEx (python2.7)
                #8  0x00005595da29142f fast_function (python2.7)
                #9  0x00005595da29142f fast_function (python2.7)
                #10 0x00005595da289d45 PyEval_EvalCodeEx (python2.7)
...
jameson.rollins@LIGO.ORG - 20:09, Wednesday 07 March 2018 (40891)

After implementing the efence stuff above, we came in to find more coredumps the next day.  On a cursory inspection of the coredumps, we noted that they all showed completely different stack traces.  This is highly unusual and pathological, and prompted Jonathan to question the integrity of the physical RAM itself.  We swapped out the RAM with a new 16G ECC stick and let it run for another 24 hours.

When next we checked, we discovered only two efence core dumps, indicating an approximate factor of three increase in the mean time to failure (MTTF).  However, unlike the previous scatter shot of stack traces, these all showed identical "mprotect" failures, which seemed to point to a side effect of efence itself running in to limits on per process memory map areas.  We increased the "max_map_count" (/proc/sys/vm/max_map_count) by a factor of 4, again left it running overnight, and came back to no more coredumps.  We cautiously declared victory.

I then started moving the remaining guardian nodes over to the new machine.  I completed the new setup by removing the efence, and rebooting the new machine a couple of times to work out the kinks.  Everything seemed to be running ok...

Until more segfault/coredumps appeared sadangrycryingno.  A couple of hours after the last reboot of the new h1guardian1 machine, there were three segfaults, all with completely different stack traces.  I'm now wondering if efence was somehow masking the problem.  My best guess there is that efence was slowing down the processes quite a bit (by increasing system call times) which increased the MTTF by a similar factor.  Or the slower processes were less likely to run into some memory corruption race condition.

I'm currently running memtest on h1guardian1 to see if anything shows up, but it's passed all tests so far...

jameson.rollins@LIGO.ORG - 09:38, Thursday 08 March 2018 (40896)

16 seg faults overnight, after rebooting the new guardian machine at about 9pm yesterday.  I'll be reverting guardian to the previous configuration today.

Interestingly, though, almost all of the stack traces are of the same type, which is different than what we were seeing before where they're all different.  Here's the trace we're seeing in 80% of the instances:

                #0  0x00007ffb9bfe4218 malloc_consolidate (libc.so.6)
                #1  0x00007ffb9bfe4ea8 _int_free (libc.so.6)
                #2  0x000055d2caca7bc5 list_dealloc.lto_priv.1797 (python2.7)
                #3  0x000055d2cacdb127 frame_dealloc.lto_priv.291 (python2.7)
                #4  0x000055d2caccb450 fast_function (python2.7)
                #5  0x000055d2caccb42f fast_function (python2.7)
                #6  0x000055d2caccb42f fast_function (python2.7)
                #7  0x000055d2caccb42f fast_function (python2.7)
                #8  0x000055d2cacc3d45 PyEval_EvalCodeEx (python2.7)
                #9  0x000055d2cace0a7e function_call.lto_priv.296 (python2.7)
                #10 0x000055d2cacb2413 PyObject_Call (python2.7)
                #11 0x000055d2cacf735e instancemethod_call.lto_priv.215 (python2.7)
                #12 0x000055d2cacb2413 PyObject_Call (python2.7)
                #13 0x000055d2cad69c7a call_method.lto_priv.2801 (python2.7)
                #14 0x000055d2cad69deb slot_mp_ass_subscript.lto_priv.1204 (python2.7)
                #15 0x000055d2cacc6c5b PyEval_EvalFrameEx (python2.7)
                #16 0x000055d2cacc3d45 PyEval_EvalCodeEx (python2.7)
                #17 0x000055d2cace0a7e function_call.lto_priv.296 (python2.7)
                #18 0x000055d2cacb2413 PyObject_Call (python2.7)
                #19 0x000055d2cacf735e instancemethod_call.lto_priv.215 (python2.7)
                #20 0x000055d2cacb2413 PyObject_Call (python2.7)
                #21 0x000055d2cad69c7a call_method.lto_priv.2801 (python2.7)
                #22 0x000055d2cad69deb slot_mp_ass_subscript.lto_priv.1204 (python2.7)
                #23 0x000055d2cacc6c5b PyEval_EvalFrameEx (python2.7)
                #24 0x000055d2cacc3d45 PyEval_EvalCodeEx (python2.7)
                #25 0x000055d2cace0a7e function_call.lto_priv.296 (python2.7)
                #26 0x000055d2cacb2413 PyObject_Call (python2.7)
                #27 0x000055d2cacf735e instancemethod_call.lto_priv.215 (python2.7)
                #28 0x000055d2cacb2413 PyObject_Call (python2.7)
                #29 0x000055d2cad69c7a call_method.lto_priv.2801 (python2.7)
                #30 0x000055d2cad69deb slot_mp_ass_subscript.lto_priv.1204 (python2.7)

Here's the second most common trace:

                #0  0x00007f7bf5c32218 malloc_consolidate (libc.so.6)
                #1  0x00007f7bf5c32ea8 _int_free (libc.so.6)
                #2  0x00007f7bf5c350e4 _int_realloc (libc.so.6)
                #3  0x00007f7bf5c366e9 __GI___libc_realloc (libc.so.6)
                #4  0x000055f7eaad766f list_resize.lto_priv.1795 (python2.7)
                #5  0x000055f7eaad6e55 app1 (python2.7)
                #6  0x000055f7eaafd48b PyEval_EvalFrameEx (python2.7)
                #7  0x000055f7eab0142f fast_function (python2.7)
                #8  0x000055f7eab0142f fast_function (python2.7)
                #9  0x000055f7eab0142f fast_function (python2.7)
                #10 0x000055f7eab0142f fast_function (python2.7)
                #11 0x000055f7eaaf9d45 PyEval_EvalCodeEx (python2.7)
                #12 0x000055f7eab16a7e function_call.lto_priv.296 (python2.7)
                #13 0x000055f7eaae8413 PyObject_Call (python2.7)
                #14 0x000055f7eab2d35e instancemethod_call.lto_priv.215 (python2.7)
                #15 0x000055f7eaae8413 PyObject_Call (python2.7)
                #16 0x000055f7eab9fc7a call_method.lto_priv.2801 (python2.7)
                #17 0x000055f7eab9fdeb slot_mp_ass_subscript.lto_priv.1204 (python2.7)
                #18 0x000055f7eaafcc5b PyEval_EvalFrameEx (python2.7)
                #19 0x000055f7eaaf9d45 PyEval_EvalCodeEx (python2.7)
                #20 0x000055f7eab16a7e function_call.lto_priv.296 (python2.7)
                #21 0x000055f7eaae8413 PyObject_Call (python2.7)
                #22 0x000055f7eab2d35e instancemethod_call.lto_priv.215 (python2.7)
                #23 0x000055f7eaae8413 PyObject_Call (python2.7)
                #24 0x000055f7eab9fc7a call_method.lto_priv.2801 (python2.7)
                #25 0x000055f7eab9fdeb slot_mp_ass_subscript.lto_priv.1204 (python2.7)
                #26 0x000055f7eaafcc5b PyEval_EvalFrameEx (python2.7)
                #27 0x000055f7eaaf9d45 PyEval_EvalCodeEx (python2.7)
                #28 0x000055f7eab16a7e function_call.lto_priv.296 (python2.7)
                #29 0x000055f7eaae8413 PyObject_Call (python2.7)
                #30 0x000055f7eab2d35e instancemethod_call.lto_priv.215 (python2.7)

 

H1 SUS (DetChar, ISC, SEI, SUS)
jeffrey.kissel@LIGO.ORG - posted 17:49, Thursday 22 February 2018 - last comment - 16:40, Wednesday 07 March 2018(40675)
Adding Viton Beneath Balance Masses of H1 OPO (an OPOS) Reduces Broadboard Bending Mode Qs by a Factor of 2-3
Á. Fernández, J. Kissel, T. Shaffer

Álvaro, T.J. and I B&K hammered the optical parametric oscillator suspension's (OPOS) fully-payloaded bread board today, both with and without viton beneath its balancing masses. The reduction of bending mode Qs was about a factor of 2 to 3. Not very impressive -- my feeling is that we're either (a) putting viton where there is little bending, and/or (b) the balance masses do not provide enough moving mass at these mode frequencies to really suck out the mode energy through the viton.

I've not been tied close enough to this suspension to know if there is further action / improvement to make or take. Are there specific requirements we're trying to meet? Comment below if you have.
(Note, these results show that traditional, yet often broken, requirement of "Anything attached to the HAM-ISI must have its first bending mode resonances above 150 Hz" has been meet. Also, the addition of viton does not and will not have any affect on what ~1 Hz ISI - OPOS plant interaction issues Arnaud saw in LLO aLOG 37394.)

I attach the transfer function results.

We struck the platform vertically down at the outer edges, in the +Transverse corner and in the +Longitudinal corner (see E1700390 and/or G1300086 for basis definitions), and measured the response with the usual three-axis accelerometer. I only show the vertical response to each vertical excitation. 

Stay tuned for pictures. 
Non-image files attached to this report
Comments related to this report
thomas.shaffer@LIGO.ORG - 09:21, Friday 23 February 2018 (40680)

Adding some pictures of the accelerometer location, hit locations, and some pictures of the added viton. The viton we added was a mixture of some 1/16" thick pieces and some 1/16" thick tie down strips. We used these pieces mainly because it was what we had available. We adding viton to all of the large masses and then tried adding it to some of the smaller masses to try to damp some of the higher frequencies. Since we were hitting a floating suspension, Alvaro had to tap very lightly the floating platform itself, while I would watch one of the stops to make sure it didn't hit.

Attachments are:

1 -- Hammer location one (+ transverse).

2 -- Hammer location two (+ longitudinal).

3 -- Accelerometer location.

4 -- Accelerometer location closer picture.

5-11 -- Various pictures of viton placement on the suspension.

Images attached to this comment
jeffrey.kissel@LIGO.ORG - 16:40, Wednesday 07 March 2018 (40885)SEI, SQZ
I attach the raw data (.pls files), the exported data (.txt files), and the script used to analyze the data.
Non-image files attached to this comment
Displaying reports 44281-44300 of 84044.Go to page Start 2211 2212 2213 2214 2215 2216 2217 2218 2219 End