Displaying reports 46781-46800 of 86425.Go to page Start 2336 2337 2338 2339 2340 2341 2342 2343 2344 End
Reports until 18:35, Wednesday 28 February 2018
LHO VE
chandra.romel@LIGO.ORG - posted 18:35, Wednesday 28 February 2018 (40779)
CP4 bake commissioning

Completed final tests on bake enclosure heater today with contractor. We have four thermocouple readouts which all act as over-heat protection along with temperature read back. If temperature rises above set point plus up to 20 degC (user defined) then the heater trips off.

We secured four type T thermocouples in the enclosure (one touches a metal flange at bottom of CP4 and the other three float in air), plus one type K that floats in air and is connected to CDS at TE202A - formerly CP3 exhaust temp read out - for remote monitoring and alarm messaging. It currently reads 29degC, which is higher than the other four reading around 21degC.

We found a spill from the turbo's portable chiller this morning. The flex hose had slipped off a barbed connection, so Kyle replace with metal hose and threaded fittings. The fluid is DI water,  mixed with some residual anti-corrosion slime.

We will set or adjust text/email alarms for the following channels: 

H1 PSL
jason.oberling@LIGO.ORG - posted 17:49, Wednesday 28 February 2018 (40778)
70W Amp Install Update

J. Oberling, E. Merilh

The last couple of days have been somewhat frustrating.  After installing the new FE pick-off last Friday, Ed and I attempted to re-acquire the PMC alignment, and failed.  While we were successfully re-aligning the beam to the PMC, we were also misaligning the beam path towards the future home of the 70W amplifier; the beam was being driven too low and was beginning to clip on PBS02.  We removed the FE pick-off and re-established our PMC alignment, and took the opportunity to install 2 irises; one between mirrors M04 and M05 and one between mirrors M06 and M07 (we would have preferred to have this iris between mirror M07 and the PMC, but there wasn't room to install the iris without blocking the reflected PMC beam).  This was completed by COB Monday.  This done, we re-installed the pick-off on Tuesday morning and, using primarily mirror M02 (but some adjustment of M01 was necessary), were able to recover the PMC alignment while mostly maintaining our beam path towards the 70W amp.  We then tweaked the pick-off alignment, which resulted in a necessary small re-tweak of the PMC alignment.  We took visibility measurements both before and after the pick-off installation:

One thing we noted was a loss of power incident on the PMC.  Before the pick-off installation we had ~25W incident on the PMC; after the installation and alignment we only had ~23W incident on the PMC.  We could find no obvious place where we are losing 2W of power; no obvious clipping or misalignments.  Perhaps some clipping in the new pick-off?  More on that below.

To begin this morning, we installed the Wincam to take a quick beam profile measurement of the FE beam to check if there was any obvious clipping from the new pick-off.  There wasn't, the beam looked as it did after our first install attempt last Friday, and very close to when we finished the NPRO swap in September 2017.  We decided to move on to installation of AMP_WP01 and PBSC01, the first new on-table components for the 70W amplifier; PBSC01 replaces mirror M02, and together with AMP_WP01 gives us the ability to switch back and forth between 70W amplifier and FE-only operation.  We reduced the power from the FE to ~300mW using the HWP inside the FE (thereby reducing the NPRO power delivered to the MOPA) and installed AMP_WP01.  We then made some rough marks on the table to roughly assist with installation of PBSC01, and then removed mirror M02.  PBSC01 was installed on the table in place of M02 and we began alternating translation of the mount and yaw of the optic to recover the PMC alignment.

During this process we noticed that the output power of the FE was changing.  Without touching any power control optics, the FE power had drifted from 300mW to ~6.9W.  We decided to lower the NPRO power by reducing the injection current from the NPRO power supply; we dropped it to ~1.26A from its operating point of 2.222A, which brought the FE output power back to ~300mW.  Continuing the alignment, the FE power continued to increase on its own, getting up to 1.5W.  At this point I noticed that I could adjust the HWP in the FE to reduce the power slightly, which indicates a possible shift in polarization from the NPRO.  At this point we broke for lunch and to consult with Matt regarding this.  Maybe they had seen similar behavior at LLO?  Turns out they had not.  We took some trends and to the best we could tell the FE power was following the NPRO power (I'll post the trends as a comment to this alog tomorrow).  At this time we also noticed that when the FE is running at full power, the new pick-off is saturating; we will adjust this after PBSC01 installation.  We decided to continue on with the alignment after lunch, this time putting the FE and NPRO output powers on a StripTool so we could monitor it in real-time while we worked.  Continuing the alignment, the FE outuput power continued to slowly increase on its own.  Once it got to ~800mW, we decided to lower the NPRO injection current again.  I dropped it from 1.26A to 1.0A; this lowered the FE output power to ~300mW, where it stayed for the remainder of the afternoon.  I have never seen this behavior before and am unclear as to the cause.

Regardless, by slowly translating and yawing PBSC01, using progressively further away alignment references, we are able to recover the majority of the PMC alignment; we did not have to touch mirror M01 at all.  We fine-tuned the PMC alignment with mirrors M06 and M07 and took a visibility measurement:

One thing to note is we are once again down in power incident on the PMC.  Before PBSC01 installation there was ~23W incident on the PMC; after installation there is ~20.5W incident on the PMC.  While there is some leakage from PBSC01 towards the 70W amplifier beam path, it's not 2.5W worth.  Once again we could find no obvious clipping or misalignment downstream of PBSC01 that would cause this loss of power.  Looking upstream however, we see a good deal of scatter.  Can't easily tell where it's coming from, I'm suspecting scatter from the new pick-off.  I've attached a couple pictures showing this scatter.

Tomorrow our plan is to get a picture of the beam profile post-PBSC01 installation, and then to begin investigating this scatter.  We know we need to adjust the alignment of the pick-off to prevent saturation of the PD, maybe that will help with the scatter as well.  Once that is taken care of we plan on moving on to measuring the beam caustics of the FE for mode matching modeling.

Attachments:

Images attached to this report
H1 CAL (CAL)
richard.savage@LIGO.ORG - posted 17:14, Wednesday 28 February 2018 - last comment - 22:20, Thursday 19 July 2018(40777)
Pcal ETMY baffles and shields installation work complete

StephenA, AlenaA, NikoL, MarekS, JimW, RickS

Jim and crew completed the installation of the shield panels today.  They also adjusted the compression of the upper-right (when viewed from the ETM) flexure gap to ~0.220".

Everything seems to be installed as designed.

NikoL, MarekS, TravisS, RickS

Began assessing the centering of the Pcal beams on the input and output apertures using targets mounted to the Pcal window flanges on the A1 adapter.  We plan to continue this work in the morning, going inside the vacuum envelope to assess centering on the Pcal periscope relay mirrors. 

We plan to install the Pcal target on the ETM suspension cage for this work.

Comments related to this report
alena.ananyeva@LIGO.ORG - 23:13, Wednesday 28 February 2018 (40781)SYS
Reflections of the beam tube surface in the baffle not to be confused with smooth finish of the baffle on this photo
Images attached to this comment
stephen.appert@LIGO.ORG - 18:11, Friday 16 March 2018 (41040)
Here is a photo logging S/N of periscope components, collected during the above PCal Yend Baffle and Shields install effort. These articles conveniently do not appear to be assembled into any of the existing D1200174 assemblies. 

:(
Images attached to this comment
stephen.appert@LIGO.ORG - 22:20, Thursday 19 July 2018 (42985)

Previous work: LHO aLOG 40759

Build records: T1800172

Summary of flexure measurements documented in T1800172:

2018 LHO End Y Flexure Gap, before baffle install
Flexure Location (viewed from Front per D1200174-v8) Flexure Gap (in)
Upper Left .210
Upper Right .230
Lower Right .150
Lower Left .110

2018 LHO End Y Flexure Gap, only pending outer baffle (D1700365) install

Flexure Location (viewed from Front per D1200174-v8) Flexure Gap (in)
Upper Left .210
Upper Right .220
Lower Right n/a
Lower Left n/a

2018 LHO End Y Flexure Gap, all baffles installed

Flexure Location (viewed from Front per D1200174-v8) Flexure Gap (in)
Upper Left .220
Upper Right .220
Lower Right n/a
Lower Left n/a
H1 SQZ
filiberto.clara@LIGO.ORG - posted 16:57, Wednesday 28 February 2018 - last comment - 08:13, Thursday 01 March 2018(40776)
SQZ6 Table Enclosure Cabling

SQZ6 enclosure was moved south of HAM6. Cabling for table in the SQZ bay was moved to new table location. SQZ team will let us know if we missed a cable. Power and E-Stop cables still need to be terminated.

Comments related to this report
daniel.sigg@LIGO.ORG - 08:13, Thursday 01 March 2018 (40784)

Nutsinee Daniel

All outside cables are in place and connected.

H1 General (PSL, SQZ, SUS, TCS)
cheryl.vorvick@LIGO.ORG - posted 16:15, Wednesday 28 February 2018 - last comment - 16:32, Wednesday 28 February 2018(40770)
Day Summary:

Activity Summary:

Still at outbuildings:

Activity Details (all times UTC):

28-02-18 15:35 ChrisS to MY to drop off insulation for bakeout
28-02-18 15:54 Rick at EY, Pcal
28-02-18 16:05 Terry to SQZ bay
28-02-18 16:42 Terry out of the SQZ bay
28-02-18 16:40 Hugh to LVEA
28-02-18 16:55 Rick called, EY is laser safe
28-02-18 16:58 Hugh out of the LVEA
28-02-18 17:07 Hugh de-isolating the BS HEPI, then re-isolating
28-02-18 17:08 Hanford Fire Department at EX testing sensors
28-02-18 17:08 Janson and Ed to the PSL
28-02-18 17:21 Alena and Micheal to EY
28-02-18 17:22 Travis to EY
28-02-18 17:26 Corey to SQZ bay and then optics lab
28-02-18 17:27 Jim and Stephen to EY
28-02-18 17:33 TJ to HAM6
28-02-18 17:43 TJ back from optics lab
28-02-18 18:11 TJ to HAM6
28-02-18 18:18 Jaimie starting work on new Guardian machine, currently getting a work permit
28-02-18 18:18 Fil Liz and Diesy to EX to work on access system
28-02-18 18:24 Mike and visitor to LVEA
28-02-18 18:45 Mike and visitor out of the LVEA
28-02-18 17:40 Travis back from EY
28-02-18 19:15 Betsy and Travis to EX to stage for in-vacuum work
28-02-18 19:15 TJ out of the LVEA
28-02-18 19:40 Fil, Liz, and Deisy back from EX
28-02-18 19:48 Jason and Ed to PSL
28-02-18 19:56 Betsy and Travis back from EX
28-02-18 20:20 Terry and Sheila to SQZR bay
28-02-18 20:22 Karen done at MX
28-02-18 20:25 Fil to TCS to install new limiter hardware
28-02-18 20:50 Fil, Liz, to HAM6 to work on cabling
28-02-18 21:00 Corey to EY with 2" optics
28-02-18 21:08 Gerardo to MY to remove a cable for use at MX
28-02-18 21:10 Arm Crew: Mark and Mark and TJ, taking the arm off HAM6
28-02-18 21:11 Arm crew will be opening the rollup door
28-02-18 21:12 Arm crew will ensure the HAM6 soft cover is on while the rollup door is in use
28-02-18 21:18 TJ to HAM6
28-02-18 21:19 Karen done at MY
28-02-18 21:23 Nutsinee to HAM6
28-02-18 21:24 Chandra to LVEA
28-02-18 21:27 Dave changes oplog
28-02-18 21:50 Nutsinee back from LVEA
28-02-18 21:50 Corey back from EY
28-02-18 21:51 Travis to LVEA
28-02-18 21:51 Rick to LVEA
28-02-18 21:59 Rick back from the LVEA
28-02-18 22:02 Travis, Rick, and Niko, to EY for Pcal work
28-02-18 22:05 JeffB to LVEA to retrieve parts
28-02-18 22:36 MarkP to LVEA to delver cables
28-02-18 22:37 MarkP back from LVEA
28-02-18 22:37 DaveB to Mezz to check on chillers
28-02-18 22:37 Chandra to MY
28-02-18 22:54 Richard to old PSL chiller closet and then HAM6
28-02-18 23:12 Betsy to LVEA for supplies
28-02-18 23:13 DaveB to CER to disconnect the TCS FE input to the newly installed summing box
28-02-18 23:17 Mark and Mark done at HAM6, arm is off, SQZ table is placed
28-02-18 23:20 DaveB to TCS chillers to read the setpoint from the chiller display
28-02-18 23:32 Rick, Travis, Niko, Marik, leaving EY
28-02-18 23:32 EY is laser safe
28-02-18 23:48 Patrick restarting Beckoff Laser Safety Code
28-02-18 23:56 Gerardo to MY to assist Kyle, then on to EY
28-02-18 23:57 Rick, Travis, Niko, Marik, back from EY

Comments related to this report
cheryl.vorvick@LIGO.ORG - 16:10, Wednesday 28 February 2018 (40771)

Hey, look, there's an updated oplog format!  Original format for date and time was l-o-n-g.  Shortened the numerical date, dropped the seconds for time, and dropped the callout of UTC.

Feb 27 2018 15:25:45 UTC ChrisS to all out buildings for FAMIS/Maint.: fire ext. charging lifts

28-02-18 15:35 ChrisS to MY to drop off insulation for bakeout

cheryl.vorvick@LIGO.ORG - 16:32, Wednesday 28 February 2018 (40775)

Update, as of 00:32UTC, all times in UTC:

01-03-18 00:16 Patricks done restarting code
01-03-18 00:16 Liz back from LVEA, Fil still at HAM6
01-03-18 00:27 Gerardo is back from EY
01-03-18 00:28 Jason and Ed done in the PSL

H1 CDS
patrick.thomas@LIGO.ORG - posted 16:10, Wednesday 28 February 2018 - last comment - 16:12, Wednesday 28 February 2018(40772)
Restarted Beckhoff Laser Safety Interlock PLC code
Done to add a readback channel for the squeezer laser interlock and to add an interlock for a spare laser. Will have tripped off and back on the power supplies for connected lasers.
Comments related to this report
patrick.thomas@LIGO.ORG - 16:12, Wednesday 28 February 2018 (40773)
Waited until work was complete at end Y.
H1 CDS
patrick.thomas@LIGO.ORG - posted 15:53, Wednesday 28 February 2018 (40768)
Conlog crashed, restarted
Feb 28 13:47:45 conlog-master conlogd[7513]: Unexpected problem with CA circuit to server "h1ecatc1.cds.ligo-wa.caltech.edu:5064" was "Connection reset by peer" - disconnecting
Feb 28 13:47:45 conlog-master conlogd[7513]: terminate called after throwing an instance of 'sql::SQLException'
Feb 28 13:47:45 conlog-master conlogd[7513]: what():  Invalid JSON text: "Invalid escape character in string." at position 44 in value for column 'events.data'.

2018-02-28T21:47:45.308479Z	    8 Execute	INSERT INTO events (pv_name, time_stamp, event_type, has_data, data) VALUES('H1:SQZ-CLF_FLIPPER_NAME', '1519854464944719307', 'update', 1, '{"type":"DBR_STS_STRING","count":1,"value":["?"],"alarm_status":"NO_ALARM","alarm_severity":"NO_ALARM"}')
2018-02-28T21:47:45.308688Z	    8 Query	rollback
H1 CDS (TCS)
david.barker@LIGO.ORG - posted 15:35, Wednesday 28 February 2018 (40767)
TCS CO2 laser chiller summing box installed for ITMX and ITMY

WP7388 TCS CO2 laser summing chassis

Richard, Fil, Cheryl, Dave:

While both TCS CO2 lasers were OFF, we installed the new laser summing chassis in the CS CER.

At 14:00 PST I stopped h1tcscs from driving the ITM[X,Y] chiller setpoint control voltages. Soon after, Fil installed the summing box (D1500265) in the path between the tcs AI chassis and the ITMX and ITMY chiller units on the mech room mezzanine.

From 14:05 to 15:15 we ran the chillers in this mode. The chillers LCD display show the temperature setpoints for ITMX and ITMY to be 19.6C and 19.5C respectively.

The laser head temperatures reached an equilibrium value after about 45 minutes.

WHILE WE ARE TESTING THE SUMMING BOX AND CALIBRATING THE H1TCSCS MODEL, THE DAC OUTPUT SHOULD NOT BE TURNED ON.

H1 SUS
betsy.weaver@LIGO.ORG - posted 14:54, Wednesday 28 February 2018 (40556)
ETMY OSEM Open Light Voltages revisited

Updated Open Light Voltages (OSEMs sitting face down on surfaces so darkish):

M0 OSEM NEW OLV NEW OFFSET NEW GAIN OLD OFFSET OLD GAIN
F1                    27600 13800 1.087 -15109 0.993
F2 25600 12800 1.172 -14961 1.003
F3 27400 13700 1.095 -14740 1.018
LF 29300 14650 1.024 -15567 0.964
RT 26100 13050 1.149 -15396 0.974
SD 27500 13750 1.091 -15555 0.964
           
R0 OSEM NEW OLV NEW OFFSET NEW GAIN OLD OFFSET OLD GAIN
F1                    29000 -14500 1.034 -15529 0.966
F2 25000 -12500 1.200 -15029 0.998
F3 26100 -13050 1.149 -14977 1.002
LF 29100 -14550 1.031 -15397 0.974
RT 27000 -13500 1.111 -15525 0.966
SD 27100 -13550 1.107 -15141 0.991
           
L1 OSEM NEW OLV NEW OFFSET NEW GAIN OLD OFFSET OLD GAIN
UL                    27225 -13613 1.102 -14611 1.027
LL 28285 -14143 1.061 -15059 0.996
UR 24350 -12175 1.232 -12868 1.116
LR 20780 -10390 1.444 -11361 1.166
           
L2 OSEM NEW OLV NEW OFFSET NEW GAIN OLD OFFSET OLD GAIN
UL (#572) 18500 -9250 1.62 -10667 1.41
LL (kept #428) 22700 -11350 1.32 -12291 1.220
UR (#522) 22000 -11000 1.36 -10612 1.413
LR (#526 21400 -10700 1.40 -10163 1.476

 

Note, we swapped out 3 of the 4 L2 stage AOSEMs which were showing lower OLVs than we were satisfied with, especially since Stuart at LLO provided me back with the latest batch of 8 tested AOSEMs we had fabbed here which had slightly higher OLVs.  We removed AOSEM D0901065 S/Ns 321, 332, and 473.  These AOSEMs could be used elsewhere if needed.  ICS has been updated.

H1 SEI
hugh.radkins@LIGO.ORG - posted 12:35, Wednesday 28 February 2018 (40766)
WBSC2 BS HEPI V4 Actuator recentered

WP 7372 FRS 9743

As reported in LHO aLog 40171 and 40201, the Corner 4 vertical actuator on HEPI appeared to be very close to the plus-side mechanical stop and exhibited some clipping are larger strokes.  To fix this, the following was done:

With HEPI Isolating, corners 4 and 2 were mechanically locked. Managed this pretty well not tripping until well locked.

Partially installed the Actuator locking screws and positioned the 0.1" shims between the Tripod Base and the Top Foundation

Pressured support jack to hold Actuator just enough to prevent it from dropping.

Loosened the horizontal 1/2-20 SHCS holding the Actuator to the Actuator Brackets.

With those bolts loose, tightened the Actuator locking screws raising the actuator until the shims were clamped.  As I was able to start and turn the locking screws, I had confidence the Tripod assembly was not horizontally out of position and so did not need to loosen the vertical Actuator Bracket bolts.  Had that not been the case, this chore would have taken considerable more time as instead of just 6 bolts to loosen and then tighten, it would have been 18 with a couple iterations of tightening one group then the other and loosening the first to relieve strain.

Now with the Actuator effectively back in 'Installation' setup I was ready to zero the IPS.  Looking at the actuator/stop gap however suggested that the ~3000ct reading was appropriate and IPS was not zero'd.  The Actuator Locking Screws were removed (almost forgot one!) and the platform was unlocked.

At this point before and after positions were viewed on trends to determine new Isolation targets for the vertical dofs: the computed Z, RX, & RY positions changed with the change of V4.  The shifts of these were applied to the target locations to arrive at new target locations.

Computed shifts of platform before & after V4 centering
  Unlocked  -- V4 Shift Target Position
DOF Before After Change Before After
Z -10000 -85000 -75000 -47000 -122 or -92k
RX -69000 -21000 +48000 -78000 -30000
RY -93000 -47000 +56000 -96000 -40000

After re-isolating with these new computed values, it was clear the actuators were pulling down to isolate the platform at the new Z location.  So, I 'raised' the target position from -122k to -92k.  This served to put the actuator drives somewhat around zero rather than all negative; this in turn puts the free hanging platform much closer to the isolated position.  It also put IPSs in better balanced readout.  I can't imagine that 30um in vertical position will impact any beam centering but of course this can easily be recovered.  The RX & RY newly computed targets were very close to the free hanging tilts.  I can not explain why the Z was off enough to warrant further action.

WP closed; FRS Pending

H1 GRD
jameson.rollins@LIGO.ORG - posted 10:14, Wednesday 28 February 2018 - last comment - 09:38, Thursday 08 March 2018(40765)
starting process of moving guardian nodes to new guardian supervision host

We are setting up a new guardian host machine.  The new machine (currently "h1guardian1", but to be renamed "h1guardian0" after the transition is complete) is running Debian 9 "stretch", with all CDS software installed from pre-compiled packages from the new CDS debian software archives.  It has been configured with a completely new "guardctrl" system that will manage all the guardian nodes under the default systemd process manager.  A full description of the new setup will come in a future log, after the transition is complete.

The new system is basically ready to go, and I am now beginning the process of transferring guardian nodes over to the new host.  For each node to be transferred, I will stop the process on the old machine, and start it fresh on the new system.

I plan on starting with SUS and SEI in HAM1, and will move through the system ending with HAM6.

Comments related to this report
jameson.rollins@LIGO.ORG - 17:02, Saturday 03 March 2018 (40831)

There's been a bit of a hitch with the guardian upgrade.  The new machine (h1guardian1) has been setup and configured.  The new supervision system and control interface are fully in place, and all HAM1 and HAM2 SUS and SEI nodes have been moved to the new configuration.  Configuration is currently documented in the guardian gitlab wiki.

Unfortunately, node processes are occasionally spontaneously seg faulting for no apparent reason.  The failures are happening at a rate of roughly one every 6 hours or so.  I configured systemd to catch and log coredumps from segfaults for inspection (using the systemd-coredump utility).  After we caught our next segfault (which happened only a couple of hours later), Jonathan Hanks and I started digging into the core to see what we could ferret out.  It appears to be some sort of memory corruption error, but we have not yet determined where in the stack the problem is coming from.  I suspect that it's in the pcaspy EPICS portable channel access python bindings, but it could be in EPICS.  I think it's unlikely that it's in python2.7 itself, although we aren't ruling anything out.

We then set up the processes to be run under electric fence to try to catch any memory out-of-bounds errors.  This morning I found two processes that had been killed by efence, but I have not yet inspected the core files in depth.  Below are the coredump summaries from coredumpctl on h1guardian1.

This does not bode well for the upgrade.  Best case we figure out what we think is causing the segfaults early in the week, but there still won't be enough time to fix the issue, test, and deploy before the end of the week.  A de-scoped agenda would be to just do a basic guardian core upgrade in the existing configuration on h1guardian0 and delay the move to Debian 9 and systemd until we can fully resolve the segfault issue.

Here is the full list of nodes currently running under the new system:

HPI_HAM1        enabled    active    
HPI_HAM2        enabled    active    
ISI_HAM2        enabled    active    
ISI_HAM2_CONF   enabled    active    
SEI_HAM2        enabled    active    
SUS_IM1         enabled    active    
SUS_IM2         enabled    active    
SUS_IM3         enabled    active    
SUS_IM4         enabled    active    
SUS_MC1         enabled    active    
SUS_MC2         enabled    active    
SUS_MC3         enabled    active    
SUS_PR2         enabled    active    
SUS_PR3         enabled    active    
SUS_PRM         enabled    active    
SUS_RM1         enabled    active    
SUS_RM2         enabled    active    

If any of these nodes are show up white on the guardian overview screen it's likely because they have crashed.  Please let me know and I will deal with them asap.


guardian@h1guardian1:~$ coredumpctl info 11512
           PID: 11512 (guardian SUS_MC)
           UID: 1010 (guardian)
           GID: 1001 (controls)
        Signal: 11 (SEGV)
     Timestamp: Sat 2018-03-03 11:56:20 PST (4h 50min ago)
  Command Line: guardian SUS_MC3 /opt/rtcds/userapps/release/sus/common/guardian/SUS_MC3.py
    Executable: /usr/bin/python2.7
 Control Group: /user.slice/user-1010.slice/user@1010.service/guardian.slice/guardian@SUS_MC3.service
          Unit: user@1010.service
     User Unit: guardian@SUS_MC3.service
         Slice: user-1010.slice
     Owner UID: 1010 (guardian)
       Boot ID: 870fed33cb4446e298e142ae901c1830
    Machine ID: 699a2492538f4c09861889afeedf39ab
      Hostname: h1guardian1
       Storage: /var/lib/systemd/coredump/core.guardianx20SUS_MC.1010.870fed33cb4446e298e142ae901c1830.11512.1520106980000000000000.lz4
       Message: Process 11512 (guardian SUS_MC) of user 1010 dumped core.
                
                Stack trace of thread 11512:
                #0  0x00007f1255965646 strlen (libc.so.6)
                #1  0x00007f12567c86ab EF_Printv (libefence.so.0.0)
                #2  0x00007f12567c881d EF_Exitv (libefence.so.0.0)
                #3  0x00007f12567c88cc EF_Exit (libefence.so.0.0)
                #4  0x00007f12567c7837 n/a (libefence.so.0.0)
                #5  0x00007f12567c7f30 memalign (libefence.so.0.0)
                #6  0x00007f1241cba02d new_epicsTimeStamp (_cas.x86_64-linux-gnu.so)
                #7  0x0000556e57263b9a call_function (python2.7)
                #8  0x0000556e57261d45 PyEval_EvalCodeEx (python2.7)
                #9  0x0000556e5727ea7e function_call.lto_priv.296 (python2.7)
                #10 0x0000556e57250413 PyObject_Call (python2.7)
...

guardian@h1guardian1:~$ coredumpctl info 11475
           PID: 11475 (guardian SUS_MC)
           UID: 1010 (guardian)
           GID: 1001 (controls)
        Signal: 11 (SEGV)
     Timestamp: Sat 2018-03-03 01:33:51 PST (15h ago)
  Command Line: guardian SUS_MC1 /opt/rtcds/userapps/release/sus/common/guardian/SUS_MC1.py
    Executable: /usr/bin/python2.7
 Control Group: /user.slice/user-1010.slice/user@1010.service/guardian.slice/guardian@SUS_MC1.service
          Unit: user@1010.service
     User Unit: guardian@SUS_MC1.service
         Slice: user-1010.slice
     Owner UID: 1010 (guardian)
       Boot ID: 870fed33cb4446e298e142ae901c1830
    Machine ID: 699a2492538f4c09861889afeedf39ab
      Hostname: h1guardian1
       Storage: /var/lib/systemd/coredump/core.guardianx20SUS_MC.1010.870fed33cb4446e298e142ae901c1830.11475.1520069631000000000000.lz4
       Message: Process 11475 (guardian SUS_MC) of user 1010 dumped core.
                
                Stack trace of thread 11475:
                #0  0x00007fa7579b5646 strlen (libc.so.6)
                #1  0x00007fa7588186ab EF_Printv (libefence.so.0.0)
                #2  0x00007fa75881881d EF_Exitv (libefence.so.0.0)
                #3  0x00007fa7588188cc EF_Exit (libefence.so.0.0)
                #4  0x00007fa758817837 n/a (libefence.so.0.0)
                #5  0x00007fa758817f30 memalign (libefence.so.0.0)
                #6  0x00005595da26610f PyList_New (python2.7)
                #7  0x00005595da28cb8e PyEval_EvalFrameEx (python2.7)
                #8  0x00005595da29142f fast_function (python2.7)
                #9  0x00005595da29142f fast_function (python2.7)
                #10 0x00005595da289d45 PyEval_EvalCodeEx (python2.7)
...
jameson.rollins@LIGO.ORG - 20:09, Wednesday 07 March 2018 (40891)

After implementing the efence stuff above, we came in to find more coredumps the next day.  On a cursory inspection of the coredumps, we noted that they all showed completely different stack traces.  This is highly unusual and pathological, and prompted Jonathan to question the integrity of the physical RAM itself.  We swapped out the RAM with a new 16G ECC stick and let it run for another 24 hours.

When next we checked, we discovered only two efence core dumps, indicating an approximate factor of three increase in the mean time to failure (MTTF).  However, unlike the previous scatter shot of stack traces, these all showed identical "mprotect" failures, which seemed to point to a side effect of efence itself running in to limits on per process memory map areas.  We increased the "max_map_count" (/proc/sys/vm/max_map_count) by a factor of 4, again left it running overnight, and came back to no more coredumps.  We cautiously declared victory.

I then started moving the remaining guardian nodes over to the new machine.  I completed the new setup by removing the efence, and rebooting the new machine a couple of times to work out the kinks.  Everything seemed to be running ok...

Until more segfault/coredumps appeared sadangrycryingno.  A couple of hours after the last reboot of the new h1guardian1 machine, there were three segfaults, all with completely different stack traces.  I'm now wondering if efence was somehow masking the problem.  My best guess there is that efence was slowing down the processes quite a bit (by increasing system call times) which increased the MTTF by a similar factor.  Or the slower processes were less likely to run into some memory corruption race condition.

I'm currently running memtest on h1guardian1 to see if anything shows up, but it's passed all tests so far...

jameson.rollins@LIGO.ORG - 09:38, Thursday 08 March 2018 (40896)

16 seg faults overnight, after rebooting the new guardian machine at about 9pm yesterday.  I'll be reverting guardian to the previous configuration today.

Interestingly, though, almost all of the stack traces are of the same type, which is different than what we were seeing before where they're all different.  Here's the trace we're seeing in 80% of the instances:

                #0  0x00007ffb9bfe4218 malloc_consolidate (libc.so.6)
                #1  0x00007ffb9bfe4ea8 _int_free (libc.so.6)
                #2  0x000055d2caca7bc5 list_dealloc.lto_priv.1797 (python2.7)
                #3  0x000055d2cacdb127 frame_dealloc.lto_priv.291 (python2.7)
                #4  0x000055d2caccb450 fast_function (python2.7)
                #5  0x000055d2caccb42f fast_function (python2.7)
                #6  0x000055d2caccb42f fast_function (python2.7)
                #7  0x000055d2caccb42f fast_function (python2.7)
                #8  0x000055d2cacc3d45 PyEval_EvalCodeEx (python2.7)
                #9  0x000055d2cace0a7e function_call.lto_priv.296 (python2.7)
                #10 0x000055d2cacb2413 PyObject_Call (python2.7)
                #11 0x000055d2cacf735e instancemethod_call.lto_priv.215 (python2.7)
                #12 0x000055d2cacb2413 PyObject_Call (python2.7)
                #13 0x000055d2cad69c7a call_method.lto_priv.2801 (python2.7)
                #14 0x000055d2cad69deb slot_mp_ass_subscript.lto_priv.1204 (python2.7)
                #15 0x000055d2cacc6c5b PyEval_EvalFrameEx (python2.7)
                #16 0x000055d2cacc3d45 PyEval_EvalCodeEx (python2.7)
                #17 0x000055d2cace0a7e function_call.lto_priv.296 (python2.7)
                #18 0x000055d2cacb2413 PyObject_Call (python2.7)
                #19 0x000055d2cacf735e instancemethod_call.lto_priv.215 (python2.7)
                #20 0x000055d2cacb2413 PyObject_Call (python2.7)
                #21 0x000055d2cad69c7a call_method.lto_priv.2801 (python2.7)
                #22 0x000055d2cad69deb slot_mp_ass_subscript.lto_priv.1204 (python2.7)
                #23 0x000055d2cacc6c5b PyEval_EvalFrameEx (python2.7)
                #24 0x000055d2cacc3d45 PyEval_EvalCodeEx (python2.7)
                #25 0x000055d2cace0a7e function_call.lto_priv.296 (python2.7)
                #26 0x000055d2cacb2413 PyObject_Call (python2.7)
                #27 0x000055d2cacf735e instancemethod_call.lto_priv.215 (python2.7)
                #28 0x000055d2cacb2413 PyObject_Call (python2.7)
                #29 0x000055d2cad69c7a call_method.lto_priv.2801 (python2.7)
                #30 0x000055d2cad69deb slot_mp_ass_subscript.lto_priv.1204 (python2.7)

Here's the second most common trace:

                #0  0x00007f7bf5c32218 malloc_consolidate (libc.so.6)
                #1  0x00007f7bf5c32ea8 _int_free (libc.so.6)
                #2  0x00007f7bf5c350e4 _int_realloc (libc.so.6)
                #3  0x00007f7bf5c366e9 __GI___libc_realloc (libc.so.6)
                #4  0x000055f7eaad766f list_resize.lto_priv.1795 (python2.7)
                #5  0x000055f7eaad6e55 app1 (python2.7)
                #6  0x000055f7eaafd48b PyEval_EvalFrameEx (python2.7)
                #7  0x000055f7eab0142f fast_function (python2.7)
                #8  0x000055f7eab0142f fast_function (python2.7)
                #9  0x000055f7eab0142f fast_function (python2.7)
                #10 0x000055f7eab0142f fast_function (python2.7)
                #11 0x000055f7eaaf9d45 PyEval_EvalCodeEx (python2.7)
                #12 0x000055f7eab16a7e function_call.lto_priv.296 (python2.7)
                #13 0x000055f7eaae8413 PyObject_Call (python2.7)
                #14 0x000055f7eab2d35e instancemethod_call.lto_priv.215 (python2.7)
                #15 0x000055f7eaae8413 PyObject_Call (python2.7)
                #16 0x000055f7eab9fc7a call_method.lto_priv.2801 (python2.7)
                #17 0x000055f7eab9fdeb slot_mp_ass_subscript.lto_priv.1204 (python2.7)
                #18 0x000055f7eaafcc5b PyEval_EvalFrameEx (python2.7)
                #19 0x000055f7eaaf9d45 PyEval_EvalCodeEx (python2.7)
                #20 0x000055f7eab16a7e function_call.lto_priv.296 (python2.7)
                #21 0x000055f7eaae8413 PyObject_Call (python2.7)
                #22 0x000055f7eab2d35e instancemethod_call.lto_priv.215 (python2.7)
                #23 0x000055f7eaae8413 PyObject_Call (python2.7)
                #24 0x000055f7eab9fc7a call_method.lto_priv.2801 (python2.7)
                #25 0x000055f7eab9fdeb slot_mp_ass_subscript.lto_priv.1204 (python2.7)
                #26 0x000055f7eaafcc5b PyEval_EvalFrameEx (python2.7)
                #27 0x000055f7eaaf9d45 PyEval_EvalCodeEx (python2.7)
                #28 0x000055f7eab16a7e function_call.lto_priv.296 (python2.7)
                #29 0x000055f7eaae8413 PyObject_Call (python2.7)
                #30 0x000055f7eab2d35e instancemethod_call.lto_priv.215 (python2.7)

 

LHO VE
chandra.romel@LIGO.ORG - posted 08:48, Wednesday 28 February 2018 (40764)
Corner pump down progress

Pump down curve attached. XBM and YBM are isolated.

Images attached to this report
H1 SQZ (SQZ, VE)
sheila.dwyer@LIGO.ORG - posted 21:54, Tuesday 27 February 2018 (40762)
HAM6 fiber feedthroughs working

Hugh, Alvaro, TJ, Sheila

This afternoon Alvaro and TJ routed the fibers in HAM6 and Hugh installed the fiber feedthroughs.  Alvaro and I used the 532 nm (eye safe) fiber laser to check the transmission.  We measured 5.7 mW out of the fiber laser, 5mW out of the 532nm collimator and 3mW out of the CLF/seed collimator.  The fiber laser power might have been fluctuating during our measurements, but the fibers are working.

The feedthrough labeled SN8 is on the flange D4-2 connected to the fiber (SN..) which goes to the 532nm pump path. Feedthrough SN9 is on the flange D4-3, inside the chamber it is connected to fiber SN _ which goes to the collimator for the seed/clf path.  

This alog will be updated in the morning with the fiber serial numbers, and after I double check that I have the flange numbers correct. 

H1 CDS
david.barker@LIGO.ORG - posted 14:35, Tuesday 27 February 2018 - last comment - 10:18, Thursday 01 March 2018(40746)
h1iscey glitched, we are negotiating recovery with EY install

h1iscey front end glitched at 14:15 PST. We are holding off on its restart until we contact EY group.

Comments related to this report
david.barker@LIGO.ORG - 14:48, Tuesday 27 February 2018 (40748)

killed and started all  models on h1iscey with EY permission.

keith.thorne@LIGO.ORG - 07:49, Wednesday 28 February 2018 (40763)CDS
I have seen glitches on my test stand H1-style ISCEX machine here at LLO (actually quite frequently).  It persists even with the GE FANUC RFM removed.  I have not tried it in an L1-style model mix yet.
betsy.weaver@LIGO.ORG - 10:18, Thursday 01 March 2018 (40788)

We believe this was physically due to brushing equipment past the cables which loop out of the front of the rack at the end station.  Note, these racks are in the middle receiving bay so frequently see traffic traverse in and out of the VEA.

LHO VE (VE)
gerardo.moreno@LIGO.ORG - posted 13:56, Tuesday 27 February 2018 - last comment - 15:42, Wednesday 28 February 2018(40744)
X-End IP12 Removed

(Mark D, Mark L, Gerardo M)

Ion pump 12 was removed from beam tube, and the vacuum port on the beam tube was covered with a blank.  The ion pump's port was covered also with a blank.  Varian SQ051 number 70095.

Comments related to this report
gerardo.moreno@LIGO.ORG - 15:42, Wednesday 28 February 2018 (40769)VE

Measured dew point at -43.7 oC after IP12 removal.

H1 SUS
betsy.weaver@LIGO.ORG - posted 11:05, Tuesday 27 February 2018 - last comment - 16:35, Wednesday 28 February 2018(40739)
500 day trends of Face BOSEMs in corner station

Attached is a 500 day trend of 16 face BOSEMs from a few different randomly sampled suspensions in the corner station.

All of the sampled BOSEMs see 200-600 counts of decay over ~365 days of the plotted data.

Note, "face" facing BOSEMs on different types of suspensions have different reference names (T1 vs F1 vs RT BOSEMs are mounted in different locations on the different types of suspensions).  For reference, search "Controls Arrangement Poster" by J. Kissel to see all of the configurations (E1100109 is the HLTS one for example).  Or see the rendering on each medm screen.

Images attached to this report
Comments related to this report
betsy.weaver@LIGO.ORG - 11:08, Tuesday 27 February 2018 (40740)

Attached is the first plot I made of a few different randomly sampled suspensions, which included some vertically mounted BOSEMs.  These trends are plotted ion brown and show other factors such as temperature in their shape over the last 500 days.  Of the remaining red "face" mounted BOSEMs on the plot, all 11 show a downward trend of a couple hundred counts.

Images attached to this comment
stuart.aston@LIGO.ORG - 16:35, Wednesday 28 February 2018 (40774)
Using the same random selection of face OSEM channels as Betsy in the original aLOG entry above, but for LLO, 500 day trends are attached below.

OSEM open-light decay trends appear similar between sites, with in general 100-600 counts of decay over ~500 days of the plotted data.

However, it should be noted that the IM suspensions also included in the trends employ AOSEMs, and not BOSEMs, but the decay trends for both types of OSEMs appear to be consistent.
Images attached to this comment
Displaying reports 46781-46800 of 86425.Go to page Start 2336 2337 2338 2339 2340 2341 2342 2343 2344 End