My previous entry about the PRMI branch in the ISC_LOCK guardian isn't as clear as it could be, so here is a newer version.
If you find that DRMI won't lock and seems to be missalinged (you see a lot of higher order modes on the AS camera, POP90 and POP18 flashes are low) you may want to lock PRMI to adjust alignment manually:
The transition from PRMI to DRMI only works sometimes, so this is probably best used only when the alignment is bad for now. (Its not a reliable way to speed up DRMI locking).
As Patrick reported, the Guardians for HAM2 & 3 ISI were reporting some issue; TJ is looking into that more deeply now. Regarding the Guardian update I did on 24 November, that update was only for common files and the code had been exercised on most platforms as of 1 Dec. It had not been exercised on HAM 2 3 4 6 BS or ITMX before this morning according to my notes.
This morning I decided to restart the guardian for HAMs 2 & 3 and the error condition went away. Still don't know why there was the error condition.
Next, both the ISIs were tripping during the Isolation process. It looked like this was occurring as the ISI was switching the GS13 back to high gain. I disabled the Sensor Correction thinking the high microseism might be the problem but tripping was still occurring. So I disabled the guardians, SEI and ISI, and brought the ISI to Isolation using the commands scripts. This way the GS13s could be kept in low gain. I was then able the switch the GS13 back to high gain after giving the platforms a few moments after Isolation was completed.
I tested the guardian this morning on HAM5 and it worked fine so while the first reported error gives us reason to suspect, the fundamental performance of the guardian seems to function. Again, while there are runtime parameter files unique for each ISI (although they are generally the same,) the guardian are all running the same common code. Isolation is the same on a HAM and the two stages of the BSC.
My first theory would be that the position offsets on HAMs 2 & 3 are larger than those for the other platforms. This may cause the GS13s to be noisier after the completion of Isolation just as the GS13s are switched causing the trip. However, HAM6 verticals are about 5 to 10 times that of all the others and the HAM2 & 3 horizontals are maybe only a little larger than the HAM5 drives. So, that may be the issue but it looks to be suttle. Does the payload contribute with the high microseism and the GS13 switch to trip the platform? Maybe except the HAM3 payload is much different from HAM5 and is much different from HAM2. What about the alignment wrt IFO? The HAM2 & 3 platforms, both HEPI and ISI are rotated compared to the other HAMs. What about FeedForward? HAM2 & 3 are getting X & RY FF from HEPI as is HAM6 but HAM6 GS13s aren't switched like HAM 2 & 3. HAM4 & 5 are getting their FF signals from the ISI Stage0 L4Cs.
So, could the alignment of HAM2 & 3 make them more vulnerable to the incoming high microseism? The BLRMS would not suggest this and the arrangement of the arms (NW & SW) would indicate the coastal noise woould be the same for the arms.
So the FF may be the issue. The guardian update also turns on the FF but this has proven to be a non issue in turn on transients but the microseism environment may be contributing.
A couple of things:
Remember that just because different systems are all running the same common code, that doesn't mean they all excersize the same code paths. The actual code run by the BSC ISI stages may be slightly different than what is run by the HAMs. Different functions may be executed. In some cases the different parameters given to the different chamber guardians actually specify which different code paths should be executed.
If a guardian node throws an error, a restart may cause the immediate issue to clear by resetting the state of the system, but it doesn't actually fix the error. If no code has been changed, then the error will necessarily occur again when the same conditions are met.
Study the exception traceback message, since usually that's pointing to exactly what the problem is. In this case, the error was:
2015-12-07T11:12:59.06689 ISI_HAM3 [INIT.main] determining how to get to defined state...
2015-12-07T11:12:59.08430 ISI_HAM3 W: Traceback (most recent call last):
2015-12-07T11:12:59.08432 File "/ligo/apps/linux-x86_64/guardian-1485/lib/python2.7/site-packages/guardian/worker.py", line 459, in run
2015-12-07T11:12:59.08433 retval = statefunc()
2015-12-07T11:12:59.08433 File "/opt/rtcds/userapps/release/isi/common/guardian/isiguardianlib/ISI_STAGE/states.py", line 52, in main
2015-12-07T11:12:59.08434 error_message = isolation_util.check_if_isolation_loops_on(control_level)
2015-12-07T11:12:59.08434 File "/opt/rtcds/userapps/release/isi/common/guardian/isiguardianlib/util.py", line 30, in wrapper
2015-12-07T11:12:59.08435 .format(nth=const.NTH[arg_number].lower(), func=func.__name__, allowed=allowed, passed_arg=args[arg_number]))
2015-12-07T11:12:59.08435 WrongArgument: The 1st argument of check_if_isolation_loops_on must be in {'HIGH': {'INDEX': 400, 'MAIN': ('FM4', 'FM5', 'FM6', 'FM7'), 'BOOST': ('FM8',)}}. Passed ROBUST.
This is telling you exactly what/where the error is: the check_if_isolation_loops_on function is getting an incorrect input argument when called during the INIT state.
Attached are the oplev pitch, yaw, and sum trends for the last 7 days for all active H1 optical levers.
Hannah Fair, Stefan Ballmer Some of the ASC and LSC ODC bits that were not in the master bit mask were occasionally triggering mid-lock, without any obvious data degradation. I updated the thresholds listed below. Since the IFO is currently not locked, the SDF is currently very red. THE SDF PROBABLY NEEDS TO BE UPDATED WITH THESE THRESHOLDS. # ASC ezca['ASC-ODC_REFL_A_DC_LT_TH'] = -150 ezca['ASC-ODC_REFL_B_DC_LT_TH'] = -150 ezca['ASC-ODC_REFL_A_DC_GT_TH'] = -1000 ezca['ASC-ODC_REFL_B_DC_GT_TH'] = -1000 ezca['ASC-ODC_REFLA_DC_PIT_LT_TH'] = 0.3 ezca['ASC-ODC_REFLA_DC_YAW_LT_TH'] = 0.3 ezca['ASC-ODC_REFLB_DC_PIT_LT_TH'] = 0.3 ezca['ASC-ODC_REFLB_DC_YAW_LT_TH'] = 0.3 ezca['ASC-ODC_CHARD_PIT_LT_TH'] = 1000 ezca['ASC-ODC_CHARD_YAW_LT_TH'] = 1000 ezca['ASC-ODC_DHARD_PIT_LT_TH'] = 3500 ezca['ASC-ODC_DHARD_YAW_LT_TH'] = 3500 ezca['ASC-ODC_PRC1_PIT_LT_TH'] = 2000 ezca['ASC-ODC_PRC1_YAW_LT_TH'] = 2000 # LSC ezca['LSC-ODC_MICHCTRL_LT_EQ_TH'] = 300000 ezca['LSC-ODC_MICHCTRL_GT_EQ_TH'] = -300000 Old values: ASC-ODC_REFL_A_DC_LT_TH -150.0 ASC-ODC_REFL_B_DC_LT_TH -300.0 ASC-ODC_REFL_A_DC_GT_TH -900.0 ASC-ODC_REFL_B_DC_GT_TH -750.0 ASC-ODC_REFLA_DC_PIT_LT_TH 0.2 ASC-ODC_REFLA_DC_YAW_LT_TH 0.2 ASC-ODC_REFLB_DC_PIT_LT_TH 0.15 ASC-ODC_REFLB_DC_YAW_LT_TH 0.15 ASC-ODC_CHARD_PIT_LT_TH 750.0 ASC-ODC_CHARD_YAW_LT_TH 750.0 ASC-ODC_DHARD_PIT_LT_TH 2500.0 ASC-ODC_DHARD_YAW_LT_TH 2500.0 ASC-ODC_PRC1_PIT_LT_TH 1500.0 ASC-ODC_PRC1_YAW_LT_TH 1500.0 LSC-ODC_MICHCTRL_LT_EQ_TH 150000.0 LSC-ODC_MICHCTRL_GT_EQ_TH -150000.0
Overview Meeting Notes:
Tasks For Tomorrow's Maintenance
/* waffle, random thoughts & meanderings below */ Plots of the reference cavity and pre-modecleaner transmission for the past day. slope of reference cavity transmission 0.27 mW/day slope of pre-modecleaner transmission 110 mW/day slope of QPDY 0.002 microns/day slope of QPDX 0.003 microns/day We should crunch some numbers to see if this is consistent with the beam change out of the pre-modecleaner, The laser output power actually fell over the same time interval. The diffraction average has decreased to suit. Comparison of the cavity spots over the past week can be seen in AlignmentShift.gif.
TITLE: 12/7 DAY Shift: 16:00-00:00UTC (08:00-16:00PDT), all times posted in UTC
STATE of H1: DOWN & HAM2/3 ISIs with Guardian issues being looked at.
Outgoing Operator: Patrick
Quick Summary: useism still elevated (LVEA at 1um/s, but might be trending down slightly if you squint at it). The earthquake band (0.03-0.1Hz band looks like it's finally returned to normal levels at about 14:00utc (6amPST). Hugh is working on HAM2/3 ISI recovery.
IFO has been down for 20+hrs.
O1 day 80
model restarts logged for Sun 06/Dec/2015 No restarts reported
TITLE: 12/7 [OWL Shift]: 08:00-16:00 UTC (00:00-08:00 PDT), all times posted in UTC STATE Of H1: Unlocked SHIFT SUMMARY: Earthquakes tripped most of the watchdogs at the beginning of the shift. Brought everything back except the HAM2 ISI and HAM3 ISI which had guardian user code error messages. Hugh is investigating. SUPPORT: Hugh INCOMING OPERATOR: Corey
Ran into guardian user error code messages for HAM2 and HAM3 ISI. Have done some investigating but don't dare change anything. Suspect it may be related to changes reported in alog 23695. I haven't found a way to get around these.
Noticed SDF differences related to the BS ISI stage 2 GS13. Clicking on LO cleared them.
Guardian user code error after resetting HAM2 ISI watchdog. (see attached screenshot)
Hit LOAD, same error returned.
Same error on HAM3 ISI.
Hitting LOAD on HAM3 ISI does not work either.
No error on HAM6 ISI.
No error on HAM5 ISI.
No error on HAM4 ISI.
No error on ETMX ISI.
ETMY Stage 1 T240 watchdogs tripped in the process of going to isolated.
Everything seems back except for HAM2 ISI and HAM3 ISI. These both have guardian user code error messages. This is preventing the IMC from locking. The ISC_LOCK guardian is stuck in the DOWN state waiting for the IMC.
The above instructions have changed, only in that the name of the state to request is now PRMI_LOCKED:
If you find that DRMI won't lock and seems to be missalinged (you see a lot of higher order modes on the AS camera, POP90 and POP18 flashes are low) you may want to lock PRMI to adjust alignment manually:
The transition from PRMI to DRMI only works sometimes, so this is probably best used only when the alignment is bad for now. (Its not a reliable way to speed up DRMI locking).