Reseated 10GbE card on h2daqfw1 (slot 5) which seems to have cleared up the bus timeout messages from dmesg. That should be fixed, unless issues return. I forgot to add in the previous report that disabling the option ROMs in the BIOS for all slots except 0 (the RAID controller), and removing the network cards from the boot list considerably improves the boot time. Counting memory on startup is still slow, as is applying power for the first time (the iLOM controller must boot first in that case). Will see if there is some setting for the iLOM that can get around this. Have also finished installing the cards in the H1 DAQ x4270s, and installed Solaris 10 so they are also ready to go (minus user accounts, iLOM setup, etc.). Did not see any glaring hardware issues. These two machines are now re-boxed in the MSR.
This is simply a test of the add report page. The author field is frozen now (mostly).
Just a quick reminder. The logbook will be undergoing scheduled maintenance today from 10am-noon Pacific time. If you have reports that will span that time gap, please save them as a draft or publish them before 10am. You can finish them after noon (pacific).
On Thursday July 1, I modified the DAQ .ini file located in '/opt/rtcds/geo/g1/chans/daq' named 'G1ISIHAM.ini'. This new file has added some needed channels to acquire data and write frame files on the test stand front end. I re-loaded the DAQ and re-booted the framebuilder. The complete list of available DAQ channels that are currently writing frame files is attached.
- We have completed the set of Local to Local measurements - We have made a full of Modal to Modal measurements today - The attached document shows the transfer functions and summarizes all the parameters of data acquisition - We start analyzing the results. At least two things catch attention and need to be checked: the resonance at 24Hz in the GS13, and the phase of the CPS at low frequencies. - We are taking more data with a finer resolution and more averages.
- awgstream issues have been solved by the installation of the new version of awgtpman. - about the get_data issue, Dave has created some spare space and installed the wiper script to prevent the disk from being full again. - we run transfer functions this afternoon, everything went well. We have got a first complete set of transfer functions in the local basis. It is attached. Few comments about these transfer functions: - Low frequency data is noisy. We will increase the number of averages - Transfer functions are going to be compared with HAM 6 measurements - There is and undesired resonance at 24Hz We are preparing a set of measurements in the general coordinate basis for tonight.
The /frames file system was 100% full, which means no new frames
were being written by the daq.
I have written a primitive script to just delete the oldest full frame
and second trend frame directories if /frames gets bigger than 95%.
The script runs as a cron job, as user controls on seiteststand
# Every hour run script to keep /frames from filling
# check /tmp/daq_wiper.log for its logs
0 * * * * /home/controls/crontabs/daq_wiper.pl
This would apply to all probes as in HAM, BSC etc. We found the MicroSense did not report the standoffs and when asked were surprised it was not 2mm. They do record them but don't list the number on the calibration sheet. I'm asking for them all but to-date only have a few that they told me. One was listed as 2.259mm (89mil) and four of the seven were greater than 2.16mm! They will 'mess' with them if they are too far below the nominal and I don't know all the numbers yet. This was a source of much of our confusion when we were jigging these in to get a noise spectra. Eric ended up using 6 10mil shims on the test jig to get the outputs near zero. Bottom line here is I wanted to post an email explanation given by MicroSense showing why our gaps are not exactly 2mm. Sounds like we could specify that if we think it is needed but apparently we did not do so. From Roy Mallory 1 July 2010: Hi Hugh, We calibrate to achieve a particular scale factor and to minimize linearity error, but don't attempt to tweak the standoff. The nominal standoff is determined by design parameters, but a number of things, like component tolerances, can cause it to be slightly different from unit to unit. It's also tricky to measure and define. For example, is the standoff the shortest line from the center of the probe to the target? Is it the distance between the point on the target and the point on the probe that will first contact if the two are translated until they touch? If the probe and target aren't perfectly parallel, the standoff will be affected, but is that the standoff to report? What we do--mostly because it's simple--is to run the probe and target together until they touch. The target, which is the back side of a mirror, is held to its mover by a light magnetic force. The front side of the mirror is monitored by an interferometer. We note the interferometer's reading with the mirror resting against the probe face, and then move the mirror until the cap gage is at the center of its range, noting the amount the interferometer reading has changed. This method will be affected by parallelism, by any nonflatness in the probe, and by any mote of dust on the probe or mirror. Our gages do have a front-panel zero control that allows the standoff to be adjusted, however most of our customers want incremental accuracy and aren't overly concerned about absolute standoff. Does the LIGO application require some particular accuracy in standoff? If so, we can adjust the zero control at the time of calibration to achieve your desired standoff. Roy
1) Alex has installed the latest code last night to get the new version of awgtpman going. We hope this solves the drive issue we were having. I did several quick tests (1 mn long). Everything seemed to be working fine. 2) I have started longer measurements (6 loops of 12 mn each). And I have been having issues with get_data. I have increased the number of attempts to 25. It helps but doesn't solve the issue. The run ends up to crash. I copied the error message below. ------------------------------------------------ WARNING: attempt #24 to get drive failed wait 3 sec and try again Getting data from nds0:8088... Data received! WARNING: attempt #25 to get drive failed wait 3 sec and try again ERROR: unable to get data for this drive segment after multiple tries! giving up the data FRD has the correct frequency vector, but The DATA and DATA QUALITY frds for this segment will all BE SET TO ZERO
It looks like we have been having two type of issues: 1- several attempts are sometimes necessary to get the data. That's what I was seeing during the quick tests. 2- The second issue is that the disk got full during the long test. Dave is fixing it. We should be back to system identification shortly.
Corey is still having issues with his name not being pre-populated in the author field of an entry. This does not happen all the time. He reports that it happened again last night/this morning. One of the details that he gives is that it happened with a browser tab that had been open to the alog for a long period of time. This may be an issue of the Shibboleth session timing out. This may be a bad interaction of lazy binding (which allows us to do anonymous read-only access) and the alog. If this is the case it would explain corey's issue. There are ways to mitigate this, by extending the session length for Shibboleth, or making the alog more Shibboleth aware. I will test this on gold, the test alog server. There I have the freedom to make the session timeout very small without interfering with others.
At 3:00pm Pacific time 1 July 2010 I will restart the Shibboleth daemon on the alog. This is an attempt to address the issues the have been seen by Corey Gray regarding an author field not being filled out. Restarting the Shibboleth daemon may require you to hit the login button again. However it should not damage any posts you are working on. However please save your posts to a draft just in case. I will be extending the Shibboleth session time to a large value. What is happening with Corey is that his Shibboleth session is expiring after 24hrs. However the alog still has enough information to identify him, so the posts are allowed, but the author name drops out. This change can be done very quickly, without requiring new code to be written, so we will try this first. Scott Koranda and myself are curious about how this will work out, ie will having a SP session that is much longer than the IdP session cause any problems. Interesting times.
The Shibboleth daemon has been restarted. This is a test entry. I realize I restart the wrong config value, the shib daemon was restarted around 3:30 with the proper settings. If you see a case where the author name is not filled out. Please fill it in with your ligo.org name (including the @LIGO.ORG) and post (or save to draft) the entry you were working on. Then logout and log back in. This problem should not present itself again. Sorry for the hastle.
Tonight, I had hopes of starting one of Fabrice's transfer functions overnight. Unfortunately, I was not able to because I was never able to clear the Watchdog. As for what seemed to be causing the trips, for the most part, the H1&V1 Actuators would immediately rail to 32k. This would cause two things: an Actuator WD trip & an H1/V1 Geophone WD trip. Additionally the Actuator would remain in RED in the "First Trig" state. I tried various tricks with gains, and turning off input/outputs to no avail. Looked at the rack and nothing was amiss. Reboot of seiteststand At this point, I performed a reboot of the frontend. It was straightforward and there were no issues, but it didn't change the situation. Matrices Re-filled All of our filter bank paths are fairly simple (no filters engaged and gains of 1 all around). Oddities I found were in some of the matrices. In the CONT2ACT matrix there was a 1 in a place there wasn't supposed to be a 1 and one of the matrix elements had a sign different from what it should be as well (see attached image of cont2act matrix). Because of this I decided I might as well run the scripts to fill the matrices. One more note: the DispAlign matrix was also empty. So, I put 1's down the diagonal of this guy as well. Matrix Set-Up Scripts Jeff Kissel has some bash scripts which fill the HAM-ISI matrices (they are the ones used to fill the HAM6 ISI for both sites). Since Dave copied over the epics bin area to the seiteststand, we were now able to run bash scripts (as well as use commands such as "caput" & "caget"). Nic helped me edit the scripts to make them work (among some location-dependent changes, commented out the --noprofile --norc values at the beginning of the script. I ran the scripts for the following matrices: Disp2Cen, Geo2Cen, & Cont2Act. After all of this, I was still not able to clear the Watchdog (or run our measurement). I've attached a snapshot of the Watchdog. Notice the "First Trig" of the Actuators (and how the Actuator Monitors are all 0). As soon as I click RESET, these monitors would show H1/V1 rail to 32k.
Just thought I'd add the location of these bash matrix scripts (Disp2Cen, Geo2Cen, & Cont2Act). They are located at: /opt/svncommon/seisvn/seismic/HAM-ISI/X1/Scripts/
We are having troubles with the drive generated by awg. For some period of times it goes "weird". It can trip the watchdogs as illustrated on the attached document. On this example: - the drive is good at the beginning - then it starts doing unwanted high frequency stuff - then it does other unwanted "oscillations" at frequencies coinciding sufficiently with the HAM-ISI resonance to trip the watch dogs. This is preventing us from completing our measurements.
cut-paste from David's email 03:07 30/6/2010 1) Threads, displayed as a tree format so that you can follow threads, with the tree organized either alphabetically or by date (and with filters for author, entry age, etc.), one could follow what has been written in a recent thread about the system they are working on and then add another entry to that thread. No need for users to type a matching keyword, since the "Add Entry To This Thread" does that automatically. One would like to be able to title a thread -- this could just be the first entry title, but it would be useful to be able to edit that to cover the thread content and to make it easier to search for 2) searching across the installations (certainly LLO and LHO) is needed. I think it would be neat to be able to search the Virgo Cascina installation as well -- that is a question of international politics, I suspect. We would need a pull-down list to know which one(s) were being searched, and would want to be able to identify several or all. 3) reduced Recent Author list for searches, and autocompletion on author names, ideally making reference to the recent author list to order options 4) can an individual's input and search settings be remembered for the next login from that individual? Hugh works on SEI at LHO -- would be a nice way to get things on the right foot. Could be a big thing software wise, but would reduce a lot of the effort in getting going. 5) notifications: ability to put in aliases here (maybe trivial, could be maintained elsewhere), but would allow subsystems and the team to be notified for certain entries SNS has these 'features'; are there some we want to plan to integrate? 1. BulletIntegrates with work orders -- if nothing else, having all work orders appear as logbook entries would be quite valuable, I think. 2. BulletGroup notification, daily orders and required readings 3. BulletUser bookmarked entries Also, one can filter at the view page by ordered posting time, author, or priority, a nice feature. Their initial data message entry keyword/category input looks like this. I think something like this would be very good for us; sounds like a database dimensionality question, though, so to be resolved soon if so. Breaking things down into more columns seems attractive. entry types: Progress (default), Problem, Repair, Maintenance, Status Categories: Safety (might want to impose notification to all recent authors, or all authors, or something for this entry?) QA SYS IOO PSL SUS SEI COC AOS ISC DAQ DCS FMP INS Location: L1, H1, H2, LHO, LLO, MIT, CIT, Other Priority: Urgent, high, normal (default)