Reports until 18:59, Thursday 24 September 2015
H1 INJ (DetChar, INJ)
christopher.biwer@LIGO.ORG - posted 18:59, Thursday 24 September 2015 - last comment - 20:52, Thursday 24 September 2015(21924)
End of impromptu hardware injection testing and update to scheduling file
Adam, Alex U., Chris, Mike L., Eric

With a bit of effort we have successfully scheduled and injected a coherent CBC hardware injection. We have scheduled three coherent hardware injections for later tonight at 1127179817, 1127181617, and 1127183417. Those injections are the last of the night.

Here I will summarize what has happened.

All attempted/successful injections used the H1L1 coherent waveform from aLog 21838.

The are a couple differences between the sites. LLO uses monit to run tinj, whereas LHO does not. LLO uses the new hardware injection SVN, whereas LHO uses the DAC SVN still. The directory to run tinj at both sites is /home/hinj/Details/tinj.

Note that the times below are not the end time of the inspiral but rather this is the start time of injecting the ASCII file. The end time of the inspiral is listed on the gracedb page for the hardware injection and is ~6.986 seconds after the injection was scheduled.

Adam and Chris set out to do a H1L1 coherent hardware injection. See 21901.

We initially tried to scheduled a hardware injection at 1127175848. However the first step, i.e. to upload a GraceDB entry did not work. Chris was given a permissions warning and denied to upload to the CBC group.

The second attempt was scheduled for 1127168901. This was unsuccessful.
  * A gracedb page was generated for this hardware injection H1 and L1. No explanation for why Chris had permission this time.
  * At LLO the wrong schedule file was updated so there was no injection at LLO.
  * At LHO tinj had stopped before the time of the injection so there was no injection at LHO.

We then stopped the tests and went to the hardware injection telecon.

On the hardware injection telecon we had a group debug session.

We had to get tinj running at LHO again. To do this we did:
cd /home/hinj/Details/tinj
./run_tinj &

After running this command tinj was running again with process ID 11229. We tailed the log file (/home/hinj/Details/tinj/tinj.log) and saw that it was updating again.


Even though tinj is running in the background it still prints output to stdout. So Chris logged out of h1hwinj1 and logged back into h1hwnj1. tinj was still running.

The thought for the tinj failing was that when Chris edited the schedule file (/home/hinj/Details/tinj/schedule) the MCR environment that was used for testing monit received an error. Eric had checked the run_tinj log (/home/hinj/Details/tinj/run_tinj.log); note run_tinj is a wrapper to run tinj in monit. The log said:
ERROR: Unexpected error in tinj.
GPS=1127167412

So we did another test with tinj. The tests were logged in 21911.

The third attempt was scheduled for 1127172657. This injection was unsuccessful.
  * tinj stopped. Debugging by Eric showed that we need to change the filename of the waveform single-column ASCII files. Inspiral waveform files need have _H1 or _L1 at the end. So we renamed coherenttest1_*.out to coherenttest1_*_H1.out and coherenttest1_*_H1.out.
  * We also had to include a _ at the end of the filename in the schedule. So the line should have read "1127172657 1 0.5 coherenttest1_1126257410".

We made these corrections. But LLO went out of lock.

The fourth attempt was scheduled for 1127173421 and it was H1 only. It was successful.
  * The waveform was injected and observed going through the hardware injection filterbanks.
  * It had a scale factor of 0.5 applied.
  * A gracedb page was generated for this hardware injection H1 and L1.

Now that we had tinj running and a successful injection at LHO we waited for LLO to come back.

The fifth attempt was scheduled for 1127175848 and was a H1L1 coherent injection into both detectors. It was successful.
  * This was the first scheduled coherent CBC injection in aLIGO.
  * It was observed in both detectors. There was a CWB trigger for this event in gracedb and it was flagged with the INJ tag.
  * A gracedb page was generated for this hardware injection H1 and L1.

We then scheduled three more injections for the night at 1127179817, 1127181617, and 1127183417.

The schedule file now reads:
1127168901 1 0.5 coherenttest1_1126257410
1127172657 1 0.5 coherenttest1_1126257410
1127173421 1 0.5 coherenttest1_1126257410_
1127175848 1 1.0 coherenttest1_1126257410_
1127179817 1 1.0 coherenttest1_1126257410_
1127181617 1 1.0 coherenttest1_1126257410_
1127183417 1 1.0 coherenttest1_1126257410_

More plots and followup to come later.

An action item is to update the hardware injection documentation with the correct filename format and the commands to restart tinj on the command line.

And I will add as another note, to get to the monit web browser interface. On a controls machine you can type for the URL "http://h1hwinj1/" and it will take you to the monit web browser interface. At the moment I see no way to manage tinj this way but once we switch to monit this will be a good thing to remember.
Comments related to this report
christopher.biwer@LIGO.ORG - 19:37, Thursday 24 September 2015 (21927)
I stuck around to watch the other injections. It looks like tinj had cancelled them because of a GRB alert.

I see lines in tinj.log like:
The current time is GPS=1127183407
Total number of injections in schedule = 40.
Injections scheduled for the future = 1.
Injection imminent in ~10s...
  1127183417 1 1 coherenttest1_1126257410_
GRB alert at 1127179156: injection allowed.
Injections paused: injection canceled.
Rechecking schedule...

LigoDV web isn't cooperating right now so I don't have plots yet.
peter.shawhan@LIGO.ORG - 19:40, Thursday 24 September 2015 (21929)
The injections scheduled for 1127181617 and 1127183417 were skipped by tinj because we were in a "GRB alert" status -- though in this case, the "GRB" was the previous hardware injection that was treated like a real low-latency GW trigger!  That was the correct behavior of the external alert code, though a little unfortunate that we didn't get those last two injections.  I actually realized this was going to happen shortly before the last injection and tried to clear the GRB alert status, but it turns out that H1:CAL-INJ_TINJ_PAUSE was ALSO set and that made tinj skip the injection.
christopher.biwer@LIGO.ORG - 19:47, Thursday 24 September 2015 (21931)DetChar, INJ
I've attached time series of the three scheduled injections. You see only the first one at 1127179817 went in. The scheduled injections at 1127181617 and 1127183417 did not because of a GRB alert.
Images attached to this comment
peter.shawhan@LIGO.ORG - 20:52, Thursday 24 September 2015 (21932)INJ
Recap of the sequence of coherent injection attempts (successful or unsuccessful):

The first coherent injection was done with a file start time of 1127175848 = 00:23:51 UTC.  It was successful, and resulted in a CWB trigger at 1127175853.94 and an oLIB trigger at 1127175853.98.  Note that those are 5.94 or 5.98 seconds after the file start time, not "~6.986" as stated in the original post -- I think that was simply off by one second.  The external alert code (GRB/SNEWS/GW) sounded an audible alarm.  A well-meaning colleague did the advocate sign-off as "NOT OK" since it was a hardware injection, not realizing that we were doing a test and wanted to treat it like a real event.  That caused Approval Processor to issue a retraction, as it is programmed to do.

The second coherent injection was done with a file start time of 1127179817 = 01:30:00 UTC.  This too was successful, and resulted in a CWB trigger at 1127179822.96 and an oLIB trigger at 1127179822.99.  This event was labeled "OK" by the operators at both sites, and by an on-shift Advocate, Xingjiang Zhu.  That allowed an Initial alert to be sent to GCN (but NOT distributed to astronomers, during this testing phase).  A few minutes later, Leo Singer changed the advocate sign-off to "NOT OK", and we verified that a retraction alert was issued.

The second injection also had a side effect that we didn't anticipate: since a significant GW candidate generates a control-room alert equivalent to a GRB alert, it put us into a GRB Alert status with an automatic 1-hour pause of hardware injections.  That is the correct behavior under normal circumstances, but tonight had the effect of overriding the third and fourth scheduled injections in tonight's testing.  I actually realized that after the third one was skipped, and checked:
[hinj@h1hwinj1 tinj]$ ezcaread H1:CAL-INJ_EXTTRIG_ALERT_TIME -i
H1:CAL-INJ_EXTTRIG_ALERT_TIME = 1127179823 (1.12718e+09)
(the time of the CWB trigger!)  I then attempted to manually clear the GRB Alert status in order to allow the fourth injection to proceed:
[hinj@h1hwinj1 tinj]$ tconvert now; ezcawrite H1:CAL-INJ_EXTTRIG_ALERT_TIME 0
1127183084
H1:CAL-INJ_EXTTRIG_ALERT_TIME = 0
However, the tinj.log file showed that the injection was going to be skipped anyway:
...
Injection imminent in ~289s...
  1127183417 1 1 coherenttest1_1126257410_
GRB alerts are off.
Injections paused: injection canceled.
At this point I thought that setting H1:CAL-INJ_EXTTRIG_ALERT_TIME to 0 might be treated specially, so at GPS 1127183156 I executed the command ezcawrite H1:CAL-INJ_EXTTRIG_ALERT_TIME 1127179156 to set that channel equal to a time a little more than an hour before the fourth scheduled injection.  However, tinj still said "Injections paused".  Finally I realized why:
[hinj@h1hwinj1 tinj]$ ezcaread H1:CAL-INJ_TINJ_PAUSE -i
H1:CAL-INJ_TINJ_PAUSE = 1127183422 (1.12718e+09)
That GPS time is one hour after the CWB event candidate from the second coherent injection.

Evidently, the ext_alert.py program currently running at LHO is setting CAL-INJ_TINJ_PAUSE.  I'm surprised, since I thought we had resolved that it wouldn't (alog 21823), and the copy of the code running at LLO didn't, but maybe the LHO copy isn't the very latest version, or was started with different command-line options.