Displaying report 1-1 of 1.
Reports until 09:45, Friday 18 May 2018
H1 DAQ (DAQ)
jonathan.hanks@LIGO.ORG - posted 09:45, Friday 18 May 2018 - last comment - 14:02, Friday 18 May 2018(42066)
WP7558 - update of 17 May work - using cpuset to isolate the daqd process
Yesterday I tried to isolate the daqd process from other system processes by giving it exclusive access to most of the cores on the system by using the cpuset program.  I was not able to move all processes off the cores, but it reduced scheduling pressure on 11 of the cores to allow the daqd more cpu time.

We let it run overnight.  The attached plot has 3 runs:

 * left - The original code.
 * middle - The code with faster checksumming.
 * right - The code with faster checksumming and an isolated process.

The retransmit rate was reduced by more that 50%.  Today I will work on how automate the setup I ran h1fw2 in yesterday.

With the reduction of retransmit requests that we are seeing on h1fw2, I will be filing a work permit to do this work on h1fw0 next week.
Images attached to this report
Comments related to this report
jonathan.hanks@LIGO.ORG - 14:02, Friday 18 May 2018 (42076)
I reworked how this was done and put it into the CDS puppet.  Instead of using the cpuset tools, I used systemd to set the cpu affinities.  Systemd was configured to use a "CPUAffinity=0" in the /etc/systemd/system.conf file.  Then daqd_h1fw2 unit file was given a "CPUAffinity=1 2 3 4 5 6 7 8 9 10 11".  The goal is to have systemd restrict itself and all its child processes (everything as it is the init process) to cpu 0.  Then the daqd process would be explicitly started on all the other cpus.

This at first pass looks to be more successful than using cpuset as there are processes that will not migrate.  A listing of processes after rebooting h1fw2 shows no other processes on the cores set aside for daqd.

For now we will watch how this behaves over the weekend.
Displaying report 1-1 of 1.