Displaying report 1-1 of 1.
Reports until 15:38, Tuesday 27 February 2018
H1 CDS
david.barker@LIGO.ORG - posted 15:38, Tuesday 27 February 2018 (40750)
h1oaf processing time restored by filling gap in core assignment left by h1ngn

On Tue 16th Jan 2018 I stopped the h1ngn model and removed it from h1oaf0. At this time the h1oaf model started running long (left hand plot in attachment). Note that h1oaf was not restarted at this time.

This morning I started with h1iopoaf0 and h1oaf as the only models running, h1oaf was running at pre_ngn_removal cpu usage with little deviation. After starting the other models on this computer h1oaf ramped up  into the 61uS range with large deviations causing TIM errors.

Looking at the model/core distribution, the first CPU physical chip (6 cores non-hyperthreaded) were fully utilized until h1ngn was stopped, leaving a "hole" in core 4. I changed h1pemcs.mdl to move it from specific_cpu=7 to specific_cpu=4. After restarted all the models this appears to have fixed h1oaf's issues (right hand plot of attachment). It is not immediately clear why.

Here are the core layouts:

cpu0

core model
0 General Linux
1 h1iopoaf0
2 h1calcs
3 h1oaf
4 was h1ngn, empty 1/16-2/27, now h1pemcs
5 h1susprocpi

cpu1

core model
6 empty
7 was h1pemcs, now empty
8 h1tcscs
9 h1odcmaster
10 empty
11 empty

 

Images attached to this report
Displaying report 1-1 of 1.