Reports until 17:42, Monday 17 August 2015
H1 CDS
david.barker@LIGO.ORG - posted 17:42, Monday 17 August 2015 (20611)
conlog crashed on Saturday, again

Conlog stopped running at 10am PDT Saturday 15th:

Aug 15 10:01:21 h1conlog1-master conlog: ../conlog.cpp: 301: process_cac_messages: MySQL Exception: Error: Out of range value for column 'value' at row 1: Error code: 1264: SQLState: 22003: Exiting.

Aug 15 10:01:21 h1conlog1-master conlog: ../conlog.cpp: 331: process_cas: Exception: boost: mutex lock failed in pthread_mutex_lock: Invalid argument Exiting.
 
Here are the logs from the previous crash logged in FRS3433  on Saturday August 8th:
 
Aug  8 12:25:48 h1conlog1-master conlog: ../conlog.cpp: 301: process_cac_messages: MySQL Exception: Error: Out of range value for column 'value' at row 1: Error code: 1264: SQLState: 22003: Exiting.
 
Aug  8 12:25:49 h1conlog1-master conlog: ../conlog.cpp: 331: process_cas: Exception: boost: mutex lock failed in pthread_mutex_lock: Invalid argument Exiting.
 
I restarted conlog using a subset of instructions from the wikipage
 
1. Check the conlog master database:
 
> ssh cdsadmin@h1conlog1-master
cdsadmin@h1conlog1-master: mysqlcheck -u root -p --check --all-databases
Note: this takes 31 minutes to complete
 
2. Check the status of the replication processes on the master:
 
> ssh cdsadmin@h1conlog1-master
cdsadmin@h1conlog1-master: mysql -u root -p
mysql> SHOW PROCESSLIST G;
'state' should report 'Has sent all binlog to slave; waiting for binlog to be updated'.
mysql> quit
 
3. Start the Conlog process on the master:
 
cdsadmin@h1conlog1-master: sudo -b -E -u conlog /opt/conlog/bin/linux-x86_64/conlog
 
4. Set the process variable list:
 
cdsadmin@h1conlog1-master: sudo /opt/conlog/bin/linux-x86_64/conlog_admin use /ligo/lho/data/conlog/h1/input_pv_list/pv_list.txt
Note: this takes about 2 minutes for the queue size to come down to zero and the unmonitored number to come to zero
 
At the end of the process all 95444 channels are being acquired by conlog. The DAQ EDCU connected number incremented to 24159 as the 4 conlog channels were connected to the EDCU.