Displaying report 1-1 of 1.
Reports until 13:48, Monday 23 July 2012
X1 SUS
james.batch@LIGO.ORG - posted 13:48, Monday 23 July 2012 (3541)
X1 Tripleteststand running intermittently
The tripleteststand is malfunctioning - Symptoms are GPS time for user model and IOP model freezes at some time (identical for both), killing the user model and restarting the IOP results in a GPS time of 0, and the IOP indicates no ADC/DAC cards exist.  There is an interesting dialog which can be brought up using the dmesg command:

[13290.003398] x1ioptriple: Allocated daq shmem; set at 0xffffc9001a10d000
[13290.003399] x1ioptriple: configured to use 5 cards
[13290.003400] x1ioptriple: Initializing PCI Modules
[13290.003408] x1ioptriple: ADC card on bus a; device 4 prim a
[13290.003409] x1ioptriple: adc card on bus a; device 4 prim a
[13290.003415] x1ioptriple: pci0 = 0xffffffff
[13290.003423] resource map sanity check conflict: 0xffffffff 0x1000001fe 0xffc00000 0xffffffff reserved
[13290.003426] ------------[ cut here ]------------
[13290.003431] WARNING: at arch/x86/mm/ioremap.c:98 __ioremap_caller+0xd5/0x301()
[13290.003432] Hardware name: X8DTU
[13290.003433] Info: mapping multiple BARs. Your kernel is fine.
[13290.003434] Modules linked in: x1ioptriplefe(+) mbuf [last unloaded: x1ioptriplefe]
[13290.003437] Pid: 5760, comm: insmod Not tainted 2.6.34.1 #7
[13290.003438] Call Trace:
[13290.003442]  [] warn_slowpath_common+0x77/0x8f
[13290.003444]  [] warn_slowpath_fmt+0x3c/0x3e
[13290.003446]  [] __ioremap_caller+0xd5/0x301
[13290.003449]  [] ? T.484+0x13/0x15
[13290.003451]  [] ? pci_bus_read_config_dword+0x66/0x74
[13290.003452]  [] ioremap_nocache+0x12/0x14
[13290.003457]  [] mapAdc+0x65/0x251 [x1ioptriplefe]
[13290.003461]  [] mapPciModules+0x6c1/0x824 [x1ioptriplefe]
[13290.003465]  [] init_module+0x242/0x97d [x1ioptriplefe]
[13290.003468]  [] ? init_module+0x0/0x97d [x1ioptriplefe]
[13290.003471]  [] do_one_initcall+0x59/0x149
[13290.003475]  [] sys_init_module+0xd1/0x231
[13290.003477]  [] system_call_fastpath+0x16/0x1b
[13290.003478] ---[ end trace 31c35fdb3a0a9ba9 ]---
[13290.003481] ioremap reserve_memtype failed -22
[13290.003483] x1ioptriple: pci2 = 0xffffffff
[13290.003485] resource map sanity check conflict: 0xffffffff 0x1000001fe 0xffc00000 0xffffffff reserved
[13290.003487] ioremap reserve_memtype failed -22
[13290.003488] x1ioptriple: ADC I/O address=0xffffffff  0x0
[13290.003491] BUG: unable to handle kernel NULL pointer dereference at (null)
[13290.003919] IP: [] mapAdc+0xd8/0x251 [x1ioptriplefe]
[13290.004137] PGD 1b7cbc067 PUD 1b7c84067 PMD 0 
[13290.004352] Oops: 0000 [#1] SMP 
[13290.004562] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:27:01.0/class
[13290.004974] CPU 2 
[13290.004979] Modules linked in: x1ioptriplefe(+) mbuf [last unloaded: x1ioptriplefe]
[13290.005596] 
[13290.005801] Pid: 5760, comm: insmod Tainted: G        W  2.6.34.1 #7 X8DTU/X8DTU
[13290.006215] RIP: 0010:[]  [] mapAdc+0xd8/0x251 [x1ioptriplefe]
[13290.006637] RSP: 0018:ffff8801b7d49dc8  EFLAGS: 00010292
[13290.006846] RAX: 000000000000003f RBX: 0000000000000000 RCX: 000000000000003f
[13290.007059] RDX: 0000000000020cc5 RSI: ffffffff8179c135 RDI: 000000000000000a
[13290.007272] RBP: ffff8801b7d49df8 R08: 000000007ffffff2 R09: 000000000000000a
[13290.007487] R10: 0000000000000006 R11: 00000000ffffffff R12: ffffffffa000f130
[13290.007701] R13: ffff8801be8ad800 R14: 0000000000000000 R15: 0000000000000000
[13290.007914] FS:  00007f957e7836f0(0000) GS:ffff880001e40000(0000) knlGS:0000000000000000
[13290.008326] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13290.008537] CR2: 0000000000000000 CR3: 00000001a165d000 CR4: 00000000000006e0
[13290.008746] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[13290.008955] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[13290.009165] Process insmod (pid: 5760, threadinfo ffff8801b7d48000, task ffff8801bdd0f200)
[13290.009571] Stack:
[13290.009771]  0000000000000000 ffffffffa000f130 0000000000000001 0000000000000000
[13290.009985] <0> 0000000000000000 0000000000000000 ffff8801b7d49e48 ffffffffa00087b7
[13290.010401] <0> ffff8801b7d49e18 00000000b7d49e58 ffff8801b7d49e48 ffff8801b7d49e58
[13290.011023] Call Trace:
[13290.011231]  [] mapPciModules+0x6c1/0x824 [x1ioptriplefe]
[13290.011445]  [] init_module+0x242/0x97d [x1ioptriplefe]
[13290.011659]  [] ? init_module+0x0/0x97d [x1ioptriplefe]
[13290.011872]  [] do_one_initcall+0x59/0x149
[13290.012081]  [] sys_init_module+0xd1/0x231
[13290.012291]  [] system_call_fastpath+0x16/0x1b
[13290.012501] Code: 00 02 00 00 e8 0c cc 01 e1 8b 35 91 e8 49 00 48 89 c2 49 89 c6 48 c7 c7 fc d3 00 a0 31 c0 e8 a9 57 00 00 4e 89 34 fd 20 5e 01 a0 <41> 8b 36 48 c7 c7 19 d4 00 a0 31 c0 e8 90 57 00 00 4a 8b 14 fd 
[13290.013247] RIP  [] mapAdc+0xd8/0x251 [x1ioptriplefe]
[13290.013468]  RSP 
[13290.018602] CR2: 0000000000000000
[13290.019122] ---[ end trace 31c35fdb3a0a9baa ]---

The only way to recover is to power the computer off, then power the I/O Chassis off at the power supply, wait, then power up the I/O Chassis, then the computer.  The IOP model can be started normally at that point, followed by the user model.  So far, the system has died three times since Wednesday July 18.
Displaying report 1-1 of 1.