Trying to find reson for kernel softlockup

Go To Last Post
8 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am seeing this error after 6666sec/~2hour ( not fixed ) not the error occurs always after 2-5 hours when I have ppp0 and usb0 actve. Both networking interfaces are activly used. It is either my application that locks up or dropbear.

......................................................................................................................................................................
Sorry for the ... but its there for the readabilety of the next section

[ 6666.864000] BUG: soft lockup - CPU#0 stuck for 61s! [CurvusRecorder.:636]
[ 6666.864000] Modules linked in: ipv6 ubifs crc16 ubi rtc_max6902 g_ether
[ 6666.864000] irq event stamp: 2490300
[ 6666.864000] hardirqs last  enabled at (2490299): [<9007d55c>] kmem_cache_alloc+0x64/0xb4
[ 6666.864000] hardirqs last disabled at (2490300): [<90018024>] do_IRQ+0x10/0x5c
[ 6666.864000] softirqs last  enabled at (2490292): [<90134556>] lock_sock_nested+0x9a/0xa4
[ 6666.864000] softirqs last disabled at (2490290): [<90182fe4>] _spin_lock_bh+0xc/0x48
[ 6666.864000] PC is at kmem_cache_alloc+0x7c/0xb4
[ 6666.864000] LR is at 0x6
[ 6666.864000] pc : [<9007d574>]    lr : [<00000006>]    Not tainted
[ 6666.864000] sp : 9138bd34  r12: 00000001  r11: 913d4af8
[ 6666.864000] r10: 913d4af8  r9 : 020a7192  r8 : 908a328c
[ 6666.864000] r7 : 9138bd38  r6 : 00000000  r5 : 91c58b00  r4 : 7f3ffe20
[ 6666.864000] r3 : 911b3a80  r2 : 00400020  r1 : 00000144  r0 : 901378ee
[ 6666.864000] Flags: qvnZc
[ 6666.864000] Mode bits: hjmde....g
[ 6666.864000] CPU Mode: Supervisor
[ 6666.864000] Process: CurvusRecorder. [636] (task: 913d4740 thread: 9138a000)
[ 6666.864000] Call trace:
[ 6666.864000]  [<901378ee>] __alloc_skb+0x22/0xc4
[ 6666.864000]  [<90158a3c>] sk_stream_alloc_skb+0x20/0xac
[ 6666.864000]  [<90158bfa>] tcp_sendmsg+0x132/0x898
[ 6666.864000]  [<90133054>] sock_sendmsg+0x84/0x98
[ 6666.864000]  [<90133ab2>] sys_sendto+0x8a/0xa4
[ 6666.864000]  [<90133ad8>] sys_send+0xc/0x10
[ 6666.864000]  [<90016132>] syscall_return+0x0/0x12
[ 6666.864000]

I'am statis to suspect if it could be a memory management problem, or a memory timin issue?

Life's to short for waiting on slow CPU's

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here is a fresh crash log.

[60041.496000] Modules linked in: ipv6 ubifs crc16 ubi rtc_max6902 g_ether
[60041.496000] irq event stamp: 4126653
[60041.496000] hardirqs last  enabled at (4126651): [<900280aa>] local_bh_enable_ip+0x76/0x8c
[60041.496000] hardirqs last disabled at (4126653): [<90018024>] do_IRQ+0x10/0x5c
[60041.496000] softirqs last  enabled at (4126650): [<901344ba>] release_sock+0x82/0x84
[60041.496000] softirqs last disabled at (4126652): [<90182fe4>] _spin_lock_bh+0xc/0x48
[60041.496000] PC is at _spin_lock_bh+0x22/0x48
[60041.496000] LR is at 0x91296e98
[60041.496000] pc : [<90182ffa>]    lr : [<91296e98>]    Not tainted
[60041.496000] sp : 911d9d08  r12: 00000001  r11: 00000001
[60041.496000] r10: 000003b8  r9 : 908a327c  r8 : 91296ae0
[60041.496000] r7 : 911d9d10  r6 : 90f36a3c  r5 : 90f36a2c  r4 : 7f7ffe20
[60041.496000] r3 : 901344d2  r2 : 911d9dcc  r1 : 9013a646  r0 : 00000000
[60041.496000] Flags: qvnzc
[60041.496000] Mode bits: hjmde....g
[60041.496000] CPU Mode: Supervisor
[60041.496000] Process: CurvusRecorder. [598] (task: 91296ae0 thread: 911d8000)
[60041.496000] Call trace:
[60041.496000]  [<901344d2>] lock_sock_nested+0x16/0xa4
[60041.496000]  [<9013a646>] sk_stream_wait_memory+0x106/0x170
[60041.496000]  [<901591aa>] tcp_sendmsg+0x6e2/0x898
[60041.496000]  [<90133054>] sock_sendmsg+0x84/0x98
[60041.496000]  [<90133ab2>] sys_sendto+0x8a/0xa4
[60041.496000]  [<90133ad8>] sys_send+0xc/0x10
[60041.496000]  [<90016132>] syscall_return+0x0/0x12
[60041.496000]

I still suspect a mem handling or timing error? Anyon?
Thanx for any replay.

Life's to short for waiting on slow CPU's

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Looks like you need to turn on spinlock debugging, seems like something gets stuck.

Hans-Christian

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In my config:

CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_DEBUG_SPINLOCK=y
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set

Do I also need the debug sleep option?

Edit: It is either my application that are using the netwok or dropbear that locks every time..

Life's to short for waiting on slow CPU's

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here are my interfave conncetions.

~ # ifconfig
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

ppp0      Link encap:Point-to-Point Protocol
          inet addr:77.17.31.nn  P-t-P:192.168.202.1  Mask:255.255.255.255
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:5745 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6207 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:3
          RX bytes:304760 (297.6 KiB)  TX bytes:5794401 (5.5 MiB)

usb0      Link encap:Ethernet  HWaddr 06:05:36:E1:AB:BC
          inet addr:192.168.205.1  Bcast:192.168.205.255  Mask:255.255.255.0
          inet6 addr: fe80::405:36ff:fee1:abbc/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:154639 errors:0 dropped:0 overruns:0 frame:0
          TX packets:230966 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:9824242 (9.3 MiB)  TX bytes:64764422 (61.7 MiB)

Where I removed the last part of the ppp0 IP with nn.

the usb0 interface uses the g_ether module while the ppp0 uses pppd app to bind with ttyS3

I turned on CONFIG_DEBUG_SPINLOCK_SLEEP and are waiting for the problem to occur again.

Life's to short for waiting on slow CPU's

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0
[15642.272000] BUG: soft lockup - CPU#0 stuck for 61s! [dropbear:3492]
[15642.272000] Modules linked in: ipv6 ubifs crc16 ubi rtc_max6902 g_ether
[15642.272000] irq event stamp: 3697436
[15642.272000] hardirqs last  enabled at (3697435): [<9007cea2>] kfree+0xca/0xe4
[15642.272000] hardirqs last disabled at (3697436): [<90018024>] do_IRQ+0x10/0x5c
[15642.272000] softirqs last  enabled at (3697424): [<901354b6>] lock_sock_nested+0xa6/0xb4
[15642.272000] softirqs last disabled at (3697422): [<9018414c>] _spin_lock_bh+0xc/0x48
[15642.272000] PC is at skb_release_data+0x74/0x80
[15642.272000] LR is at 0x6
[15642.272000] pc : [<90138274>]    lr : [<00000006>]    Not tainted
[15642.272000] sp : 91a41d9c  r12: 00000001  r11: 9129a758
[15642.272000] r10: 9129a758  r9 : 0907e415  r8 : 908a528c
[15642.272000] r7 : 91a41d9c  r6 : 912d7a80  r5 : 912d7a80  r4 : 00027ba8
[15642.272000] r3 : 000005b4  r2 : 91a41f38  r1 : 00000001  r0 : 91a40000
[15642.272000] Flags: qvnzc
[15642.272000] Mode bits: hjmde....g
[15642.272000] CPU Mode: Supervisor
[15642.272000] Process: dropbear [3492] (task: 9129a3a0 thread: 91a40000)
[15642.272000] Call trace:
[15642.272000]  [<9013828e>] skb_release_all+0xe/0x10
[15642.272000]  [<901382ac>] __kfree_skb+0x8/0x74
[15642.272000]  [<90159b74>] sk_stream_alloc_skb+0xa4/0xac
[15642.272000]  [<90159cae>] tcp_sendmsg+0x132/0x898
[15642.272000]  [<9013356e>] sock_aio_write+0x86/0x8c
[15642.272000]  [<900801cc>] do_sync_write+0x88/0xb4
[15642.272000]  [<900805ee>] vfs_write+0x6e/0xe8
[15642.272000]  [<900807a4>] sys_write+0x2c/0x4c
[15642.272000]  [<90016132>] syscall_return+0x0/0x12
[15642.272000]

Crach with CONFIG_DEBUG_SPINLOCK_SLEEP on.

Life's to short for waiting on slow CPU's

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Taken less than 2 sec before crash.

Every 2s: cat /proc/meminfo   2010-05-21 14:36:27
MemTotal:          22964 kB
MemFree:            3860 kB
Buffers:             104 kB
Cached:             3016 kB
SwapCached:            0 kB
Active:             2812 kB
Inactive:           3528 kB
Active(anon):       1320 kB
Inactive(anon):     2340 kB
Active(file):       1492 kB
Inactive(file):     1188 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:          3248 kB
Mapped:              812 kB
Shmem:               440 kB
Slab:               7108 kB
SReclaimable:       1804 kB
SUnreclaim:         5304 kB
KernelStack:         552 kB
PageTables:          424 kB
Quicklists:           52 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:       11480 kB
Committed_AS:      10540 kB
VmallocTotal:     507904 kB
VmallocUsed:        4264 kB
VmallocChunk:     492064 kB

Life's to short for waiting on slow CPU's

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I tried increasing all the timing on the memory x2 but still feils/locks.

[ 7328.288000] BUG: soft lockup - CPU#0 stuck for 61s! [CurvusRecorder.:557]
[ 7328.288000] Modules linked in: ipv6 ubifs crc16 ubi rtc_max6902 g_ether
[ 7328.288000] irq event stamp: 2263713
[ 7328.288000] hardirqs last  enabled at (2263711): [<900281de>] local_bh_enable_ip+0x76/0x8c
[ 7328.288000] hardirqs last disabled at (2263713): [<90018024>] do_IRQ+0x10/0x5c
[ 7328.288000] softirqs last  enabled at (2263710): [<9013540a>] release_sock+0x92/0x98
[ 7328.288000] softirqs last disabled at (2263712): [<9018414c>] _spin_lock_bh+0xc/0x48
[ 7328.288000] PC is at _spin_lock_bh+0x22/0x48
[ 7328.288000] LR is at 0x91e3a758
[ 7328.288000] pc : [<90184162>]    lr : [<91e3a758>]    Not tainted
[ 7328.288000] sp : 91355d08  r12: 00000001  r11: 00000001
[ 7328.288000] r10: 000003b8  r9 : 908a527c  r8 : 91e3a3a0
[ 7328.288000] r7 : 91355d10  r6 : 91f2583c  r5 : 91f2582c  r4 : 7fa2cea4
[ 7328.288000] r3 : 90135432  r2 : 91355dcc  r1 : 9013b676  r0 : 00000000
[ 7328.288000] Flags: qvnzc
[ 7328.288000] Mode bits: hjmde....g
[ 7328.288000] CPU Mode: Supervisor
[ 7328.288000] Process: CurvusRecorder. [557] (task: 91e3a3a0 thread: 91354000)
[ 7328.288000] Call trace:
[ 7328.288000]  [<90135432>] lock_sock_nested+0x22/0xb4
[ 7328.288000]  [<9013b676>] sk_stream_wait_memory+0x106/0x170
[ 7328.288000]  [<9015a25e>] tcp_sendmsg+0x6e2/0x898
[ 7328.288000]  [<90133f94>] sock_sendmsg+0x84/0x98
[ 7328.288000]  [<901349f2>] sys_sendto+0x8a/0xa4
[ 7328.288000]  [<90134a18>] sys_send+0xc/0x10
[ 7328.288000]  [<90016132>] syscall_return+0x0/0x12
[ 7328.288000]

From my config:
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
kernel.printk = 4 1 1 7 <- Am I missing any output messages with this option?

Any help received with great thanx :)

Life's to short for waiting on slow CPU's