Hi,
I recently upgraded one of my Linodes from a Linode 512 to 768 due to some memory limitations. After booting it up again, I noticed that while idling the load is still higher than it should be.
It's currently supposed to be idle, but the load avg are: load average: 0.49, 0.62, 0.59
It's staying around that number, I've seen it peak at 0.90, with nothing going on that should cause it.
CPU looks idle, IO seems low and RAM is fine as well. Here's the summary from top:
Code:
top - 16:07:26 up 4:11, 1 user, load average: 0.60, 0.64, 0.60
Tasks: 106 total, 2 running, 104 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 768316k total, 709116k used, 59200k free, 32348k buffers
Swap: 524284k total, 180k used, 524104k free, 478828k cached
I tried to troubleshoot myself a bit, but I don't have enough experience to say whether this could be the cause or not. With nmon I looked at the kernel statistics and saw this:
Code:
Kernel Stats ---------------------------------------------------------------------------------------------------------------------------------------- |
| RunQueue 1 Load Average CPU use since boot time |
| ContextSwitch 1776.8 1 mins 0.88 Uptime Days= 0 Hours= 4 Mins=13 |
| Forks 3.0 5 mins 0.71 Idle Days= 1 Hours= 9 Mins=21 |
| Interrupts 1050.6 15 mins 0.62 Average CPU use=-687.96% |
I thought that maybe the interupts/contextswitched are what causes the high load? I don't know what it should be, but from what I saw on screenshots from other people, it wasn't this high when the system should be idle.
I looked at the interupts and see this:
Code:
Average: INTR intr/s
Average: 0 0.00
Average: 1 0.00
Average: 2 0.00
Average: 3 0.00
Average: 4 0.00
Average: 5 0.00
Average: 6 0.00
Average: 7 0.00
Average: 8 0.00
Average: 9 0.00
Average: 10 0.00
Average: 11 0.00
Average: 12 0.00
Average: 13 0.00
Average: 14 0.00
Average: 15 0.00
Average: 16 183.40
Average: 17 1.70
Average: 18 47.70
Average: 19 0.00
Average: 20 0.00
Average: 21 0.00
Average: 22 179.05
Average: 23 1.85
Average: 24 43.25
Average: 25 0.00
Average: 26 0.00
Average: 27 0.40
Average: 28 180.35
Average: 29 1.45
Average: 30 44.70
Average: 31 0.00
Average: 32 0.00
Average: 33 1.35
Average: 34 178.05
Average: 35 1.80
Average: 36 42.55
Average: 37 0.00
Average: 38 0.00
Average: 39 1.05
Average: 40 0.00
Average: 41 0.00
Average: 42 7.35
Average: 43 0.00
Average: 44 1.30
Average: 45 0.00
Average: 46 0.00
Average: 47 0.00
root@nuvini:~# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
16: 2972148 0 0 0 xen-percpu-virq timer0
17: 21870 0 0 0 xen-percpu-ipi spinlock0
18: 750503 0 0 0 xen-percpu-ipi resched0
19: 146 0 0 0 xen-percpu-ipi callfunc0
20: 0 0 0 0 xen-percpu-virq debug0
21: 8028 0 0 0 xen-percpu-ipi callfuncsingle0
22: 0 2915213 0 0 xen-percpu-virq timer1
23: 0 22061 0 0 xen-percpu-ipi spinlock1
24: 0 695840 0 0 xen-percpu-ipi resched1
25: 0 92 0 0 xen-percpu-ipi callfunc1
26: 0 0 0 0 xen-percpu-virq debug1
27: 0 8167 0 0 xen-percpu-ipi callfuncsingle1
28: 0 0 2824373 0 xen-percpu-virq timer2
29: 0 0 21485 0 xen-percpu-ipi spinlock2
30: 0 0 692198 0 xen-percpu-ipi resched2
31: 0 0 225 0 xen-percpu-ipi callfunc2
32: 0 0 0 0 xen-percpu-virq debug2
33: 0 0 8251 0 xen-percpu-ipi callfuncsingle2
34: 0 0 0 2895170 xen-percpu-virq timer3
35: 0 0 0 20237 xen-percpu-ipi spinlock3
36: 0 0 0 678003 xen-percpu-ipi resched3
37: 0 0 0 202 xen-percpu-ipi callfunc3
38: 0 0 0 0 xen-percpu-virq debug3
39: 0 0 0 8396 xen-percpu-ipi callfuncsingle3
40: 433 0 0 0 xen-dyn-event xenbus
41: 405 0 0 0 xen-dyn-event hvc_console
42: 66505 0 0 0 xen-dyn-event blkif
43: 67 0 0 0 xen-dyn-event blkif
44: 90115 0 0 0 xen-dyn-event eth0
NMI: 0 0 0 0 Non-maskable interrupts
LOC: 0 0 0 0 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
PMI: 0 0 0 0 Performance monitoring interrupts
IWI: 0 0 0 0 IRQ work interrupts
RTR: 0 0 0 0 APIC ICR read retries
RES: 750503 695840 692198 678003 Rescheduling interrupts
CAL: 8174 8259 8476 8598 Function call interrupts
TLB: 0 0 0 0 TLB shootdowns
ERR: 0
MIS: 0
As for contextswitches, I have no idea on how to look up what's causing the high number. Or even -if- these things can be the cause for the high load.
EDIT: I tried the following, but it doesn't seem to show what's causing it. In this list there's nothing that's very high.
Code:
root@nuvini:~# pidstat -w
Linux 3.4.2-linode44 (nuvini) 07/28/2012 _i686_ (4 CPU)
04:17:56 PM PID cswch/s nvcswch/s Command
04:17:56 PM 1 0.33 0.00 init
04:17:56 PM 2 0.01 0.00 kthreadd
04:17:56 PM 3 0.37 0.00 ksoftirqd/0
04:17:56 PM 4 1.82 0.00 kworker/0:0
04:17:56 PM 5 0.11 0.00 kworker/u:0
04:17:56 PM 6 0.43 0.00 migration/0
04:17:56 PM 7 0.44 0.00 migration/1
04:17:56 PM 8 0.00 0.00 kworker/1:0
04:17:56 PM 9 0.34 0.00 ksoftirqd/1
04:17:56 PM 10 0.45 0.00 migration/2
04:17:56 PM 11 0.00 0.00 kworker/2:0
04:17:56 PM 12 0.34 0.00 ksoftirqd/2
04:17:56 PM 13 0.45 0.00 migration/3
04:17:56 PM 14 0.00 0.00 kworker/3:0
04:17:56 PM 15 0.38 0.00 ksoftirqd/3
04:17:56 PM 16 0.00 0.00 cpuset
04:17:56 PM 17 0.00 0.00 khelper
04:17:56 PM 18 0.04 0.00 kdevtmpfs
04:17:56 PM 19 0.00 0.00 kworker/u:1
04:17:56 PM 21 0.01 0.00 xenwatch
04:17:56 PM 22 0.03 0.00 xenbus
04:17:56 PM 155 0.17 0.00 sync_supers
04:17:56 PM 157 0.00 0.00 bdi-default
04:17:56 PM 159 0.00 0.00 kblockd
04:17:56 PM 168 0.96 0.00 kworker/1:1
04:17:56 PM 171 0.00 0.00 md
04:17:56 PM 255 0.00 0.00 rpciod
04:17:56 PM 257 1.40 0.00 kworker/0:1
04:17:56 PM 268 0.02 0.00 kswapd0
04:17:56 PM 269 0.00 0.00 ksmd
04:17:56 PM 270 0.00 0.00 fsnotify_mark
04:17:56 PM 274 0.00 0.00 ecryptfs-kthrea
04:17:56 PM 276 0.00 0.00 nfsiod
04:17:56 PM 277 0.00 0.00 cifsiod
04:17:56 PM 280 0.00 0.00 jfsIO
04:17:56 PM 281 0.00 0.00 jfsCommit
04:17:56 PM 282 0.00 0.00 jfsCommit
04:17:56 PM 283 0.00 0.00 jfsCommit
04:17:56 PM 284 0.00 0.00 jfsCommit
04:17:56 PM 285 0.00 0.00 jfsSync
04:17:56 PM 286 0.00 0.00 xfsalloc
04:17:56 PM 287 0.00 0.00 xfs_mru_cache
04:17:56 PM 288 0.00 0.00 xfslogd
04:17:56 PM 289 0.00 0.00 glock_workqueue
04:17:56 PM 290 0.00 0.00 delete_workqueu
04:17:56 PM 291 0.00 0.00 gfs_recovery
04:17:56 PM 292 0.00 0.00 crypto
04:17:56 PM 540 0.20 0.00 php5-fpm
04:17:56 PM 541 0.01 0.00 php5-fpm
04:17:56 PM 542 0.01 0.00 php5-fpm
04:17:56 PM 543 0.02 0.00 php5-fpm
04:17:56 PM 544 0.00 0.00 php5-fpm
04:17:56 PM 545 0.00 0.00 php5-fpm
04:17:56 PM 546 0.00 0.00 php5-fpm
04:17:56 PM 547 0.00 0.00 php5-fpm
04:17:56 PM 855 0.00 0.00 khvcd
04:17:56 PM 971 0.00 0.00 kpsmoused
04:17:56 PM 972 0.87 0.00 kworker/2:1
04:17:56 PM 1009 0.00 0.00 deferwq
04:17:56 PM 1011 0.89 0.00 kworker/3:1
04:17:56 PM 1013 0.37 0.00 kjournald
04:17:56 PM 1048 0.00 0.01 udevd
04:17:56 PM 1142 0.00 0.00 udevd
04:17:56 PM 1143 0.00 0.00 udevd
04:17:56 PM 1381 0.00 0.00 portmap
04:17:56 PM 1393 0.00 0.00 rpc.statd
04:17:56 PM 1618 0.00 0.00 rsyslogd
04:17:56 PM 1627 0.16 0.01 monit
04:17:56 PM 1687 0.01 0.00 mysqld_safe
04:17:56 PM 1831 0.11 0.00 mysqld
04:17:56 PM 1832 0.00 0.00 logger
04:17:56 PM 1887 1.00 0.01 apache2
04:17:56 PM 1891 0.00 0.00 apache2
04:17:56 PM 1892 0.00 0.00 apache2
04:17:56 PM 1898 0.33 0.00 apache2
04:17:56 PM 1900 0.00 0.00 atd
04:17:56 PM 1939 0.02 0.00 cron
04:17:56 PM 1948 0.20 0.00 flush-202:0
04:17:56 PM 1949 0.00 0.00 apache2
04:17:56 PM 1950 0.00 0.00 apache2
04:17:56 PM 2102 0.00 0.00 dhclient
04:17:56 PM 2158 0.03 0.00 sshd
04:17:56 PM 2166 0.09 0.00 master
04:17:56 PM 2169 0.01 0.00 qmgr
04:17:56 PM 2193 0.00 0.00 tlsmgr
04:17:56 PM 2550 1.02 0.00 ntpd
04:17:56 PM 2553 1.00 0.00 memcached
04:17:56 PM 2727 0.05 0.00 fail2ban-server
04:17:56 PM 2821 0.22 0.04 lfd
04:17:56 PM 2842 0.00 0.00 lfd
04:17:56 PM 3498 0.00 0.00 mingetty
04:17:56 PM 3655 0.00 0.00 sshd
04:17:56 PM 3657 0.42 0.01 sshd
04:17:56 PM 3660 0.01 0.00 bash
04:17:56 PM 3703 0.00 0.00 su
04:17:56 PM 3704 0.05 0.01 bash
04:17:56 PM 5250 0.00 0.00 sshd
04:17:56 PM 5252 0.01 0.00 sshd
04:17:56 PM 5253 0.00 0.00 sshd
04:17:56 PM 5710 0.00 0.00 teamspeak
04:17:56 PM 5711 1.47 0.14 ts3server_start
04:17:56 PM 5731 0.01 0.00 ts3server_linux
04:17:56 PM 11700 0.00 0.00 sleep
04:17:56 PM 11701 0.00 0.00 pidstat
04:17:56 PM 28375 0.01 0.00 smtpd
04:17:56 PM 29738 0.00 0.00 pickup
root@nuvini:~# sar -w 1 3
Linux 3.4.2-linode44 (nuvini) 07/28/2012 _i686_ (4 CPU)
04:18:19 PM proc/s cswch/s
04:18:20 PM 3.00 1899.00
04:18:21 PM 2.00 2031.00
04:18:22 PM 1.00 1959.00
Average: 2.00 1963.00
The server itself does seem to be running fine - but I would like to have a realistic value of the load, and this only started after the upgrade. It should be idle now so it shouldn't show a load like that.
Hopefully anyone has any ideas on how to get it back to how it should be. Thanks
