Hello,
We recently moved one of our dedicated servers to a Linode 4096 and the performance was great for the past week or two, with the site having a load average of less than 0.10 during the day and down to 0.05 in the evening.
This evening, the load is up to 0.50 under the same level of user activity (we've not had a traffic spike or anything)
So I'm wondering where we start investigating what's increasing the load.
This is what collectl -10 -sCDN reports
Code:
# SINGLE CPU STATISTICS
# Cpu User Nice Sys Wait IRQ Soft Steal Idle
0 3 0 1 0 0 0 0 94
1 0 0 0 0 0 0 0 99
2 0 0 0 0 0 0 0 98
3 0 0 0 0 0 0 0 99
# DISK STATISTICS (/sec)
# <---------reads---------><---------writes---------><--------averages--------> Pct
#Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util
xvda 1 0 0 4 67 5 12 6 5 0 79 3 4
xvdb 0 0 0 0 0 0 0 0 0 0 0 0 0
As you can see the CPU is doing pretty much nothing, but there's wait on the (non-swap) disk. The site is serving perhaps 1-2 pages a second. Conversely another smaller Linode server delivering 20+ banner ads per second has zero wait.
Here's the output of top:
Code:
top - 08:53:16 up 19:01, 4 users, load average: 0.63, 0.69, 0.99
Tasks: 113 total, 1 running, 112 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.0%sy, 0.0%ni, 86.1%id, 13.4%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4194508k total, 1791364k used, 2403144k free, 149148k buffers
Swap: 262136k total, 104808k used, 157328k free, 430576k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8803 mysql 15 0 717m 214m 4284 S 0.3 5.2 27:00.06 mysqld
16741 nobody 16 0 521m 94m 4752 S 0.3 2.3 0:01.24 httpd
15766 nobody 16 0 524m 97m 4896 S 0.2 2.4 0:01.59 httpd
16739 nobody 16 0 522m 94m 4756 S 0.2 2.3 0:00.56 httpd
16740 nobody 16 0 522m 94m 4768 S 0.2 2.3 0:00.41 httpd
15767 nobody 16 0 523m 95m 4888 S 0.1 2.3 0:01.82 httpd
16375 nobody 16 0 522m 94m 4792 S 0.1 2.3 0:00.93 httpd
17988 root 15 0 12744 1272 932 R 0.1 0.0 0:00.04 top
1 root 15 0 10348 188 160 S 0.0 0.0 0:09.98 init
2 root RT 0 0 0 0 S 0.0 0.0 0:00.24 migration/0
3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
4 root RT 0 0 0 0 S 0.0 0.0 0:00.29 migration/1
5 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1
Again, the CPU isn't doing much, and there's plenty of spare memory.
Do you know how I can find what process is associated with the disk wait ?
The server is running Centos 5.5, MySQL 5.5, Apache 2.2 + mod_perl.
Here's the my.cnf, SHOW PROCESSLIST just shows 12 connections (one for each httpd process)
Code:
key_buffer = 192M
table_cache = 512
sort_buffer_size = 2M
read_buffer_size = 1M
join_buffer_size = 1M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 64M
thread_cache_size = 32
table_open_cache = 512
query_cache_type = 1
query_cache_size = 24M
query_cache_limit = 2M
max_heap_table_size = 56M
tmp_table_size = 48M
thread_concurrency = 8
Here's the httpd.conf, there are never more than 12-13 httpd processes, so it's not running out of those IMO.
Code:
StartServers 10
MinSpareServers 5
MaxSpareServers 15
MaxClients 45
MaxRequestsPerChild 200