I have never had this much trouble with a server and I am at a loss...
I have a Linode 2048 w/ extra RAM (2678 MB total). I am running Ubuntu 10.04 LTS w/ Apache and MySQL. There is a single WordPress website - teleread.com, it gets an average of 5000 uniques a day.
Code:
cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.1 LTS"
uname -a
Linux teleread 2.6.32.16-linode28 #1 SMP Sun Jul 25 21:32:42 UTC 2010 i686 GNU/Linux
Code:
apache2ctl -V
Server version: Apache/2.2.14 (Ubuntu)
Server built: Apr 13 2010 19:28:27
Server's Module Magic Number: 20051115:23
Server loaded: APR 1.3.8, APR-Util 1.3.9
Compiled using: APR 1.3.8, APR-Util 1.3.9
Architecture: 32-bit
Server MPM: Prefork
threaded: no
forked: yes (variable process count)
Server compiled with....
-D APACHE_MPM_DIR="server/mpm/prefork"
-D APR_HAS_SENDFILE
-D APR_HAS_MMAP
-D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
-D APR_USE_SYSVSEM_SERIALIZE
-D APR_USE_PTHREAD_SERIALIZE
-D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
-D APR_HAS_OTHER_CHILD
-D AP_HAVE_RELIABLE_PIPED_LOGS
-D DYNAMIC_MODULE_LIMIT=128
-D HTTPD_ROOT=""
-D SUEXEC_BIN="/usr/lib/apache2/suexec"
-D DEFAULT_PIDLOG="/var/run/apache2.pid"
-D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
-D DEFAULT_LOCKFILE="/var/run/apache2/accept.lock"
-D DEFAULT_ERRORLOG="logs/error_log"
-D AP_TYPES_CONFIG_FILE="/etc/apache2/mime.types"
-D SERVER_CONFIG_FILE="/etc/apache2/apache2.conf"
Code:
mysql --version
mysql Ver 14.14 Distrib 5.1.41, for debian-linux-gnu (i486) using readline 6.1
A few times a day the IOWAIT starts to rapidly increase, the SWAP starts to thrash and the load average jumps. I have tuned, and tuned, and retuned Apache and MySQL, but no matter what I do, it keeps happening.
Running Apache2 w/ prefork MPM
Code:
<IfModule mpm_prefork_module>
StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 300
MaxClients 300
MaxRequestsPerChild 4000
</IfModule>
MySQL:
Code:
[mysqld]
user = mysql
port = 3306
socket = /var/run/mysqld/mysqld.sock
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
skip-external-locking
skip-innodb
key_buffer_size = 64M
table_open_cache = 1048
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size =16
query_cache_size = 32M
tmp_table_size=64M
max_heap_table_size=64M
back_log = 100
max_connections = 301
max_connect_errors = 5000
join_buffer_size=1M
open-files = 10000
interactive_timeout = 300
wait_timeout = 300
thread_concurrency = 8
I've tried php-cgi, fcgid, and regular php to see if anything made a difference, but it doesn't help.
Eventually I start getting these in the Apache error log:
Code:
[Wed Sep 22 09:19:23 2010] [warn] child process 18258 still did not exit, sending a SIGTERM
[Wed Sep 22 09:19:23 2010] [warn] child process 18303 still did not exit, sending a SIGTERM
[Wed Sep 22 09:19:23 2010] [warn] child process 18304 still did not exit, sending a SIGTERM
But I think that's a sign of the OOMing/thrashing, not a sign of the culprit.
According to netstat, at any given moment in time I have about 120+ tcp connections to www, but occasionally it'll spike... I have seen these in the Apache error log (when I lower the MaxClients to test:
Code:
[Mon Sep 20 11:51:13 2010] [error] server reached MaxClients setting, consider raising the MaxClients setting
The lowest I've set MaxClients is 150, and I just set it back to 300 after the latest issue.
From top it looks like Apache is using 23M (RES) per process... at this moment netstat says I have 174 connections to www and ps shows 41 apache2 processes... that's roughly 943MB of RAM
At this moment MySQL is at 42M (RES)
These are the stats at this very moment in time
Code:
teleread# netstat -t | grep -c www
163
teleread# ps auxww | grep -c www-data
27
teleread# top
top - 10:05:44 up 17:43, 3 users, load average: 0.69, 0.67, 3.93
Tasks: 133 total, 2 running, 131 sleeping, 0 stopped, 0 zombie
Cpu(s): 6.2%us, 1.3%sy, 0.0%ni, 92.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.1%st
Mem: 2708280k total, 978832k used, 1729448k free, 63564k buffers
Swap: 262136k total, 14024k used, 248112k free, 312024k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18886 mysql 20 0 121m 45m 5460 S 0 1.7 0:41.06 mysqld
19435 www-data 20 0 52696 26m 4020 S 0 1.0 0:05.63 apache2
19474 www-data 20 0 52440 26m 4024 S 0 1.0 0:04.07 apache2
19451 www-data 20 0 49648 24m 4056 S 9 0.9 0:07.32 apache2
19429 www-data 20 0 49656 24m 4060 S 0 0.9 0:06.01 apache2
19496 www-data 20 0 49880 24m 3844 S 0 0.9 0:02.08 apache2
19492 www-data 20 0 49652 24m 4056 S 0 0.9 0:04.87 apache2
19331 www-data 20 0 49652 24m 4056 S 1 0.9 0:10.19 apache2
19469 www-data 20 0 49644 23m 4024 S 0 0.9 0:03.30 apache2
19473 www-data 20 0 49636 23m 4020 S 0 0.9 0:04.82 apache2
19479 www-data 20 0 49728 23m 3828 S 0 0.9 0:02.79 apache2
19507 www-data 20 0 49652 23m 3844 S 11 0.9 0:01.59 apache2
19495 www-data 20 0 49644 23m 3844 S 0 0.9 0:01.88 apache2
19508 www-data 20 0 49472 23m 4012 S 0 0.9 0:01.76 apache2
19501 www-data 20 0 49580 23m 3880 S 0 0.9 0:02.13 apache2
19433 www-data 20 0 49368 23m 4028 S 0 0.9 0:04.99 apache2
19487 www-data 20 0 49476 23m 3860 S 0 0.9 0:03.64 apache2
So... I am looking for some ideas, as I said, I am at a loss.
Thank you.
Lew