Disk IO Spikes, Crashes My Linode Server

This is the 2nd time it's happened to me since I've moved to Linode about a couple of weeks ago. At first I thought the mysql databases I brought over weren't optimized correctly, so I optimized them and everything worked well for a time.

Today, I had a huge Disk I/O Spike, where it normally averages 272.69 it went up to 7539.17. I wasn't doing anything, installing anything, etc. I realized it when none of my websites were loading.

I'm running a Linode 1080 with 2GB of swap, so this is pretty shocking for the server to come crashing down like this. I had to reboot my linode server through the linode manager to get my websites up again.

I run mainly 4 websites, none with crazy amounts of visitors, maybe 500 visitors a day each.

I have no clue what logs to look at to see what could have caused this. I'm running Webmin/Virtualmin as my panel.

I did find this in my /var/log/apache2/error.log for today and yesterday:

[Thu Jul 30 09:17:12 2009] [error] [client 91.209.196.70] Invalid method in request \x80\x8c\x01\x03\x01
[Thu Jul 30 12:21:37 2009] [notice] Graceful restart requested, doing restart
[Thu Jul 30 12:21:42 2009] [notice] Digest: generating secret for digest authentication ...
[Thu Jul 30 12:21:42 2009] [notice] Digest: done
[Thu Jul 30 12:21:43 2009] [notice] Apache/2.2.8 (Ubuntu) DAV/2 SVN/1.4.6 PHP/5.2.4-2ubuntu5.6 with Suhosin-Patch mod_ruby/1.2.6 Ruby/1.8.6(2007-09-24) mod_ssl/2.2.8 OpenSSL/0.9.8g configured -- resuming normal operations
[Thu Jul 30 12:39:15 2009] [error] [client 91.209.196.70] Invalid method in request \x80\x8c\x01\x03\x01

[Fri Jul 31 11:49:28 2009] [error] server reached MaxClients setting, consider raising the MaxClients setting
[Fri Jul 31 12:23:12 2009] [notice] suEXEC mechanism enabled (wrapper: /usr/lib/apache2/suexec)
[Fri Jul 31 12:23:12 2009] [notice] Digest: generating secret for digest authentication ...
[Fri Jul 31 12:23:12 2009] [notice] Digest: done
[Fri Jul 31 12:23:14 2009] [notice] Apache/2.2.8 (Ubuntu) DAV/2 SVN/1.4.6 PHP/5.2.4-2ubuntu5.6 with Suhosin-Patch mod_ruby/1.2.6 Ruby/1.8.6(2007-09-24) mod_ssl/2.2.8 OpenSSL/0.9.8g configured -- resuming normal operations

By the way that IP isn't my IP address and I found this online about client 91.209.196.70 here and here.

Any other logs I should look at?

__Ubuntu 8.04 LTS

Virtualmin 3.71

Spamassassin + ClamAV__

7 Replies

Sounds like you might be hitting swap (which will destroy your performance).

Are you running munin?

That much swap is pointless. If you're thrashing even 100M of swap the box will be unusable anyhow, so reduce your swap image back to 256M or something sane. I'd rather have a box go OOM and reboot itself than thrash and be unusable for eternity. Check out this post:

http://www.linode.com/wiki/index.php/RebootingonOOM

Check your Apache setup: Identify which MPM you're running with: "apache2 -V | grep MPM". Find that section in apache2.conf. What are the values? If you see numbers three digits long, you're in trouble.

Linode.com's looks like this: <ifmodule mpm_prefork_module="">StartServers 5 MinSpareServers 5 MaxSpareServers 10 MaxClients 75 MaxRequestsPerChild 1000</ifmodule>
I'd recommend setting MaxClients lower to start out with (say 20).

Hope that helps,

-Chris

@JshWright:

Are you running munin?
No I'm not but seems like a good idea to add a monitoring application like that - thanks for the heads up and I'm looking into it.

@caker:

That much swap is pointless. If you're thrashing even 100M of swap the box will be unusable anyhow, so reduce your swap image back to 256M or something sane.
Ok, will do, I had originally decided on 2GB as both Ubuntu and CentOS documentation suggest 2x your RAM. I did find this though in the Ubuntu docs which is more like what you recommend.

@caker:

Check your Apache setup: Identify which MPM you're running with: "apache2 -V | grep MPM".

Server MPM:     Prefork
 -D APACHE_MPM_DIR="server/mpm/prefork"

@caker:

Find that section in apache2.conf. What are the values? If you see numbers three digits long, you're in trouble.
This is what I have:

 <ifmodule mpm_prefork_module="">StartServers          5
    MinSpareServers       5
    MaxSpareServers      10
    MaxClients          150
    MaxRequestsPerChild   0</ifmodule> 

MaxClients to 20 is a great start…

@MichaelE:

@caker:

That much swap is pointless. If you're thrashing even 100M of swap the box will be unusable anyhow, so reduce your swap image back to 256M or something sane.
Ok, will do, I had originally decided on 2GB as both Ubuntu and CentOS documentation suggest 2x your RAM. I did find this though in the Ubuntu docs which is more like what you recommend.
The general 2x recommendation isn't necessarily unreasonable on a dedicated machine, since while disk is significantly slower than memory, the swapping is still only on behalf of the single environment running on the machine. It does depend on whether you're truly going to be overcomitted (and then thrashing, in which case even a little can be disastrous), or just have a larger allocation than working set. It's still probably reasonable to also have some absolute maximums independent of machine size (e.g., 2GB is probably extreme for a 1GB machine in any case), unless you know a lot about how your working set operates and you're ok with a hit when switching among activities.

Importantly, in a VPS scenario, disk I/O is, in general, more contentious and performance impacting. Unlike memory (which with Xen based systems is guaranteed availability for a given VPS), disk is shared among all other machines on that host. So the already performance compromising disk I/O performance is slowed even further by the VPS contention for host resources.

The bottom line is that in a VPS environment, you want to do whatever you can to keep everything in memory, and there's little benefit to permitting significant swapping as your performance will likely tank horribly.

With respect to the original log, the binary method log would seem to indicate that the source address is connecting to you and just spewing binary data at the server. If you see some consistent addresses in the logs doing that, couldn't hurt to filter them out, though as long as you get your Apache configuration under control so having lots of hits won't overload the machine, having some sporadic screwy connections shouldn't hurt that much.

– David

Big thanks to caker and db3l - I've set MaxClients to 20 as well as reduced my swap.

Just one part of my question that I'm still curious about. I know that my setting for swap and MaxClients may have left me susceptible to the Disk I/O rate going high.

But why did Disk I/O get that high in the first place?

Was that IP address hitting me with a DDoS attack?

@MichaelE:

But why did Disk I/O get that high in the first place?
Hard to be absolutely certain after the fact, but given that you were permitting Apache to service up to 150 clients simultaneously, it's certainly a reasonable guess that you exceeded memory and did begin swapping, which could certainly spike the disk I/O if you started thrashing between the processes.

> Was that IP address hitting me with a DDoS attack?
Also tough to say for sure. If you have a gazillion similar looking entries from one (or a few) IP addresses, then perhaps. But if you just have random assortments of those sorts of hits, or sporadic as opposed to really heavy load, it's more likely it's just your average trolling that exists nowadays on the public net where people try to find hosts with exposures.

Note that even if it wasn't a DDoS attack, just having enough of the hits in a short enough period of time could cause the swapping under your original configuration since Apache would try to service them all simultaneously rather than queueing them up with the smaller client limit.

– David

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct