Basic info:
I'm currently on a Linode 512 using 32-bit Ubuntu 10.04.3 LTS, hosting several sites, most of which are Wordpress installations. I have gone through several tuning tutorials to ensure that my LAMP stack is configured appropriately. Additionally, the Wordpress installations are caching with Xcache, courtesy of W3 Total Cache.
Short version:
I believe Apache is being hammered by referrer spam bots and I can't seem to control it, even when using fail2ban. What else can I do to ascertain what the true cause of the issue is and how to handle it.
TLDR version:
This setup has worked quite well for some time, however, one of the Wordpress installations is being targeted by referrer spam, which I see in Apache's access.log for that particular vhost. I've taken a look at access.log and see all the referrer spam, though the number of requests/sec doesn't appear to be substantial enough to bring the server down (generally 250-700 per
hour). I've dug a bit deeper in access.log, using awk to get a list of offending referrers and the IPs that are submitting them. I've tried banning several B-class IP ranges in iptables via ufw to prevent the connections from even reaching Apache. Furthermore, I've installed fail2ban and have enabled the default Apache protection configuration (along with SSH).
However, from time-to-time (almost daily), the site still gets hammered, spawning lots of Apache processes, chewing up available memory, leading to lots of swapping, which throws the IO requests through the roof (10K+ blocks/sec). This is reflected in the Linode dashboard graphs and brings the server to a grinding halt so much that I can barely SSH in. When I have been able to get in, I've validated with `iotop` that apache2 is indeed thrashing.
Generally, my only course of action is to bounce the entire box, which resolves the issue until the IO's shoot through the roof again. I can't seem to pin-down a particular time-of-day that this occurs, but it's generally in the morning if it happens. I've tried validating that the server is running efficiently by running `ab -n 1000 -c 10
http://www.theproblematicsite.com/` and the results look extremely promising (though, potentially skewed due to running on localhost). It's reporting 585 requests/sec with the longest request at 61ms.
At this point, I have no recourse except to monitor iotop and attempt to either bounce Apache (which I don't know is or is not effective since I usually can't even SSH in) or the whole server. I can throw this together and schedule it with cron, but it's a band-aid on a bullet wound.
This all has led me to here, to see if there's anything else I can do to ascertain what the root cause of the issue is and how to handle it.
Suggestions?