xfactor-updates wrote:
Switching MaxClients to 1 doesn't kill the server but it still gets extremely slow.
The benchmark testing gets slow, or the server gets slow? The former I'd expect due to the fact that MaxClients of 1 essentially serializes all requests through the single process. Given these stats (which don't show much in the way of CPU or I/O overhead) I'd probably expect the Linode itself (say via ssh) to be fairly responsive. You seem to have eliminated the memory stats from the top output, but I'm assuming you had some memory free.
An important fact to highlight is that at least you have a configuration (poor performing though it is) that doesn't ever completely keel over. That's real progress in terms of troubleshooting, and at least provides a stable starting point to work up from.
The fact that your single apache2 process is using 81MB of resident memory is a big deal. If that happens to more than a few of the possible processes under your older MaxClients of 15 you could easily explain the problems you were getting into. Unfortunately, the prior post doesn't actually seem sorted in memory order (still CPU) so can't say for sure - but even there you had a much higher average resident usage (39-40MB) than the prior summary.
So, my take away from this is that your stack is actually using quite a bit of memory for the apache processes on average (say 30-40MB), with some extreme peaks (80+MB). That may or may not be something you can optimize, but it certainly crys for starting with a very low MaxClients (just divide those resident sizes into the available memory). I'm not really a PHP guy, but seem to recall comments somewhere about some of the cache solutions needing a lot of memory - since that's your tightest resource at the moment, you might also experiment with disabling any caching for the heck of it.
Sans optimization, a next step would be to slowly raise MaxClients, testing with each change, and watching until you got tight on free memory (or see a spike in CPU or I/O, but I suspect memory will be the first resource you exhaust). Given these resident sizes I'm thinking you won't get much beyond 5-10. That can give you an inkling of best performance without any further tuning. Keep the process list sorted by memory so you can observe peaks in apache process sizes.
You might also consider making sure that MaxRequestsPerChild is low (but not 0), in case there's a memory leak in the application stack that let's the longer lived apache processes grow. Such an issue could also explain why limiting things to a single process resulted in an even larger size. Setting it to 1 will hurt performance, but ensure the smallest footprint in such cases since the process exits after each request forcing a resource cleanup. Note that a setting of 1 will probably also make it harder to catch all the processes in top - dropping the refresh interval to 1s can help but you'll still just be seeing snapshots at intervals that are large compared to the process creation/destruction interval.
To put these two parameters in perspective, consider a load test pummeling your server with thousands of requests. Apache is going to let MaxClients copies of itself be running simultaneously, each handling a request. Thus, your memory footprint will be MaxClients times the process size, which in turn depends on what that process does, such as your PHP code. Apache leaves a process around for MaxRequestsPerChild, handling multiple requests. So if your processing causes the apache process to grow by some amount per request, your peak usage will be something like MaxClients * (initial_apache_size + (MaxRequestsPerChild * per_request_growth)).
Your goal here is to get these parameters high enough for maximum throughput, yet low enough to not exhaust memory worst case. Even if you do have some leak/growth, letting at least a few requests get handled by a single child can help a lot with fixed overhead of starting up the process and PHP interpreter, while protecting against unbounded growth over time. So for example, if you have a choice between MaxClients of 10 and MaxRequestsPerChild of 1, versus MaxCients of only 5 but MaxRequestsPerChild of 5, the latter might actually perform better due to a slower rate of process creation/destruction.
That should let you settle in on a rough working set for the current Linode. Let's say that you can only get MaxClients to 5, and due to a leak with higher MaxRequestsPerChild you're stuck with that at 1 to help bound individual apache process sizes. Now you'll know the rough requests/s you can handle on a Linode 512 without ever dying, and judge if that's fast enough. Remember also that even low request/s numbers can yield very large daily page visit counts. An average of 1/s is still 86,400 per day though clearly you want instantaneous peak req/s rates to be higher to support that on average. But your original target of 5K page views a day (let's say over an 8 hour period, and each page needs 10 individual requests, so 50,000 http requests in 8 hours) could be met with about 1.75 req/s on average. And that's probably a pretty conservative estimate since the static parts of the page can be serviced much more rapidly (and with less memory) by apache so won't be anywhere near as slow as your PHP-backed content.
If you judge the rate insufficient, then yes, at that point increasing the Linode plan will let you continue to raise MaxClients (slowly) depending on how much more memory the plan has, and gain some parallelism - at least up until the point where CPU or non-swapping I/O overhead begins to dominate.
Or of course, at that point digging more deeply into your application stack to find out if there are bugs, bottlenecks or things that can be improved there becomes an option too. But at least you'll have a stable platform to attempt tuning on.
Geez, this got long again ... sorry about that.
-- David