Quote:
Keep in mind I'm using some extreme scenarios to test the server. But as you said, a properly configured server shouldn't run out of memory right?
Yes. Well, to be fair, unless the minimum requirements of the application stack truly cannot fit in the available resources. But a Linode 512 really ought to be able to handle this, even with a resource hungry stack like Drupal, at worst with reduced performance, but without crashing.
Wow. The only conclusion I can come to tell you're saying that a single request needs at least a quarter of your available memory (sans non-apache processes). Assuming that's in the 100MB range I guess that's not impossible, but does seems excessive.
Darn, this isn't getting any easier for you, is it? :-(
In your shoes, I suppose my last scenario would be MaxClients and MaxRequestsPerChild of 1 each. That lets a single request into your host at a time. If you can crash things that way, you know you literally don't have enough resource for your application stack, and barring fixing a problem there, simply have to grow your Linode. It may be that a Linode 1024 would be fine, or it may be that the same URL will just eat through whatever you give it (if it's a bug). Only way to know would be to test.
The fact that it worked for so long probably implies an existing behavior (potentially a bug) that is now being tickled by differences (or growth) in your data, or maybe some change (a module upgrade?).
I guess the silver lining, if there's any in this scenario, is that if a single request is capable of doing this, if you could figure out what one (or ones) it is, you should be able to narrow your focus a lot.
Is there any way for you to get information from whatever stress tool you've settled on as to what URLs it requests in what order? If you could find the first one that started failing (timeout or whatever) it might point you at somewhere to look, or ask questions in a Drupal group about.
I think it would also be worth your time to clone your current Linode onto a larger size (like a 1024 or larger), and then run the same test against that. If it survives, at least you know you have the option of spending your way out of this short term pending any other analysis. If not, at least you know that as well.
-- David
PS: For images, I'd add something like nginx for static content on the front end, proxying dynamic requests back to apache, independent of system size. The latency is because even simple static requests have to wait for a free apache worker, and each worker has the full interpreter stack. Offloading that to nginx would minimize the resources necessary to deliver static content.