wfox wrote:
I've been studying the various threads here on reported issues with high io, and have yet to diagnose what's causing the problem on one of my linode servers.
Well, at least it seems fairly clear why you're getting the failures, if not necessarily the key culprit.
Quote:
If I had to guess, I'd say Apache is the culprit. Tailing syslog, I see this fairly frequently.
Code:
Apr 11 21:05:08 linodeapp kernel: Out of memory: kill process 6225 (apache2) score 62168 or a child
While with Django the odds favor Apache (I think mod_wsgi still embeds the interpreter inside the Apache process), just being chosen by the OOM process killer need not mean that's the case.
But at a very simple level whatever you have running on the apps box is using too much memory in aggregate. Now it may be just Apache (25 simultaneous processes could burn through 2GB with about 80MB/process, ignoring all other processes), or it could be something else on the box using up a chunk of resource. Or the combination of the two - e.g., some other very large process significantly reducing the memory available to Apache processes.
So your next step (as is probably often referenced in similar threads) is to find out where your memory is going. Check your current free memory, then divide it by the worst case size in Apache process you can find, and see if your MaxClients settings puts you over. Then drop that back down so it won't and start tuning from there.
If there's a big discrepancy in Apache process sizes then maybe they grow over time (say perhaps an application stack leak), and you shouldn't let so many requests get handled by the same process (drop requests per child).
But if I were you, I'd consider my first job at this point to ensure the machine never entered the OOM state (even if that means drastically cutting back on MaxClients), and only then see how to tune to get highest throughput.
-- David