Cheek wrote:
The weird thing is, it was working fine for months without crashing. It just crashed again with CPU and I/O spiking through the roof.
That's not that unusual. Traffic load changes, performance of databases as they grow change, etc.. You may have been very close to exceeding your resource for a while now and just not known it. Or something may have bumped the request load to your node up significantly (a link from some site?) without your knowing it.
Quote:
I'd rather have Lassie rebooting my node, than the CPU spiking. Because that could crash my server for hours when I'm not around.
You should at least get an eventual email about the CPU usage exceeding the notification limits (I do). The problem is that depending on kernel configuration, a panic doesn't actually halt the box (the kernel is still running, just in a tight loop, which leads to the CPU usage) so Lassie doesn't consider it down.
There's a kernel parameter (kernel.panic) that you can set to the number of seconds after which the box will reboot itself after a panic. As a general matter there could be some risk of always restarting depending on the cause of the panic, but in a scenario like this it's probably preferable rather than staying in the panic'd state. You can save adjustments to that value in /etc/sysctl.conf.
Quote:
But anyway, there's not much else running besides apache and mysql. And as I said, I've set the mysql to the values as in the library and MaxClients to just 10, which seems pretty low.
Really the only way to know is to test. You may have a stack that is using even more memory than you think, so even MaxClients of 10 may be too much. You need to actually monitor your resource usage under load to identify what you actually use. The other threads cover ways of doing that in far greater detail, but basically you want to observe how much actual memory each Apache process is using when handling requests.
Quote:
Is there anything else that could cause this? I was really happy with my linode, but I'm just one guy building a website and can't spent most of the day fixing the server.
Going into an OOM condition? Nope - pretty much means you're using too much memory.
Assuming it's the Apache configuration is an educated guess, but as it's probably the leading contender of this scenario in just about all cases brought to the forum, it's a good first bet.
Quote:
I've got 3 options: find a fix, double the Linode or go with something like a Mediatemple DV server. It'll more than double the costs but maybe it's the best option for me?
Only you can answer that for yourself. Certainly just throwing more resource at the problem (the bigger Linode - I know nothing about Mediatemple) is "simpler", but I can't say it's guaranteed to solve the problem without your first identifying the root cause. Certainly might push off your having to deal with it until later though.
For example, let's say that you were still at MaxClients of 50, and your request load used them all, but each Apache process was using 100MB (all extreme values). Just bumping your Linode to a 2048 wouldn't solve the problem, just let you get a few more simultaneous requests before keeling over the same way.
-- David