untitled9 wrote:
Um... why's your load average 22.75?
A load average of 25 merely means "25 jobs attempting to run".
If the I/O throughput is too low then even normal jobs (eg cron checking to see if any work is to be done) will slow down and because the I/O request is not satisfied then the job remains in the "attempting to run" state, and so adds 1 to the load average.
In this case, disk I/O was effectively frozen and so most jobs (eg console login process, ssh forking, cron, web server, postfix checking the queue, postfix accepting mail etc etc etc) all froze and so all added "1" to the load average.
A machine in this state is called "I/O bound".
Another reason for a high load average could be "CPU bound", where too many processes are trying to run and the CPU just can't satisfy all the requests.
It _is_ possible to have a high load average and still good performance; eg a job that just forks a child and terminates, and the child does the same. Processes will be created and terminated very very quickly, and so a large number will be in the run queue every second (so load average will look high) but the system remains perfectly responsive. (I did this a few years back on a Sun Sparc 20 and got a load average of over 30, just to prove a point to my manager... it was impossible to kill, so I had to reboot the machine!)
Unix systems are very dependent on a number of things, and bottlenecks can appear in unexpected places. I presume this is one reason why caker spent so much time on the I/O throttler, simply because disk I/O is very important for smooth running of an linode; even if that linode isn't swapping (mine is nowhere near (30Mb RAM used!) disk I/O performance is high on the critical path on performance tuning.