edavis wrote:
Waited for the load to settle down. It got to .5-ish. I then looked at my io_status and then performed an ls -l... top running in the background... the load immediately shot up to 3+ and ls took 30+ seconds to return. Here was my io_status at the time:
io_count=4251010 io_rate=13 io_tokens=399982 token_refill=512 token_max=400000
In the bad old days when the host that I was on was suffering from other Linodes hogging all of the disk I/O, this is the behavior that I would get. All of my processes that wanted to touch the disk would get stuck, and the load average would go up, I guess those processes were somehow counted as runnable and adding to the load average instead of sleeping on I/O for some reason.
Anyway, it was the I/O tokens mechanism that solved most of this problem by keeping other Linodes from thrashing the box so hard, and the remainder of the problem was mostly solved by moving me to a quieter host. Now there are rare occasions where this kind of poor performance happens, but it's not very frequent and doesn't last very long.
My guess, and it's just a guess, is that other Linodes on your host are totally swamping the I/O. If 4 or 5 Linodes are all constantly doing disk I/O even at the maximum rate allowed by the token system, performance for everyone else on your host will suffer. Sounds like this is happening to you.