Linux 2.6 on the Hosts
Six out of 20 host servers are now running on 2.6 version of the Linux kernel, with the CFQ fair-queuing disk scheduler.
Now that we've been running it on a few boxes for a while, I have a pretty good feel for how it performs. I've noticed 2.6 is better at some workloads, and a little worse for other workloads, compared to 2.4 (determined by comparing pre/post 2.6 mrtg and vmstats output). I'm optimistic that there are some additional gains to be had with some of the VM tuning options (/proc/sys/vm/*).
Overall, I think 2.6 is "a good thing", and we'll be moving the rest of the hosts over to 2.6 eventually.
Disk I/O Thrashing is no more!
The primary reason I wanted to move to 2.6 was for the I/O performance improvements over 2.4.
Linux is susceptible to what I would call a "hard-drive Denial Of Service attack" when there are high rate of random read/write requests, filling up the request queue(s). This causes latency issues for other requests, and essentially brings things to a crawl.
This is exactly the kind of workload that happens when a Linode is continually thrashing its swap devices (rapid reading and writing) and when the host is under pressure to write out those dirty pages (which it always will be, after some time). Unfortunately, the CFQ patch to 2.6 didn't solve this issue. (Nor do the default anticipatory or deadline schedulers).
CFQ does help a little with many threads doing random I/O (like during the cron job parties), but it doesn't eliminate the possibility for one Linode to wedge the entire host. Read on for the solution...
UML I/O Request Token-Limiter patch
I've implemented a simple Token Bucket Filter/Limiter around the async UBD driver inside UML. The token-bucket method is pretty neat. Here's how it works: Every second, x tokens are added to the bucket. Every I/O requests requires a token, so it has to wait until the bucket has some tokens before it's allowed to perform the I/O.
This method allows a burstable/unrestricted rate until the bucket is empty, and then it starts to throttle. Perfect!
Links:
token-limiter-v1.patch
token-limiter-v1.README
With this patch, a single Linode can no longer wedge the host!
This is a big deal, since the only method to correct this when it happens was for me to intervene, and stop the offending Linode.
The limiter patch is in the 2.4.25-linode24-1um kernel (2.6 to follow shortly).
The defaults are set very high, and I doubt any of you will be impacted by it under normal use. I can change the refill and bucket size values during run-time, so I'll be able to design a monitor for each host that dynamically changes the profiles depending on host-load. This is a big deal!
Linux 2.6 for the Linodes
I haven't officially announced the 2.6-um kernel yet. There are still a few bugs and performance issues to be worked out. I don't recommend running the 2.6-um kernel for production use yet, but, a few adventurous users have been testing it and reporting some of the quirks involved in getting it running under each distro. I'll try to compile a guide on migrating to 2.6 and release it once the kernel is more stable.
What's new in the world of UML?
We're long over-due for new UML patches. I'd guess we'll have a new UML release (for both 2.4 and 2.6) within the next two weeks or so.
Besides the usual bug fixes, I know Jeff has been working on AIO support for the IO driver inside UML. AIO is a new feature implemented in 2.6 (on the hosts). Some benefits are:
- The ability to submit multiple I/O requests with a single system call.
- The ability to submit an I/O request with-out waiting for its completion and to over-lap the request with other processing.
- Optimization of disk activity by the kernel through combining or reordering the individual requests of a batched I/O.
- Better CPU utilization and system throughput by eliminating extra threads and reducing context switches.
More on AIO:
http://lse.sourceforge.net/io/aio.html
http://archive.linuxsymposium.org/ols20 ... LS2003.pdf
---
That is all!
-Chris