Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
 Post subject: IO-Nice
PostPosted: Sat Sep 11, 2004 12:17 am 
Offline
Senior Member

Joined: Sat Dec 13, 2003 12:39 pm
Posts: 98
Would it be possible to have a way to limit IO token usage rates for specific processes?

On my linode 64, which I use mainly for testing, I have a postgres database that is practically empty. A cron job (installed by apt-get it looks like) runs every 5 hours to do routine database maintenance, and this cron job consumes over half of the IO tokens. I could imagine this being a problem if for example a web app was hit at the same time that depleted IO and caused IO to become limited.

I expect with actual data in the database, this cron job would consume more IO tokens and it would be more likely to interfere with important stuff. Right now the cron job takes 2 minutes or so to run. What I would like to do is be able to restrict its token consumption rate to like 1/10 of what it consumes and have it take longer, say 20 minutes.

I think that would eliminate the risk of it interfering with more important IO uses and wouldn't hurt since routine maintenance on the DB is designed to run in the background and not interfere with other uses of the DB.

I can imagine it might be very tricky to isolate the maintenance-related IO requests from others. Maybe it wouldn't even be possible, I just don't know. Thinking about it, I imagine postgres itself probably originates them in response to the maintenance requests.


Top
   
 Post subject:
PostPosted: Sun Sep 12, 2004 2:41 am 
Offline
Senior Member

Joined: Fri Aug 06, 2004 5:49 pm
Posts: 158
Postgres has to go through that much work to keep up maintenance?!

Wow... I think I'll be sticking with MySQL.


Top
   
 Post subject: mysql
PostPosted: Sun Sep 12, 2004 8:07 pm 
Offline
Senior Member

Joined: Sat Dec 13, 2003 12:39 pm
Posts: 98
I didn't notice hoe much IO the cron job was using until I started tracking IO usage with that mrtg script. Do you know how much IO mysql uses? Maybe I should switch.


Top
   
 Post subject:
PostPosted: Mon Sep 13, 2004 12:17 am 
Offline
Junior Member

Joined: Tue Nov 18, 2003 2:02 am
Posts: 30
If you're talking about a routine VACUUM of the database, I think Postgres will give that query a lower priority than that of any other app you've got running.

So, if you've got the VACUUM running, and someone hits your web app, Postgres will preferentially use its disk I/O to service the web app's query.

I use Postgres at work with a medium-size database (~150M records) and running a VACUUM doesn't noticeably impact other queries.


Top
   
 Post subject:
PostPosted: Mon Sep 13, 2004 12:52 pm 
Offline
Senior Member

Joined: Fri Aug 06, 2004 5:49 pm
Posts: 158
Unfortunately VACUUM doesn't know about the I/O Limiter system in place on Linodes. If it's already eaten all the I/O tokens, even if preference is given to other queries, they're still already going to be limited, and slowed down significantly.

Last I knew, MySQL doesn't need much maintenance, and typically, it's put in the hands of the administrator to know a bit about the system and run maintenance when it's needed. For something like 100M database, putting maintenance of data checking aside, a simple OPTIMIZE TABLE doesn't take 2 minutes and shouldn't need to be done more than maybe once a week... at least from my experience, which isn't a whole lot, but I've had 2GB databases I've worked with.


Top
   
 Post subject: vacuum
PostPosted: Mon Sep 13, 2004 8:33 pm 
Offline
Senior Member

Joined: Sat Dec 13, 2003 12:39 pm
Posts: 98
The cron job runs vacuum and some other optimizations. In practical terms I might change the schedule and have it run once a month instead of once per day.

Like tierra said. The case I'm worried about is postgres mainteannce maxing out the IO while another unrelated web app runs. The web app might do no database access, but it would still be limited in doing anything that uses IO.

I've been running tests, and when that happens, it is not good. It can take minutes to get a single simple web page back when IO is limited and there is competition for IO tokens.

What would really be nice, but perhaps unrealistic to implement, is to be able to limit any IO not related to a web app to something lower than the refill rate, or be able to set priorities some other way.


Top
   
 Post subject:
PostPosted: Tue Sep 14, 2004 2:33 am 
Offline
Senior Member

Joined: Fri Aug 06, 2004 5:49 pm
Posts: 158
What would be really nice is more ram =), it's cool that I can run anything I want on a Linode, but I can only run so many at a time. Then again, I guess that's the issue companies had when they were splitting up mail servers, web servers, database servers, etc. Not saying Chris is doing anything wrong, cuz damn, that's one cheap server with one hell of an internet connection. I just wish I had more money to throw at a Linode 128.

That's the only problem here, the only time people keep running out of I/O tokens is when they're using more swap than they have ram, and swapping like no other between services on every new request. I think a lot of people don't understand what swap space is, and how it works. I was given the completely wrong impression because of Microsoft's naming convention - "virtual memory".

Quote:
What would really be nice, but perhaps unrealistic to implement, is to be able to limit any IO not related to a web app to something lower than the refill rate, or be able to set priorities some other way.


I'd be happy with even just a 2 level priority scale: the high priority processes use as much as they want, and low level can only use I/O if the token bucket is at least 1/2 full. Or something to that effect. I'm sure it's possible, but that's asking a lot (unless I get back handed again by someone who knows a lot more than me and get's it implemented with 5 lines of code, as I have no experience in that dept).

Anyway, I'm not going to bother Chris about it, he's doing an excellent job, and I can manage with the vast amounts of system resources I've already been given at such a low price.


Top
   
 Post subject: yep
PostPosted: Tue Sep 14, 2004 6:52 am 
Offline
Senior Member

Joined: Sat Dec 13, 2003 12:39 pm
Posts: 98
I hear you, don't get me wrong, this is a great deal, and the best features I have seen around. I have a 64 and a 128.

One thing -- the postgres maint stuff doesn't consume hardly any new memory, and uses very little CPU, so swap usage isn't a factor. It is really pure IO on the actual database files on the HD. Postgres is always running, it just responds to commands that the cron job initiates. I watched it do its thing with those MRTG graphs, also watched swap/ram usage, cpu sys/user usage, net traffic, etc. IO token usage shot up, the others were flat, except a slight increase in cpu.

I figure UML is open source, and there are various people writing changes for it, seeing how people use it etc, so I am just posting my experience and things i think would improve it. Seems like IO is a key area of development. I too would be happy with just 2 levels of IO priority, but I'm thinking, getting from 1 to 2 is the hard part, from 2 to like 20 would probably be nothing.

And the key thing, too, is that the code for IO could be better. Slicing up memory is fine, I have no complaints about UML in that regard. Same for CPU sharing, seems great to me. But the IO sharing code is just not at the same high level that CPU and RAM sharing are at. I feel like just in general we're not using most efficiently the IO we share on linode hosts. It's getting much better, and I'm sure in the not too distant future it will be there.


Top
   
 Post subject: Re: yep
PostPosted: Tue Sep 14, 2004 1:46 pm 
Offline
Senior Member

Joined: Fri Aug 06, 2004 5:49 pm
Posts: 158
kiomava wrote:
One thing -- the postgres maint stuff doesn't consume hardly any new memory, and uses very little CPU, so swap usage isn't a factor. It is really pure IO on the actual database files on the HD. Postgres is always running, it just responds to commands that the cron job initiates. I watched it do its thing with those MRTG graphs, also watched swap/ram usage, cpu sys/user usage, net traffic, etc. IO token usage shot up, the others were flat, except a slight increase in cpu.


It's not really that at all actually, maybe some, but your using more I/O for another reason. Say you have your 64mb of ram, and 256mb swap. Let's also take my server as an example since mines in danger of doing this if I had more traffic (I'll have to kill some services or upgrade eventually). I have MySQL, Apache2, qmail, courier-imap, spamd, and proftpd among other services all loaded up at the same time. I'm using the full 64mb of ram, and another 80mb of swap to do this when idle. Processes that have been idle the longest get pushed into swap space first when the system needs ram. So lets just say that at some point in time, MySQL had to be pushed into swap since maybe I last checked my email and qmail and courier-imap are loaded up along with Apache. The next time I make a request for a PHP page that uses the MySQL database, the system recognizes it needs ram, and copies qmail and courier-imap to the swap drive. Just for one page request, I've already done as much I/O as much memory as qmail and courier-imap take up while running. Now it's got to read the swap drive and copy MySQL back into ram. Now I've also done more I/O the size of the MySQL binary plus whatever it's loaded up. Now that MySQL is in ram, it can continue running it, and make the requests as necessary.

So even if my page request only needed a couple hundred bytes from a MySQL database, it's probably used up 40mb of I/O. Now say I'm getting requests for mysql dependant dynamic pages at the same time people are checking their email from my server, it's going to keep swapping out MySQL, qmail, and courier-imap... the swapping becomes a problem when the processes all being used in the past minute accumulate to over how much ram you have.


Top
   
 Post subject: Re: yep
PostPosted: Tue Sep 14, 2004 3:12 pm 
Offline
Senior Member

Joined: Thu Aug 28, 2003 12:57 am
Posts: 273
tierra wrote:
kiomava wrote:
So even if my page request only needed a couple hundred bytes from a MySQL database, it's probably used up 40mb of I/O. Now say I'm getting requests for mysql dependant dynamic pages at the same time people are checking their email from my server, it's going to keep swapping out MySQL, qmail, and courier-imap... the swapping becomes a problem when the processes all being used in the past minute accumulate to over how much ram you have.


This is not true under normal situations. Unless your system is running in a very, very high-load/low-memory situation and is truly swapping (which means writing out entire process memory spaces into swap at a time when there is very high contention for memory), you're not going to be writing entire processes to the disk. Linux and other modern operating systems will write just the least recently used pages of processes to swap to free up needed memory. I'm sure that it does a chunk of pages at a time to improve I/O efficiency but it doesn't have to write or read the entire 40 MB MySql process.

I'm not sure when the paging algorithm switches over to swapping in and out entire processes, but I'm sure that it's only when the system is under extreme duress. If your requests to the different processes can be satisfied just by paging out a few pages of one process to load pages of the other process into memory, and then unloading those and re-loading the others when the original process needs more memory, then you'll only be paging out a small number of pages at a time, not entire processes. So in your example I would expect the system to constantly be shuffling a couple of hundred K or so into and out of swap as requests to the various processes are handled.


Top
   
 Post subject:
PostPosted: Tue Sep 14, 2004 3:18 pm 
Offline
Senior Member

Joined: Thu Aug 28, 2003 12:57 am
Posts: 273
By the way, if you run "ps" under Linux you can see that most processes will have some number of pages in swap at any given time, for example:

Code:
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
rpc        752  0.0  0.0  1524  464 ?        S    Jul20   0:00 portmap
rpcuser    780  0.0  0.0  1644  632 ?        S    Jul20   0:00 rpc.statd
ntp        899  0.0  0.0  1888 1880 ?        SL   Jul20   1:35 ntpd
root       952  0.0  0.0  2632  480 ?        S    Jul20   0:00 sshd
root       986  0.0  0.0  2144  568 ?        S    Jul20   0:00 xinetd
lp        1006  0.0  0.0  9396  684 ?        S    Jul20   0:10 lpd


VSZ is the virtual memory allocation of a process, and represents the size of the process's in-memory pages + in-swap pages. RSS is the memory resident size, which is how much of the process is in memory as opposed to in swap. As you acn see, all of the above processes have some pages in swap. If I ssh into my box, we can see that suddently more of the sshd pages become resident:

Code:
root       952  0.0  0.0  2632  644 ?        S    Jul20   0:00 sshd


Note that the virtual memory size of sshd did not change because it did not allocate more memory to handle my ssh request (or if it did, it got memory that it already owned but was being held by the gnu malloc algorithm), but the resident size went up as some pages were swapped in by the kernel. Most likely the sshd process wanted to run code from some parts of its text segment that had not been used in a while (since I haven't sshed into this box in days) but now needed to be run to handle the incoming ssh request.

I have been running Linux for a *long* time and I don't think I have ever seen actual swapping occur, where the kernel switches to its very heavy-handed swapping-entire-processes strategy to try to improve overall system efficiency under incredibly high paging loads.


Top
   
 Post subject: Re: yep
PostPosted: Tue Sep 14, 2004 3:43 pm 
Offline
Senior Member

Joined: Fri Aug 06, 2004 5:49 pm
Posts: 158
bji wrote:
Unless your system is running in a very, very high-load/low-memory situation and is truly swapping (which means writing out entire process memory spaces into swap at a time when there is very high contention for memory), you're not going to be writing entire processes to the disk.


Did I mention the situation I was outlining was in fact a high-load/low-memory situation? Given I did exagerate the figures significantly, but it's the same situation, and it is in fact what all the I/O is that's presenting the problem on a lot of people's Linodes.


Top
   
 Post subject: Re: yep
PostPosted: Tue Sep 14, 2004 8:58 pm 
Offline
Senior Member

Joined: Thu Aug 28, 2003 12:57 am
Posts: 273
tierra wrote:
bji wrote:
Unless your system is running in a very, very high-load/low-memory situation and is truly swapping (which means writing out entire process memory spaces into swap at a time when there is very high contention for memory), you're not going to be writing entire processes to the disk.


Did I mention the situation I was outlining was in fact a high-load/low-memory situation? Given I did exagerate the figures significantly, but it's the same situation, and it is in fact what all the I/O is that's presenting the problem on a lot of people's Linodes.


Your example stated that you have 64 MB of memory used (presumably with no file cache?) and 80 MB of swap. That in and of itself does not mean that the system will be in a state where it needs to invoke whole-process-swapping. You can comfortably use all of your memory and most of your swap if the total number of pages in active use is not a whole lot larger than physical memory. By active use I mean, being touched all of the time. Your example of hitting web pages and using mysql would have to see a server that is severely overloaded to be hitting that many pages of that many processes so frequently as to require whole-process-swapping.

If your Linode is reaching this state then your server is definitely beyond a Linode's capabilities. Probably you should be using a dedicated server with a LOT of memory. (and by the way I don't mean *you* specifically, I just mean anyone in general).

I guess my point is that I don't think that any evidence has really been given that would indicate that this person's Linode is under so much duress that the operating system is doing true "swapping". But I could very easily be wrong; certainly some Linodes do seem to go swap-crazy and Caker has to go shut them down, I've experienced the effect of such a situation on my Linode many times and its no fun. But I always assumed that even in those cases, it was just that the offending Linode was *paging* a tremendous amount, not *swapping*. But once again, I could be wrong.


Top
   
 Post subject: QoS baby!
PostPosted: Tue Sep 14, 2004 9:44 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
Having a two-tiered limiter won't solve anything. It's a little more complicated considering that I/O throughput changes given the circumstances.

If a single thread read can sustain 50MB/sec, then two threads can only achieve 25MB/sec. Considering that they are likely reading from different parts of the disk, you have to calculate in the access times, so suddenly two threads only achieve 20MB/sec each. Multiple that by many threads (read: Linodes) and suddenly throughput can be in the K/sec range. Yeah, RAID-1 can read from both disks independently, but you get the point.

There's no point in limiting the I/O of a Linode if the host is sitting there idle, just because they've run out of tokens. What I really need to do is factor in a few variables into an I/O watch-dog that dynamically changes the io_token_refill and io_token_max rates depending on a few factors. At that point, the refill rates and the bucket size values would equal, turning the limiter into a rate-limiter rather than a token-bucket.

The ideal situation is that the limiter is set with sky-high values, and only buckles down when it's time to share or a Linode is behaving badly.

The variables would be something like:

:arrow: The current load of the host server itself, measured by how many processes are waiting for I/O

:arrow: The amount of swap usage for each Linode should factor in as a penalty. I have seen Linodes who have a lot in swap but don't thrash, but more often than not they're the ones causing high I/O.

:arrow: The I/O usage history of each Linode, over a specified amount of time. This would prevent Linodes from being I/O limited who occasionally use lots of I/O during normal operation, say during untarring a large file. This variable would also help negate the fore-mentioned swap penalty, for those that have lots in swap but don't thrash.

My goals are that swap-thrashing Linodes are given lower-throughput values automatically, to eliminate the useless limiting of good behaving Linodes, and to always have a small supply of I/O bandwidth available for the host itself.

-Chris


Top
   
 Post subject: Re: yep
PostPosted: Tue Sep 14, 2004 10:45 pm 
Offline
Senior Member

Joined: Fri Aug 06, 2004 5:49 pm
Posts: 158
bji wrote:
I guess my point is that I don't think that any evidence has really been given that would indicate that this person's Linode is under so much duress that the operating system is doing true "swapping". But I could very easily be wrong; certainly some Linodes do seem to go swap-crazy and Caker has to go shut them down, I've experienced the effect of such a situation on my Linode many times and its no fun. But I always assumed that even in those cases, it was just that the offending Linode was *paging* a tremendous amount, not *swapping*. But once again, I could be wrong.


Well, the whole reason it came up was because kiomava mentioned something about I/O jumps when running the postgres table maintence script. And coming from the world of MySQL, I can't imagine Postgres needing to do that much I/O everyday to keep up table maintenance, so I was taking guesses at what else it could of been, and my number one guess was swapping. I could be wrong, Postgres may actually just have to do that much work, but I'm just not used to those figures.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group