Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
 Post subject:
PostPosted: Wed Mar 23, 2011 3:52 pm 
Offline
Senior Newbie

Joined: Mon Mar 07, 2011 6:49 am
Posts: 9
Excuse me OZ, i did not explained my self properly. i don`t want to be taken as troll or some one who is throwing around accusations and demands, i know i will take me no where. I just would like to be sure about those 2 weeks and ask linode to implement process that will make IO problems visible to support staff.

My server been swapping before 5 of march and is swapping now on its "standard" rate. it is lacking of memory indeed and i need to fix it, hopefully with help of community. Maybe i will be able to help some one in future.

What i predict from my logs is: 5 of march, data link from my host to storage get busy, the for all disc operations slowed down. I could fight for ever with tuning and it would not help because disc was slow.

monthly graf of IO wait:
http://cgp.multi.obin.org/detail.php?p= ... =800&y=350

monthly graf of swap:
http://cgp.multi.obin.org/detail.php?p= ... =800&y=350

monthly memory consumption:
http://cgp.multi.obin.org/detail.php?p= ... =800&y=350

Looking from my (non-sys-admin) point of view: if slow performance would be because of vps is running ot of memory, it would be visible in more memory consumption AND/OR higher swap usage. Non of those occurred. Also I did not changed any thing on my system at 5 of march.


Top
   
 Post subject:
PostPosted: Wed Mar 23, 2011 4:22 pm 
Offline
Senior Member

Joined: Fri May 02, 2008 8:44 pm
Posts: 1121
Your graphs indeed seem to show more or less consistent memory and swap usage. However, swap usage (how much data you have in there) is different from swap activity (how much data you're moving in and out of there per second). It's okay to have high swap usage if there isn't much activity, but a combination of high swap usage and activity is bad.

To see how much swap activity you're having, you can run vmstat 5 for a minute or two, hit Ctrl+C to stop it, then look at the "si" and "so" columns. If everything is OK, those values should be really small. Run the test when your server is fast, and again when it's slow. Post the screenshots and I'm sure somebody will be able to figure out what's going on.

Here's another test that you can run. Go to the "extras" menu and buy another gig of RAM. Reboot and see if the problem goes away. If it does, you should probably upgrade to a bigger plan. Remember that although only 0.5GB is shown as "used" on the graph, those yellow/blue buffers and cache are also very important. Linux needs plenty of space for them. (If it doesn't work out, you can cancel the extra RAM after a day or two. A prorated refund will be immediately applied to your account.)


Top
   
 Post subject:
PostPosted: Wed Mar 23, 2011 4:51 pm 
Offline
Senior Newbie

Joined: Mon Mar 07, 2011 6:49 am
Posts: 9
Thank you hybinet, right now after Linode moved me to other host my performance problems are more less gone. :D


Top
   
 Post subject:
PostPosted: Wed Mar 23, 2011 5:10 pm 
Offline
Senior Member

Joined: Wed May 13, 2009 1:18 am
Posts: 681
szczym wrote:
Thank you hybinet, right now after Linode moved me to other host my performance problems are more less gone. :D

Just so you realize that, presuming all other things remained similar on your own application stack, that was just luck.

Total I/O bandwidth to local storage is pretty much the same on any Linode host, so if you are using the same I/O on the old and new hosts, but are getting better performance from the latter, it likely just means that the other guests on your new host are using less themselves than on your old host. But nothing guarantees that will remain the case, especially if perhaps your current scenario was helped, if only in part, from being moved to a newer host which may not be fully occupied yet.

The I/O bandwidth is shared fairly among those guests trying to use it, but is probably the most constrained resource. So if your current application stack requires a significant amount of I/O (which you would need to determine) you may just have bought a little time until you run into more contention, especially if the host you were moved to was newer and thus has fewer guests at the moment.

Now, it could also have been that it was some other guest on your old host that was a heavy I/O user which can adversely impact even modest users. I had a Linode for example that would consistently get into large I/O wait percentages even though it barely did any I/O itself and never swapped. But given Caker's comment about your Linode's typical usage compared to others on its host (both old and new) it seems that you are the heavy user in both places. I suspect that odds favor your performance degrading over time. Just realize that has nothing to do with slow storage per-se, just that it's a shared resource that your setup needs a lot of, which may not always be available when split among others on your host.

If I were in your shoes, I'd use the "reprieve" you have gotten by moving hosts to analyze and tune your application stack to reduce the I/O requirements as much as possible, making it more likely your performance will remain good over time. If the issue then happens again, you'll know that you're about as efficient as you can be and might need to consider a plan upgrade instead.

-- David


Top
   
PostPosted: Wed Mar 30, 2011 7:27 am 
Offline
Newbie

Joined: Tue Mar 29, 2011 8:06 am
Posts: 2
I only found this forum because I'm always googling around to see who's using collectl and saw a recent post that mentioned it. I did read this thread with interest and all I can say is to beware of graphs that show coarse data. I'm not saying any of the anaysis is wrong, but basing it on coarse sampling data can be risky.

I've looked at some of the graphs on this topic and see a multi-month plot followed by a declaration that everything is fine. Personally I'd never be willing to do so on such minimal data. When you have a single blip on a graph that is representing multiple hours or more, how can you trust that to mean anything?

Consider those who run sar with a 10 minute sampling rate. I'd claim they too are fooling them themselves. What if there is a 2 minute CPU spike of 100%? They'll never see it and during that 2 minutes the system will be crawling. Same thing with networks, disks, etc.

That's the reason collectl's sampling rate is 10 seconds and sometimes I even run it at 1 second. And before anyone gets all excited and says that will generate too much of a load, let me say that collectl uses less than 0.1% of the cpu at the 10 second rate. Since all these tools use about the same level of overheard I'd use whatever tools you at that level. All you need to do is run it as a daemon and forget about it being there until you have a problem. Then you have enough detail to see what is really happening.

But now there's the problem of plotting the data. I also see all those 'pretty' plots rrd draws, BUT they are far from accurate if you throw a lot of data at them because they 'normalize' the data and as a result information is lost.

I say forget pretty and use a tool like gnuplot. At the very least if you have 8000 data points (that's one per 10 seconds) and 1 of them is a spike you WILL see it and for my money (and this stuff is all free) I'll go with accurate over pretty every time.

-mark


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group