Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Wed Aug 26, 2009 12:51 am 
Offline
Newbie

Joined: Wed Aug 26, 2009 12:38 am
Posts: 4
can someone explain why i'm not seeing a (reasonable) performance gain between a 360 and 2880 node?

here is what i benchmarked:

i have 5 text files (mysql dumps), each being ~850MB. i wrote a script that does the following:

- gzip *.sql
- gunzip *.gz
- bzip2 *.sql
- bunzip2 *.bz2

i ran this on both a 360 node and a 2880 node. the 360 finished in ~36mins, the 2880 in ~33.5mins.

with an 8X ratio of CPU power, i expected much more than a 7% savings in runtime. as far as i can tell, my nodes are practically identical (running on ubuntu 8.04LTS if that makes a diff).

can someone tell me what's going on? is my test invalid? is the 360 unfairly spiking or something? is there something i can do to make the test fair? or is this expected performance?

thanks in advance!


Top
   
 Post subject:
PostPosted: Wed Aug 26, 2009 1:24 am 
Offline
Senior Member

Joined: Wed Apr 11, 2007 8:23 pm
Posts: 76
On your Linode 360, does your CPU graph show a 350%+ spike? Linodes have burstable CPU, meaning if no one else is using the CPU at the time, you're free to use as much of it as you want. That is, until they need it, at which time they get priority.

What you may be seeing is that your 360 was taking up 90% of the CPU/IO of the Linode Host because no one else was using it. Being as its called "burstable", its frowned upon for an extended period.

Perhaps someone else could explain it better.


Top
   
 Post subject:
PostPosted: Wed Aug 26, 2009 2:43 am 
Offline
Senior Member

Joined: Wed May 13, 2009 1:18 am
Posts: 681
Smark wrote:
What you may be seeing is that your 360 was taking up 90% of the CPU/IO of the Linode Host because no one else was using it. Being as its called "burstable", its frowned upon for an extended period.

I'm not sure there's any reason to avoid using the CPU if you can get it. It's shared equally among all the nodes on the host if there is contention, so if you can get 400%, nobody else wants it. If someone else is burning CPU at the same time, you'll share equally and you won't get to 400%.

To schmingle, for purely CPU bound tasks, an otherwise idle 360 and an idle 2880 probably won't be all that different, since as Smark points out xen shares the CPU equally among all its nodes (well, up to 4 CPUs per node) so an individual node can get a significant amount of CPU if the host is otherwise idle.

What your test won't show is what the average CPU or disk availability is over time. Best case the 360 and 2880 are similar, but worst case is far worse for the 360 than 2880 (given the difference of on average, 40 nodes competing on the host vs. 5). So it's really a question of buying assurance of resource availability.

It's not clear if your test was I/O or CPU bound, but on average you should see less contention for both resources on the 2880 host (again, fewer nodes competing) which especially with disk can be a performance killer with larger database tasks or other disk heavy operations, but the machines themselves are largely the same, so if other nodes aren't competing for the disk at a given point, the raw performance of the 360 is likely similar to the 2880.

And of course, you have a lot more memory to work with (which is the only real guaranteed resource under the xen setup). That in and of itself could be reason enough to get the larger configuration.

In short, it's not going to obviously boost peak performance to have a 2880 vs. a 360, depending on your workload and how you structure your host. That's actually something I find very attractive about Linode, since it lets you stay economical while on average getting more bang for the buck in performance. I'm guessing that it's rare for all the nodes on a given 360 host to be saturated (CPU or disk) simultaneously.

But worst case for a 360 will be much worse than worst case for a 2880, so it's a question of odds of hitting the worse case, and how much it will impact your application should it happen.

-- David


Top
   
 Post subject:
PostPosted: Wed Aug 26, 2009 3:46 am 
Offline
Senior Member

Joined: Sat Mar 28, 2009 4:23 pm
Posts: 415
Website: http://jedsmith.org/
Location: Out of his depth and job-hopping without a clue about network security fundamentals
Smark wrote:
Linodes have burstable CPU,

They do?

schmingle wrote:
with an 8X ratio of CPU power, i expected much more than a 7% savings in runtime.

Ratio between what? Linode plans are identical in CPU; the only provisioned difference is how much RAM you get assigned to you. Since your test cases here are mostly I/O-bound (and I/O is fairly consistent across plans as well), I would expect these numbers.

We don't take the approach of some other providers and give you a slice of the CPU proportional to your plan size. You get the full computing power of the host server at all times. There are reserves built in to the system to prevent you from pegging other Linodes to doom and back, and this also explains why CPU graphs go to 400%.

_________________
Disclaimer: I am no longer employed by Linode; opinions are my own alone.


Top
   
 Post subject:
PostPosted: Wed Aug 26, 2009 3:48 am 
Offline
Junior Member

Joined: Sun Nov 16, 2008 4:35 am
Posts: 38
db3l wrote:
I'm not sure there's any reason to avoid using the CPU if you can get it. It's shared equally among all the nodes on the host if there is contention, so if you can get 400%, nobody else wants it. If someone else is burning CPU at the same time, you'll share equally and you won't get to 400%.


If memory serves, this is mostly a leftover attitude from the UML days where CPU sharing wasn't quite as solid.

That aside, in my opinion there is a theoretical downside to people doing heavy sustained workloads, albeit maybe not for the person doing them.

If most linodes have a bursty/interactive workload, where how fast a job gets done (e.g. loading a webpage) is important for the user experience, then theoretically, if the host is populated mainly by other "interactive" linodes, the typical case would be that when any given request or group of requests comes in to any linode on a host, a significant amount of the host CPU would be available to complete it quickly (since the odds of any two particular linodes needing to complete non-trivial requests at the exact same point in time would usually be fairly low). All linodes on such a host benefit through faster average response times.

This would break down if you have a lot of heavy, sustained work going on in other linodes.

As I said, though, this is all theory, and it wouldn't make a lot of difference if there were only a couple heavy users on a host. I don't have any real-world data to know for sure what the typical linode workload is, nor if there are hosts that chronically lack burstable CPU due to clusters of heavy sustained users.


Top
   
 Post subject:
PostPosted: Wed Aug 26, 2009 4:29 am 
Offline
Newbie

Joined: Wed Aug 26, 2009 12:38 am
Posts: 4
jed wrote:
Linode plans are identical in CPU; the only provisioned difference is how much RAM you get assigned to you.


thanks for clarifying that, jed. shame on me for my ignorance, thinking that those little green boxes represented proportional CPU power. i'm coming from EC2 so i just assumed that's what it meant. it sounds like i may be downgrading my node then.

jed wrote:
I/O is fairly consistent across plans as well


jed, could you go into more detail about this? i actually opened a support ticket asking about this and got the following response:

"Linodes on larger plans have a lower contention ration on the hosts, so better IO performance (especially when compared to a 360 host) is quite probable."

thanks for taking the time to answer my questions. i appreciate your help in evaluating hosting solutions for my company.


Top
   
 Post subject:
PostPosted: Wed Aug 26, 2009 10:49 am 
Offline
Senior Member

Joined: Wed Feb 13, 2008 2:40 pm
Posts: 126
Lower plans have more nodes per physical machine, so there is usually more competition for disk I/O, which is the real killer on a VPS.


Top
   
 Post subject:
PostPosted: Thu Aug 27, 2009 4:34 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
On a somewhat related topic, I love pigz and pbzip2. They're parallel (multithreaded) implementations of gzip and bzip2 that scale linearly. They're also drop-in replacements, functioning identically.


Top
   
 Post subject:
PostPosted: Thu Aug 27, 2009 6:08 pm 
Offline
Senior Member

Joined: Fri May 02, 2008 8:44 pm
Posts: 1121
You will see a remarkable improvement in performance if you have, for example, a 1.5GB database. The 2880 can keep the entire database in memory and give you blazing fast responses to whatever query you throw at it, whereas the 360 will need to read from disk all the time.


Top
   
 Post subject:
PostPosted: Thu Aug 27, 2009 6:53 pm 
Offline
Newbie

Joined: Wed Aug 26, 2009 12:38 am
Posts: 4
hybinet wrote:
You will see a remarkable improvement in performance if you have, for example, a 1.5GB database. The 2880 can keep the entire database in memory and give you blazing fast responses to whatever query you throw at it, whereas the 360 will need to read from disk all the time.


hybinet, could you propose a good test for me to run? i tested some intensive sql scripts and the 360 outperformed the 2880 by quite some margin.

my DB dump is only ~850MB. unfortunately, i can't reveal my scripts. i can tell you however that they involve a lot of updates, inserts, and table swapping.

btw, i also have a small EC2 instance as well as possibly a large one with EB storage that i can also run tests on for benchmarking. i'll be happy to publish the results.


Top
   
 Post subject:
PostPosted: Thu Aug 27, 2009 7:32 pm 
Offline
Senior Member

Joined: Sat Mar 28, 2009 4:23 pm
Posts: 415
Website: http://jedsmith.org/
Location: Out of his depth and job-hopping without a clue about network security fundamentals
schmingle wrote:
jed wrote:
I/O is fairly consistent across plans as well


jed, could you go into more detail about this?

Sorry, should have been clearer -- I/O capability does not vary per-plan. Every plan has access to the same hardware.

_________________
Disclaimer: I am no longer employed by Linode; opinions are my own alone.


Top
   
 Post subject:
PostPosted: Fri Aug 28, 2009 10:28 am 
Offline
Junior Member

Joined: Wed May 21, 2008 5:34 am
Posts: 46
Website: http://www.eve-razor.com/forum
Location: Austin, Tx
schmingle wrote:
hybinet wrote:
You will see a remarkable improvement in performance if you have, for example, a 1.5GB database. The 2880 can keep the entire database in memory and give you blazing fast responses to whatever query you throw at it, whereas the 360 will need to read from disk all the time.


hybinet, could you propose a good test for me to run? i tested some intensive sql scripts and the 360 outperformed the 2880 by quite some margin.

my DB dump is only ~850MB. unfortunately, i can't reveal my scripts. i can tell you however that they involve a lot of updates, inserts, and table swapping.

btw, i also have a small EC2 instance as well as possibly a large one with EB storage that i can also run tests on for benchmarking. i'll be happy to publish the results.


I am assuming when you ran your SQL tests you had changed the configuration to make use of all the extra resources.

Query speeds only change greatly if its able to read the table from memory rather than disk.. And you need to have your my.cnf setup to make use of the extra storage.


Top
   
 Post subject:
PostPosted: Tue Sep 01, 2009 4:17 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
gzip is block-based, not solid, so it wouldn't need or want to keep the entire file in memory at the same time. Apart from a bit of read-ahead, there should be no performance difference, even if you were compressing a 10GB file with 256MB of RAM.


Top
   
 Post subject:
PostPosted: Tue Sep 01, 2009 4:47 pm 
Offline
Senior Member

Joined: Tue Jan 22, 2008 2:10 am
Posts: 103
If you're IO bound (you don't seem to specify), you may be able to get better performance by doing your work in a ramdisk:

Code:
mount -t tmpfs none /scratch/directory


Now put temporary files in /scratch/directory - they'll be kept in RAM, reducing disk IO. Of course, this takes away from RAM available for other processes.


Top
   
 Post subject:
PostPosted: Tue Sep 01, 2009 5:08 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
Why would that produce faster performance? The time spent copying the file to the RAM disk before compressing it would seem to negate any possible benefits derived during the actual compression.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group