My workplace is in the process of switching from dedicated servers to Linode for hosting our website, and I'm trying to decide which plans to choose for each server. This has been pretty straightforward for most of the servers, except for the one that'll host the main part of our site, which is a fairly complex Rails application. The application gets a lot of traffic, and I want to get a good idea of how Linode will perform before migrating. I decided to start by ordering a 4096 Linode and a 2048 Linode (both in the Newark, NJ datacenter), provision them identically using Puppet, and benchmark them by running ApacheBench off a 3rd Linode using the private IPs. Both Linodes are running Centos 6.5 64-bit and have the same configuration profile.
To my surprise, the 2048 performed signficantly better in terms of latency: average response times were about 60% higher on the 4096 Linode, and there was much higher variation. The 4096 performed better in terms of throughput since it could run more Passenger workers, but I'm more concerned about latency. The parts of the Rails application I'm benchmarking are mainly CPU-bound, so I'm guessing this is due to the noisy neighbor problem (i.e the server the 4096 is on has a lot more tenants using the CPU than the 2048). To check, I ran "sysbench --test=cpu" on both Linodes, and the results seem to confirm my suspicions:
4096 Linode:
Code:
[root@web masonm]# sysbench --test=cpu --cpu-max-prime=100000 --num-threads=2 run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 2
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 100000
Test execution summary:
total time: 266.7814s
total number of events: 10000
total time taken by event execution: 533.4837
per-request statistics:
min: 33.73ms
avg: 53.35ms
max: 202.86ms
approx. 95 percentile: 89.52ms
Threads fairness:
events (avg/stddev): 5000.0000/1.00
execution time (avg/stddev): 266.7419/0.01
2048 Linode:
Code:
[root@web shared]# sysbench --test=cpu --cpu-max-prime=100000 --num-threads=2 run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 2
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 100000
Test execution summary:
total time: 141.9363s
total number of events: 10000
total time taken by event execution: 283.8482
per-request statistics:
min: 27.67ms
avg: 28.38ms
max: 32.22ms
approx. 95 percentile: 29.39ms
Threads fairness:
events (avg/stddev): 5000.0000/0.00
execution time (avg/stddev): 141.9241/0.00
Is there something else that can explain this? The next thing I'm going to try is switching datacenters, but it feels like I'm missing something.
EDIT: I got the plans wrong in my original message: the server I said was a 8192 is actually a 4096, and the 4096 is actually a 2048. Sorry for any confusion!