Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Wed Nov 03, 2010 5:04 pm 
Offline
Junior Member

Joined: Wed Nov 03, 2010 4:55 pm
Posts: 28
Location: 55
I'm interested in setting up a high availabilty cross-datacenter load balanced cluster of servers for a single website.

The goal is simple, I want to have one website (with a lot of visitors) hosted on a webserver like Nginx, with multiple servers across several datacenters.

One of the issues is that all these servers need to have the same database in sync.

Another issue is that resources should be shared evenly accross these datacenters, while prefering the closest server to each visitors if resource usage is fairly balanced.

So, how would one setup such load balancing while having one synched database for high availability? (I'm thinking about solutions like redis, couchdb, mongodb)

And how would one at the same time make sure that visitors are being pointed to the server that is the closest to them, unless one server uses significant less resources?

Has this been done before, or maybe even documented?

Any suggestions are appreciated.

Thanks.


Top
   
PostPosted: Wed Nov 03, 2010 5:08 pm 
Offline
Senior Member

Joined: Sat Mar 28, 2009 4:23 pm
Posts: 415
Website: http://jedsmith.org/
Location: Out of his depth and job-hopping without a clue about network security fundamentals
tommedema wrote:
high availabilty cross-datacenter

Just to let you know, the IP-based high availability stuff only works within the same datacenter (since our IP addresses are routed to a single facility). To accomplish cross-datacenter HA, you'd be using DNS and your servers would have different IP addresses.

Just clearing that up.


Top
   
PostPosted: Wed Nov 03, 2010 5:19 pm 
Offline
Junior Member

Joined: Wed Nov 03, 2010 4:55 pm
Posts: 28
Location: 55
jed wrote:
tommedema wrote:
high availabilty cross-datacenter

Just to let you know, the IP-based high availability stuff only works within the same datacenter (since our IP addresses are routed to a single facility). To accomplish cross-datacenter HA, you'd be using DNS and your servers would have different IP addresses.

Just clearing that up.


Alright, but it is possible to setup high availabilty cross datacenter, right?


Top
   
 Post subject:
PostPosted: Wed Nov 03, 2010 5:47 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
On a linode, I think all of your options when it comes to high availability are going to involve DNS. Round-robin can reduce the impact of a server going down (4 servers, one goes down, 75% of initial requests still make it through) and low DNS TTLs can reduce the amount of time before the remaining server comes back up.

As for load balancing, you have three options:

1) DNS round-robin. Not perfect load balancing since it doesn't take load into account, but it can help spread the load

2) Geodns. Not accurate since it redirects people based on where their DNS server is, not where they are. For example, I use Google DNS from Montreal, and my ISP routes me through Toronto. I have no idea where a geodns solution would think I live!

3) Load balancing redirects. The idea is that you have frontline server(s) that use some strategy to decide what application servers the customer is redirected to. If you've ever seen your browser going to www1, www2, www3, etc, then they're probably doing something like this.

Of course there is always the option of load-balancing different resources to different places. An application server, a database server, a content server, etc. Keep in mind, though, that all linodes in an account, no matter what datacenter, will share a common bandwidth pool; you don't need to load-balance for that.


Top
   
 Post subject:
PostPosted: Fri Nov 05, 2010 11:12 am 
Offline
Junior Member

Joined: Wed Nov 03, 2010 4:55 pm
Posts: 28
Location: 55
Guspaz wrote:
On a linode, I think all of your options when it comes to high availability are going to involve DNS. Round-robin can reduce the impact of a server going down (4 servers, one goes down, 75% of initial requests still make it through) and low DNS TTLs can reduce the amount of time before the remaining server comes back up.

As for load balancing, you have three options:

1) DNS round-robin. Not perfect load balancing since it doesn't take load into account, but it can help spread the load

2) Geodns. Not accurate since it redirects people based on where their DNS server is, not where they are. For example, I use Google DNS from Montreal, and my ISP routes me through Toronto. I have no idea where a geodns solution would think I live!

3) Load balancing redirects. The idea is that you have frontline server(s) that use some strategy to decide what application servers the customer is redirected to. If you've ever seen your browser going to www1, www2, www3, etc, then they're probably doing something like this.

Of course there is always the option of load-balancing different resources to different places. An application server, a database server, a content server, etc. Keep in mind, though, that all linodes in an account, no matter what datacenter, will share a common bandwidth pool; you don't need to load-balance for that.


Thanks. How exactly does linode limit my options though? I'm not sure why a cloud server would result in any limitations compared to a colocated server.

When I visit google.com I'm pretty sure it directs me to a close by server as my latency to it is always extremely low.

Do you have any idea how they accomplish this? GeoDNS?


Top
   
 Post subject:
PostPosted: Fri Nov 05, 2010 1:20 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
Limited in that you don't control your own routing, you can't do anycast or geocast (probably what Google is doing), so you're stick with using DNS or software to do your balancing. One option that I didn't mention is that it's possible to have your DNS server direct people based on logic, so you could have a DNS server send people to the servers with the lowest load. I think that DNS caching would make that impractical, though.

To use GeoDNS, you'd need to sign up with a DNS provider that supports it.


Top
   
 Post subject:
PostPosted: Fri Nov 05, 2010 1:25 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
Personally, I'd go for the following approach:

1) DNS round-robin for the initial load spreading. As mentioned, this also reduces the impact of a downed server
2) Low TTL on DNS so that you can update DNS to take a downed server out of the rotation quickly
3) Have the application servers themselves redirect a percentage of users if a server is overloaded, since DNS round-robin does not produce equal load balancing. Need custom code for this, so that each application server is aware of the load of the other servers, so that it can make the decision of where to redirect to.

I'm not an expert in HA systems, though.


Top
   
 Post subject:
PostPosted: Sun Nov 07, 2010 9:41 am 
Offline
Junior Member

Joined: Wed Nov 03, 2010 4:55 pm
Posts: 28
Location: 55
Thanks.

Bit of a surprise though there's only one replier in this thread. :)

What I don't understand, is that you need a server to appoint users to the least heavy used servers. What if this appointing server goes down?


Top
   
 Post subject:
PostPosted: Sun Nov 07, 2010 11:29 am 
Offline
Senior Member

Joined: Sun Mar 07, 2010 7:47 pm
Posts: 1970
Website: http://www.rwky.net
Location: Earth
Have two.

Or even better, 4!

Two in datacenter A which have ip failover between each other, 2 in data centre B with he same setup and DNS round robin between the two data centres.

_________________
Paid support
How to ask for help
1. Give details of your problem
2. Post any errors
3. Post relevant logs.
4. Don't hide details i.e. your domain, it just makes things harder
5. Be polite or you'll be eaten by a grue


Top
   
 Post subject:
PostPosted: Mon Nov 08, 2010 9:54 am 
Offline
Junior Member

Joined: Wed Nov 03, 2010 4:55 pm
Posts: 28
Location: 55
obs wrote:
Have two.

Or even better, 4!

Two in datacenter A which have ip failover between each other, 2 in data centre B with he same setup and DNS round robin between the two data centres.


So how would one be appointed to the correct datacenter depending on 1) it's load balance and if nothing is critical 2) the users location to the server?


Top
   
 Post subject:
PostPosted: Mon Nov 08, 2010 11:07 am 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
It doesn't have to be one server doing the appointing. All servers can do it.

DNS round-robin will go most of the way towards getting load spread. Each application server can then decide if a user should be served or passed on to a different server with a lower load.

To do this, you'll need to have the current load of each server known by each server so that each server (what a mouthful) can make the decision of "I'm overloaded, better pass off this user to another server. Hey, server C has the lowest load, I'll send him there.


Top
   
 Post subject:
PostPosted: Fri Dec 10, 2010 9:40 am 
Offline
Senior Newbie

Joined: Fri Dec 10, 2010 9:10 am
Posts: 6
Quote:
So how would one be appointed to the correct datacenter depending on 1) it's load balance and if nothing is critical 2) the users location to the server?


I recently found the silver bullet to this question

http://www.ultradns.com/solutions/sitebacker.html

It's a commercial service and I've not used it yet but from what I've read it does look like it will do what you're looking for. You can pay more money and get users redirected to local DCs also.

DynDNS also do a similar product I believe, not looked into this one too much yet.


Top
   
 Post subject:
PostPosted: Fri Dec 10, 2010 11:27 am 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
That automates it, but doesn't necessarily offer anything you can't get out of a monitoring solution and an on-the-ball admin.


Top
   
 Post subject:
PostPosted: Fri Dec 10, 2010 10:49 pm 
Offline
Senior Member
User avatar

Joined: Sat Aug 30, 2008 1:55 pm
Posts: 1739
Location: Rochester, New York
Monitoring solutions and on-the-ball admins cost money, and adminning reliable customized DNS is not the world's most pleasant and trivial activity.

If it's important enough to require high availability, it's important enough to hire professionals to do the dirty work. Unless you're a professional, in which case everyone else should hire you. :-)

_________________
Code:
/* TODO: need to add signature to posts */


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group