Linode Forum Index Linode Forum
Linode Community Forums
 


Cross-datacenter load balancing and closest server pref.?

Click here to go to the original topic

 
       Linode Forum Index -> Linux Networking
Author Message
tommedema



Joined: 03 Nov 2010
Posts: 23
Location: 55

Posted: Wed Nov 03, 2010 4:04 pm    Post subject: Cross-datacenter load balancing and closest server pref.?  

I'm interested in setting up a high availabilty cross-datacenter load balanced cluster of servers for a single website.

The goal is simple, I want to have one website (with a lot of visitors) hosted on a webserver like Nginx, with multiple servers across several datacenters.

One of the issues is that all these servers need to have the same database in sync.

Another issue is that resources should be shared evenly accross these datacenters, while prefering the closest server to each visitors if resource usage is fairly balanced.

So, how would one setup such load balancing while having one synched database for high availability? (I'm thinking about solutions like redis, couchdb, mongodb)

And how would one at the same time make sure that visitors are being pointed to the server that is the closest to them, unless one server uses significant less resources?

Has this been done before, or maybe even documented?

Any suggestions are appreciated.

Thanks.
Back to top  
jed



Joined: 28 Mar 2009
Posts: 394
Location: New Jersey

Posted: Wed Nov 03, 2010 4:08 pm    Post subject: Re: Cross-datacenter load balancing and closest server pref.  

tommedema wrote: high availabilty cross-datacenter
Just to let you know, the IP-based high availability stuff only works within the same datacenter (since our IP addresses are routed to a single facility). To accomplish cross-datacenter HA, you'd be using DNS and your servers would have different IP addresses.

Just clearing that up.
Back to top  
tommedema



Joined: 03 Nov 2010
Posts: 23
Location: 55

Posted: Wed Nov 03, 2010 4:19 pm    Post subject: Re: Cross-datacenter load balancing and closest server pref.  

jed wrote: tommedema wrote: high availabilty cross-datacenter
Just to let you know, the IP-based high availability stuff only works within the same datacenter (since our IP addresses are routed to a single facility). To accomplish cross-datacenter HA, you'd be using DNS and your servers would have different IP addresses.

Just clearing that up.

Alright, but it is possible to setup high availabilty cross datacenter, right?
Back to top  
Guspaz



Joined: 26 May 2009
Posts: 1147
Location: Montreal, QC

Posted: Wed Nov 03, 2010 4:47 pm    Post subject:  

On a linode, I think all of your options when it comes to high availability are going to involve DNS. Round-robin can reduce the impact of a server going down (4 servers, one goes down, 75% of initial requests still make it through) and low DNS TTLs can reduce the amount of time before the remaining server comes back up.

As for load balancing, you have three options:

1) DNS round-robin. Not perfect load balancing since it doesn't take load into account, but it can help spread the load

2) Geodns. Not accurate since it redirects people based on where their DNS server is, not where they are. For example, I use Google DNS from Montreal, and my ISP routes me through Toronto. I have no idea where a geodns solution would think I live!

3) Load balancing redirects. The idea is that you have frontline server(s) that use some strategy to decide what application servers the customer is redirected to. If you've ever seen your browser going to www1, www2, www3, etc, then they're probably doing something like this.

Of course there is always the option of load-balancing different resources to different places. An application server, a database server, a content server, etc. Keep in mind, though, that all linodes in an account, no matter what datacenter, will share a common bandwidth pool; you don't need to load-balance for that.
Back to top  
tommedema



Joined: 03 Nov 2010
Posts: 23
Location: 55

Posted: Fri Nov 05, 2010 10:12 am    Post subject:  

Guspaz wrote: On a linode, I think all of your options when it comes to high availability are going to involve DNS. Round-robin can reduce the impact of a server going down (4 servers, one goes down, 75% of initial requests still make it through) and low DNS TTLs can reduce the amount of time before the remaining server comes back up.

As for load balancing, you have three options:

1) DNS round-robin. Not perfect load balancing since it doesn't take load into account, but it can help spread the load

2) Geodns. Not accurate since it redirects people based on where their DNS server is, not where they are. For example, I use Google DNS from Montreal, and my ISP routes me through Toronto. I have no idea where a geodns solution would think I live!

3) Load balancing redirects. The idea is that you have frontline server(s) that use some strategy to decide what application servers the customer is redirected to. If you've ever seen your browser going to www1, www2, www3, etc, then they're probably doing something like this.

Of course there is always the option of load-balancing different resources to different places. An application server, a database server, a content server, etc. Keep in mind, though, that all linodes in an account, no matter what datacenter, will share a common bandwidth pool; you don't need to load-balance for that.

Thanks. How exactly does linode limit my options though? I'm not sure why a cloud server would result in any limitations compared to a colocated server.

When I visit google.com I'm pretty sure it directs me to a close by server as my latency to it is always extremely low.

Do you have any idea how they accomplish this? GeoDNS?
Back to top  
Guspaz



Joined: 26 May 2009
Posts: 1147
Location: Montreal, QC

Posted: Fri Nov 05, 2010 12:20 pm    Post subject:  

Limited in that you don't control your own routing, you can't do anycast or geocast (probably what Google is doing), so you're stick with using DNS or software to do your balancing. One option that I didn't mention is that it's possible to have your DNS server direct people based on logic, so you could have a DNS server send people to the servers with the lowest load. I think that DNS caching would make that impractical, though.

To use GeoDNS, you'd need to sign up with a DNS provider that supports it.
Back to top  
Guspaz



Joined: 26 May 2009
Posts: 1147
Location: Montreal, QC

Posted: Fri Nov 05, 2010 12:25 pm    Post subject:  

Personally, I'd go for the following approach:

1) DNS round-robin for the initial load spreading. As mentioned, this also reduces the impact of a downed server
2) Low TTL on DNS so that you can update DNS to take a downed server out of the rotation quickly
3) Have the application servers themselves redirect a percentage of users if a server is overloaded, since DNS round-robin does not produce equal load balancing. Need custom code for this, so that each application server is aware of the load of the other servers, so that it can make the decision of where to redirect to.

I'm not an expert in HA systems, though.
Back to top  
tommedema



Joined: 03 Nov 2010
Posts: 23
Location: 55

Posted: Sun Nov 07, 2010 8:41 am    Post subject:  

Thanks.

Bit of a surprise though there's only one replier in this thread. :)

What I don't understand, is that you need a server to appoint users to the least heavy used servers. What if this appointing server goes down?
Back to top  
obs



Joined: 07 Mar 2010
Posts: 1400
Location: Earth

Posted: Sun Nov 07, 2010 10:29 am    Post subject:  

Have two.

Or even better, 4!

Two in datacenter A which have ip failover between each other, 2 in data centre B with he same setup and DNS round robin between the two data centres.
Back to top  
tommedema



Joined: 03 Nov 2010
Posts: 23
Location: 55

Posted: Mon Nov 08, 2010 8:54 am    Post subject:  

obs wrote: Have two.

Or even better, 4!

Two in datacenter A which have ip failover between each other, 2 in data centre B with he same setup and DNS round robin between the two data centres.

So how would one be appointed to the correct datacenter depending on 1) it's load balance and if nothing is critical 2) the users location to the server?
Back to top  
Guspaz



Joined: 26 May 2009
Posts: 1147
Location: Montreal, QC

Posted: Mon Nov 08, 2010 10:07 am    Post subject:  

It doesn't have to be one server doing the appointing. All servers can do it.

DNS round-robin will go most of the way towards getting load spread. Each application server can then decide if a user should be served or passed on to a different server with a lower load.

To do this, you'll need to have the current load of each server known by each server so that each server (what a mouthful) can make the decision of "I'm overloaded, better pass off this user to another server. Hey, server C has the lowest load, I'll send him there.
Back to top  
phy7tes



Joined: 10 Dec 2010
Posts: 6

Posted: Fri Dec 10, 2010 8:40 am    Post subject:  

Quote: So how would one be appointed to the correct datacenter depending on 1) it's load balance and if nothing is critical 2) the users location to the server?


I recently found the silver bullet to this question

http://www.ultradns.com/solutions/sitebacker.html

It's a commercial service and I've not used it yet but from what I've read it does look like it will do what you're looking for. You can pay more money and get users redirected to local DCs also.

DynDNS also do a similar product I believe, not looked into this one too much yet.
Back to top  
Guspaz



Joined: 26 May 2009
Posts: 1147
Location: Montreal, QC

Posted: Fri Dec 10, 2010 10:27 am    Post subject:  

That automates it, but doesn't necessarily offer anything you can't get out of a monitoring solution and an on-the-ball admin.
Back to top  
hoopycat



Joined: 30 Aug 2008
Posts: 1294
Location: Rochester, New York

Posted: Fri Dec 10, 2010 9:49 pm    Post subject:  

Monitoring solutions and on-the-ball admins cost money, and adminning reliable customized DNS is not the world's most pleasant and trivial activity.

If it's important enough to require high availability, it's important enough to hire professionals to do the dirty work. Unless you're a professional, in which case everyone else should hire you. :-)
Back to top  
 
       Linode Forum Index -> Linux Networking
Page 1 of 1