Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Sat Nov 20, 2010 8:46 pm 
Offline
Newbie

Joined: Sat Nov 20, 2010 7:45 pm
Posts: 2
Location: Dayton, Ohio
I would like to set up a cross-datacenter ha setup for a joomla website. I have read enough here to see that it is possible, but being a relative amateur when it comes to linux management I have some questions before I get started. I've read thru the library tutorial at http://library.linode.com/linux-ha/ip-f ... untu-10.04 and it makes sense as far as setting up a failover within the same datacenter, but doesn't go so far as cross-datacenter setups.

For clarification sake, I'm really only interested in redundancy in the (hopefully rare) case that the primary datacenter is unavailable. I don't care about load balancing, etc. Just a backup to reduce the chances of site downtime. The sites I host don't get a huge amount of traffic, but we're moving from Dreamhost where frequent random outages have my clients all in a huff.

Basically, I'd simply like to know what additional/different steps are required for a cross-datacenter setup as compared to the tutorial. Is it as simple as using static IPs in place of private IPs in the "Configure Private Networking" section? My experience is that nothing is ever that simple, and I expect there's more to it than that.

Thanks


Top
   
 Post subject:
PostPosted: Sat Nov 20, 2010 9:22 pm 
Offline
Senior Member
User avatar

Joined: Fri Oct 24, 2003 3:51 pm
Posts: 965
Location: Netherlands
Cross-datacenter failover is problematic because the IP address for a given Linode is fixed to the datacenter where that Linode resides. You have to use DNS to do the switchover, which takes time.

_________________
/ Peter


Top
   
 Post subject:
PostPosted: Sun Nov 21, 2010 9:00 am 
Offline
Senior Member
User avatar

Joined: Sat Aug 30, 2008 1:55 pm
Posts: 1739
Location: Rochester, New York
Also, synchronization of data between two locations isn't quite as easy, due to the distance involved. Some methods work OK, other methods don't work out well at all. Transporting data over the Internet isn't free, either.

I'm not sure I'd use DRBD. I'd probably use whatever replication capabilities are available in the database engine, perhaps with rsync (or maybe DRBD, if it will do it) for relatively static files.

_________________
Code:
/* TODO: need to add signature to posts */


Top
   
 Post subject:
PostPosted: Sun Nov 21, 2010 5:36 pm 
Offline
Senior Member

Joined: Wed May 13, 2009 1:18 am
Posts: 681
There's also the whole question of recovery, since once you have a primary failure and cut over to the secondary, you need to be sure you've got all the steps in place for re-synchronizing the primary and then switching back the other way. And hope that the disruption wasn't partial enough (or just network segmenting) that part of the system thought it was still good when it wasn't so you end up with split state. It's a lot of work, especially regarding state management.

For my own purposes, I've more or less concluded that I'm better with a basic synchronization process to a secondary (rsync/unison for static filesystem content, database appropriate support for replication), but leaving the process for cut-over under manual control.

hoopycat's bandwidth comment is well taken too - for example, in one of my node pairs, my main standby node in the same DC as the primary, with maximum sync latency of 60s between the two over the private network. Doing the same between DCs could eat a full node's bandwidth allotment over a month, so might require allowing for a slightly larger (say 5min) latency or just allocate the bandwidth to that task.

Of course, this does impose a minimum latency on any eventual cut-over (mostly my deciding to take the step and then DNS propagation), but to be honest, I'm mostly concerned with protecting against an unexpected multi-hour or more outage due to serious failure than a few minutes here or there.

The incremental cost (time, configuration, expense) to achieve zero latency HA is pretty extreme for the benefits, at least in my own scenarios, and certainly without a mechanism to work around DNS propagation, there's always going to be a reasonable latency to enabling a standby anyway.

-- David


Top
   
 Post subject:
PostPosted: Sun Nov 28, 2010 8:56 pm 
Offline
Newbie

Joined: Sat Nov 20, 2010 7:45 pm
Posts: 2
Location: Dayton, Ohio
Thanks for all your responses and for setting me straight about the reality of accomplishing this.

Here's what I'm thinking of doing, as a sort-of compromise solution. First off, the site doesn't change all that much on a regular basis aside from new user registrations (which are strictly database changes). I do all of the new article posting myself anyway, so I can do any image/file syncing manually when it's required. For the database, I think I can do pretty much the same... manually sync when I make changes to articles, etc. And I'll just turn off the registered area on the backup site so no database changes can happen there, and no other synchronization is necessary... so the public portion of the site can be available on the secondary host node in case of primary failure.

Honestly I think any more than that is overkill for this particular client, and you all make very valid points about the difficulty involved in doing any more than that.

I was looking at DNS Made Easy to handle DNS failover (I've seen it mentioned on other posts here, and it looks like a good/easy solution). Any other good options for handling that part?


Top
   
 Post subject:
PostPosted: Mon Nov 29, 2010 1:33 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
That sounds like a lot of work, doing it all manually. You're probably better off coming up with something simple but automated. For example, nightly syncing via rsync. Run a cron job on your primary that, through SSH, initiates a blocking (consistent, if your database access is transactional) database dump, rsyncs it and any website changes to the secondary box, and initiates a database import on the secondary box.

In terms of more timely updates for when you post an article, if you've got some sort of article posting script, you can just have it execute the script that the nightly cron job executes (or if you have no article posting script, initiate it yourself). The sync should be pretty fast since little will have changed, and while the import on the secondary box might take a while, that shouldn't matter since your primary box doesn't need to wait on that.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group