Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
 Post subject:
PostPosted: Wed Aug 17, 2011 7:29 pm 
Offline
Senior Member
User avatar

Joined: Fri Oct 24, 2003 3:51 pm
Posts: 965
Location: Netherlands
thehousecat wrote:
Does anyone who has wider experience find the other datacenters more reliable?

Yes. I have Linodes at Newark and London, as well as Fremont, and they have many fewer problems. The Fremont power problems all seem to stem from a nearby lightning strike last year. They are getting replacement UPSs (that hopefully really will be uninterruptible this time), but this fix is months away due to the lead time on the equipment. In the meanwhile, customers are leaving HE's Fremont 1 datacenter like rats off a sinking ship.

_________________
/ Peter


Top
   
 Post subject:
PostPosted: Wed Aug 17, 2011 7:53 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
This is still not as bad as that time that my previous VPS provider decided to move all his servers to a new datacenter, physically damaging most of them in the process, and never told customers about it. And by this I mean the first sign of a problem was that all of our VPS instances went down without warning, and stayed down for several weeks with zero communication from the host about why they were down, and the host stopped answering e-mails and phone calls during this period. Only after weeks of downtime did the host finally update their website with an explanation. Up until then, most customers were convinced that he had cut and run.

Eventually, we got so fed up waiting for our VPS to return that we signed up with Linode and restored from not-as-recent-as-we'd-like backups... while in the back of a van driving between Montreal and Toronto (~560 KM). It was an interesting experience, to say the least. Since that disaster, we've instituted a more regular backup procedure; on-site backups with Linode (via their backup service), and off-site nightly incremental backups to a file server in my apartment.


Top
   
 Post subject:
PostPosted: Wed Aug 17, 2011 7:57 pm 
Offline
Senior Member

Joined: Tue Jun 21, 2011 4:25 pm
Posts: 118
Website: http://www.alohatone.com
Location: Hawaii
Guspaz wrote:
This is still not as bad as that time that my previous VPS provider decided to move all his servers to a new datacenter, physically damaging most of them in the process, and never told customers about it. And by this I mean the first sign of a problem was that all of our VPS instances went down without warning, and stayed down for several weeks with zero communication from the host about why they were down, and the host stopped answering e-mails and phone calls during this period. Only after weeks of downtime did the host finally update their website with an explanation. Up until then, most customers were convinced that he had cut and run.

Eventually, we got so fed up waiting for our VPS to return that we signed up with Linode and restored from not-as-recent-as-we'd-like backups... while in the back of a van driving between Montreal and Toronto (~560 KM). It was an interesting experience, to say the least. Since that disaster, we've instituted a more regular backup procedure; on-site backups with Linode (via their backup service), and off-site nightly incremental backups to a file server in my apartment.


Weeks down? dang, you are too patient.

After 4 hours we were rebuilding someplace else.


Top
   
 Post subject:
PostPosted: Wed Aug 17, 2011 8:17 pm 
Offline
Senior Member
User avatar

Joined: Sat Aug 30, 2008 1:55 pm
Posts: 1739
Location: Rochester, New York
thehousecat wrote:
Does anyone who has wider experience find the other datacenters more reliable?


I seem to recall a circuit breaker tripped in Newark a few years back, taking out a half-dozen servers (not affecting mine). Things were back up in about a half hour. That's about all I can remember for power-related issues outside of Fremont.

I've never deployed long-term stuff in Fremont, and I can't recall ever claiming a SLA credit. It's difficult to accumulate enough downtime anywhere else.

_________________
Code:
/* TODO: need to add signature to posts */


Top
   
 Post subject:
PostPosted: Wed Aug 17, 2011 9:37 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
I looked up the original timeline in my e-mail. It looks like our server went down Friday May 15th, 2009, and we rebuilt our server on Friday May 22nd, 2009. We were on an unreliable mobile phone connection on wifi in the back of a van, thank goodness for screen...

It looks like the original VPS did come up later on the 22nd, allowing us to pull in the latest data to supplement our newly deployed system (having been based on an older backup). So my memory was faulty, the downtime wasn't weeks, but about one week. (EDIT: Other sources say we may have been down for two weeks, but got access to the data after one week?) The rest of the recollection would be accurate in that there was no news from the host for the first chunk of it, and only later on did we start getting updates from the host.

We probably should have migrated sooner, but a combination of factors meant we didn't:

1) While our event's pre-registration was live at the time (and in that sense we were losing hundreds or thousands of dollars in sales), our event is of the type where people were likely to wait until the server was back up to pre-register. It was also not a critical point in the pre-registration process (which would be closer to the end of it). In fact, our biggest concern with restoring from backup, and one reason we were hesitant to rebuild, was what we should do about registrations that had been paid and processed, but that we no longer had any data about. We obviously couldn't refuse an attendee who had paid and registered just because we had lost their data... We could have reviewed our paypal records to rebuild the important parts of the information (the names of people who paid, if not their assigned registration numbers) We were lucky that we later got access to the original data and were able to re-merge it back in regardless.

2) After 2 days, I redirected our DNS to a server at the local university which we controlled and posted a downtime notice. A day later, with the server still down, I put our website's design around the message to at least make it look a bit more official, and redirected all 404s to the message. We also redirected our mail to a server we controlled so that we could bring mail services back up on a temporary server.

3) By this point we were actively researching a new host to switch to. Picking a hosting provider is a big decision, and despite the urgency, we still had to do our due diligence, and by this point we were pretty much settled on Linode.

4) After the situation went from "Our server is down" to "Our host is MIA", we started trying to gather up all the backups we could from various servers and sources. Database and site code/content was primarily pulled from these older backups, and we refreshed our content with the newer data from the wayback machine. We were still hesitant to restore from backup because of the difficulty of a later merge if we did get access to the original data.

5) Eventually, the situation became intolerable; we were leaving for Anime North, the largest convention in the country, and we needed our server up for promotion purposes. This is why at this point we took the plunge and started rebuilding from our gathered backups. From the back of a car, going down the highway.

It's always a difficult decision to make. How much downtime do you tolerate before you go from waiting for your server to come back online to rebuilding from backups? As a somewhat loosely organized non-profit company, we're also not the kind of organization that has policies or procedures for this sort of thing.

Since then, we've at least taken precautions. As I've said, we have nightly incrementals off-site, and on-site linode backups. At this exact moment, since our event has ended for the year (literally just three days ago), our registration system is not active and downtime would be relatively unimportant; if we did need to take emergency actions our forums are probably all we'd care about. But nevertheless, we'd probably still take action sooner.

Of course, we've also gone from a fly-by-night operation to a first-class hosting provider (Linode), so I'm relatively confident that we'd be unlikely to have to restore from off-site backups. If our linode's host should die, we can restore from Linode's on-site backups in a matter of minutes. If Linode's datacenter should go down, we can restore from the off-site backup with a little bit of rebuilding (it's an incomplete backup so we'd need to deploy a full linode, layer our backup on top of that, do some checking after that, and get back up in about 3 hours (I've got two bonded VDSL2 lines and some other connections to my apartment, so I can push 14 megs upstream on my fastest link, and probably 20 megs up total if I combine that with cable, 3G, and free wifi). And if Linode went down entirely (nuclear bomb exploding at Linode headquarters?) then we have a much better picture of the VPS hosting industry such that we could move to a new host and be up and running again probably in 6 to 12 hours. Of course, all these times are *after* we make the decision that the original machine is a write-off and we need to restore from backups...


Top
   
 Post subject:
PostPosted: Thu Aug 18, 2011 6:44 pm 
Offline
Senior Member

Joined: Sat May 03, 2008 4:01 pm
Posts: 569
Website: http://www.mattnordhoff.com/
thehousecat wrote:
No need to be patronizing - your magic comment is stating the obvious

I didn't mean to be patronizing, just a jerk. :P Um, I'm sorry. I was grumpy about the people who blamed Linode for the outage rather than HE. (Yes, it is Linode's fault for using HE, and maybe Linode should take some sort of action, but it's not entirely Linode's fault.)

thehousecat wrote:
Does anyone who has wider experience find the other datacenters more reliable?

None of the other data centers have had luck this bad, at least in the last few years. On the other hand, neither had Fremont up until a year ago.

You might want to read the "Datacenter reliability comparison" thread, but it doesn't really add much to what HoopyCat and I have said, except for some more details.

_________________
Matt Nordhoff (aka Peng on IRC)


Top
   
 Post subject:
PostPosted: Thu Aug 18, 2011 10:05 pm 
Offline
Newbie

Joined: Sat Jul 09, 2011 11:02 am
Posts: 3
Thanks everyone.

It just stings a bit when there are outages like this so close together! Hopefully they will get it sorted soon. I'm going to keep a VM in Fremont, but that datacenter is on its final warning :-)


mnordhoff - no worries.


Top
   
 Post subject:
PostPosted: Fri Aug 19, 2011 3:33 am 
Offline
Senior Member

Joined: Sat Nov 13, 2010 3:05 am
Posts: 91
Website: http://www.graq.co.uk
Guspaz wrote:
..... and off-site nightly incremental backups to a file server in my apartment.

Now see, that right there, makes you a geek :)


Top
   
 Post subject:
PostPosted: Fri Aug 19, 2011 8:05 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
graq wrote:
Guspaz wrote:
..... and off-site nightly incremental backups to a file server in my apartment.

Now see, that right there, makes you a geek :)

Doesn't everyone do that?

_________________
Rgds
Stephen
(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Sat Aug 20, 2011 7:58 am 
Offline
Senior Member
User avatar

Joined: Tue Nov 24, 2009 1:59 pm
Posts: 362
Does an old laptop count as a file server for the purpose of this requirement? :)

_________________
rsk, providing useless advice on the Internet since 2005.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group