Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Forum locked  This topic is locked, you cannot edit posts or make further replies.
Author Message
PostPosted: Tue Oct 27, 2009 4:32 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
During a shared library update distributed to our hosts, a number of the hosts incorrectly have marked Linodes as being shut down. To recover from this we may be issuing host reboots to upgrade their software to our latest stack, and then bringing the Linodes to their last state. We're working on this now and expect to have additional updates shortly. We'll also be notifying those affected via our support ticket system. Please stand by.

-Chris


Top
   
PostPosted: Wed Oct 28, 2009 12:58 am 
Offline
Junior Member
User avatar

Joined: Thu Apr 23, 2009 2:32 am
Posts: 41
Website: http://www.linode.com/
We believe we have addressed the core issues surrounding today's service outage, and we're working with all customers who are still facing problems to restore their Linodes to full service. We encourage anyone who may still be experiencing problems to contact us via our support ticket system or #linode on IRC.

We would like to take this opportunity to once again sincerely apologize for any issues this outage has caused, and to restate our commitment to handling problems of any kind in as expedient a fashion as possible. As both a company and individuals, we value our relationship with our customers, and we understand the frustration many of you have felt while the underlying causes and resulting effects of this outage have been investigated and handled.

As the situation evolves, we will be conducting an in-depth analysis of the factors that led up to the outage, and developing a solid set of procedures designed to prevent the future recurrence of such issues. Please expect additional information, to be made available once we have thoroughly reviewed the situation. Again, thank you for your support and continued business.


Top
   
 Post subject: You guys are awesome
PostPosted: Wed Oct 28, 2009 1:38 am 
Offline
Newbie

Joined: Tue Oct 27, 2009 8:06 pm
Posts: 3
Before any gripes can be said, let me just thank you guys for the great service you provide and working so hard to get things up and running again. I've been a customer here for a few years now and this is the first time I've had an issue.

Keep up the good work!


Top
   
 Post subject:
PostPosted: Wed Oct 28, 2009 1:45 am 
Offline
Senior Newbie

Joined: Sat Jun 13, 2009 9:59 am
Posts: 13
Thanks for the update. I would ask that you add a communication plan to your processes and procedures to use in future incidents.

Thanks for the hard work.


Top
   
 Post subject: Re: You guys are awesome
PostPosted: Wed Oct 28, 2009 3:29 am 
Offline
Senior Newbie

Joined: Fri Jul 31, 2009 6:47 am
Posts: 12
davew wrote:
Before any gripes can be said, let me just thank you guys for the great service you provide and working so hard to get things up and running again. I've been a customer here for a few years now and this is the first time I've had an issue.

Keep up the good work!


Exactly, because that's the other side of this story, and if you ask me, the most important one. Linode has always been great. Good uptime, great service and great VPS management! Unlucky as this outage has been, it's a very, very rare event. Kudos to Linode for resolving this so rapidly!


Top
   
 Post subject:
PostPosted: Wed Oct 28, 2009 4:53 am 
Offline
Newbie

Joined: Tue May 26, 2009 3:10 am
Posts: 3
Thanks for the update. Number one request from me would be an alert email to say each of my linodes has been rebooted. *very* critical information that I need to know.


Top
   
 Post subject:
PostPosted: Wed Oct 28, 2009 5:02 am 
Offline
Senior Newbie

Joined: Fri Jul 31, 2009 6:47 am
Posts: 12
mibbit wrote:
Thanks for the update. Number one request from me would be an alert email to say each of my linodes has been rebooted. *very* critical information that I need to know.


Get a pingdom account to monitor your linodes. It can e-mail you, text you and tweet you about the server status. Once it's down and once it's up.


Top
   
 Post subject:
PostPosted: Wed Oct 28, 2009 9:46 am 
Offline

Joined: Wed Oct 28, 2009 9:24 am
Posts: 1
Thanks for keeping the lines of communication open on the forum and on the IRC channel! You guys really do a great job, and your handling of tonight's unfortunate series of events was really top notch.

When I first noticed that my node was down, I put in a support ticket and had a response explaining the situation and referring me to the forums / IRC channel within minutes. And then when my node still was having trouble after the problem had been identified and fixed for most people, I was able to get one-on-one help right away in #linode from `mikegrb.' The problem wasn't easy to fix, but Mike stayed with me for over two hours as we worked out the solution together.

Having interactive one-on-one help like that isn't something many other hosting companies have offered me in the past, and those that have usually abandon me if the problem isn't solved in the first 5-10 minutes.

If problems like these were rampant, and I found myself needing technical support on a regular basis, I'd probably be singing a different tune right now, but with this as the first unexpected downtime in over 2 years of service, I have to say I'm really pleased by Linode's response. Thanks, Linode!


Top
   
 Post subject:
PostPosted: Wed Oct 28, 2009 10:02 am 
Offline
Junior Member

Joined: Thu Feb 05, 2009 12:48 pm
Posts: 24
gnummep-martin wrote:
mibbit wrote:
Thanks for the update. Number one request from me would be an alert email to say each of my linodes has been rebooted. *very* critical information that I need to know.


Get a pingdom account to monitor your linodes. It can e-mail you, text you and tweet you about the server status. Once it's down and once it's up.

Yep.

And also setup your linode to email you when it boots ... then you're covered.


Top
   
 Post subject:
PostPosted: Wed Oct 28, 2009 10:04 am 
Offline
Senior Member

Joined: Thu Apr 03, 2008 12:02 am
Posts: 103
AOL: derole
bd3521 wrote:
And also setup your linode to email you when it boots ... then you're covered.


But why rely on a third-party service when we could have the Linode Manager send these mails ?


Top
   
 Post subject:
PostPosted: Wed Oct 28, 2009 10:05 am 
Offline
Senior Member

Joined: Sat May 03, 2008 4:01 pm
Posts: 567
Website: http://www.mattnordhoff.com/
oliver wrote:
But why rely on a third-party service when we could have the Linode Manager send these mails ?


The manager can die too, y'know.


Top
   
 Post subject:
PostPosted: Wed Oct 28, 2009 10:10 am 
Offline
Senior Member

Joined: Thu Apr 03, 2008 12:02 am
Posts: 103
AOL: derole
mnordhoff wrote:
The manager can die too, y'know.


True, but at that point things are so fucked that you can't do anything useful with your linode anyway, email or not.

That said, way to go (as usual) is probably redundancy and have an external service check your site and also have the Linode Manager report if anything serious happens.


Top
   
 Post subject:
PostPosted: Wed Oct 28, 2009 10:33 am 
Offline
Junior Member
User avatar

Joined: Wed Mar 19, 2008 10:34 pm
Posts: 32
Website: http://www.claws-and-paws.com/
WLM: doug.muth@gmail.com
Yahoo Messenger: dmuthathome
AOL: Dmuth+At+Home
Location: Ardmore, PA
Suggestion for your communication next outage.

What Google did during their last outage was to post an update, followed by the text, "We will update this message again by 5:19 PM", where the time given was an hour in the future. That way, people wouldn't be refreshing the forums every 5 minutes, but could busy themselves with something else for the next hour.

_________________
Disclaimer: I am not an Linode staff member.


Top
   
PostPosted: Wed Oct 28, 2009 10:59 am 
Offline
Senior Newbie

Joined: Wed Dec 07, 2005 10:57 pm
Posts: 17
Location: Philadelphia, PA
Dear Linode staff,

See http://blog.bitbucket.org/2009/10/04/on ... ts-coming/ for a pretty well written post-mortem.


Top
   
 Post subject:
PostPosted: Wed Oct 28, 2009 3:47 pm 
Offline
Junior Member

Joined: Sat Mar 21, 2009 3:45 am
Posts: 48
This outage was frustrating, but thankfully 1) unusual & 2)relatively brief (for me, at least). I'm glad to hear that you'll be doing a post mortem and sharing your conclusions. I look forward to what you find. I hope you will consider releasing interim findings and conclusions along the way.

In addition to the technical/operational issues that led to this outage, I hope you'll also tackle communications issues, both with proactive warnings of potential downtime, and also with multichannel communications with customers when linode's own infrastructure is compromised. The first thing I did after trying to ping and ssh to our offline host was visit the linode homepage and find that it wasn't available. I ended up thinking to look to Twitter at about the same time your pages started loading (slowly) again. I checked in on IRC once I could find the server information. While your site was offline, it would have been great to have gotten an email, or to be able to register to get SMS notifications.


Top
   
Display posts from previous:  Sort by  
Forum locked  This topic is locked, you cannot edit posts or make further replies.


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group