pparadis wrote:
Once again, we sincerely apologize for these issues. All system administrators are continuing to work to restore full service for all affected customers. Support tickets are being processed as rapidly as possible. Thank you for your patience.
What would you guess your ETA is, for solving this problem for all hosts?
How many hosts have you fixed and how many are left? It's been five hours downtime for one of my linodes now, and it's still down.
The guy who did this (or the boss who ordered him, if that was the case) should get punished by being forced to work 4 weekend-days without pay, and linodes with more than one hour downtime should get one month of service credited to their account.
That way the technician/boss would think twice before 1. not notifying customers of the potentially risky upgrade well in advance, and 2. not testing the update on a few hosts before pushing the update onto all (?) of your hosts at once.
And the company as a whole would get a clear financial incentive not to repeat such foolishness in the future.
That said, I will not move to a different provider just because of todays downtime. I just won't recommend Linode as enthusiastically any more. You're still better than any other alternative I've heard of.
For the future:
Let's say you have 1000 hosts you wish to update. You can never be sure nothing will break. You should therefore test your update on a test-server. If that works you should try it on a live production server. If that works, you should try it on two additional production servers at once. If that works you should try it on 4 more servers at once. If that works, try it on 8, 16, 32 and so on. After 11 tests you would have upgraded 1024 servers. Testing to see if everything is ok before proceeding to update even more servers, eleven times, is a reasonable "waste of time" considering the impact one undiscovered error would have for so many people. If you do upgrades on just a few servers at a time (as suggested above in this message), any problems you miss, we customers will catch because we are so many people.
And please create a mailinglist anyone could join if we want updates on planned upgrades and potential problems, progress reports and so on. Maybe one list per datacenter? I only want to know about any updates you do in the Atlanda DC for example.
Anyway, good luck fixing today's problems. It's 03:26 in Sweden now and I'm hitting the sack.[/b]