Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Tue Apr 04, 2006 11:03 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
The xen beta box is going to be rebooted in a few. Under heavy load (a few migrations, a deployment, and a resize), it looks like it triggered CONFIG_DETECT_SOFTLOCKUP and created a few zombie domains, preventing people from booting. I'm going to grab the latest Xen updates, turn that off, update the host kernel and reboot.

Those that are still up and running should see a graceful shutdown in a few minutes...

-Chris


Top
   
 Post subject:
PostPosted: Tue Apr 04, 2006 11:52 pm 
Offline
Senior Member
User avatar

Joined: Sun Feb 08, 2004 7:18 pm
Posts: 562
Location: Austin
Great to hear it was what amounts to an instrumentation problem and not a real failure.

Xen seems to be working out really well.

Any idea when the nodes will come back up? :-)


Top
   
 Post subject:
PostPosted: Tue Apr 04, 2006 11:58 pm 
Offline
Senior Member
User avatar

Joined: Sun Feb 08, 2004 7:18 pm
Posts: 562
Location: Austin
...oh. It's in the queue. Nevermind!


Top
   
 Post subject:
PostPosted: Tue Apr 04, 2006 11:58 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
About half have been booted already, and it's working its way through the rest.

-Chris


Top
   
 Post subject:
PostPosted: Wed Apr 05, 2006 12:05 am 
Offline
Senior Member
User avatar

Joined: Sun Feb 08, 2004 7:18 pm
Posts: 562
Location: Austin
It doesn't seem to have worked...

Code:
xen_linode_boot: failed to get domid
xen_linode_boot: warning - li-network might not have ran


Top
   
 Post subject:
PostPosted: Wed Apr 05, 2006 12:09 am 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
Yeah, I've seen that one as well. Issue another reboot and it should work.

-Chris


Last edited by caker on Wed Apr 05, 2006 12:10 am, edited 1 time in total.

Top
   
 Post subject:
PostPosted: Wed Apr 05, 2006 12:10 am 
Offline
Senior Member
User avatar

Joined: Sun Feb 08, 2004 7:18 pm
Posts: 562
Location: Austin
Sure did.


Top
   
 Post subject:
PostPosted: Wed Apr 05, 2006 5:20 am 
Offline
Senior Member
User avatar

Joined: Fri Oct 24, 2003 3:51 pm
Posts: 965
Location: Netherlands
My Linode reported the same error after the host initiated restart:
Code:
xen_linode_boot: failed to get domid
xen_linode_boot: warning - li-network might not have ran

LPM showed it as 'Powered off'. (I'm at work, where ssh is only allowed to predefined hosts - not including my Linode - so I guess it was off but I'm not totally sure).

I issued a boot command and got the same error message and Linode still shown as 'Powered off'.

A second boot command again gave the error messages but the Linode was shown by LPM as 'Running'.

A reboot command produced a successful shutdown followed by a boot with error messages and a 'Powered off' Linode.

Another boot command and it came up without error messages and LPM shows 'Running'.

The Linode is attempting to boot into a vanilla Debian 3.1 distro, so I don't think it's a problem with the system.

_________________
/ Peter


Top
   
PostPosted: Wed Apr 05, 2006 1:56 pm 
Offline
Junior Member
User avatar

Joined: Sun Sep 19, 2004 7:42 pm
Posts: 27
Website: http://eric.gatenby.org/
Location: New York, NY
caker wrote:
The xen beta box is going to be rebooted in a few. Under heavy load (a few migrations, a deployment, and a resize), it looks like it triggered CONFIG_DETECT_SOFTLOCKUP and created a few zombie domains, preventing people from booting. I'm going to grab the latest Xen updates, turn that off, update the host kernel and reboot.


Is host56 experiencing more problems today, or has anyone else noticed anything wrong? For at least the past 2 hours, the performance has been absolutely horrible.


Top
   
 Post subject:
PostPosted: Wed Apr 05, 2006 1:59 pm 
Offline
Senior Member
User avatar

Joined: Sun Feb 08, 2004 7:18 pm
Posts: 562
Location: Austin
Yes, it's been struggling... Maybe Caker's migrating some folks to the new Xen box right now.


Top
   
 Post subject:
PostPosted: Wed Apr 05, 2006 2:37 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
We found another bug in Xen. It looks related to what we hit last night. I've got an email thread going on the xen-devel mailing list.

If you want to be un-migrated, please open a support ticket and specify if we can just "reset" you back to the host you were previously on without moving the disks, or if you need your disk images moved.

-Chris


Top
   
 Post subject:
PostPosted: Wed Apr 05, 2006 2:41 pm 
Offline
Senior Member
User avatar

Joined: Sun Feb 08, 2004 7:18 pm
Posts: 562
Location: Austin
Is there a forecast for when things will be better? Are we having to wait for the Xen developers to fix something, or can we go back to the state where things were working fine?

And if we do choose to move our disk images, will that happen at a reasonable speed, or will it be subject to the same slowdown?

If there's a chance that rebooting the host will make things better, I'd say let's try it. It's not doing me a whole lot of good as is...


Top
   
 Post subject:
PostPosted: Wed Apr 05, 2006 2:52 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
It'll take at least another reboot for those people that can't boot currently.

I've already suspended pending migrations to the box, so anyone with a migration pending, you'll need to hold off for now.

Things seemed to work fine until a certain threshold of number of linodes on the machine was hit. If we can get a few people off the machine, I think we'll be ok while this gets resolved.

To answer your question re speed of migrating off ... I honestly don't know at this point. The disk performance might be being masked by this bug in Xen, since a few of us were able to totally thrash the box without any other domains even noticing. I've also been able to get easily 60M/sec reads, so something weird is going on.

If you're just worried about performance, check back in about 10 minutes. There's one final migration that was currently underway when this happeneed, that's about to finish...

-Chris


Top
   
 Post subject:
PostPosted: Wed Apr 05, 2006 3:00 pm 
Offline
Junior Member
User avatar

Joined: Sun Sep 19, 2004 7:42 pm
Posts: 27
Website: http://eric.gatenby.org/
Location: New York, NY
caker wrote:
If you want to be un-migrated, please open a support ticket and specify if we can just "reset" you back to the host you were previously on without moving the disks, or if you need your disk images moved.


I moved from a Dallas host to the Xen host. Is there any availability in Fremont? (I've trying to avoid an IP change)

Thanks!


Top
   
 Post subject:
PostPosted: Wed Apr 05, 2006 3:04 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
egatenby wrote:
I moved from a Dallas host to the Xen host. Is there any availability in Fremont? (I've trying to avoid an IP change)


Yes, that would be best -- it would involve migrating your disk images again (no big deal).

Send me a new ticket with this request for tracking...

-Chris


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group