Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Thu Dec 04, 2003 3:07 pm 
Offline
Newbie

Joined: Thu Dec 04, 2003 3:02 pm
Posts: 3
I do a lot of interactive work on my linode (called 'shed' on host10) and I have found that it becomes (so far as i can tell) completely unresponsive every few minutes. It is usually only unresponsive for a couple seconds, but sometimes it is unreachable for as long as 30 seconds. When I have checked, host10 has still been responding to pings in a timely fashion but my linode has not. Sometimes if I check the load average on my linode it shoots up as high as 4.0 without any processes taking CPU...

This sounds like it could be the same 2.4.22/2.4.23 IO scheduling problems mentioned in late October on host9. Any news?

Thanks,
- Greg


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2003 12:02 am 
Offline
Newbie

Joined: Thu Nov 20, 2003 11:12 pm
Posts: 2
I'm having exactly the same problem.

All IP traffic to my linode (on host10) goes nowhere, but I can still ssh to the console on host10. Then after "a while" (maybe 10-120 seconds later) IP traffic resumes

This is very anoying, any ideas?

=Matt


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2003 1:45 am 
Offline

Joined: Fri Dec 05, 2003 1:41 am
Posts: 1
This happens to me too.

I was able to "catch" it a week or so ago, the load was above 4 but I forget the exact number. Still, within no more than 5 minutes it was back to normal.

Actually, it just happened a few minutes ago. Pretty easy to tell when it happens cause Evolution complains that it can't get mail.


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2003 1:59 am 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
sdh wrote:
Actually, it just happened a few minutes ago. Pretty easy to tell when it happens cause Evolution complains that it can't get mail.

Strange -- I've been connected via ssh to a Linode on host10 for a while without any interruption. So whatever it is, its not effecting everyone on the host..

-Chris


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2003 6:40 pm 
Offline
Senior Member

Joined: Sun Nov 30, 2003 2:28 pm
Posts: 245
I just had the same thing happen on host 12. I had two sessions open: one "console" via lish, and another direct ssh session. Both went unresponsive, the direct ssh one recovered first (after ~ 30 seconds, maybe) and the lish one ~10 seconds later. Accord to the control panel, load is "low".

_________________
The irony is that Bill Gates claims to be making a stable operating system and Linus Torvalds claims to be trying to take over the world.
-- seen on the net


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2003 7:27 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
I've been connected to host12 from here for a number of days without interruption -- going through ssh just like your Lish connection.

Chances are this is a network issue between you and your Linode, rather than a problem within the Linode network itself. These types of problems are so hard to pinpoint.

Load has been very low on host10 and host12, but I will review the logs and see if the loadavg peaked -- in case that does have something to do with it.

-Chris


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2003 8:41 pm 
Offline
Newbie

Joined: Thu Nov 20, 2003 11:12 pm
Posts: 2
caker wrote:
Chances are this is a network issue between you and your Linode, rather than a problem within the Linode network itself. These types of problems are so hard to pinpoint.


The one thing that makes me think that this is *not* the case is that while I could not ssh to my linode, I *could* ssh to the console on host10.

This makes me think that it a "virtual network" problem between host10 and the UML nodes, but *shrug*, it could be anything.

I haven't observed the problem for the last 12 hours.

=Matt


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2003 9:25 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
spudbean wrote:
The one thing that makes me think that this is *not* the case is that while I could not ssh to my linode, I *could* ssh to the console on host10.

This makes me think that it a "virtual network" problem between host10 and the UML nodes, but *shrug*, it could be anything.

I haven't observed the problem for the last 12 hours.

=Matt

I agree for the host10 case you've brought up, there is the possibility of a bridging bug. I'll keep on the look-out.

I think SteveG's problem was just his connection or a routing problem somewhere out there... If you can't access your IPs or via the host console, then the problem is elsewhere about the internet...

-Chris


Top
   
PostPosted: Sun Dec 07, 2003 9:02 pm 
Offline
Newbie

Joined: Thu Dec 04, 2003 3:02 pm
Posts: 3
OHEC:
OBSERVE linode seems unresponsive at times
HYPOTHESIZE my uml isn't getting CPU
EXPERIMENT I ran the following program on my linode under screen to prevent network blocking from being an issue:
Code:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>
                 
unsigned long   
tv2ms(struct timeval tv)
{               
        return (tv.tv_sec * 1000) + (tv.tv_usec / 1000);
}               

int
main(void)
{
        unsigned long ms,old_ms;
        struct timeval tv;

        gettimeofday(&tv,NULL);
        old_ms = tv2ms(tv);
        while (1) {
                gettimeofday(&tv,NULL);
                ms = tv2ms(tv);
                if (ms < old_ms) {
                        /* ignore truncation bug... */
                        printf("burp: %ld\n",old_ms-ms);
                } else if (old_ms + 1000 < ms) {
                        printf("timeout: %ld\n",ms - old_ms);
                        system("date");
                }
                old_ms = ms;
                usleep(500000);
        }
        return 0;
}

after a few hours it caught little blips where I'd lose CPU for just a couple seconds that occaisonally corresponded to load on our own server (apt-get for example)...no big deal. but then i caught a period of about 10 minutes where i was losing cpu and it peaked at almost 30 seconds:

timeout: 3590
Sun Dec 7 17:31:30 EST 2003
timeout: 3630
Sun Dec 7 17:32:01 EST 2003
timeout: 5780
Sun Dec 7 17:32:47 EST 2003
timeout: 6770
Sun Dec 7 17:34:30 EST 2003
timeout: 3390
Sun Dec 7 17:34:35 EST 2003
timeout: 27230
Sun Dec 7 17:35:02 EST 2003
timeout: 2600
Sun Dec 7 17:36:22 EST 2003
timeout: 1440
Sun Dec 7 17:38:25 EST 2003

So I'm pretty sure we're running into scheduling trouble...but i'm just posting this for discussion..

But I'm optimistic about this kernel upgrade. After the kernel upgrade I'll try this same test except with realtime scheduling inside of our uml and also integrate a ping of the 64.62.190.1 gateway to see if there are independent networking flakiness. Will update.

- Greg


Top
   
 Post subject:
PostPosted: Thu Dec 11, 2003 5:55 pm 
Offline
Senior Member

Joined: Sat Aug 30, 2003 6:35 am
Posts: 57
I have exactly the same problem, but I'm on host11.

I have a linode 96, and over the past few hours it's been dreadfully slow,
and I can't fathom why. Now, connections to it ping timeout,
and I've had shutdown and boot on the host job que for about 10 minitues
now and it's still not processed them......
A friend earlier told me it might be a scheduling problem,
and I'm beginning to suspect this too.

-Ashen


Top
   
PostPosted: Fri Dec 12, 2003 12:08 am 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
galexand wrote:
So I'm pretty sure we're running into scheduling trouble...but i'm just posting this for discussion..


Using usleep() isn't going to give you accurate results inside UML, as it has a bug which affects the delivery of SIGALARMS. Jeff Dike has said the fix will be included in the next UML patch.

Interesting test, though. My overall feeling is that none of these issues are related to lack of CPU time -- more of an issue of blocking because of disk i/o.

How have things been post reboot?

-Chris


Top
   
PostPosted: Fri Dec 26, 2003 12:41 am 
Offline
Newbie

Joined: Thu Dec 04, 2003 3:02 pm
Posts: 3
Hope you weren't holding your breath waiting for this report...but I figure I've given the system lots of chances to demonstrate infelicities and I've noticed more local connection problems (cablemodem at my home) than problems with the server. So I must say it has gotten much better subjectively since the kernel upgrade/reboot.

Thanks!!
- Greg


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: mwchase and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group