Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Wed Sep 07, 2011 10:21 am 
Offline
Junior Member

Joined: Tue Jun 07, 2011 9:16 am
Posts: 31
Location: Spain, EU
Hi!

In the last few weeks, I've been experimenting some random "crashes" of my Linode 1024. They happen every 9-13 days, at the same hour, about 8:00 GMT, and they cause the CPU, network and HD graphs at my Linode Dashboard to show values close to zero. The server doesn't respond from "outside", and the only possible solution is to reboot it, and this takes about 3 minutes.

It runs Ubuntu 10.04, with LAMP, Tomcat 6, Postfix and Dovecot. All of them are the last versions available, the software is totally up to date. I've checked some "system" logs, like syslog, daemon.log, etc., but they don't show anything relevant, they just "log" things until the crash moment.

I have this server since June, but the problems started in August. In addition, from the moment of the server purchase, I haven't made any software installation or similar. Other than versions changes, the server has the same software that was installed in the first time.

Any ideas?

Thanks in advance!


Last edited by usr01 on Sun Sep 11, 2011 5:50 am, edited 1 time in total.

Top
   
 Post subject:
PostPosted: Wed Sep 07, 2011 10:34 am 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
Hello,

You need to do two things: One, make sure you're running the "Latest 3.0" kernel (Linode Manager -> Linode -> Config Profile -> Kernel drop-down, save, reboot). Second, when this happens, inspect your Lish console before you reboot to see if there has been a kernel BUG or OOM or something, or if your Linode is still 'alive' but something has taken networking down.

Hope that helps,
-Chris


Top
   
 Post subject:
PostPosted: Wed Sep 07, 2011 11:02 am 
Offline
Junior Member

Joined: Tue Jun 07, 2011 9:16 am
Posts: 31
Location: Spain, EU
Thanks for your reply. I've just changed the kernel to "Latest 3.0", and now all runs OK. However, about the Linode being "alive" or just the network, I must say that ALL the "system" logs that I've checked were "empty" for the few hours the server was crashed. Not just the network, it was the server, completely.

Should I check one particular log?

And let's suppose this happens again. When I access my server using Lish console at Manager, what should I watch?

Finally, in the case I can't access my Linode from Lish console as said, what could be the issue?


Top
   
 Post subject:
PostPosted: Thu Sep 08, 2011 8:00 pm 
Offline
Senior Member

Joined: Fri Jan 09, 2009 5:32 pm
Posts: 634
Check your MaxClients in apache. the default is for systems with many gigabytes of RAM and is not appropriate for a linode. That is the cause of the vast majority of threads like yours.


Top
   
 Post subject:
PostPosted: Thu Sep 08, 2011 9:08 pm 
Offline
Senior Member

Joined: Fri May 02, 2008 8:44 pm
Posts: 1121
The console shows a lot of things that wouldn't be in any log file. For example, a panicked kernel can't write anything to a log file, but it can still spit out a message to the console before going down. Firewall messages often show up on the console, too.

A reboot will probably erase anything that was written to the console, though.


Top
   
 Post subject:
PostPosted: Fri Sep 09, 2011 4:03 am 
Offline
Junior Member

Joined: Tue Jun 07, 2011 9:16 am
Posts: 31
Location: Spain, EU
THanks to everyone!

Regarding Apache, I don't think it is a memory problem, because memory consumption is all the time around 30% of the total.

The next time this happens, if it does happen, I will access my server using Lish console and check for warning messages before rebooting, so that they aren't erased.


Top
   
 Post subject:
PostPosted: Fri Sep 09, 2011 12:58 pm 
Offline
Senior Member

Joined: Sat May 03, 2008 4:01 pm
Posts: 569
Website: http://www.mattnordhoff.com/
Lish's logview command (detach from the screen session showing your console itself) shows the most recent lines of both your current boot and the previous one, so even if you have rebooted, the information is still there. Unless you've rebooted twice, anyway.

_________________
Matt Nordhoff (aka Peng on IRC)


Top
   
 Post subject:
PostPosted: Fri Sep 09, 2011 2:03 pm 
Offline
Junior Member

Joined: Tue Jun 07, 2011 9:16 am
Posts: 31
Location: Spain, EU
OK.


Top
   
 Post subject:
PostPosted: Sun Sep 11, 2011 5:50 am 
Offline
Junior Member

Joined: Tue Jun 07, 2011 9:16 am
Posts: 31
Location: Spain, EU
OK, the server has just crashed, and here is the output from Lish console as you described:

REMOVED. SEE http://lists.xensource.com/archives/htm ... 01172.html FOR INFORMATION

I can't login, just opening the console throws those messages. And I can't do anything. Note that I changed MAC and IP addresses, because that isn't important.

In addition, I'm not the first one with this problem on the VPS:

viewtopic.php?t=7514

The graphs of my dashboard say that CPU load before crash was 20%, network was about 400 kbps and IO around 3.3 kblocks per sec. And, as I mentioned, memory is always about 30% all the time.

Clearly, I'm not the only customer of Linode that has this error. It would be nice if Linode, as it's a serious company, investigates to solve the problem

Thanks in advance!


Last edited by usr01 on Sun Sep 11, 2011 10:51 am, edited 30 times in total.

Top
   
 Post subject:
PostPosted: Sun Sep 11, 2011 6:22 am 
Offline
Senior Member
User avatar

Joined: Tue Aug 17, 2004 11:37 pm
Posts: 262
Website: http://www.our-lan.com
WLM: nf@our-lan.com
Location: Brisbane, Australia
I dont think you quite understand what your being shown.
This is a kernel crash probably todo with your instance running out of memory. This isnt linode's fault.

Scroll up in that logview and im sure somewhere there will be out of memory error.

This means you need to tune your instance to not eat as much memory. Maybe run ab or a benchmarking utility agaisnt it to see if you can put enough pressure on it to cause the error to happen

_________________
ServerAdmin - www.our-lan.com
"Diplomacy is the art of saying nice doggy whilst looking for a really big stick"
"In my experiece, any attempt to make any system idiot proof will only challenge God to make a better idiot"


Top
   
 Post subject:
PostPosted: Sun Sep 11, 2011 6:31 am 
Offline
Senior Newbie

Joined: Wed Dec 10, 2008 10:35 am
Posts: 14
Also, a quick tip - to scroll up in the console where this is being shown, press Ctrl-A and release, then press Escape. You'll then be able to scroll up and down with the cursor keys and Page Up/Page Down. Press Escape again to go back to normal.


Top
   
 Post subject:
PostPosted: Sun Sep 11, 2011 6:31 am 
Offline
Junior Member

Joined: Tue Jun 07, 2011 9:16 am
Posts: 31
Location: Spain, EU
I'm sorry, I can't scroll up. How is that done? How can I use "logview"?

If I press Ctrl-A, it shows me a message saying "No other window". If I press Escape, then the message hides and I return to the normal console. I can't scroll with the mouse, or the Page Up/Down keys.


Top
   
 Post subject:
PostPosted: Sun Sep 11, 2011 6:46 am 
Offline
Junior Member

Joined: Tue Jun 07, 2011 9:16 am
Posts: 31
Location: Spain, EU
Well, I managed to show the contents of the Lish console:

REMOVED. SEE http://lists.xensource.com/archives/htm ... 01172.html FOR INFORMATION.

It seems like it's an Apache issue... Could anyone translate the above code to English? Is it a memory error? IO? CPU? It seems it's something related to the memory, the stack, bit I don't know it 100% sure.

Thanks in advance!


Last edited by usr01 on Sun Sep 11, 2011 10:49 am, edited 1 time in total.

Top
   
 Post subject:
PostPosted: Sun Sep 11, 2011 7:06 am 
Offline
Junior Member

Joined: Tue Jun 07, 2011 9:16 am
Posts: 31
Location: Spain, EU
It appears it's NOT an Apache issue:

http://www.gossamer-threads.com/lists/xen/devel/216324

This user has exactly the SAME problem as me, and... HE/SHE USES LINODE! Apparently, they haven't reached a solution, as they are not able to reproduce the error.

And in most forums in which users report this error, it appears that the problem is related with hardware. However, this is strange, as Linode uses virtual environments...

Finally, it's quite remarkable that, as I described, I have used Linode since June, but problems started in August, and they occur every few days. The forum post I linked is of August.

Hope we find a solution.


Top
   
 Post subject:
PostPosted: Sun Sep 11, 2011 8:16 am 
Offline
Senior Member
User avatar

Joined: Sat Aug 30, 2008 1:55 pm
Posts: 1739
Location: Rochester, New York
I believe Linode might be a couple steps ahead of you, given that the xen-devel thread was started by a Linode engineer. :-)

Might be a good idea to open a ticket, to the attention of psandin, especially if you're willing to be a test pilot.

_________________
Code:
/* TODO: need to add signature to posts */


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group