Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Sat Aug 31, 2013 6:36 am 
Offline
Senior Member

Joined: Wed May 13, 2009 1:32 pm
Posts: 737
Location: Italy
Hi,
I use linode since years, never got a single problem but since 2013 I'm experiencing some linode hanging and I'm not able to discover why it hangs.

When the server hangs I receive a mail from linode that my server has averaged 20% of cpu usage in the last two hours, this usually happens by night and I find my server hanged in the morning.
Apache does not respond anymore, email does not work, svn not working, any service is working neither SSH.
The only service that it works is lish console.

Have you got an idea on what it can cause this issue?

Thanks.


Top
   
PostPosted: Sat Aug 31, 2013 12:51 pm 
Offline
Senior Member

Joined: Wed May 13, 2009 1:32 pm
Posts: 737
Location: Italy
I think that fail2ban is causing this issues, what do you think?
is there a method to catch the problem?


Top
   
PostPosted: Sat Aug 31, 2013 5:25 pm 
Offline
Senior Member
User avatar

Joined: Thu Nov 24, 2011 12:46 pm
Posts: 139
Location: Mesa AZ
sblantipodi wrote:
I think that fail2ban is causing this issues, what do you think?
is there a method to catch the problem?


Why do you think that is it?

Have you looked at the /var/log/fail2ban.log files to see what it is doing?

_________________
Kevin a.k.a. Dweeber


Top
   
PostPosted: Sat Aug 31, 2013 5:49 pm 
Offline
Senior Member

Joined: Wed May 13, 2009 1:32 pm
Posts: 737
Location: Italy
Dweeber wrote:
sblantipodi wrote:
I think that fail2ban is causing this issues, what do you think?
is there a method to catch the problem?


Why do you think that is it?

Have you looked at the /var/log/fail2ban.log files to see what it is doing?


because is the most cpu intensive software I have on that linode.


Top
   
PostPosted: Sat Aug 31, 2013 5:52 pm 
Offline
Senior Member

Joined: Wed May 13, 2009 1:32 pm
Posts: 737
Location: Italy
is there something that can send me an email of the jobs that is taking
20% of CPU for more than an hours?
this will help me understanding what happens on my server before hanging.


Top
   
PostPosted: Sat Aug 31, 2013 6:11 pm 
Offline
Senior Member

Joined: Fri Feb 17, 2012 8:20 pm
Posts: 365
You could look into Longview.

Perhaps you have really huge logs that aren't being rotated? Just guessing..


Top
   
PostPosted: Sat Aug 31, 2013 8:08 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
Run this program from cron every 5 minutes:
Code:
#!/bin/ksh -p

LOG=/var/tmp/srvr_stat.$(date +%Y%m%d)

{
  date
  uptime
  free
  ps aux
  echo
  echo
} >> $LOG

This'll let you see some basics of what your machine if doing; in particular free memory (are you swapping to death?) and processes using lots of CPU. After your machine crashes you can review the log files to see what happened.

_________________
Rgds
Stephen
(Linux user since kernel version 0.11)


Top
   
PostPosted: Sun Sep 01, 2013 8:28 am 
Offline
Senior Member

Joined: Wed May 13, 2009 1:32 pm
Posts: 737
Location: Italy
sweh wrote:
Run this program from cron every 5 minutes:
Code:
#!/bin/ksh -p

LOG=/var/tmp/srvr_stat.$(date +%Y%m%d)

{
  date
  uptime
  free
  ps aux
  echo
  echo
} >> $LOG

This'll let you see some basics of what your machine if doing; in particular free memory (are you swapping to death?) and processes using lots of CPU. After your machine crashes you can review the log files to see what happened.



Ok, I modified the program to write a new file every 5 minutes and put this files in a new direcotry every day.
Code:
#!/bin/ksh -p

mkdir -p /root/log_for_crash_detect/day_$(date +%Y-%m-%d)
LOG=/root/log_for_crash_detect/day_$(date +%Y-%m-%d)/log_$(date +%Y-%m-%d-%H-%M)

{
  date
  uptime
  free
  ps aux
  echo
  echo
} >> $LOG


In this way it will be easyer to track the problem.

I really suspect that fail2ban is the killer.
This particular linode does not run anything such resource intensive, it runs a mailserver, a svn server, a proxy server and I use it for tunneling.
I think that the problem is in fail2ban because I know it has many problem in analyzing big files.
I rotate my maillog every week but it can be up to 300MB and this may create problems to fail2ban I think.

IN any case I will keep you posted if I discover something more.

Thanks to help me tracking the problem.


Top
   
PostPosted: Sun Sep 01, 2013 9:02 am 
Offline
Senior Member

Joined: Wed May 13, 2009 1:32 pm
Posts: 737
Location: Italy
wait a minute. is there a way to sort for CPU usage using ps command?


Top
   
PostPosted: Sun Sep 01, 2013 9:02 am 
Offline
Senior Member
User avatar

Joined: Sat Aug 30, 2008 1:55 pm
Posts: 1739
Location: Rochester, New York
Disable fail2ban and see if the problem goes away?

_________________
Code:
/* TODO: need to add signature to posts */


Top
   
PostPosted: Sun Sep 01, 2013 9:05 am 
Offline
Senior Member

Joined: Wed May 13, 2009 1:32 pm
Posts: 737
Location: Italy
hoopycat wrote:
Disable fail2ban and see if the problem goes away?


this is the second option I have if I'm sure that fail2ban is the killer.


Top
   
PostPosted: Sun Sep 01, 2013 10:23 am 
Offline
Senior Member

Joined: Sun Mar 07, 2010 7:47 pm
Posts: 1970
Website: http://www.rwky.net
Location: Earth
sblantipodi wrote:
wait a minute. is there a way to sort for CPU usage using ps command?

Code:
ps aux --sort '-pcpu'
sorts all processes by cpu

_________________
Paid support
How to ask for help
1. Give details of your problem
2. Post any errors
3. Post relevant logs.
4. Don't hide details i.e. your domain, it just makes things harder
5. Be polite or you'll be eaten by a grue


Top
   
PostPosted: Sun Sep 01, 2013 10:33 am 
Offline
Senior Member

Joined: Wed May 13, 2009 1:32 pm
Posts: 737
Location: Italy
obs wrote:
sblantipodi wrote:
wait a minute. is there a way to sort for CPU usage using ps command?

Code:
ps aux --sort '-pcpu'
sorts all processes by cpu


thanks


Top
   
PostPosted: Tue Sep 03, 2013 3:27 pm 
Offline
Senior Member

Joined: Wed May 13, 2009 1:32 pm
Posts: 737
Location: Italy
in any case when my sever hangs, it does not properly hangs, it stops responding to external IP, I can connect to server over lish so it isn't hanged.

I'm thinking that Linode limits CPU usage and some sort of "resource protection" is executed on my linode.


Top
   
PostPosted: Tue Sep 03, 2013 4:17 pm 
Offline
Senior Member

Joined: Fri Feb 17, 2012 8:20 pm
Posts: 365
I've never heard of Linode "limiting CPU". If you're using too much, they'll tell you, but then again, I've never heard of that either. Though if you do think the host is the cause you can open a ticket to be migrated to another host.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group