Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Wed Apr 08, 2015 12:07 pm 
Offline
Junior Member

Joined: Sun Mar 21, 2010 11:19 pm
Posts: 45
I'm trying to pinpoint the source of recent high i/o warnings and cpu usage since the forced security update in early March.

Our linode actually failed at reboot during that maintenance window and I wasn't aware. There was a failed message on the Host Job Queue reported which I noticed 8 days ago when I updated cPanel + system packages. I have rebooted the linode after updates.

Since then, I've received a lot more I/O warnings, like: "has exceeded the notification threshold (1000) for disk io rate by averaging 1763.29 for the last 2 hours."

The only thing I can correlate to this issue are "breaks" in my linode reporting graphs. Something that isn't happening on my other linode. See this screenshot: https://www.evernote.com/shard/s9/sh/11 ... 66744a3687

January reports no breaks
February reports 1
March shows 15+
Last 24 hours shows 8

We haven't added any significant resource offenders over this time span. I'm curious if there is some incompatibility with our version of Linux (CentOS 5.11), the kernel, and the recent linode upgrades?

Here is the output of my load average using sar -q command:

12:00:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
12:10:01 AM 1 179 0.27 0.29 0.32
12:20:01 AM 2 180 0.14 0.30 0.33
12:30:01 AM 3 183 0.37 0.39 0.35
12:40:01 AM 1 181 0.31 0.32 0.32
12:50:01 AM 1 180 0.29 0.38 0.34
01:00:01 AM 3 189 0.20 0.26 0.31
01:10:01 AM 1 179 0.24 0.39 0.37
01:20:01 AM 2 181 0.34 0.42 0.42
01:30:01 AM 1 178 0.25 0.32 0.37
01:40:01 AM 1 228 0.54 0.46 0.42
01:50:01 AM 1 185 0.50 0.41 0.41
02:00:01 AM 4 185 0.21 0.61 0.55
02:10:01 AM 1 178 0.38 0.52 0.54
02:20:01 AM 2 179 0.12 0.36 0.47
02:30:01 AM 3 181 0.16 0.25 0.37
02:40:01 AM 2 182 0.16 0.24 0.31
02:50:01 AM 3 180 0.36 0.28 0.31
03:00:01 AM 3 197 0.42 0.31 0.31
03:10:01 AM 2 177 0.17 0.23 0.28
03:20:01 AM 2 180 0.14 0.36 0.38
03:30:01 AM 2 191 0.56 0.46 0.42
03:40:01 AM 1 178 0.34 0.48 0.47
03:50:01 AM 1 184 0.49 0.48 0.47
04:00:01 AM 3 184 0.26 0.32 0.39

04:00:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
04:10:01 AM 1 186 0.97 0.53 0.43
04:20:01 AM 2 182 0.26 0.29 0.35
04:30:01 AM 3 187 0.27 0.36 0.39
04:40:01 AM 1 178 0.36 0.32 0.36
04:50:01 AM 1 181 0.44 0.59 0.48
05:00:01 AM 3 187 0.33 0.32 0.40
05:10:01 AM 1 204 0.46 0.63 0.54
05:20:01 AM 3 203 0.80 0.75 0.64
05:30:01 AM 3 190 1.19 1.11 0.88
05:40:01 AM 2 186 1.29 1.35 1.11
05:50:01 AM 2 195 1.14 1.11 1.09
06:00:01 AM 4 192 0.87 1.04 1.09
06:10:01 AM 1 179 0.21 0.69 0.95
06:20:01 AM 2 187 0.42 0.49 0.72
06:30:01 AM 2 187 0.91 0.95 0.87
06:40:01 AM 1 180 0.47 0.48 0.65
06:50:01 AM 1 178 0.41 0.62 0.66
07:00:01 AM 2 193 0.35 0.32 0.47
07:10:01 AM 3 181 0.89 0.59 0.51
07:20:01 AM 2 179 0.27 0.33 0.41
07:30:01 AM 1 183 0.22 0.33 0.38
07:40:01 AM 2 181 1.03 0.67 0.51
07:50:01 AM 3 192 1.11 1.07 0.80
08:00:01 AM 4 188 0.25 0.60 0.74

08:00:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
08:10:01 AM 1 198 0.62 1.12 1.23
08:20:01 AM 2 210 0.69 0.51 0.82
08:30:01 AM 4 183 0.12 0.38 0.63
08:40:01 AM 1 187 0.66 0.46 0.55
08:50:01 AM 1 188 0.45 0.47 0.53
09:00:01 AM 4 193 0.73 0.47 0.48
09:10:01 AM 1 190 0.49 0.64 0.57
09:20:01 AM 3 198 0.49 0.51 0.52
09:30:01 AM 2 184 0.56 0.54 0.52
09:40:01 AM 1 190 0.54 0.49 0.50
09:50:01 AM 2 208 0.53 0.61 0.56
10:00:01 AM 3 198 0.66 0.82 0.75
10:10:01 AM 2 193 0.30 0.61 0.70
10:20:01 AM 3 203 0.54 0.51 0.59
10:30:01 AM 2 196 1.06 0.87 0.72
Average: 2 188 0.49 0.53 0.54


Top
   
PostPosted: Wed Apr 08, 2015 7:27 pm 
Offline
Senior Member

Joined: Sun Mar 07, 2010 7:47 pm
Posts: 1970
Website: http://www.rwky.net
Location: Earth
The breaks in the graphs are from when the server that generates the graphs couldn't communicate with the host for some reason, they happen sometimes they're nothing to do with your node. As for the IO warnings 1763 isn't that high, but you could try using iotop to watch the io usage, also swap usage is often a cause of increase IO.

_________________
Paid support
How to ask for help
1. Give details of your problem
2. Post any errors
3. Post relevant logs.
4. Don't hide details i.e. your domain, it just makes things harder
5. Be polite or you'll be eaten by a grue


Top
   
PostPosted: Fri Apr 10, 2015 10:08 am 
Offline
Junior Member

Joined: Sun Mar 21, 2010 11:19 pm
Posts: 45
That's what Linode support said as well. While I appreciate that, it just seems like a massive amount suddenly. Especially when looking at past months where this happened maybe once or twice -- now we're well past 30.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group