Bill Clinton wrote:
Mind if we ask how we can make such cool graphs ?
Bill Clinton (aka Sunny Dubey)
Well, there are probably really elegant ways to do it with MRTG/rrdtools, but I use a couple of simple scripts that I wrote instead because they were simpler. First I run the following script on my Linode:
Code:
#!/bin/sh
while true; do
DATE=`date`
echo -n "$DATE "
uptime
sleep 15
done
I run it like this:
Code:
./uptime.sh 2>&1 > uptime.out &
I let that run for as long as I want to; generally I have it running all of the time, and whenever I get around to it I convert the results into a graph with this script:
Code:
#!/bin/sh
cat uptime.out | awk '{ if($16=="average:") print $2,$3,$4,$6,$17,$18,$9; else print $2,$3,$4,$6,$16,$17,$18 }' | tr ',' ' ' > raw_data
echo 'set terminal jpeg; set yrange [0:20]; set xdata time; set bmargin 3; set timefmt "%b %d %H:%M:%S %Y"; plot "raw_data" using 1:5 title "1 min load average" with lines' | gnuplot > plot.jpg
display plot.jpg
That's it. Pretty simple really.
It relies on 'uptime' to determine the current host load; in general because my Linode is idle almost all of the time, any significant load is due to other Linodes hogging all of the disk I/O. When that happens, processes sit on the kernel's run queue while they wait for I/O requests to complete, and when that happens, the load level goes up. Thus we get a pretty good correlation between the disk activity on the host system and any particular Linode's load level.
For what it's worth, the spikes are pretty short lived even though they are annoying. Caker offered to move me to another host but it's such a minor issue that I don't even want to bother. I'd like to see a UML kernel fix for this problem though - disk I/O should be shared as fairly as CPU! Maybe someday someone will enhance Linux to handle this better ...