Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Mon Mar 11, 2013 1:19 pm 
Offline
Junior Member

Joined: Mon Apr 18, 2011 1:54 pm
Posts: 45
Website: http://www.rassoc.com/gregr/weblog
This is more of a PSA...during the process of asking some questions to support about the node balancers and the new default 250Mbps outbound bandwidth cap (up from 50Mbps), I came to learn that bandwidth on the _internal_ network is also subject to the cap. Upon reflection, this makes some sense, since internal and external are on the same network interface, but boy did it take me by surprise.

My application runs on several nodes with different roles, and there is sometimes around 1MB of data transferred between nodes for a single web request. With a 50Mbps cap, that transfer would take around 160ms - that's quite a while.

After rebooting to change to a 250Mbps cap, I saw a measurable improvement in overall request time for many of my customers...on the order of a 30% improvement for certain requests. Purely because of internal network speed.

So anyway, if you heavily use the internal network, just be aware the bandwidth caps apply to that!


Top
   
PostPosted: Mon Mar 11, 2013 1:25 pm 
Offline
Senior Member

Joined: Fri Nov 02, 2012 4:20 pm
Posts: 60
Hadn't thought about that either. Thanks!


Top
   
PostPosted: Mon Mar 11, 2013 1:27 pm 
Offline
Senior Member
User avatar

Joined: Wed Mar 17, 2004 4:11 pm
Posts: 554
Website: http://www.unixtastic.com
Location: Europe
gregr wrote:
My application runs on several nodes with different roles, and there is sometimes around 1MB of data transferred between nodes for a single web request. With a 50Mbps cap, that transfer would take around 160ms - that's quite a while.


I don't know the details but 1MB internal traffic per web request sounds obscenely high.


Top
   
PostPosted: Mon Mar 11, 2013 3:08 pm 
Offline
Junior Member

Joined: Wed Jul 04, 2012 11:08 am
Posts: 34
Note that staff would set that limiter higher upon request if it caused you any issues. It was mostly considered a protective measure I suppose, to avoid a single vm to monopolize the hosts port or cause damage to other networks inadvertently.

HOWEVER. The shaper Linode is using on its host seems to be suffering from a, uh, somewhat not very nice case of bufferbloat which can make the performance quite bad in some cases, resulting very bad intra-DC performance or even complete TCP failures between for example frontends and backends. I will be writing something up for the Feature Request/Bug Report subforum on this topic shortly - along with our current workaround, I just need to double check using a pristine Linode kernel first. :)

In the meantime these might be interesting reads on the topic:
http://en.wikipedia.org/wiki/Bufferbloat
and
http://www.bufferbloat.net/


Top
   
PostPosted: Mon Mar 11, 2013 3:45 pm 
Offline
Junior Member

Joined: Mon Apr 18, 2011 1:54 pm
Posts: 45
Website: http://www.rassoc.com/gregr/weblog
sednet wrote:
gregr wrote:
My application runs on several nodes with different roles, and there is sometimes around 1MB of data transferred between nodes for a single web request. With a 50Mbps cap, that transfer would take around 160ms - that's quite a while.


I don't know the details but 1MB internal traffic per web request sounds obscenely high.

It's a lot, but I definitely wouldn't call it "obscenely high". There are lots of applications that require a lot of data to be flowing around.

This particular case doesn't happen for every request - just some of them. It's retrieving up to 2 months of 1-minute interval financial pricing data from an internal server that stores all of this data, in order to generate a chart for the user. The results of this are cached - but every now and then it has to be generated from scratch, and it takes a fair amount of data to do it. That 1MB is highly compressed as well - it's quite a bit larger in its natural form.

And if you imagine 5 or 10 of these all happening in parallel, you can see how bandwidth matters a lot.


Top
   
PostPosted: Wed Mar 13, 2013 11:52 am 
Offline
Senior Newbie

Joined: Tue Oct 27, 2009 9:26 pm
Posts: 15
gregr wrote:
So anyway, if you heavily use the internal network, just be aware the bandwidth caps apply to that!


Bandwidth caps definitely do not apply on the internal network. I do lots of transfers between a database server and 5 web server nodes, and if that were the case I'd be in deep trouble.

If it shows up as Priv. In / Priv. Out on the traffic graphs on the Linode Manager, it is not being counted towards your cap. I have a graph currently showing an average of 58 Mbit/sec private outbound traffic on the internal network (192.168.x.x address) and no other traffic generated from that host, and it says my combined traffic for the day is a whopping 3.5 MB.

So definitely internal traffic does not count towards your cap, if you use the internal network IPs.


Top
   
PostPosted: Wed Mar 13, 2013 12:07 pm 
Offline
Senior Member

Joined: Tue May 03, 2011 11:55 am
Posts: 105
dataiv wrote:
gregr wrote:
So anyway, if you heavily use the internal network, just be aware the bandwidth caps apply to that!


Bandwidth caps definitely do not apply on the internal network. I do lots of transfers between a database server and 5 web server nodes, and if that were the case I'd be in deep trouble.

If it shows up as Priv. In / Priv. Out on the traffic graphs on the Linode Manager, it is not being counted towards your cap. I have a graph currently showing an average of 58 Mbit/sec private outbound traffic on the internal network (192.168.x.x address) and no other traffic generated from that host, and it says my combined traffic for the day is a whopping 3.5 MB.

So definitely internal traffic does not count towards your cap, if you use the internal network IPs.


He's not talking transfer. He's talking about the port speed cap on Linodes, I.E. 250 Mbps.


Top
   
PostPosted: Thu Mar 14, 2013 1:54 am 
Offline
Senior Member

Joined: Sat Sep 25, 2010 2:25 am
Posts: 75
Website: http://www.ruchirablog.com
Location: Sri Lanka
dataiv wrote:
my combined traffic for the day is a whopping 3.5 MB.


excuse me? :roll:

_________________
www.ruchirablog.com


Top
   
PostPosted: Sat Mar 16, 2013 10:30 pm 
Offline

Joined: Sat Mar 16, 2013 10:23 pm
Posts: 1
trippeh wrote:
Note that staff would set that limiter higher upon request if it caused you any issues. It was mostly considered a protective measure I suppose, to avoid a single vm to monopolize the hosts port or cause damage to other networks inadvertently.

HOWEVER. The shaper Linode is using on its host seems to be suffering from a, uh, somewhat not very nice case of bufferbloat which can make the performance quite bad in some cases, resulting very bad intra-DC performance or even complete TCP failures between for example frontends and backends. I will be writing something up for the Feature Request/Bug Report subforum on this topic shortly - along with our current workaround, I just need to double check using a pristine Linode kernel first. :)

In the meantime these might be interesting reads on the topic:
http://en.wikipedia.org/wiki/Bufferbloat
and
http://www.bufferbloat.net/


As a linode user, I would love it if I could help them try out an fq_codel enabled shaper on their servers (if that is what they are using) as well as on the underlying hardware under the vm with no shaper. The results we get all the way up to 10GigE have been remarkable.

The results at 4-100Mbit are even more remarkable.

See for example, cablelabs results on cable modems as discussed in last week's ietf iccrg meeting:

http://www.ietf.org/proceedings/86/slid ... ccrg-3.pdf

Plenty more data like that floating around. Some caveats apply particularly at lower speeds:

http://www.ietf.org/proceedings/86/slid ... ccrg-0.pdf

please have someone at linode contact me offline if you would like to try this stuff out.

dave taht


Top
   
PostPosted: Sat Mar 16, 2013 10:39 pm 
Offline
Junior Member

Joined: Wed Jul 04, 2012 11:08 am
Posts: 34
I'm running fq_codel on my nodes, by limiting the outbound a slightly under linodes limiter using hfsc so I'm never getting backed up in their buffers. It worked wonders when they shaped you at 50Mbit/s; no more database timeouts connecting to other nodes.

It seems it is a little less of an issue now at 250Mbit/s, but it is still quite helpful for the responsiveness of the sites we run when under stress (either by one huge backup job or just a user with fast pipes).

Yes I'm still meaning to do those measurements :)


Top
   
PostPosted: Sun Mar 17, 2013 12:00 am 
Offline
Junior Member

Joined: Wed Jul 04, 2012 11:08 am
Posts: 34
This is what I currently use to keep buffering under control and ensure fairness. It integrates with ifupdown in Debian derived distros (that is, /etc/network/interfaces)

(Edit) PS! I do not know if the vanilla Linode kernel has all the required bits compiled in. I'm using pv-grub to load our own kernel. 3.4 and newer ships with the needed parts, but it might not be enabled.

You can test it directly by running
Code:
IFACE=eth0 IF_EGRESS_RATE=240Mbit ./shaper


Code:
#!/bin/sh
#
# Add fq_codel to all networking devices without any config.
#
# Save as /etc/network/if-up.d/shaper
# chmod 755 /etc/network/if-up.d/shaper
#
# Needs a fairly recent iproute package (from sometime in 2012 IIRC).
# fq_codel and Byte Queue Limits kernel support (mainline since 3.4 I think)
# HFSC shaper kernel support if using egress-rate
# IFB device kernel support if using ingress-redir
#
# Needs kernel supporting fq_codel qdisc, HFSC shaper, and a fairly
# recent iproute package. IFB also needs to be supported in kernel if
# ingress-redir is used.
#
# Support setting a custom egress-rate to avoid excessive buffering when
# it happens upstream of us (say, shaper on a vm host). Otherwise we
# just add as root qdisc.
#
# Avoids wireless devices because they are currently incompatible.
#
# In /etc/network/interfaces
# if no excessive buffering upstream, no config needed. fq_codel is
# attached directly to device.
#
# If excessive buffering upstream - limit our egress by setting a egress-rate:
#
#  iface eth0 inet dhcp
#  egress-rate 240Mbit
#
# To shape ingress, redirect to a queueing device and set the egress
# limit on that (works only because TCP tries to be nice)
#
#  iface eth0 inet dhcp
#  ingress-redir ifb0
#
#  iface ifb0 inet manual
#  egress-rate 500Mbit
#
# Note that the bandwidths must be set low enough to avoid packets
# getting queued in the upstream buffer, typically a little under what
# you're sold.
#
# Other available parameters are:
# egress-target try to keep buffering below this value, default 5ms
# egress-flows  fq_codel flow "buckets", default 10240
# egress-ecn    do ECN marking when saturating, "yes" to turn on - default off
#
# It can be useful to also turn off TSO, GSO and GRO using ethtool -K
# to improve accuracy (at the cost of some more CPU usage).
#
# Debug (show commands)
set -x

IP=/sbin/ip
TC=/sbin/tc

[ ! -x "$TC" ] || [ ! -x "$IP" ] && exit 0
[ "$IFACE" = "lo" ] && exit 0

# Buggy logic to detect physical device and exclude wireless
[ ! -e "/sys/class/net/$IFACE/device" ] && [ -z "$IF_EGRESS_RATE" ] && exit 0
[ -e "/sys/class/net/$IFACE/wireless" ] && [ -z "$IF_EGRESS_RATE" ] && exit 0

[ -z "$IF_EGRESS_TARGET" ] && IF_EGRESS_TARGET=5ms
[ -z "$IF_EGRESS_FLOWS" ] && IF_EGRESS_FLOWS=10240

# Is ECN useful on egress when we are the source (not forwarding)?
# leave off by default for now.
ecn="noecn"
if [ "$IF_EGRESS_ECN" = "on" ] || [ "$IF_EGRESS_ECN" = "yes" ]; then
        ecn="ecn"
fi

# Reset
$TC qdisc del dev $IFACE root 2>/dev/null || true
$TC qdisc del dev $IFACE ingress 2>/dev/null || true

if [ ! -z "$IF_EGRESS_RATE" ]; then
        # Bandwidth limiting mode
        echo "Setting egress rate of $IFACE to $IF_EGRESS_RATE, target $IF_EGRESS_TARGET, flows $IF_EGRESS_FLOWS"
        $TC qdisc add dev $IFACE root handle 1 hfsc default 1
        $TC class add dev $IFACE parent 1:  classid 1:1  hfsc sc rate $IF_EGRESS_RATE ul rate $IF_EGRESS_RATE
        $TC qdisc add dev $IFACE parent 1:1 handle 11: fq_codel target $IF_EGRESS_TARGET flows $IF_EGRESS_FLOWS $ecn
else
        # Link-limited mode (default and dont fail interface bringup if it doesnt work)
        echo "Setting scheduler of $IFACE to fq_codel, target $IF_EGRESS_TARGET, flows $IF_EGRESS_FLOWS"
        $TC qdisc add dev $IFACE root fq_codel target $IF_EGRESS_TARGET flows $IF_EGRESS_FLOWS $ecn || true
fi

# Redir to queueing device
if [ ! -z "$IF_INGRESS_REDIR" ]; then
        $IP link set dev $IF_INGRESS_REDIR up
        $TC qdisc add dev $IFACE ingress
        # Redirect both IPv4 and IPv6..
        $TC filter add dev $IFACE parent ffff: protocol ip prio 1 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev $IF_INGRESS_REDIR
        $TC filter add dev $IFACE parent ffff: protocol ipv6 prio 2 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev $IF_INGRESS_REDIR

fi



Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group