Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin 

Post new topic Reply to topic
Author Message
 Post subject: Slow SCP through network
PostPosted: Sat Dec 03, 2011 9:02 pm 
Offline
Newbie

Joined: Sat Dec 03, 2011 6:42 pm
Posts: 2
I don't want to bug Linode's staff too much with support tickets, so I hope someone can help me out here.

The main issue is that I have large performance changes with "scp". One time it would take 8 seconds, another time 4 minutes. I am using "scp" to copy data back and furth between a node in Germany, and Linode.

Here's an example. These two commands were executed within a minute of each other:

Code:
/srv# time scp -p -r -P 1234 -i /root/a.pem file1 node.in.europe:/tmp/test
file1                                                                                                                                                                      100% 2184KB   2.1MB/s   00:01

real    0m6.368s
user    0m0.078s
sys     0m0.033s

/srv# time scp -p -r -P 1234 -i /root/a.pem file1 node.in.europe:/tmp/test
file1                                                                                                                                                                      100% 2184KB 546.1KB/s   00:04

real    1m5.746s
user    0m0.062s
sys     0m0.048s



On the latter one, SCP would go from 0% to 100% within seconds, and then hang at 100% for minutes. Running SCP with -vvv gives me:

Code:
file1                                                                                                                                                                      100% 2184KB 728.1KB/s   00:03
debug3: Wrote 28960 bytes for a total of 156975
debug3: Wrote 28960 bytes for a total of 185935
debug3: Wrote 27512 bytes for a total of 213447
debug3: Wrote 28960 bytes for a total of 242407
debug2: channel 0: rcvd adjust 114688
debug3: Wrote 27512 bytes for a total of 269919
debug3: Wrote 24616 bytes for a total of 294535
debug3: Wrote 24616 bytes for a total of 319151
[...]
Transferred: sent 2241120, received 2712 bytes, in 59.2 seconds
Bytes per second: sent 37832.0, received 45.8
debug1: Exit status 0

real    1m1.056s
user    0m0.084s
sys     0m0.046s


for an instance where it takes long, and

Code:
file1                                                                                                                                                                      100% 2184KB   2.1MB/s   00:01
debug3: Wrote 94120 bytes for a total of 400239
debug3: Wrote 77848 bytes for a total of 478087
debug2: channel 0: rcvd adjust 114688
debug3: Wrote 131264 bytes for a total of 609351
debug3: Wrote 32816 bytes for a total of 642167
debug3: Wrote 32816 bytes for a total of 674983
debug2: channel 0: rcvd adjust 131072
debug3: Wrote 65632 bytes for a total of 740615
debug3: Wrote 32816 bytes for a total of 773431
[...]
Transferred: sent 2241120, received 2712 bytes, in 3.8 seconds
Bytes per second: sent 592957.3, received 717.5
debug1: Exit status 0

real    0m5.235s
user    0m0.086s
sys     0m0.028s


for an instance where it went fast. From that, I conclude that SCP can get the data ready very quickly, and spends most of its time waiting to be able to push the data out of the network interface.

With that, I ran a couple of traceroutes, and I'm getting different routes every time:

Code:
/srv# traceroute node.in.europe
traceroute to node.in.europe (87.230.ooo.ooo), 30 hops max, 60 byte packets
 1  a1.7.1243.static.theplanet.com (67.18.7.161)  0.551 ms  0.658 ms  0.645 ms
 2  xe-2-0-0.car03.dllstx2.networklayer.com (67.18.7.89)  0.178 ms  0.206 ms  0.191 ms
 3  po101.dsr02.dllstx2.networklayer.com (70.87.254.77)  0.582 ms  0.661 ms  0.611 ms
 4  te4-3.dsr02.dllstx3.networklayer.com (70.87.255.129)  0.768 ms  0.760 ms te3-2.dsr02.dllstx3.networklayer.com (70.87.253.133)  0.812 ms
 5  ae17.bbr02.eq01.dal03.networklayer.com (173.192.18.230)  50.695 ms  50.724 ms ae17.bbr01.eq01.dal03.networklayer.com (173.192.18.226)  0.477 ms
 6  dls-bb1-link.telia.net (213.248.102.173)  0.490 ms  0.548 ms  0.534 ms
 7  ash-bb1-link.telia.net (213.155.133.178)  60.992 ms  60.413 ms ae2-20G.scr2.DAL1.gblx.net (67.16.141.237)  5.790 ms
 8  ldn-bb1-link.telia.net (80.91.246.69)  109.003 ms * po6.ar4.AMS2.gblx.net (67.17.107.174)  124.616 ms
 9  ldn-b5-link.telia.net (80.91.248.216)  109.141 ms  109.125 ms *
10  * * *
11  xe-0-0-1.dr-master.r1.cgn3.hosteurope.de (176.28.4.14)  130.149 ms xe-0-2-0.cr-merak.fra2.hosteurope.de (176.28.4.2)  123.437 ms xe-0-0-1.dr-master.r1.cgn3.hosteurope.de (176.28.4.14)  128.728 ms
12  xe-2-2-0.cr-pollux.cgn3.hosteurope.de (80.237.129.169)  128.345 ms  128.334 ms  128.287 ms


I can repeat the traceroute, and it'll be different hosts everytime, however the dropouts are usually close to the Germany node, in the telia.net network.

Now, using a traceroute from the Germany node to Linode uses an entire different route, avoiding telia.net altogether:

Code:
# traceroute node.in.the.us
traceroute to node.in.the.us (66.228.ooo.ooo), 30 hops max, 40 byte packets
 1  * * *
 2  xe-3-3-0.cr-pollux.cgn3.hosteurope.de (176.28.4.9)  0.232 ms  0.231 ms  0.213 ms
 3  xe-0-2-0.cr-antares.ams1.hosteurope.de (80.237.129.182)  4.509 ms  4.514 ms xe-0-3-0.cr-antares.ams1.hosteurope.de (80.237.129.118)  4.523 ms
 4  tengigabitethernet6-2.ar4.ams2.gblx.net (206.165.75.1)  4.854 ms  4.816 ms  4.809 ms
 5  ar4.scr4.AMS2.gblx.net (67.17.107.173)  4.656 ms  4.643 ms  4.634 ms
 6  ae13.scr4.NYC1.gblx.net (67.16.166.214)  83.645 ms  83.511 ms  83.483 ms
 7  e5-1-30G.ar9.NYC1.gblx.net (67.16.142.54)  82.889 ms  80.544 ms  89.375 ms
 8  softlayer-technologies-inc.ethernet11-3.ar9.nyc1.gblx.net (206.165.75.234)  79.368 ms  79.391 ms  79.359 ms
 9  ae7.bbr02.tl01.nyc01.networklayer.com (173.192.18.177)  86.988 ms  86.961 ms  86.947 ms
10  ae1.bbr01.eq01.chi01.networklayer.com (173.192.18.132)  106.135 ms  106.122 ms  106.080 ms
11  ae20.bbr01.eq01.dal03.networklayer.com (173.192.18.136)  125.737 ms  125.731 ms  125.687 ms
12  po31.dsr01.dllstx3.networklayer.com (173.192.18.225)  121.404 ms  118.953 ms  122.796 ms
13  te4-4.dsr02.dllstx2.networklayer.com (70.87.255.134)  125.490 ms * te2-1.dsr01.dllstx2.networklayer.com (70.87.255.66)  129.907 ms
14  po2.car01.dllstx2.networklayer.com (70.87.254.78)  126.373 ms po1.car01.dllstx2.networklayer.com (70.87.254.74)  125.014 ms po2.car01.dllstx2.networklayer.com (70.87.254.78)  168.917 ms
15  5a.7.1243.static.theplanet.com (67.18.7.90)  128.726 ms  125.728 ms  129.486 ms



I guess my question basically is - what can I do? Is routing into the telia.net network something any of these providers can influence? Or am I barking up the wrong tree altogether and this isn't the real reason I'm getting these differences in performance?


Top
 Profile  
 
PostPosted: Sat Dec 03, 2011 9:26 pm 
Offline
Senior Member

Joined: Fri Dec 10, 2010 6:21 am
Posts: 130
stw wrote:
With that, I ran a couple of traceroutes, and I'm getting different routes every time:

Code:
/srv# traceroute node.in.europe
traceroute to node.in.europe (87.230.ooo.ooo), 30 hops max, 60 byte packets
 1  a1.7.1243.static.theplanet.com (67.18.7.161)  0.551 ms  0.658 ms  0.645 ms
 2  xe-2-0-0.car03.dllstx2.networklayer.com (67.18.7.89)  0.178 ms  0.206 ms  0.191 ms
 3  po101.dsr02.dllstx2.networklayer.com (70.87.254.77)  0.582 ms  0.661 ms  0.611 ms
 4  te4-3.dsr02.dllstx3.networklayer.com (70.87.255.129)  0.768 ms  0.760 ms te3-2.dsr02.dllstx3.networklayer.com (70.87.253.133)  0.812 ms
 5  ae17.bbr02.eq01.dal03.networklayer.com (173.192.18.230)  50.695 ms  50.724 ms ae17.bbr01.eq01.dal03.networklayer.com (173.192.18.226)  0.477 ms
 6  dls-bb1-link.telia.net (213.248.102.173)  0.490 ms  0.548 ms  0.534 ms
 7  ash-bb1-link.telia.net (213.155.133.178)  60.992 ms  60.413 ms ae2-20G.scr2.DAL1.gblx.net (67.16.141.237)  5.790 ms
 8  ldn-bb1-link.telia.net (80.91.246.69)  109.003 ms * po6.ar4.AMS2.gblx.net (67.17.107.174)  124.616 ms
 9  ldn-b5-link.telia.net (80.91.248.216)  109.141 ms  109.125 ms *
10  * * *
11  xe-0-0-1.dr-master.r1.cgn3.hosteurope.de (176.28.4.14)  130.149 ms xe-0-2-0.cr-merak.fra2.hosteurope.de (176.28.4.2)  123.437 ms xe-0-0-1.dr-master.r1.cgn3.hosteurope.de (176.28.4.14)  128.728 ms
12  xe-2-2-0.cr-pollux.cgn3.hosteurope.de (80.237.129.169)  128.345 ms  128.334 ms  128.287 ms


I can repeat the traceroute, and it'll be different hosts everytime, however the dropouts are usually close to the Germany node, in the telia.net network.



Looks like the varying routes start already in the networklayer.com network, as the traffic sometimes seem to go out through telia.net and sometimes through gblx.net? (Based on the varying hosts at the same hopcount in the trace above.)

Speculation follows:
Seems like it's networklayer.com that for whatever reason switch back and forth between these two... Which quite possibly is because the route to whichever one they prefer is flapping or something to that regard.

It's unclear if your problem related to which path is used, ie if one is actually notably better than the other, or if the problem is the actual switching back and forth.


Top
 Profile  
 
PostPosted: Sat Dec 03, 2011 10:01 pm 
Offline
Senior Member
User avatar

Joined: Sun Dec 27, 2009 11:12 pm
Posts: 879
Location: Colorado, USA
stw wrote:
I don't want to bug Linode's staff too much with support tickets

And yet Linode's Accounting Dept bugs me every month wanting payment.

You pay for service - don't be afraid to use it. Worse that can happen is they say it's not a hardware/infrastructure problem so you're on your own.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Dec 04, 2011 12:17 am 
Offline
Senior Member
User avatar

Joined: Sat Aug 30, 2008 1:55 pm
Posts: 1651
Location: Rochester, New York
I would lean towards a network thing, as well. scp's progress indicator is based on when the network stack accepts the data, not necessarily when it is actually delivered. (Try scping a file from home to somewhere else... it sits on 100% like its a Windows OS installation.) This smells a lot like inconvenient packet loss.

Try laying down a ping while doing the scp, or maybe even mtr. Whatever is causing the routing to change is probably also dropping packets for a few seconds.

_________________
Code:
Warning (10631): VHDL Process Statement warning at signature.vhd(1): inferring latch(es) for signal or variable "disclaimer", which holds its previous value in one or more paths through the process


Top
 Profile  
 
PostPosted: Sun Dec 04, 2011 1:46 pm 
Offline
Newbie

Joined: Sat Dec 03, 2011 6:42 pm
Posts: 2
First of all, thanks everyone for your insights.


hawk7000 wrote:
Looks like the varying routes start already in the networklayer.com network, as the traffic sometimes seem to go out through telia.net and sometimes through gblx.net? (Based on the varying hosts at the same hopcount in the trace above.)


That's true - I noticed some of the replies came from different hosts, but I didn't see a pattern in there. I agree with you, networklayer.com is probably switching routes between telia.net and gblx.net - which makes it hard to tell what network (telia.net or gblx.net) makes SCP take this long - or whether it's the switching altogether that causes it.

It still puzzles me that sometimes SCP finishes within seconds, and sometimes only after minutes have passed - yet the routes change within a traceroute. I'd assume that at some point, the speed of SCP would pick up if the route changes, or at least that it remains consistently low if route flapping itself is the issue - but instead the speed is either consistently slow, or consistently fast during a transfer.


hoopycat wrote:
This smells a lot like inconvenient packet loss.

Try laying down a ping while doing the scp, or maybe even mtr. Whatever is causing the routing to change is probably also dropping packets for a few seconds.


That's a good idea - it's funny how I use traceroute and all, and then forget using one of the most basic tools. I guess I neglected that because I assumed that commercial connections would not possibly have packet loss, and that it'd be a problem constrained to poor ADSL lines.
Anyway, you were correct; I ran a ping during the scp, and it would have a packet loss of around 20%.

For a 25 second SCP, I got
Code:
--- node.in.europe ping statistics ---
26 packets transmitted, 20 received, 23% packet loss, time 25026ms
rtt min/avg/max/mdev = 127.911/129.334/131.692/0.947 ms

and for a 2 minute SCP I got
Code:
--- node.in.europe ping statistics ---
140 packets transmitted, 114 received, 18% packet loss, time 139113ms
rtt min/avg/max/mdev = 126.246/128.959/133.576/1.009 ms


None of the replies were delivered out of order. During the "fast" scp (8 seconds) I had no packet loss. I tries to run SCP over a different port (1235) in an attempt to see whether port 1234 would be throttled, but I get the same figures.


So, thanks to you I realized that the problem is (a pretty significant?) packet loss, even though I am not sure, and probably can't find out, whether it's congestion caused or caused by the route switching. With that, is there anything I can do? Is this something Linode (or the German hoster) have any influence over? As in, could (and would) Linode choose a different route, or is it out of their hands anyway (in which case I wouldn't bother asking), because it's no longer in their network?

Telia.net is in Stockholm, networklayer.com's whois information is proxied (Domains By Proxy), which strikes me a bit as odd, seeing hiding contact information is something I would only expect individuals to do. Still, who would have more influence over the route - Linode or the provider in Germany?


Top
 Profile  
 
 Post subject:
PostPosted: Mon Dec 05, 2011 7:20 pm 
Offline
Senior Member
User avatar

Joined: Sat Aug 30, 2008 1:55 pm
Posts: 1651
Location: Rochester, New York
Oh wow. TCP performance tends to decay pretty hard beyond about 5% packet loss, so the fact that it is working at all ought to be a pleasant surprise. Keep in mind that a lot of this is hidden from you because of a very large transmit buffer (see the Send-Q column on netstat -nt)... what looks smooth and constant from scp's perspective is very likely fits-and-starts under the hood. (scp really crams a lot into the sendq.)

This is probably worth tickets from both ends. In general, contact the party/parties with whom you have a business relationship. Neither Softlayer (theplanet.com, networklayer.com) nor Global Crossing nor Telia will deal with you directly, so start from the ends. (This is also handy because, in all but the most trivial cases, the return path will be totally different than the forward path, and packet loss could occur on either with similar effect. Did the packet get lost on the way there, or did the acknowledgement get lost on the way here?)

_________________
Code:
Warning (10631): VHDL Process Statement warning at signature.vhd(1): inferring latch(es) for signal or variable "disclaimer", which holds its previous value in one or more paths through the process


Top
 Profile  
 
 Post subject:
PostPosted: Mon Dec 05, 2011 7:26 pm 
Offline
Senior Member

Joined: Sat May 03, 2008 4:01 pm
Posts: 537
hoopycat wrote:
... (This is also handy because, in all but the most trivial cases, the return path will be totally different than the forward path, and packet loss could occur on either with similar effect. Did the packet get lost on the way there, or did the acknowledgement get lost on the way here?)

...Use 2ping and find out!

_________________
Matt Nordhoff (aka Peng on IRC)


Top
 Profile  
 
 Post subject:
PostPosted: Tue Dec 06, 2011 2:50 am 
Offline
Senior Member
User avatar

Joined: Sun Jan 18, 2009 2:41 pm
Posts: 642
hoopycat wrote:
(scp really crams a lot into the sendq.)

If you use the -l (lower-case L) option, it seems to prevent scp from doing that. I've found it to be useful for getting more honest status reports from scp when transferring small files (which scp would normally just report 100% completion on immediately, as it's dumped the entire file into the send queue).


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS
Powered by phpBB® Forum Software © phpBB Group

Home | Manager | Contact Us | Jobs | Terms of Service | Privacy Policy | ™ © 2003-2012 Linode, LLC. All rights reserved.