Multi-Threaded SSH/SCP
neo writes "Chris Rapier has presented a paper describing how to dramatically increase the speed of SCP networks. It appears that because SCP relies on a single thread in SSH, the crypto can sometimes be the bottleneck instead of the wire speed. Their new implementation (HPN-SSH) takes advantage of multi-threaded capable systems dramatically increasing the speed of securely copying files. They are currently looking for potential users with very high bandwidth to test the upper limits of the system."
You can also use a cheaper cipher. From the ssh manpage:
-c blowfish|3des|des
Selects the cipher to use for encrypting the session. 3des is
used by default. It is believed to be secure. 3des (triple-des)
is an encrypt-decrypt-encrypt triple with three different keys.
blowfish is a fast block cipher, it appears very secure and is
much faster than 3des. des is only supported in the ssh client
for interoperability with legacy protocol 1 implementations that
do not support the 3des cipher. Its use is strongly discouraged
due to cryptographic weaknesses.
I'm sorry if I haven't offended anyone
They claim that the first bottleneck is actually flow control of buffers, which prevents utilizing full network bandwidth in normal gigabit connections. The threads will help only after this first bottleneck has been cleared. They have patches to fix both problems. The slashdot summary was therefore a bit inaccurate, and reading TFA certainly helps.
Or just compile from source and enable the 'none' "cipher".
I surely missed having that option when copying files between hosts on my LAN. I don't need to hide data from myself. If someone else connects and encrypting data is a concern, I'll simply not use the 'none' "cipher".
By the way, does anybody else think "the ability to switch to a NONE cipher post authentication" is pretty dodgy?
Watch this Heartland Institute video
Actually, it appears that (at least on Debian) AES is already the default. Selecting 3des gives tremendous slowdown; blowfish is somewhat slower than AES.
Copying 100MB of data over 100mbit ethernet to a P2 350Mhz box (the slowest I got) gives:
* 3des 1.9MB/s
* AES 4.8MB/s
* blowfish 4.4MB/s
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Centralization breaks the internet.
This is the setup using nc:
and this is the setup that socketpipe arranges:
There were crypto acceleration cards, but I think the market was fairly small. They made sense for sites with lots of https traffic, but nowadays general purpose cpus are blazingly fast compared to back then.
So I guess they disappeared..
Truth arises more readily from error than from confusion. -Francis Bacon
The limitations of transfer rates for scp is often the round trip time that consumes time for confirmation of received packages. This is a serious issue for transfers from the Europe to the US West Coast (around 200 ms) or to Australia (around 400 ms). Having several parallel TCP streams can solve this problem and has been in use for many years for transfer of data in High Energy Physics. An example of such a solution is GridFTP http://www.globus.org/toolkit/docs/4.0/data/gridftp/.
Use NX instead of plain old remote DISPLAY or ssh's X11 forwarding or even VNC! It's silly fast! You get a perfectly usable desktop even on slow, high latency connections. The free edition is free as in GPL.
My other account has a 3-digit UID.
Actually, it depends upon the SSH protocol. Both Debian and Cygwin have this to say:
-c cipher_spec
Selects the cipher specification for encrypting the session.
Protocol version 1 allows specification of a single cipher. The
supported values are "3des", "blowfish", and
"des". 3des (triple-des) is an encrypt-decrypt-encrypt
triple with three different keys. It is believed to be secure.
blowfish is a fast block cipher; it appears very secure and is
much faster than 3des. des is only supported in the ssh client
for interoperability with legacy protocol 1 implementations that
do not support the 3des cipher. Its use is strongly
discouraged due to cryptographic weaknesses. The default
is "3des".
For protocol version 2, cipher_spec is a comma-separated list of
ciphers listed in order of preference. The supported ciphers are:
3des-cbc, aes128-cbc, aes192-cbc, aes256-cbc, aes128-ctr,
aes192-ctr, aes256-ctr, arc
four128, arcfour256, arcfour, blowfish-cbc, and cast128-cbc. The default is:
aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour128,
arcfour256,arcfour,aes192-cbc,aes256-cbc,aes128-ctr,
aes192-ctr,aes256-ctr
UNIX/Linux Consulting
As a note, the changes are actually to SSH itself and not just SCP. So any application that uses SSH as a transport mechanism can see a performance boost. This isn't to say *every* user will. This is mainly geared towards high bandwidth delay product networks (greater than 1MB) or GigE LANs.
BDP is the bandwidth-delay product. BDP is one of the main things these patches address. Loopback has very, very little delay. You could, I suppose, add artificial delay over loopback, but now you're diverging further from the actual deployment scenario.
The other thing is that when sender and receiver are the same host, you don't engage the full network stack (no ethernet queuing, for example, no dropped packets, etc. etc.), so you don't find out all the curve balls that TCP/IP will throw you.
And yet another thing is that sender and receiver will compete for the same CPUs, and so whatever upper CPU bound you have with separate sender and receiver, you'll be at roughly half that (assuming send and receive are balanced) when both are on the same machine.
--JoeProgram Intellivision!
SSH can, of course, be configured to compress automatically.
> if the CPU is the bottleneck, how could adding more threads possibly help? This is actually a great question. On single core systems its very unlikely that the multi-threading aspect of our patch will be of much use to you. The stock version of SSH is, because of its design, unable to use more than one core regardless of how many cores you actually happen to have. Which means that you could have one core thats pegged by SSH and have other cores that are essentially running idle (if you look at the presentation we go into that after we address with window issues). What we've done is allow SSH to offload the heavy work (the encryption) onto other cores in order to make full use of the CPU resources available.
tar cfpz - . | ssh user@host '( cd /destination ; tar xfpvz - )'
I'd use a "." instead of *, it avoids shell line length problems, and will also copy hidden files... as someone who as learned this the hard way. Also in my experience, on anything faster then 10MB, don't bother with compression (it's really a CPU to network speed ratio, on transfers I did regularly that was the rule of thumb with P4 2.2Ghz Xeons). Also, I removed the "v" from the source tar, as it duplicates every file name twice and can be hard to read. I can't remember if ssh or tar had better compression, I know I tested both. It really just changed the tipping point of the CPU speed. I also used to use blowfish for the cipher as it was easier on the CPU if you were running out of CPU instead of network. On a Gigabit network, I always ran out of CPU first.
I normally use -C instead of a subshell, but that's merely a matter of taste. I also use the technique in reverse quite often so I can untar on the destination machine as root.
Kirby
A couple notes about the multi-threading: The main goal was to allow SSH to make use of multiple processing cores. The stock OpenSSH is, by design, limited to using one core. As such a user can encounter situations where they have more network capacity and more compute capacity but will be unable to exploit them. The goal of this patch was to allow users to make full use of the resources available too them. The upshot of this is that its best suited for high performance network and compute environments (The HPN in HPN-SSH stands for High Performance Networking). This doesn't mean it won't be useful to home users - only that they might not see the dramatic performance gains someone in a higher capacity environment might see. Its really going to depend on the specifics of their environment.
Based on our research we decided the most effective way to do this would be to make the AES-CTR mode cipher multi-threaded. The CTR mode is well suited to threading because there is no inter block dependency and, even better, the resulting cypher stream is indistinguishable from a single threaded CTR mode cypher stream. As a result, we retain full compatibility with other implementations of SSH - you don't need to have HPN-SSH on both sides of the connection. Of course, you won't see the same improvements unless you do.
We still see this as somewhat experimental because we've not yet implemented a way to allow users to choose between a single threaded AES-CTR and multi-threaded AES-CTR mode. As such users on single core machines - if using AES-CTR may see a decrease in performance. We suggest those users just make use of the AES-CBC mode instead (which is the default anyway). Also, you need to be able to support posix threads.
Future work will involve pipelining the MAC routine and that should provide us with another 30% or so improvement in throughput.
Also, its important to keep in mind that these improvements are *not* just for SCP but for SSH as a whole. People using HPN-SSH as a transport mechanism for rsync, tunnels, pipes, and so forth may also see considerable performance improvements. Additionally, the windowing patches don't necessarily require HPN-SSH to be installed on both ends of the connection. As long as the patch is installed on the receiving side (the data sink) you may (assuming you were previously window limited) see a performance gain.
We welcome any comments, suggests, ideas, or problem reports you might have regarding the HPN-SSH patch. Go the website mentioned above and use the email address there to get in touch with us. This is a work in progress and we are doing what we can to enable line rate easy to use fully encrypted communications. We've a lot more to do but I hope what we've done so far is of use and value to the community.
This one perhaps? : Threads.pdf
I can throw myself at the ground, and miss.
No worries, she'll be right.
ssh user@host.com tar -C /remote/path -cpzf - remotefile1 remotefile2 | tar -C /local/path -xvzp -