Mosh: Modernizing SSH With IP Roaming, Instant Local Echo
An anonymous reader writes "Launched in 1995, SSH quickly became the king of network login tools, supplanting the old insecure mainstays TELNET and RLOGIN. But 17 years later, a group of MIT hackers have come out with "mosh", which claims to modernize the most annoying parts of SSH. Mosh keeps its connection alive when clients roam among WiFi networks or switch to 3G, and gives instant feedback on typing (and deleting). No more annoying network lag on typing, the MIT boffins say, citing Bufferbloat, which has been increasing latencies."
The folks involved have a pre-press research paper with the gritty details (to be presented at USENIX later this year). Mosh itself is not particularly exciting; the new State Synchronization Protocol it is based upon might be: "This is accomplished using a new protocol called the State Synchronization Protocol, for which Mosh is the first application. SSP runs over UDP, synchronizing the state of any object from one host to another. Datagrams are encrypted and authenticated using AES-128 in OCB mode. While SSP takes care of the networking protocol, it is the implementation of the object being synchronized that defines the ultimate semantics of the protocol."
And 15 years later, LOCAL_ECHO is back in mosh!
and gives instant feedback on typing (and deleting).
That sounds like a step backwards to me. Any utility in that is lost when something doesn’t sync up properly. When I hit a key, I want to know it has been sent and received and see the result.. not see the result as my shell predicts it. Maybe I’m just having local echo flashbacks from past telnet experiences.
Everything else sounds really neat though. I don’t jump wifi often enough for re-connecting and re-attaching to screen to be a big deal.. but I can see the utility for those who do.
So mosh has brought back the ages-old idea of local echo on the terminal. It disappeared as soon as terminal connections became faster than the old teletype links. I have often wished for such a feature in ssh, some kind of 'cooked mode'. However I usually run a 'screen' session on the other end of ssh, with emacs inside that, and finally a shell-mode under Emacs! Mosh would need to do something quite clever to enable local editing in that.
-- Ed Avis ed@membled.com
"To bootstrap an SSP connection, the user rst logs in to the remote host using conventional means, such as SSH or Kerberos."
While neat for those who are currently in areas with spotty wireless coverage it is a neat idea but for most users I don't think it's that much of an issue, even at the moment.
Fast forward five years and I just don't see this software being all that useful. Sure, there's always gonna be that handful of people who will scream that this is extremely useful because they're always hopping between wifi hotspots but most users are using 3G/4G when they're on the move and coverage for those is already "good enough" in most civilized places and steadily improving. I've taken 5+ hour train trips several times and only had ssh connections drop once or twice on those trips (due to spotty coverage in what would quality as the middle of nowhere in northern Sweden).
This is like "solving" the IPv4 address exhaustion problem with NAT, it's a neat workaround but doesn't actually solve the problem.
Greylisting is to SMTP as NAT is to IPv4
and receive?
No thanks, I stopped reading when i saw that udp is used instead of tcp.
Let's see...
I get reconnectability (which I already have, either by using a VPN or by using screen on the server), but now it's built-in.
I get local echo so I have no clue whether my connection has been dropped -- but OTOH, this is great if you have the brain of a goldfish and so can't remember what you just typed for a couple seconds till it gets echoed back. I presume this is optional, so non-goldfish-brains can tell it to 'degrade' to be as useful as ssh.
I get better unicode support -- well, that one's cool, anyway.
And it needs ssh for login, but also needs a mosh server -- so I can ssh into every server, but only mosh into a few.
Am I missing some really great thing about it? It seems like a major hassle for a minor improvement.
Is there a mosh iOS app?
If they implement their own TCP-like layer over UDP, there's no reason it can't be just as reliable.
It's kind of hard to do things like roaming using TCP because endpoint IPs can change.
We tried to put OCB mode in 802.11i. So IBM sent a guy to explain the 'licensing terms' for their patents on OCB mode. The next vote in 802.11i after that presentation was to replace OCB mode with CCM.
Until the patents expire or are freely licensed, OCB mode should be considered off limits for free and open projects.
I should use this sig to advertise my book ISBN-13 : 978-1501515132.
IP roaming looks nice & ought to be secure with the right steps (no reply from old IP:port, correct cryto negotiation with new IP:port).
But LOCAL ECHO is a big problem -- applications have to be aware of it. On CLI, many keystrokes are commands, not text to be entered. On vi in command-mode, G goes to the last line.
Personally, a bigger thing is traffic reduction, particularly keystoke combining. Nagel's algorithm is a start, but I've modded ssh to delay and buffer likely-text keystrokes for a short time (400ms) while letting likely commands through immediately to retain responsiveness. The delays aren't irksome, and I reduce outbound traffic by ~80%.
mosh is not made to replace ssh but to work with its aid. from page 3 i'm not exactly sure how could user remotely login using Kerberos (in contrast to ssh), but it essentially is a user process started over ssh, wrapping a shell like screen on server.
only problem i can see is the firewall/portforwarding bypassing; if all UDP packets are encrypted and from/to random ports there's no way your iptables is going to pick that one up.
i can't see any idle traffic or re-keying being specified, though i guess it's easy to add later on. i'll definetly try this out, once i can get it through NATs.
I see no rationale for not helping to improve SSH. This shit shouldn't be encouraged.
Then they discover there was usually a good reason for something being done the way it was in the past. Eg local echo was very useful for line buffered programs such as MUDs and chat servers or even talking to SENDMAIL or an FTP server directly. It was easier to write the server to cope with just line by line rather than character by character and it used up less network resources in the process.
You open a SSH connection (client->server:22). This port is allowed on the firewall, it lets you through. But then the server decides to listen on UDP:(random port) and tells the client, back through the (encrypted) initial connection, which UDP port to contact. So you initiate a SSP UDP session on that port. How does the firewall knows it should let you through? Since the port number is communicated on an encrypted session, it doesn't have access to that information. So how does this work in a secure environment? The paper doesn't mention any mean for the server to communicate with the network which port its listening on.
As with TLS, I'd like to see any future revisions of these secure protocols trim more fat. Arcane ciphers, modes, etc. Crypto software is very difficult to secure - there are a lot of subtleties, and there's been a distinct lack of basic software engineering discipline. Lack of regard for domains, layers, interfaces, etc.
> It's kind of hard to do things like roaming using TCP because endpoint IPs can change.
Bullshit. With UDP you have to abstract the connection so that the source IP can change. With TCP you can do the exact same fucking thing. Close the old socket when you get a connection attempt from a new client with the right handshake.
Oh God! The flashbacks are killing me! Back in the mid-70's I worked for Tymshare (sister company/parent/?? of Tymnet) doing load testing on a project called OnTyme (commercial email). I was hip-deep in the Tymnet protocol trying to record and then re-create realistic pseudo-user-loads from different points in the country. Massive PITA.
I see the need for this all the time. It's a commonplace in large enterprises like hospitals, factories, and financial services corporations.
Example: I'm working on my hospital laptop. I get called urgently to do something elsewhere in the hospital so somebody won't die right now. I grab the lappie and run, then when I get to the theatre I plug into the malfing imager and fix it. Meanwhile all my SSH connections died because I crossed three wireless boundaries at high speed.
Example 2: I'm on the line debugging the tension loader robot while a human continuously manually corrects the tension downthread. I find the upstream data to the robot is bad and I have to backtrack all the way across the building to find the malfunctioning sensor, then come back and double-check the robot again. All my SSH connections into the DCS keep failing because the factory floor's high RFI means we have to have lots of small loud wireless zones, and I have to keep moving among them.
Example 3: I'm in a conference room lecturing junior banksters on how to fleece grandmothers and the CEO throws us out so his pet congressman can use the room to tongue-polish his shoes for him. The next conference room is two wireless zones away, so my secure SSH tunnel into Dr. Evil's antarctic lair fails and I have to sacrifice another day trader to get the blood I need for our in-house key transfer protocol.
OK, that last example was a bit contrived but I was starting to get bored.
No more annoying network lag on typing, the MIT boffins say
Many boffins died to bring us this information
Quoting from the technical info; "Every time the server receives an authentic packet from the client with a sequence number higher than any it has previously received, the IP source address of that packet becomes the server's new target for its outgoing packets"
What's to stop one from replaying a captured packet with a higher sequence number and a spoofed source address to redirect the connection?
Seeing as this is designed for hopping networks, this could easily happen on a public network. MITM, anyone?
Does this do things like Port Forwarding? or is this not a replacement for SSH, but almost an extension of it?
I use SSH port forwarding (both directions) compression, and stuff all the time.
What are we going to do tonight Brain?
Heck, people have done that with Objective-C remote proxies have basically been doing that as a form of RPC since NeXT days. Not to mention any other usages of it.
No shoes, no shirt, no Windows client, no service.
Bullshit. With UDP you have to abstract the connection so that the source IP can change. With TCP you can do the exact same fucking thing. Close the old socket when you get a connection attempt from a new client with the right handshake.
I'm a little out of my depth here, but I'd imagine it'd be much easier with UDP because UDP is connectionless. With this sort of roaming, the server isn't expected to change addresses, but the client is. So, have the client sign everything with a negotiated public key, and the server doesn't even have to care where each packet is coming from, or even open any new connections when the client moves across IPs.
Since this is an SSH replacement, I'd expect the key signing to be done already, so once you build an ordering and reliability protocol on top of UDP you essentially get the roaming for free.
As we're talking about things related to terminals, the one thing I'm still anxiously missing is a terminal emulator which implements smooth scrolling of new text, a feature that was also present in some hardware terminals a million years ago. I guess some smart guy could modify an existing terminal to support this. Heck, if I had a bit more skills, I'd roll up my sleeves and do it myself. It would be sweet.
Modern shells have completion, and mosh is not going to predict that.
It seems to me that for my typical usage it is going to have limited utility - I'm either in a shell where I'm leaning heavily on the tab for completion, or in vi where it would need to secondguess what vi is going to display.
I guess you never worked at Texas Instruments and used their bastardized TN3720 over XNS or UDP 211 to connect to their mainframe (System 390?). For DOS and WfW, the emulator was 914C/G using primarily XNS and later UDP, while in 95 and NT the emulator was TI-COMM and only used UDP. There were these systems( Compaq workstations) called TICCs, TI-COMM Controllers, running SCO Unix that were designed to translate TokenRing SNA to XNS and IP. Considering the state of the network, the emulators worked well. Until the switch to using IP and a network restructuring by AT&T, each TI campus was a massive bridged network running IP, XNS, IPX, NetBUEI, and AppleTalk. There was a reason why the network and servers tended to crash each day towards the end of each shift.
Be grateful you didn't have to use their email system called MSG, which initially could only send other users on the mainframe and everyone was restricted to a 4 character alphanumeric email designation. It took a while before they release an SMTP gateway, MIMI, running on a SPARC 5 or 10 to interface between MSG and the Internet.
A0820707
Why can't they add local echo, predictive typing, and resumable sessions to ssh or another TCP-based protocol? Yes, TCP can *possibly* take longer to recover from network errors, but this isn't something where you can just drop some missing packets (like some audio streaming things) to keep things flowing.
It's implemented over UDP, which means you *still* have to basically do all the functions of TCP, but now you get to do them with code that hasn't been tried and tested over the past several decades. Plus a new crypto implementation on top of that. And for what? Slightly better response time during network loss, which you shouldn't notice anyhow because of the predictive typing?
UDP just seems to solve no real problems, yet *adds* a lot of problems -- the firewall problem, for one. Fresh new daemon with unknown security issues.
I see no reason why you couldn't just tunnel this over ssh and get the vast majority of the benefits -- or better yet, patch the predictive typing/session resume/etc into ssh directly. Then you get to take advantage of the decades of work and bugfixing that's already been done for the majority of your protocol stack.
(And I don't buy for a minute that it's significantly more difficult to handle session resume when it's a TCP connection...)
In the context of persistent logical connections, you have to consider that the TCP connection will get severed and must be re-established. It's not enough to have an open connection sitting around; if bits aren't coming from the other side, the open connection may have been physically severed by some dead networking gear or a backhoe. So you end up coding some application-level heartbeat-and-reconnect logic with logic to securely resume a session (forgive the hand-waving here).
At this point, if the "securely resume a session" bit is sufficiently compact, you are free to do the exact same logic per-packet, and you don't need to maintain a TCP connection. You lose TCP's reliable-ordering support, but the automatic-reconnect logic is typically sufficient to do without reliable-ordering. You also lose TCP's congestion control, so you'll probably need to add some application-level throttling mechanism to avoid DoSing yourself.
This is why TCP vs. UDP is irrelevant in the topic at hand. Neither is sufficient to maintain a persistent connection, and the extra logic required on top of one can typically be ported to the other.
..would you want local echo? The purpose of echo is so you know your character was received by the remote device!!!!!!
If you're echo is taking too long the problem is not the echo it's the lag. Don't you WANT to know if there's lag???
Leave the roaming bit to the layer below it (TCP), or even better - below that (IP) ? It seems more appropriate to not just be able to roam ssh. Not everything is http, you know.
Religion is what happens when nature strikes and groupthink goes wrong.
Mosh looked like an interesting solution to my own roaming issues - and most of my "roaming" is in-town between home and work and occasionally a public library or cafe. That, and when AT&T DSL drops my PPPoE connection and renegotiates my hope IP address at random times, sometimes several times an hour. (I live in the US - a kind of banana republic of residential networking.)
I've already got a reverse-roaming solution involving using OpenVPN to connect my laptop back to my home fileserver & printer and private IMAP server - but that's for my own convenience and my co-workers are amused.
I'm partially responsible for my own work environment - and it isn't a conventional Linux distribution (it's based on an early version of LFS). Which means that in some ways it's more secure because it doesn't include features we don't want or need (i.e.: reduced exposure surface.) But is a real pain to maintain. I've picked up the reins of other talented people - but the environment is old.
So I built Mosh. After upgrading the compiler. Patching /usr/include/endian.h. Upgrading Boost. Installing the Google Protocol thing. Installing libutempter. It took all afternoon! (I wanted to be able to share my work with other groups in our organization.)
And then I discovered Mosh wants a UTF-8 environment. Getting around that took the rest of the evening, and it depends on your shell. You can do it with "csh" - but "bash" is tricky. Then I had to patch libutempter. And patch libutempter. And patch libutempter. And patch libutempter. (My "ptsname()" call doesn't work - but neither does FreeBSD or MacOSX's.)
Anyway - mosh is SLOW. I can feel the buffer bloat it introduces.
It's interesting still. But I think a combination of OpenVPN, private networks, ssh, and even screen would work as well or even better.