Guaranteed Transmission Protocols For Windows?
Michael writes "Part of our business at my work involves transferring mission critical files across a 2 mbit microwave connection, into a government-run telecommunications center with a very dodgy internal network and then finally to our own server inside the center. The computers at both ends run Windows. What sort of protocols or tools are available to me that will guarantee to get the data transferred across better than a straight Windows file system copy? Since before I started working here, they've been using FTP to upload the files, but many times the copied files are a few kilobytes smaller than the originals."
The summary states that with FTP, the downloaded files were of the wrong size. Can anyone explain why TCP's efforts to to deal with unreliable networks, such as the retransmission of unacknowledged packets and their reassembly in proper order, would not already deal with this? I am familiar with the concepts involved but I think I lack the low-level understanding of how you would get the kind of results the story is reporting.
It is a miracle that curiosity survives formal education. - Einstein
Wasn't TCP designed for just this? Guaranteed transmission?
Similes are like metaphors
You should look at the EDIINT AS2 protocol, AKA RFC 4130. This is a widely-used e-commerce protocol built over HTTP/S.
AS2 provides cryptographic signatures for authentification of the file at reception, non-repudiation and message delivery confirmation (if no confirmation is returned, the transfer is considered a failure), and is geared towards files. There is even an open-source implementation avaliable.
More complex than FTP/SFTP but entirely worth it if your data is mission-critical and/or confidential. Plus, passes through most networks because it is based on HTTP.
You're not old until regret takes the place of your dreams.
Even on reliable connections, using .complete files is a great idea.
It works this way: If you're pushing, open ftp, after ftp completes, you check remote filesize, if matches local file size, you also ftp a 0 size .complete file (or a $filename.complete file with md5 checksum, if you want to be extra paranoid).
Any app that reads that file will first check if .complete file is there.
If remote file size is less, you resume upload. If remove filesize is more than local, you wipe out remote file and restart.
Same idea for the reverse side (if you're pulling the file, instead of pushing).
You can also setup scripts to run every 5 minutes, and only stop retrying once .complete file is written (or read).
Note that the above would work even if the connection was interrupted and restarted a dozen times during the transmission. [we use this in $bigcorp to transfer hundreds of gigs of financial data per day... seems to work great; never had to care for maintenance windows, 'cause in the end, the file will get there anyway (scripts won't stop trying until data is there)].
"If anything can go wrong, it will." - Murphy
Starting a comment off by explaining that you're not familiar enough with the subject matter to intelligently comment is a very handy flag, and I appreciate your warning the rest of us that what you were saying was going to be wrong ;)
BTW, checksum hasn't been considered a trustworthy means of ensuring data integrity for more than a decade. I invite you to have a discussion with Google regarding checksum collisions.
That depends... http://www.defensetech.org/archives/000085.html
Using FTP ASCII mode for binary files would be increadibly stupid, but yeah, it sounds like that could be it.
Calling ftp from a .BAT script or whatever it's called in DOS and *not* checking its exit code
is another likely candidate.
Otherwise, I don't believe FTP has any checksums, so I'd expect bit errors here and there --
things the TCP and link layer checksums did not catch in 1/65536 of the cases.
Depends entirely on the CPU speed of the endpoints relative to the link speed. If you enable compression and the files aren't already compressed, it can be a lot faster.
Actually encryption doesn't guarantee *things add up* after transfer. And ssh does not guarantee things add up any more than tcp does. It does have other advantages, like compression.
And tcp is just not a good file transfer protocol over microwave links. Sure you can fix the glaring issues, using huge windows, you can even change registry settings to improve the situation : http://support.microsoft.com/kb/224829.
Making it work really well, though, you'll need
If you're worried about correctness of transfer you might want to use rsync for windows, which *does* check correctness. You might want to use an interface like http://www.aboutmyip.com/AboutMyXApp/DeltaCopy.jsp.
Now rsync is no wonder. It is not something that is constantly trying to reconnect. You start it once ... it tries once. If you want an opportunistic reliable file transfer utility ... you might want to try bittorrent, it's quite good at that.