Slashdot Mirror


Guaranteed Transmission Protocols For Windows?

Michael writes "Part of our business at my work involves transferring mission critical files across a 2 mbit microwave connection, into a government-run telecommunications center with a very dodgy internal network and then finally to our own server inside the center. The computers at both ends run Windows. What sort of protocols or tools are available to me that will guarantee to get the data transferred across better than a straight Windows file system copy? Since before I started working here, they've been using FTP to upload the files, but many times the copied files are a few kilobytes smaller than the originals."

79 of 536 comments (clear)

  1. UDP. by langelgjm · · Score: 5, Funny

    Clearly you're looking for UDP. Next question.

    --
    "Anyone who [rips a CD] is probably engaging in copyright infringement." - David O. Carson
    1. Re:UDP. by Underfoot · · Score: 3, Informative

      UDP is actually a great basis for accelerated file transfer. Several file transfer utilities / protocols have been built around it. I deal with really large files, but I have been using Aspera on several projects with great success. Worth a look.

      http://www.asperasoft.com/

      --
      I mentioned tinker-toys once in a post - now I'm modded down for life.
    2. Re:UDP. by El+Torico · · Score: 4, Funny

      Now I know the sound of packets being dropped. Thanks.

      --
      In the land of the blind, the one-eyed man is usually crucified.
    3. Re:UDP. by sofar · · Score: 5, Funny

      TCP is so horrible. I wish HTTP used UDP by default so I wouldn't have the pro

    4. Re:UDP. by Anonymous Coward · · Score: 2, Informative

      TCP is so horrible. I wish HTTP used UDP by default so I wouldn't have the pro

      Aspera is little better than Tsunami.

      As an exercise for the reader, guess which one is cheaper.

    5. Re:UDP. by Chees0rz · · Score: 2, Funny

      OYu need tdownoadl hte fxofire lugin

    6. Re:UDP. by TheTurtlesMoves · · Score: 2, Insightful

      Well I find it nice to have a tar.bz2 file in the same order, and all of it. So you need to add "sequence" numbers and some form of ACKs in there. All you are doing is moving this function to the application rather than leaving in the stack.

      TCP does this pretty well on 99% of the internet *and* the internet is aware of TCP. Its only very "different" connections that things can make a real difference. AKA the microwave link. Though we have wrapped the link with specific hardware/software layer that lets IP work well over it in our cases.

      Also when people "roll" their own "superior" UDP transfer protocol, many don't bother to check why TCP does what it does. Flow control is *needed* with ACKs and resends --well any connection, buffers are not infinitely big. There is the 2 generals problem etc. Its not as strait forward as many think.

      --
      The Grey Goo disaster happened 3 billion years ago. This rock is covered in self replicating machines!
  2. Any encrypted transmission protocol actually by guruevi · · Score: 4, Informative

    SFTP should do since the communications are encrypted, if something changes along the way it should be rejected by the other end. HTTPS and any other protocol-over-SSL should do.

    FTP is a plain-text protocol so if something changes along the way it won't give you any issues.

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
    1. Re:Any encrypted transmission protocol actually by jeffmeden · · Score: 4, Informative

      Using modern encryption like SSH does guarantee that things *have to add up* since keeping what you start with a secret is just as important (sometimes more so) as making sure you finish with exactly what you start with (meaning no one in the middle meddled with your data).

      So, in short, something like SSH or any other properly encrypted communication mechanism is a great way to both secure the data from snooping (in the case of a microwave link, a VERY real problem) as well as to safeguard the data from corruption (intentional or unintentional). I sincerely hope, for the asker's sake and possibly for the country's sake, that these files he works with are trivial.

    2. Re:Any encrypted transmission protocol actually by Anonymous Coward · · Score: 4, Funny

      I sincerely hope, for the asker's sake and possibly for the country's sake, that these files he works with are trivial.

      Well, let's see.

      transferring mission critical files across a 2 mbit microwave connection, into a government-run telecommunications center

      Pretty sure encryption isn't necessary.

    3. Re:Any encrypted transmission protocol actually by Kineticabstract · · Score: 2, Interesting
      Poster isn't concerned about whether the data has errors. That's a problem for the data creators. He's worried about it getting screwed up in transmission, either accidentally or maliciously, and encryption absolutely solves that issue. Yes, garbage in returns garbage out, but you're guaranteed (within collision space boundaries) that the garbage that comes out is exactly the same as the garbage that went in. And that's the point here.

      Starting a comment off by explaining that you're not familiar enough with the subject matter to intelligently comment is a very handy flag, and I appreciate your warning the rest of us that what you were saying was going to be wrong ;)

      BTW, checksum hasn't been considered a trustworthy means of ensuring data integrity for more than a decade. I invite you to have a discussion with Google regarding checksum collisions.

    4. Re:Any encrypted transmission protocol actually by fm6 · · Score: 2, Informative

      Poster isn't concerned about whether the data has errors. That's a problem for the data creators. He's worried about it getting screwed up in transmission, either accidentally or maliciously

      Sigh. You're welcome to nitpick my prose, but would you mind doing so in a way that makes sense. Data that got screwed up in transmission can be said to have errors. And that's what I meant.

      and encryption absolutely solves that issue.

      How? Not all encryption algorithms break if you mung the data after it was encrypted. Do all the algorithms break if this happens? Show me where it says this, and I'll admit that encryption is sufficient.

      BTW, checksum hasn't been considered a trustworthy means of ensuring data integrity for more than a decade.

      Dude, you really need to start listening to how people actually talk. For more than a decade, the word "checksum" has been used to apply to algorithms that don't simply add up bits, such as MD5. Not strictly logical, but language rarely is.

    5. Re:Any encrypted transmission protocol actually by link-error · · Score: 5, Insightful

      Wrong. FTP has a binary mode. This is probably the reason his files are missing several k at the destination. Sending a binary file in ascii mode is the ONLY TIME I've ever had a file not transfer entirely/correctly using FTP. Unless of course there is a network error/timeout, etc, but the FTP client always errored out in those cases. Using SFTP over an already secure network will only slow things down greatly.

      --
      -Unresolved symbol? Byte me!
    6. Re:Any encrypted transmission protocol actually by jgrahn · · Score: 2, Interesting

      Wrong. FTP has a binary mode. This is probably the reason his files are missing several k at the destination.

      Using FTP ASCII mode for binary files would be increadibly stupid, but yeah, it sounds like that could be it.

      Sending a binary file in ascii mode is the ONLY TIME I've ever had a file not transfer entirely/correctly using FTP. Unless of course there is a network error/timeout, etc, but the FTP client always errored out in those cases.

      Calling ftp from a .BAT script or whatever it's called in DOS and *not* checking its exit code is another likely candidate. Otherwise, I don't believe FTP has any checksums, so I'd expect bit errors here and there -- things the TCP and link layer checksums did not catch in 1/65536 of the cases.

      Using SFTP over an already secure network will only slow things down greatly.

      Depends entirely on the CPU speed of the endpoints relative to the link speed. If you enable compression and the files aren't already compressed, it can be a lot faster.

    7. Re:Any encrypted transmission protocol actually by OeLeWaPpErKe · · Score: 2, Interesting

      Actually encryption doesn't guarantee *things add up* after transfer. And ssh does not guarantee things add up any more than tcp does. It does have other advantages, like compression.

      And tcp is just not a good file transfer protocol over microwave links. Sure you can fix the glaring issues, using huge windows, you can even change registry settings to improve the situation : http://support.microsoft.com/kb/224829.

      Making it work really well, though, you'll need

      If you're worried about correctness of transfer you might want to use rsync for windows, which *does* check correctness. You might want to use an interface like http://www.aboutmyip.com/AboutMyXApp/DeltaCopy.jsp.

      Now rsync is no wonder. It is not something that is constantly trying to reconnect. You start it once ... it tries once. If you want an opportunistic reliable file transfer utility ... you might want to try bittorrent, it's quite good at that.

    8. Re:Any encrypted transmission protocol actually by Kazoo+the+Clown · · Score: 2, Informative

      FTP is however, more than an order of magnitude faster than SFTP or SCP. If the files are relatively small, SFTP is certainly the more secure solution, but if the files are huge and time is an issue, FTP has the clear performance advantage.

    9. Re:Any encrypted transmission protocol actually by Kazoo+the+Clown · · Score: 2, Informative

      We've done a lot of testing for our data warehouse products over a gigabit link between two quadcore server PCs comparing the transfer of several gigabytes worth of data, between ftp and sftp, and typical times for sftp have been taking about 3 and a half hours, when the same transfer via ftp is taking about 20 minutes. For the clients, we've been using psftp and windows command-line ftp, and for the servers, War-FTP and copssh. HP has a performance patch for OpenSSH (see here), but we have been unable to locate or develop a build for Windows that has this patch. While there may be better tuned SFTP software out there, the readily available open source tools do not compare well with FTP.

  3. TCP? by causality · · Score: 4, Interesting

    The summary states that with FTP, the downloaded files were of the wrong size. Can anyone explain why TCP's efforts to to deal with unreliable networks, such as the retransmission of unacknowledged packets and their reassembly in proper order, would not already deal with this? I am familiar with the concepts involved but I think I lack the low-level understanding of how you would get the kind of results the story is reporting.

    --
    It is a miracle that curiosity survives formal education. - Einstein
    1. Re:TCP? by Anonymous Coward · · Score: 5, Insightful

      TCP has timeouts. The FTP client and server probably have timeouts. Eventually, some bit of the system will decide the operation is taking too long and give up. The FTP client is probably reporting an error, but if it's driven by a poor script no-one will know.

    2. Re:TCP? by Zocalo · · Score: 5, Informative

      The only times I've seen FTP report a successful file transfer and have a file discrepency is when a binary file has been transferred in ASCII mode and the CR/LF sequences are being swapped for just CRs, or visa versa. Nothing wrong with the protocol, PEBKAC...

      --
      UNIX? They're not even circumcised! Savages!
    3. Re:TCP? by AvitarX · · Score: 4, Insightful

      I bet it is file systems with different block sizes rounding slightly differently, and an OP that does not understand.

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    4. Re:TCP? by mini+me · · Score: 2, Insightful

      FTP, while in ASCII mode, can try to translate line endings. If the carriage returns were removed, in order to be UNIX compatible, the file size would have been reduced.

      Most FTP clients allow the enabling of a binary mode which prevents the conversion from happening.

    5. Re:TCP? by rhsanborn · · Score: 2, Informative

      And those people are wrong.

    6. Re:TCP? by amoeba1911 · · Score: 5, Informative

      I'm gonna learn you some English: First, I will download my photos to my Facebook page. Then I will borrow you my car but in collateral I demand you borrow me you're lawnmower for a week so I can mow my lawn. Your smart, so you will do good on your next test.

    7. Re:TCP? by bwcbwc · · Score: 5, Informative

      I used to get dropped characters and groups of characters in text files using FTP back in the 1990s and early 21st century. It seemed to be a bug in the FTP client, because it only happened when we used the Windows Explorer interface for the product. When we did command line or used the native GUI there was no problem. If you're seeing this type of a pattern where you can see that characters are missing, switch to a different FTP client or try the Windows command line FTP.

      Another possibility is that the target Windows system is mimicking a Unix system, so that an ASCII transfer is stripping the CR characters from CR/LF sequences.

      On the other hand, if you really want a "guaranteed delivery" with formal acknowledgment and validation, try using a secured protocol like SSH or SFTP or a messaging system like JMS with a handshaking architecture around it. There are plenty of Open Source architectures you can build around (xBus for example), but I don't know of any ready-built executables. Commercially, vendors like IBM (MQ) and Tibco have products that deal with the messaging at a similar level.

      --
      We are the 198 proof..
    8. Re:TCP? by HeronBlademaster · · Score: 2, Interesting

      Anyone who uses it that way is wrong. (For the lazy, every single definition of the verb "to borrow" involves receiving, not giving.)

      So your parent post should have said "Borrow should only be used to refer to the act of receiving something", but his (her?) statement is still essentially correct, if you go by the actual definition of the word rather than colloquial usage from one particular area.

      If I start using a word's opposite as if it were the word, and six hundred other people near me start doing it too, that makes it colloquial (in our area), but that doesn't make it right.

    9. Re:TCP? by ShieldW0lf · · Score: 3, Informative

      You could deal with a situation like this by zipping or rarring it into multiple small files and including parity files.

      http://en.wikipedia.org/wiki/Parchive

      --
      -1 Uncomfortable Truth
    10. Re:TCP? by rezalas · · Score: 2, Insightful

      With the network being as questionable as stated, I can only wonder what part of the network is causing it to be unreliable. Usually if the entire network as issues then you are probably talking about everything in the office coming back to a switch panel and a faulty switch. If only certain transfers from point to point are commonly failing then you probably have wiring issues. In either a hardware or medium case, you need to be fixing the network instead of finding workarounds. Working with the network Admin for the facility to detect the source of the issue should be a two - three hour task at most. Save yourself time in the future and spend the bulk time now to fix the real problem.

    11. Re:TCP? by samkass · · Score: 5, Informative

      While others point out, probably correctly, that the problem is probably a binary/ascii conversion, in actuality the error checking on TCP is simply not that good.

      TCP uses a 16-bit checksum, so you have 1 in 65536 chance of an error packet being incorrectly validated as being correct. To make matters worse, it uses 1's complement instead of 2's complement, so 0x00 and 0xFF are indistinguishable.

      Ethernet has a 32-bit, 2's complement checksum so if you're transmitting over that link-layer you're probably in good shape. But depending on that from a systems point of view seems risky.

      Much better to only transfer ZIPs and check them at the other end if you only have control over the endpoints. If you can control the transmission, use a better error-correcting high-level protocol or even a forward-error correction protocol on top of TCP.

      Or just use rsync.

      --
      E pluribus unum
    12. Re:TCP? by SnarfQuest · · Score: 2, Insightful

      Binary verses text mode?
      Lousy windows file system screwing up on one or the other end.
      Sparse files.
      Windows "fixing" the data during transmission.
      Loss of packets, and no error checking.
      Windows.

      --
      Who would win this election: Andrew Weiner vs Andrew Weiner's weiner.
    13. Re:TCP? by Obfuscant · · Score: 5, Informative
      Since ASCII files are also ultimately represented as particular sequences of binary data, why does FTP even have an ASCII transfer mode?

      Because of differences between systems like Unix and Windows, where line ends are a simple newline on Unix but a CR/LF pair on Windows. Also systems like VMS which have (had) about thirteen different file formats all inherent in the file structure itself.

      In other words, because all ASCII files are not represented the same way by all different operating systems.

      I know that Windows uses CR/LF for line termination and *nix uses just LF. That's a very minor inconvenience at worst,

      Not if you have an "ASCII" file you are trying to read on Windows that has Unix newline conventions. Try opening a newlined file with notepad, for example.

      ...and little standalone utilities to convert the formats are readily available and have been for some time now.

      "Little standalone utilities" are really handy for small files and small numbers of files. It's really handy when you know the format the file you have is in and what it needs to be. Please tell me how you will identify a VMS fixed record file that you have just ftp'd from a VMS FTP server when it gets to your Windows system. It has NO newlines or CR/LF pairs. You might dump the file somehow and notice that the lines are all 93 characters long and then write yourself a perl script to split it up -- or you could simply tell your FTP client that you are in ASCII mode and let the FTP server/client negotiate some resulting format that your system likes. Now try that with a VMS variable length record file, where the lines are variable length, still without line endings.

      FTP wasn't designed just for hobbyists who want a file or two and have the time to deal with file formats by hand. It was designed to move data, and anything that can be automated should be. "Little standalone utilities" are a pain in the ass when trying to automate something, especially when the critical information necessary to know what specific utility to use has been lost, or is completely unknown to the recipient's system. Like VMS fixed length records on Unix or Windows.

      It just seems like it's not the job of a file transfer protocol to concern itself with what an independent, unrelated application can or cannot do with the file after it's transferred.

      ASCII mode in FTP has nothing to do with anyone trying to tell anyone what they can or cannot do with a file after it's transferred. It's all about knowing how to deal with a hundred different ways of representing ASCII data on dozens of different operating systems and making life EASIER for people who have to do that on a daily basis.

      If YOU would rather operate in BIN mode and worry about which file formats you've just downloaded and how to convert them to an ASCII representation that your software knows how to deal with, more power to you. I got tired of dealing with this the first time I had to convert a VMS "ASCII" file to Unix and I'll let FTP do it silently for me. Yes, I've dealt with users who didn't know what ASCII mode was and downloaded a zipped file in ASCII mode and it didn't work, but the time I've saved just myself not having to deal with converting crap has more than made up for the time I've spent telling them to use BIN mode.

    14. Re:TCP? by meerling · · Score: 2, Interesting

      Absolutely.

      If the drives are different sizes, different filesystems, or even just set up with different cluster sizes.
      (Yes, you can do that in Windows, just don't get stupid with the settings.)

      He may have corrupted files, he should really check, but if a different size on different drives is the only thing he's checked, they may be perfectly fine.

      Ancient History Perspective :)
      Back in the Dos days, people were always panicking about their memory not having the exact byte value they expected. Most people didn't understand that different bios versions/brands and different bios options, like shadowing, all affected that value.

    15. Re:TCP? by AvitarX · · Score: 2, Interesting

      You appear to be correct.

      Yet I appear to be "insightful", interesting.

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    16. Re:TCP? by Jimmy_B · · Score: 3, Interesting

      Not if you have an "ASCII" file you are trying to read on Windows that has Unix newline conventions. Try opening a newlined file with notepad, for example.

      As far as I can tell, the problem is entirely unique to notepad. Every other text editor I've ever tried handles files with Unix-style newlines correctly. Since it would be trivial to fix Notepad, I can only assume that Microsoft either doesn't care at all about Notepad, or is deliberately leaving the incompatibility in place to discourage use of Unix.

    17. Re:TCP? by jmauro · · Score: 2, Informative

      Or they'd rather just have you use the already included Wordpad that does handle new lines correctly.

    18. Re:TCP? by Obfuscant · · Score: 2, Informative
      As far as I can tell, the problem is entirely unique to notepad.

      Who rated this "insightful"?

      I'm sorry, but I've worked in this area for years. I was responsible for moving data and source files to and from Unix to DOS to VMS to OSs that are even deader than VMS, and the problem is hardly unique to "notepad". YOU may see it only in notepad because YOU only use Windows, but there are a lot of other OSs out there. If you've never worked on an OS that has structured files inherent in the filesystem, well, lucky you. I have. The newlines in those kinds of files are completely lost when you copy the byte stream contents, because the newlines are implicit and defined in the file structure itself. A fixed-record file doesn't need newlines because every line is the same length.

      Every other text editor I've ever tried handles files with Unix-style newlines correctly.

      There is much more to the world than Windows and Unix-style newlines. If all you have seen is Windows and Unix newlines, I suppose you could think the problem was limited to that, but it really isn't. In fact, if you use FTP much at all, I suspect even you have been protected by ASCII mode, to the point that you never even knew that an FTP site you visited was VMS-based. I know I've been to VMS sites, and ASCII mode is critical if you are dealing with ASCII files.

    19. Re:TCP? by link-error · · Score: 3, Informative

      I replied similar to this above, but if you're microwave connection is generating any binary data and you're transmitting using ascii mode, you'll get file size differences at the destination.

      --
      -Unresolved symbol? Byte me!
  4. Robocopy? by wafath · · Score: 5, Insightful
    1. Re:Robocopy? by Krneki · · Score: 2, Informative

      Robocopy works on top of Windows network layer, it's the same as using copy / paste with some extra functionality.

      --
      Love many, trust a few, do harm to none.
    2. Re:Robocopy? by Anonymous Coward · · Score: 5, Informative

      Yeah but that extra functionality contains things like the ability to resume a transfer, retry if things fail, and verify the files after copying.

    3. Re:Robocopy? by Saint+Stephen · · Score: 5, Informative

      MOD PARENT UP. Not to mention it's multithreaded, so it's not really the same as copy/paste - it's the same as a whole bunch of copy/pastes as the same time.

      Why do people keep fighting the Robocopy, I'll never know.

    4. Re:Robocopy? by Malc · · Score: 3, Insightful

      It might be using Windows copy protocols, but it definitely is not like copy/paste. It's restartable for instance. It's way more reliable.

      We have to copy large files to our office in China. FTP always fails. Windows copy via Explorer often fails, but it is also incredibly painful to do when latency is high and one is browsing over the network. Robocopy (depending on system setup) will motor through and is very persistent when there's a connection hiccup. You definitely want restartability if you copy large files are a couple of hundred MB an hour.

      I'd say make sure to break the files up in to chunks if they're large. Also, run 2-4 robocopies in parallel if the latency is high as this will give better throughput. It can do funny things to Windows though (maybe other things wait on some network handle and seem to freeze until one of the robocopy processes moves on to the next file).

      Also, consider doing it over a Cisco VPN. It seems to add some robustness if there is packet loss. I often had trouble access servers in the US when I was living in China due to packet loss, but no such problem over a VPN (zero packet loss, but very slow instead, which is better).

    5. Re:Robocopy? by Ritchie70 · · Score: 5, Informative

      Actually, you can specify a single file, it just has a silly syntax.

      robocopy source destination file

      So "robocopy c:\a c:\b myfile.txt" will copy c:\a\myfile.txt to c:\b\myfile.txt.

      --
      The preferred solution is to not have a problem.
    6. Re:Robocopy? by oatworm · · Score: 2, Informative

      Active Directory tends to complicate things, though you can use NTBackup or Windows Backup (depending on your Server version) to kind of keep things somewhat under control there. Even then, though, restoring AD from backup using NTBackup is not a particularly fun or, in my experience, reliable proposition. Plus, this doesn't even dig into the rest of a server's system state (in theory, if it's backed up right via NTBackup, you might be able to restore the whole thing without reinstalling every piece of software - good luck!) or attempting a brick-level backup of Exchange.

      It really is phenomenal how much effort Microsoft forces you to go through just to back up their servers. These days, I just go with image-based software for server backups - they seem to do a far more reliable job of getting Windows servers back up in a hurry than file-level products (which Robocopy + NTBackup would qualify as). But, that's just me, and I primarily deal with smallish networks, so I'm not entirely sure how well that scales.

  5. Use BITS by Lothar · · Score: 5, Informative

    Background Intelligent Transfer Service (BITS) can be used to transfer files between windows servers. It is the technology behind Windows Update. We use it in our company to transfer files across a low bandwidth sattelite connection. Great thing is that it can automatically resume transfer after rebooting both machines. SharpBits offer a nice .NET API. You can find it here: http://www.codeplex.com/sharpbits

  6. domyjobforme tag by EmagGeek · · Score: 2, Insightful

    I love it! Haha... that's probably one of the better tags I've seen.

    1. Re:domyjobforme tag by Anonymous Coward · · Score: 3, Insightful

      Too easily thrown around if you ask me. He's not looking for anyone to set it up, he just wants some options. Isn't that what community is about?

  7. BitTorrent by Inf0phreak · · Score: 4, Insightful

    I'd say BitTorrent -- with firewall rules or some other measure so random people can't see your microscopic swarm. It uses SHA-1 hashes of chunks, so if a torrent client says a file downloaded successfully it's pretty much guaranteed to be true.

    --
    ________
    Entranced by anime since late summer 2001 and loving it ^_^
  8. rsync should do the trick by bacchu_anjan · · Score: 5, Insightful

    hi there,

        why don't you get cygwin on both the systems and then do a rsync ?

        between your own network, you might want to use robocopy(http://en.wikipedia.org/wiki/Robocopy).

    BR,
    ~A

    1. Re:rsync should do the trick by ericnils · · Score: 3, Informative

      We use Cygwin's rsync to backup windows servers over a slow Internet connection at work. It works very well for us and using the -z compression option will probably result in much faster transmission over a 2Mbit pipe than FTP will provide. We run rsync as a service on the source and pull to the destination using the rsync command line tool, but you could easily reverse that. You should also consider Microsoft's built-in DFS replication which automates replication of data between two file servers over TCP.

    2. Re:rsync should do the trick by JoeRandomHacker · · Score: 2, Informative

      rsync is great, though on Cygwin there are some caveats. Last time I tried using it to sync a large amount of data I ran into a Cygwin pipe bug (for the pipe between rsync and the ssh process) which caused the transfer to hang. Using the "rsync:" protocol (with an rsync daemon), optionally over an ssh tunnel (port forwarding), worked fine, though it was a bit clunky.

  9. Correct me if I'm wrong... by not+already+in+use · · Score: 2, Interesting

    Wasn't TCP designed for just this? Guaranteed transmission?

    --
    Similes are like metaphors
    1. Re:Correct me if I'm wrong... by Dogun · · Score: 3, Informative

      Implementations of TCP in most operating systems fall a bit short of that, killing off stalled connections, etc. Also, some firewall suites, and some routers make a habit of killing off connections after a certain amount of time, sometimes without regard to whether or not they are 'active'.

      You might have some luck boosting reliability with the TcpMaxDataRetransmissions registry setting in Windows. But ultimately, the poster is going to need to find a file copy suite which retries when connections die.

  10. Line endings! by sys.stdout.write · · Score: 5, Insightful

    they've been using FTP to upload the files, but many times the copied files are a few kilobytes smaller than the originals

    Twenty bucks says you're converting from Windows line endings (/n/r) to Linux line endings (/n).

    Use binary mode and you'll be fine.

    1. Re:Line endings! by wick3t · · Score: 2, Informative

      Twenty bucks says you're converting from DOS line endings (\r\n) to Unix line endings (\n).

      There, fixed that for you.

  11. rsync by itsme1234 · · Score: 5, Informative

    ... is what you want. Yes, you can use it with Windows (with or without cygwin bloat). Use -c and a short --timeout and you're good to go. If you're using it over ssh you're looking at three layers of integrity (rsync checksums, ssh and TCP), two of them quite strong even against malicious attacks not only against normal stuff. Put it in a script with a short --timeout; if anything is wrong with the link your ssh session will freeze completely, as soon as your --timeout is reached rsync will die and your script can respawn a new one (which will resume the transfer using whatever chunks with good checksum you have already transfered and will again checksum the whole file when it finishes).

    1. Re:rsync by doug · · Score: 3, Informative

      Yep, that's what I'd do. The rsync --server means sending signatures instead of files to prevent pointless copies, and it does an excellent job of ensuring good copy or failure. It is certainly better than any ftp variant.

  12. RTFM - set binary mode in FTP by n4djs · · Score: 5, Informative
    'set mode binary' prior to moving the file. I bet the file you are moving isn't a text file with CR-LF line terminations as normally found in DOS, or one side is set and the other isn't.

    Ritchie's Law - assume you have screwed something up *first*, before blaming the tool...

  13. Re:Well...duh by metamatic · · Score: 2, Informative

    You don't need to MD5 if you're using rsync. The rsync algorithm already uses checksums to ensure the files are bit-for-bit identical. In fact, rsync 3.x uses MD5.

    --
    GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
  14. That is to be expected by kseise · · Score: 2, Funny

    Think of this transfer model like a car, the further it goes, the more bytes are burned up. they just need to be added back in with a network filling station. I would look to google for a government approved provider.

  15. Re:Jesus protocol by Tablizer · · Score: 2, Funny

    Jesus is awesome.

    I've never heard of that product. Who makes it? Can it do binary transfers also? It must be open-source with such an odd name.

  16. Re:Sneakernet by nystire · · Score: 2, Funny

    Or a mine-field...

  17. AS2 FTW by just+fiddling+around · · Score: 2, Interesting

    You should look at the EDIINT AS2 protocol, AKA RFC 4130. This is a widely-used e-commerce protocol built over HTTP/S.

    AS2 provides cryptographic signatures for authentification of the file at reception, non-repudiation and message delivery confirmation (if no confirmation is returned, the transfer is considered a failure), and is geared towards files. There is even an open-source implementation avaliable.

    More complex than FTP/SFTP but entirely worth it if your data is mission-critical and/or confidential. Plus, passes through most networks because it is based on HTTP.

    --
    You're not old until regret takes the place of your dreams.
  18. Use .complete files. by Prof.Phreak · · Score: 3, Interesting

    Even on reliable connections, using .complete files is a great idea.

    It works this way: If you're pushing, open ftp, after ftp completes, you check remote filesize, if matches local file size, you also ftp a 0 size .complete file (or a $filename.complete file with md5 checksum, if you want to be extra paranoid).

    Any app that reads that file will first check if .complete file is there.

    If remote file size is less, you resume upload. If remove filesize is more than local, you wipe out remote file and restart.

    Same idea for the reverse side (if you're pulling the file, instead of pushing).

    You can also setup scripts to run every 5 minutes, and only stop retrying once .complete file is written (or read).

    Note that the above would work even if the connection was interrupted and restarted a dozen times during the transmission. [we use this in $bigcorp to transfer hundreds of gigs of financial data per day... seems to work great; never had to care for maintenance windows, 'cause in the end, the file will get there anyway (scripts won't stop trying until data is there)].

    --

    "If anything can go wrong, it will." - Murphy

  19. Re:Guaranteed? by jeffmeden · · Score: 3, Funny

    You forgot a few:

    Windows at both ends... Used to use FTP... Considering windows file sharing...

    Is anyone else a little nervous? I hope by 'government' he means Department of Natural Resources or some equally uninteresting entity. I am picturing someone at the SEC going "You know, I swear this accounting data had a few more rows the last time I looked at it-- Oh well it's not like this Madoff guy is actually up to anything strange anyway"

  20. You're kidding, aren't you?? by ballyhoo · · Score: 5, Insightful

    You are kidding about this, aren't you?

    Let me get the facts straight:

    - you have "mission critical files", and the network you're transferring them over is so incredibly badly managed that it doesn't support reliable data transfer
    - you want a technical workaround for this brokenness.

    If this is the case, you don't have a technical problem on your hands; you have a political one.

    "Mission critical" has a meaning: it means critical to the success of the operation. I.e. without these files, your operation or someone else's operation will fail.

    If your management believes that your files are "mission critical", and you're facing a problem of this sort, you need to document the difficulties you're having, along with measurements to support your claims and then make a clear statement that as long as your network path is completely broken, you are absolving yourself of responsiblility for the correct transmission of these files.

    If your management doesn't do anything about this, then the files are not "mission critical".

    1. Re:You're kidding, aren't you?? by Dravik · · Score: 3, Insightful

      Mission critical means that you need to get it done even if someone else isn't getting their job done. Standing around in a huff and stomping your feet means that the mission critical information isn't getting moved. What he needs to do is find a way to accomplish his mission despite the difficulties, and then document the problems so they can be addressed.

      --
      The purpose of language is communication, If the idea is clear the grammar ain't important
    2. Re:You're kidding, aren't you?? by eap · · Score: 2, Insightful

      I think you missed the part about the government being involved

  21. Re:Jesus protocol by Loko+Draucarn · · Score: 3, Funny

    Not to mention the three day latency on refreshing the entropy pool.

  22. Re:Well...duh by Alrescha · · Score: 2, Informative

    "You don't need to MD5 if you're using rsync. The rsync algorithm already uses checksums to ensure the files are bit-for-bit identical. In fact, rsync 3.x uses MD5."

    Rsync, by default, does not necessarily do this. I've seen situations where rsync would happily copy files from a remote host over ssh to a destination host and the resulting files failed an independent MD5 test. Rsync was not causing this trouble - but it did fail to detect it. Forcing a checksum of every file (using "-c") would let rsync detect the failure to copy properly (after the entire file was done) and it would retry.

    In the end, a router and one of the hosts were rebooted and the problem went away. The point is that just using rsync and ssh does not guarantee anything.

    A.

    --
    ...bringing you cynical quips since 1998
  23. The protocol needs to be a part of the discussion by raddan · · Score: 2, Informative

    On some level, there isn't much difference between an application and a protocol. In fact, if you ever take a networking theory course, you'll see that each protocol layer in the network stack is, in fact a "protocol machine" (i.e., an application), which does the little protocol dance that makes functions at that layer happen.

    But I digress. What the user is running into here is a fundamental problem with TCP over lossy networks. It really was not designed with really lossy networks in mind. E.g., the congestion control mechanism in TCP ("exponential backoff") makes the assumption that there is a wire sitting there and that certain parameters (like bandwidth) are not going to change. If you need certain QoS guarantees on a wireless link, TCP may be hard-pressed to deliver, because TCP's [limited] QoS mechanisms may make the problem worse. There is a HUGE amount of overhead on 802.11 networks to make sure that TCP doesn't suck.

    I don't know how this person's microwave link is configured, but they might be better served by thinking about the QoS guarantees in the various layers in their network stack. I know a previous poster was joking when they said UDP might be a good option, but look, part of the problem on wireless is TCP's retransmission mechanism. With UDP it is up to the user/application to ask for a retransmit. Bittorrent works exactly like this, so something like Bittorrent, where each small file chunk gets its own hash, and those hashes are checked upon receipt, might not be a bad idea. I like rsync as well (because it has a rolling checksum feature), but again, you have TCP in the mix, and if I recall correctly, rsync will not retry automatically on failures, which is what you want.

  24. Re:Sneakernet by nystire · · Score: 2, Interesting
  25. For everything else there's md5sum by Colin+Smith · · Score: 2, Funny

    The transmission system is irrelevant. All that matters is that you know you have received whatever was sent.

    Just make sure you send a checksum and that the received file matches.

    oh wait... Windows scripting...

    --
    Deleted
  26. Re:Guaranteed? by gandhi_2 · · Score: 2, Funny

    I worked on a system for the Utah DNR once. Data about sensitive species, species of concern, and endangered species have security requirements. If someone finds out how many Woundfin we are down to...the terrorist win.

  27. Re:Jesus protocol by melikamp · · Score: 3, Funny

    It must be open-source with such an odd name.

    Close. It's open sores, especially around the wrists.

  28. Re:FTP is fairly reliable... by sexconker · · Score: 2, Informative

    Windows reports file sizes exactly, to the byte.

    It reports both the true file size and the file size on the disk, which is based on the block size and the number of blocks required to store the file. ..

  29. Cygwin + lpd by rlseaman · · Score: 2, Funny

    Set up a BSD lpd queue under Cygwin, something like:

    sendit:lp=/spool/null:sd=/spool:if=/spool/sendit.sh:sf:sh:mx#0:

    Have the sendit.sh script do whatever it is you want with the file. To send a file: lpr -Psendit filename

    Configuration of the network queue left as an exercise for the student. (Hint - queue pathnames locally.)

  30. SSHFS by cenc · · Score: 2, Insightful

    I use sshfs file mounts for all office document file sharing and such, not just one time transfers. SSH encryption security, with the ability to open and edit files over the network. No goofing around with samba or windows file sharing. Regardless, some sort of ssh or sftp at least.

    Not sure about getting it to work on windows, but there should be some options.

  31. Z-Modem FTW! by Cytotoxic · · Score: 5, Insightful

    Crappy connection? Resumable transfers? Slow connections? Sounds like the good old BBS days!

    Z-modem is your answer.