Slashdot Mirror


BitTorrent For Enterprise File Distribution?

HotTuna writes "I'm responsible for a closed, private network of retail stores connected to our corporate office (and to each other) with IPsec over DSL, and no access to the public internet. We have about 4GB of disaster recovery files that need to be replicated at each site, and updated monthly. The challenge is that all the enterprise file replication tools out there seem to be client/server and not peer-to-peer. This crushes our bandwidth at the corporate office and leaves hundreds of 7Mb DSL connections (at the stores) virtually idle. I am dreaming of a tool which can 'seed' different parts of a file to different peers, and then have those peers exchange those parts, rapidly replicating the file across the entire network. Sounds like BitTorrent you say? Sure, except I would need to 'push' the files out, and not rely on users to click a torrent file at each site. I could imagine a homebrew tracker, with uTorrent and an RSS feed at each site, but that sounds a little too patchwork to fly by the CIO. What do you think? Is BitTorrent an appropriate protocol for file distribution in the business sector? If not, why not? If so, how would you implement it?"

18 of 291 comments (clear)

  1. Sneakernet by 91degrees · · Score: 5, Insightful

    The bandwidth of a DVD in the postal service isn't great but it's reasonable and quite cost effective.

  2. Different torrent client ? by drsmithy · · Score: 5, Informative

    No need to get fancy with an "RSS feed". rTorrent, at least, can be configured to monitor a directory for .torrent files and automatically start downloading when one appears. You could set this up, then simply push out your .torrent file to each site with something like scp or rsync.

    1. Re:Different torrent client ? by Anonymous Coward · · Score: 5, Interesting

      rtorrent watching a directory for .torrent would be the way to go. And then use unison to keep the .torrent directory in-sync.

  3. Works great by Anonymous Coward · · Score: 5, Insightful

    BitTorrent is an excellent intranet content-distribution tool; we used it for years to push software and content releases to 600+ Solaris servers inside Microsoft (WebTV).

    -j

  4. Sure, why not? by sexybomber · · Score: 5, Insightful

    Is BitTorrent an appropriate protocol for file distribution in the business sector?

    Sure! BitTorrent, remember, is only a protocol, it's just become demonized due to the types of files being shared using it. But if you're sharing perfectly legitimate data, then what's wrong with using a protocol that's already been extensively tested and developed?

    Just because it's been used to pirate everything under the sun doesn't make it inappropriate in other arenas.

  5. rsync by timeOday · · Score: 5, Informative

    How much do these disaster recovery files change every month? If they stay mostly the same, using rsync (or some other binary-diff capable tool) may let you keep your simple client/server model while bringing bandwidth under control.

    1. Re:rsync by Anonymous Coward · · Score: 5, Informative

      Yes, and there are ways you can use rsync from well-planned scripts that are very powerful beyond just file transfer.

      1. The basic case of "transfer or update existing files at destination to match source." It always takes advantage of existing destination data to reduce network transfers.

      2. The creation of a new destination tree that efficiently reuses existing destination data in another tree without modifying the old tree. See --copy-dest option.

      3. In addition to the previous, don't even create local disk traffic of copying existing files from the old tree to new, but just hard link them. This is useful for things like incremental backup snapshots. See --link-dest option.

      It may not be as sexy as p2p protocols, but you can implement your own "broadcast" network via a scattered set of rsync jobs that incrementally push their data between hops in your network. And a final rsync with the master as the source can guarantee that all data matches source checksums while having pre-fetched most of the bulk data from other locations.

      I've been enjoying various rsync applications such as the following (to give you an idea of its power): Obtain any old or partial mirror of a Fedora repository and update it from an appropriate rsync-enabled mirror site, to fill in any missing packages. This is a file tree of packages and other metadata. Concatenate all of the tree's files into one large file. Then use rsync to "update" this file to match a correponding DVD re-spin image on a distro website. Rsync will figure out when most of those file extents cooked into the ISO image are already in the destination file, and just go about repositioning them and filling in the ISO filesystem's metadata. An incredibly small amount of traffic is spent performing this amazing feat.

  6. Cisco already makes a product to do this - WAAS by colinmcnamara · · Score: 5, Informative

    It is like Rsync on steroids. Cisco's Wan optimization and Application Acceleration product allows you to "seed" your remote locations with files. It also utilizes some advanced technology called Dynamic Redundancy Elimination that replaces large data segments that would be sent over your WAN with small signatures.

    What this means in a functional sense is that you would push that 4 Gig file over the WAN one time. Any subsequent pushes you would only sync the bit level changes. Effectively transferring only the 10 megabytes that actually changed.

    While it is nice to get the propeller spinning, there is no sense reinventing the wheel.

    Cisco WAAS - http://www.cisco.com/en/US/products/ps5680/Products_Sub_Category_Home.html

    --
    Colin McNamara - CCIE #18233 "The difficult we do immediately, the impossible just takes a little longer"
  7. If the CIO expects "official" support... by aktzin · · Score: 5, Informative

    Personally I like the portable media shipment suggestions. But if your CIO/company requires enterprise software from a large vendor with good support, have a look at IBM's Tivoli Provisioning Manager for Software:

    http://www-01.ibm.com/software/tivoli/products/prov-mgrproductline/

    Besides the usual software distribution, this package has a peer-to-peer function. It also senses bandwidth. If there's other traffic it slows down temporarily so it won't saturate the link. Once the other traffic is done (like during your off-hours or maintenance windows) it'll go as fast as it can to finish distributing files.

    --
    Quantum mechanics: the dreams that stuff is made of.
  8. Re:Snail-mail USB sticks by SirLurksAlot · · Score: 5, Insightful

    Why would they want to pay for those USB sticks (and any shipping fees that might be involved) when they have a perfectly good network already in place to send the data in a secure manner? There are too many variables involved in using USB sticks as a means of transferring back-up data. Sticks could get damaged, lost, stolen, etc, not to mention that the server at each store would need to allow USB access which could potentially open them up to other security risks. Just imagine if someone at a store decided to plug in their own USB stick and swipe a few files. Nice idea, but there are too many risks involved with a physical transfer of data.

    --
    God, schmod. I want my monkey man!
  9. No, you fool! by bistromath007 · · Score: 5, Funny

    Haven't you been reading the warnings around here about how bad it is for the Internet? If big business starts using BT we'll microwave the baby!

  10. Re:Bittorrent is not secure by jd142 · · Score: 5, Informative

    While security is always something to be considered, this from the question:

    "private network of retail stores connected to our corporate office (and to each other) with IPsec over DSL, and no access to the public internet"

    Private network? Check.
    No access to public internet? Check.

    So pretty much no way for the files to be seeded outside the company.

    And even if there were a way to seed on the internet when they don't have access to it, password protect the file so only a client with the password can download it. That's not unbreakable, but if a competitor wanted the information there are easier ways to get it.

  11. How I would do it... by LuckyStarr · · Score: 5, Interesting

    ...is quite straight forward in fact.

    1. Create a "Master" GnuPG/PGP Key for yourself. This key is used to sign all your data as well as your RSS feed (see below).
    2. Set up an RSS feed to announce your new files. Sign every entry in it using your "Master-Key".
      • All the stores check the validity of your RSS feed via your public key.
      • All the stores have one (or the same) GnuPG/PGP key to decrypt your files. The beauty of GnuPG/PGP is that given many destinations you can encrypt your data so that every recipient (each with their own key) can decrypt them. Nice, eh?
    3. Set up a standard BitTorrent server to distribute your files.
    4. Announce all your new files via your RSS feed.

    This has many advantages:

    The beauty of this system is that it relies heavily on existing technology (BitTorrent, RSS, GnuPG, etc), so you can just throw together a bunch of libraries in your favourite programming language (I would use Python for myself), and you are done. Saves you time, money and a lot of work!

    Furthermore you do not need to have a VPN set up to every destination as your files are already encrypted and properly signed.

    Another advantage is: As this is a custom-built system for your use-case it should be easy to integrate it into your already existing one.

    --
    Meme of the day: I browse "Disable Sigs: Checked". So should you.
  12. How is the VPN setup by eagle486 · · Score: 5, Informative
    If the VPN is setup in a standard hub and spoke configuration then bittorrent would not help since all traffic between sites has to go via the central site.

    Your best bet is multicast, there are programs for software distribution that use multicast.

  13. it's called dsync by slashdotmsiriv · · Score: 5, Interesting

    and you can find documentation for it here:
    http://www.cs.cmu.edu/~dga/papers/dsync-usenix2008-abstract.html

    It is rsync on steroids that uses a BitTorrent-like P2P protocol that is even more efficient because it exploits file similarity.

    You may have to contact the author of the paper to get the latest version of dsync, but I am sure they would be more than happy to help you with that.

  14. Use existing technology by Mostly+a+lurker · · Score: 5, Funny

    CIOs are notoriously conservative. Any solution you suggest that involves building a solution from scratch will scare them. The solution is to use existing proven technology. In the MS Windows world, at least, root kits have been distributing updates successfully for years. You should be looking at simply modifying an existing root kit to your requirements.

  15. Re:In a word, Yes by nabsltd · · Score: 5, Insightful

    For Blizzard, updates to World of Warcraft are very much a "business critical function".

  16. Windows DFS -- Dont use FRS by anexkahn · · Score: 5, Informative

    In windows 2003 R2/Windows Server 2008 they really improved DFS. It lets you set up throttling in 15 minute increments, and with Full Mesh replication, it decentralizes your replication..kind of like bit torrent. However, you have to make sure you don't accidentally use FRS, because it sucks. Where I work we have 5 branches that pull data from our data center. I have DFS replication setup so I can have all our software distribution at the local site. I need to keep the install points at all the sites the same, so I use DFS to replicate all the data, then to get to it I type \\mydomain.com\DFSSharename Active Directory determines what site I am in, then points me to the local share. If the local share is not available, it points me to the remote share, or to a secondary share in the same site...so it gives you failover for your file servers. If you don't have any windows boxes, this wont work, and this really locks you into Microsoft, but it won't cost you anything more than what you have already paid. Below is a link to Microsoft's page with more information, including how to set it up: http://www.microsoft.com/windowsserver2003/technologies/storage/dfs/default.mspx

    --
    Curious about Storage and Virtualization? Check out