Slashdot Mirror


Cringely's P2P Backup Idea

gewg_ writes "If Napster and Bit Torrent had a baby, would it Baxter? As a follow-on to Cringely's last column where he talked about having a backup strategy in the wake of Hurricane Frances, this week he proposes a distributed RAID notion as a solution."

62 of 205 comments (clear)

  1. Baxter is already taken! by Artifex · · Score: 4, Informative

    Baxter is, of course, the famous IRC client for BeOS. (Hi, Seth!)

    --
    Get off my launchpad!
  2. What an awesome idea by Faust7 · · Score: 4, Funny

    Depending on exactly what you have stored, millions of people may want to help you backup as soon as possible.

    1. Re:What an awesome idea by Mod+Me+God+Too · · Score: 5, Interesting

      I once encoded some data in a few MP3s... this was back in 2000. The MP3s were long speech files... about 30mb/file @ 160kbps and were popular, but took so long to transfer, so to propegate the 'new' files as quickly as possible I reduced the bit rate from 160kbps to 32kbps and added in the 'extra' 'noise' as I did this - as it's speech it didn't really matter.

      If I do a search now they're easy to find, much easier than the original 160kbps were.

      This was just a test, no special data used - but an amazing way to archive and distribute data.

      --
      --

      It is not the commies, the government, the nigger, nor the corporates. It is your paranoia.
  3. p2p backup by khrtt · · Score: 5, Funny

    I think this is old news. Some people have been backing up the source code for viruses that they wrote on Kazaa for months now.

    1. Re:p2p backup by aqua · · Score: 3, Interesting

      I made a related waggish proposal a couple of years ago:

      1. Make tarball of backup
      2. Encrypt if desired
      3. Encode tarball, 4-8 bytes at a time, in email addresses
      4. Put email addresses on web
      5. Wait for spam

      Presto -- spammers now pay for your backup; anytime you have a disk failure, just wait a while and watch your spamcan or smtp log, and reconstruct your backup at will.

      (Some assembly required, offer void where prohibited)

  4. No thanks by Lord_Dweomer · · Score: 4, Insightful
    Maybe this would be good for some data, but I would never backup sensitive data on something like this. Nor would a lot of businesses.

    --
    Buy Steampunk Clothing Online!
    1. Re:No thanks by OrangeHairMan · · Score: 3, Insightful

      I would never backup sensitive data on something like this

      Encryption? Simply using GnuPG or any of the free AES encryptors out there will make it incredibly secure. If your data is sensitive enough, you should be doing this already...

      -orange

    2. Re:No thanks by proj_2501 · · Score: 3, Insightful

      the issue is not necessarily secrecy, but knowing you can get that data back exactly when you want to.

    3. Re:No thanks by DarkHelmet · · Score: 2, Interesting
      Encryption? Simply using GnuPG or any of the free AES encryptors out there will make it incredibly secure. If your data is sensitive enough, you should be doing this already...

      Or for that matter, why not build encryption into the system itself, so that you don't have to manually do it.

      --
      /^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
    4. Re:No thanks by Ageless · · Score: 2, Funny

      Or better yet, read the article which says the system would do exactly that!

    5. Re:No thanks by slasher+guy · · Score: 2, Insightful

      And then lose your key with the rst of the data!

    6. Re:No thanks by pHDNgell · · Score: 2, Interesting

      Maybe this would be good for some data, but I would never backup sensitive data on something like this. Nor would a lot of businesses.

      I've been backing up sensitive data almost exactly like this for quite a while now. I've got an application that breaks a stream of data into chunks and encrypts them. It compares the md5 of the source block against the md5 of the same block from the previous backup. If they match, it hard links the block into the backup directory, if they don't match, it encrypts the block and stores a new copy.

      The net effect is that I have about 2.5GB of full database dumps nightly that take up about 3MB of storage space (thus takes about 3MB to transfer it with rsync -H). I do this without having to store the unencrypted stream.

      So yeah, it may be better to not send my GPG encrypted blocks to these other machines, but I trust GPG about as much as I trust my own network and applications keeping my data safe, so it's good for me.

      I do the same thing with my DB dumps and my mail. I only have a 144kbps internet connection, so being able send my full dumps out nightly across it is a real win.

      --
      -- The world is watching America, and America is watching TV.
  5. over and over again ... by duplicatedAccount · · Score: 3, Interesting

    Well, we leave the data where it belongs: in the proxy network where the processes live too. Still a bit incomplete, but maturing WebDAV and mountable slices forthcoming...

  6. Freenet by John_Allen_Mohammed · · Score: 5, Interesting

    Just insert a bunch of data into the network.. record the keys and retrieve once a week then delete. That should keep the data retrievable from the network for a good while. Using two nodes would help. Plus everything is encrypted with some heavy shit.

    Or, just make a local-freenet on the company lan.. everything is encrypted and unretrievable without the proper keys, so it's very secure and it's distributed.. + FEC encoding.

    That assumes freenet works, AFAIK it's still fucking broken. Ian Clarke is playing too much politics with the project and the only coder that really understands freenet (Mathew Toseland) is swamped with ideas, day after day.. it just gets worse and worse... The donations seemed like a good idea, but after watching the DEV list for the last 18 months, I realize it's a failed project :(

    --

    Skype Me! username: john_allen_mohammed
    1. Re:Freenet by Joe+Tie. · · Score: 2, Interesting

      Ah, so that explains it. I finally got enough ram to keep freenet going 24/7, and was surprised to find it so unreliable. I wasn't expecting a speed demon, but I was expecting that links to files on freesites would work if the site itself was. That, so far, has seldom been the case. Are there any other similar projects going on?

      --
      Everything will be taken away from you.
    2. Re:Freenet by MrJay · · Score: 2, Interesting

      That assumes freenet works, AFAIK it's still fucking broken. Ian Clarke is playing too much politics with the project and the only coder that really understands freenet (Mathew Toseland) is swamped with ideas, day after day.. it just gets worse and worse... The donations seemed like a good idea, but after watching the DEV list for the last 18 months, I realize it's a failed project :(

      Check out other development lists on popular projects (if they're public). You'll find that heated debates, arguments, and blatant personal attacks are very common. The Linux kernel list has years of flames between Linus, Alan and other major contributors.. but development continues, doesn't it?

      Same in this case. There are two sets of debates on the Freenet lists. The first set includes those with a background in computer science, an understanding of programming, or an understanding of the theoretical concepts behind decentralized communications. The second (and much larger) set of people are primarily concerned with advancing their own positions within the project. Their ideas are based on false assumptions or a blatant ignorance of Freenet's goals. They will propose things that are completely opposite the goal of a "decentralized anonymous attack-resilient network" and then blame Ian's ego when their ideas are shot down (perhaps unapologetically by Ian himself, sometimes by others). No amount of hand-holding or compassion can convince this group that their ideas are simply incompatible.

      All this aside, the stable releases have been working excellent for me and I am running a stock Freenet node; no tweaks. The introduction of NGR also came with lots of bugs, which (imo) is the main cause for the lackluster performance of Freenet. Now that it's working better, perhaps we can judge NGR on it's abilities (for better or worse) instead of the bugs which cause any analysis of Freenet itself to be inaccurate.

    3. Re:Freenet by MrJay · · Score: 2, Interesting

      Entropy is dead and has been for a few months at this point. They decided that Freenet is broken and claimed they could do better. Time did tell.

      The other point many people critical of Freenet make is that other P2P systems are much faster. One chap came into the Freenet channel and claimed Entropy is much better than Freenet because it's written in C++ and real fast. He didn't realize that Entropy development was dead and that the network consisted of about 20 peers. Nobody knows the precise number of Freenet nodes in the stable network, and any estimate I give is just a guess.

      Ever since March 31st/April 1st of 2003, it's just been downhill.

      Ah, The introduction of NGR. The idea is that classical routing was not scalable, so something else was needed if Freenet was to support more users. It didn't make sense when a splinter group broke off from the main project and used the infamous "build 692" to claim that new Freenet development was broken, since the old build works for the 50 or so people who were using it. We are now close to the point where a proper simulation of NGR and Freenet will tell us much about NGR itself, since I believe most of the terrible bugs have been fixed.

    4. Re:Freenet by MrJay · · Score: 2, Interesting

      I think you meant 'unfair' instead of 'inaccurate'. And that still sounds weird. Imagine a benchmark test that came out, showing Intel's newest processor to suck due to a bug. Would you buy Intel saying "yeah, but if you ignore the bug, it rocks"?

      Apples and Oranges. And I meant "inaccurate". Fairness isn't an issue. Your analogy is flawed for the following reasons:

      * Intel makes chips that are fundamentally still based on the x86 model, which has been around for over 20 years.
      * Freenet is around 5 years old, and still not at version 1.0 yet.
      * Intel doesn't benchmark chips that don't work; only chips that are working.

      You've brushed aside the fact that developing an anonymous, decentralized network that is resilient to attack is hard, and much of the theory is untested in the real world, even though mathematically the idea seems perfect. Since you pay hundreds of dollars for an Intel chip, and Freenet is itself Free, what do you expect given that Freenet isn't actually complete? A perfect product? Give us an R&D budget equal to Intel and Freenet will work much faster than the current pace.

      And where did I say benchmark? I said analysis, and what we wish to analyze is how Freenet is currently routing, for better and/or for worse. Once this is known decisions can be made to move Freenet towards a real release. What part of this is unhealthy? And can you even name one other P2P network that scales better than Freenet?

  7. Re:Queue Linus Quote in..... by Anonymous Coward · · Score: 5, Informative

    In case they missed it.

    "Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it."

  8. Interesting idea by scoser · · Score: 5, Insightful
    Now the world's porn will be safe forever!

    But on the serious side, the claim of using encryption to store data on someone's hard drive worries me. Let's say the encryption gets broken. Now you might get Aunt Nedda's cookie recipes, but then again, you might get BobCo's strategic investment plan for the next 6 months as well. I can see people signing up just for the chance to hunt through people's data.

    1. Re:Interesting idea by legirons · · Score: 2, Insightful

      "But on the serious side, the claim of using encryption to store data on someone's hard drive worries me. Let's say the encryption gets broken. Now you might get Aunt Nedda's cookie recipes, but then again, you might get BobCo's strategic investment plan for the next 6 months as well."

      Worse, if this Aunt Nedda lives in the UK, she could go to jail for 2 years for not being able to decrypt the files on her hard-drive at the request of the police.

    2. Re:Interesting idea by bloo9298 · · Score: 2, Interesting

      On the contrary, I'd say Auntie has a really strong case that she never had the key to someone else's encrypted data stored on her drive, so the RIP act would not apply to her.

  9. Not The First w/ The Idea by william_lorenz · · Score: 4, Informative

    Cringley's not the first with this kind of idea. In fact, the Freenet Project already implements something to this effect. Although not specifically designed for reliable backups, the distributed caching algorithms essentially replicate data towards where it's most often needed, helping to improve network performance and creating copies of important data along the way so that it won't be destroyed if a central server fails. Obviously not a commercial solution, but very interesting.

    1. Re:Not The First w/ The Idea by Fnkmaster · · Score: 2, Interesting

      And oddly, Simpson Garfinkel, another well-known technopundit, submitted a very similar idea (P2P backup service) as a business plan to the MIT 50k competition back in 2002. See here for the entry summary (search in the page for Garfinkel). Anyway, I somehow dredged that up from the back of my brain when I saw this Cringely piece because I recalled that Garfinkel was interested in actually doing something like this several years back.

  10. makes my heart a little warmer... by sgant · · Score: 2, Interesting

    From the article:

    It sounds from every description like the solution is Linux-specific, but I'm sure it can be made to work with other UNIX variants, especially since Gmail, itself, runs on Apple xServe 1u boxes. Windows compatibility is unknown, but I'm sure someone will solve that soon.

    I know, it's a little childish, but I get a good feeling when I see something small...even this little thing here...that thinks of other OS's first and Windows compatibility will be "real soon now" or something like that.

    --

    "Leo Fender was in a 'state of grace' when he designed the Stratocaster." -- Paul Reed Smith
  11. I lived the storm by jsm008us · · Score: 2, Interesting

    Well, I lived through this storm, checking my PC upstairs to make sure nothing was going to damage it. If the storm was risking the roof flying off and my room becoming flooded, I would have taken out my hdd. This sounds like a brilliant idea.

    Hey, it beats trying to store data to gmail accounts! ;)

    --

    mysql>SELECT * FROM users WHERE clue > 0
    0 Rows Returned
  12. Save Betamax by chatooya · · Score: 4, Interesting

    Ideas like Cringely's will be impossible if the INDUCE Act passes.

    Save Betamax is a national Congress call-in day this tuesday to oppose the INDUCE Act. It might be our last chance to stop this bill.

  13. Re:damn.. by dotwaffle · · Score: 5, Interesting

    I had this idea in about '97 or '98. I looked around to see if anyone else had done anything like this (remember, this is kinda pre-mass-P2P) and found that someone had done so, but on a business scale solution. I think it was called Mango, and is still in production today. It essentially made a portion of your drive available for a drive letter, then whetever was copied onto it could be seen by all. The data was stored in at least 2 places, so if one went down, there was still one copy, and the remaining copy would duplicate, so that there was always at least 2 copies. In the end, I think nobody went for it because it was too expensive... But this is EXACTLY what a lot of Small-Medium businesses need atm. Bring on the Mango's!

  14. Much faster by interiot · · Score: 3, Informative
    Alternatively, you can spend $100-200 on a iPod-sized laptop drive enclosure and drive, and have a MUCH faster incremental backup system that's easy to store away from the original data (eg. store your home backup drive at work).

    As a bonus, you can use it to transport data (eg. your mp3 collection) between places, or even use it to boot linux anywhere with much more space and document storage capability than Knoppix.

    1. Re:Much faster by interiot · · Score: 2, Insightful

      And compared to a mini-iPod, you get something that 1) you don't have to worry about power since it has no batteries and doesn't require external power, 2) you get the same amount of disk capacity for something like 1/3rd the cost, 3) THEY SUPPORT USB MASS-STORAGE drivers so any modern OS can talk to it without extra drivers or funky software. Yes, it's not a portable music player, but this solution may be more appropriate for geeks who spend all their time next to a computer in one form or another, or are a little more interested in the data-transport capabilities than the convenient music playing.

    2. Re:Much faster by interiot · · Score: 2, Insightful

      The P2P solution either requires users to have their cable modem pegged and nearly unusable for 60 days (80gb is the current best laptop drive size, most cable modems max out at 128kbps up), or that they backup only a fraction of their hard drive. I can't quite figure out how carrying a laptop drive around, full of your MP3's, which you can play on any computer you sit at, is any less convenient than either of those options.

  15. Nice idea, but by moonbender · · Score: 4, Interesting

    It's a neat idea. In a nutshell, he suggests a Peer to Peer encrypted storage network. You get exactly as much storage room as you are willing to offer yourself for others to use. When you store anything, it's encrypted and automatically spread to other systems.

    It doesn't make for a very safe backup, though: What happens if somebody decides to stop the service and just deletes his local storage? You've got no more backup at least for a while, and you might not even know it. And of course, other people have head crashes, too, which would also obliberate your backup at least for the time it takes to recreate it from your own data. Of course, by that time, you might have deleted it yourself, either by accident or knowingly, since you have a backup after all. A viable solution would be to store every file multiple times on different remote servers, although that'd lower the storage capacity you get. It's still the right step, though.

    The crucial problem is that the service provider can't really give any guarantees that you will be able to regain your lost data. With three or more independent copies in different locations, it's very unlikely that the backup won't work for some reason, but a backup that's not 100% is not a very useful one, especially in those situations where backups are really crucial.

    It's still a neat idea, and to my knowledge has not been done to that degree of sophistication. Of course, as others suggest, nobody is stopping you from inserting encrypted data into Freenet, but that's nowhere near as fast and secure as this could be. And while it's not a true backup, it's better than no backup at all, and most likely enough security for many persons.

    --
    Switch back to Slashdot's D1 system.
  16. similar already exists by ei4anb · · Score: 2, Informative

    http://www.csua.berkeley.edu/~emin/source_code/dib s/ which is open source and also http://www.hivecache.com/ which will be commercial 'real soon now'

  17. I still say Gmail... by plasticmillion · · Score: 4, Interesting
    Not to beat a dead horse, but Cringely seems like he was in a bit of a hurry to reject the Gmail solution. Wouldn't simple encryption solve the privacy problem? The Gmail text analysis is based on the assumption that the data is some kind of natural language text, so it would be baffled by anything else. Huffman encoding (or some other compression) would do the trick and save space besides.

  18. If Diablo 1 was in P2P by CrazyJim1 · · Score: 4, Interesting

    If your character data was stored on everyone else's computer, it would act like a virtual server, where if a few data sets get hacked, they'd be corrected by the whole.

    P2P can work in wild ways we haven't even tapped.

    too bad orrin hatch is trying to outlaw p2p:
    www.geocities.com/James_Sager_PA

  19. FolderShare - Pretty similar by hng_rval · · Score: 2, Informative

    Foldershare
    We use foldershare for peer-to-peer backup, but the catch is that you invite people that you trust to your libraries.

    For backup purposes, I only invite myself and just connect another computer to the account.

    --
    Thank you Mario! But our princess is in another castle!
  20. Sounds unfeasable. by dj245 · · Score: 2, Insightful

    How many times would you have to duplicate the data to ensure that no corruption (both intentional and unintentional) occurred? You would have to compare copies of the data to each other to make sure it matched. I wouldn't want my backup corrupt because some joker wrote Goatse.cx pictures to it a few thousand times. You would also have to store additional data in the event that people ran the program and then quit, taking your backup along with them. So maybe you would have 1gb backed up over the network, and 10gb of other people's crap on your computer. And thats assuming it ran on some sort of credit system where you only got to backup a percentage of what you allowed people to store. Otherwise hoarders would run rampant and take over the system.

    --
    Even those who arrange and design shrubberies are under considerable economic stress at this period in history.
  21. What BS. by Critical_ · · Score: 4, Informative

    I just went through Hurricane Ivan in Grenada. If you have been watching the coverage you should know that our island was completely destroyed. There is no water, no electricity, and no security. The university I attend (St. George's) lied to the students' parents about our situation. There were looters with guns and machetes threatening students. The first two nights we fended for ourselves with a large bonfire and homemade weapons, knives, pipes, etc. The third night we had 10 minutes to pack up and leave since we could see the looters lighting fires to apartment buildings on the road we were on. I quickly took the hard drives out of my two laptops (and the external drive I have), picked up a GSM roaming phone, any cash I had, a passport and two pairs of clothes. We ran to campus. Campus had about 200 male students lighting bonfires and running security teams to monitor the area. We chartered our own jet out of Grenada yesterday to Barbados which is where I am writing this from. My point is this: no one cares about data in this situation. No one wants to know about RAID or tape backups. If it came down to it, I would have ran with only a passport, a phone, and cash. We were worried for our lives and whether we had water or not, data was not our concern. People need a reality check. How many of you can claim that you went through a Category III or IV hurricane on an isolated island fending for their lives? Not many, so quite franly Cringely can go to hell.

    1. Re:What BS. by BlackHawk-666 · · Score: 3, Insightful

      Maybe you don't care about that data today, since the terror experience is still fresh, but you might care about it later. For example, assume that data was full of photographs of friends, deceased relatives, and other impossible to replace stuff. This backup scheme would've suited you even better than grabbing the hard drive, because you wouldn't even have had to do that.

      --
      All those moments will be lost in time, like tears in rain.
    2. Re:What BS. by Loualbano2 · · Score: 3, Insightful

      I am not trying to minimize your experience with Ivan, so please don't take this comment as such. The story you posted sounds crazy as hell and I wouldn't wish such an episode on anyone except my worst enemies.

      I do believe you reacted a little emotionally, which is understandable given your current situation. I think that if you look at the article again, you will find the only reason he mentions hurricanes is because Frances news reports before the fact got him thinking about it.

      That being said, I don't think Crigley was trying to insinuate that someone in a situation such as yours should or could worry about data. The point I took away from the article is that a person wouldn't need to worry about data at all under any disaster circumstance if you implement a system such as the one he proposes.

      I think that if you look at it like that, you will agree that he is not trying to discount the gravity of your experience.

      -ft

    3. Re:What BS. by kwerle · · Score: 2, Insightful

      OK, your life sucks right now, and I'm sorry.

      You're a student. You've accumulated less than a decade? worth of useful data. Depending on what that data is (scientific data hard to reproduce, personal writing (books/plays), scientific data easy to reproduce, or highscores on minesweeper), that data may have a $ value from 0 to maybe 10s of thousands of dollars (which means time). Small companies that have been in business for just a few years can have data that is worth millions of dollars. Ask your nearest bio prof. how much their personal genetic testing database is worth in time, effort, and money.

      Second,
      I quickly took the hard drives out of my two laptops (and the external drive I have), picked up a GSM roaming phone, any cash I had, a passport and two pairs of clothes... We were worried for our lives and whether we had water or not, data was not our concern.

      It seems that data was your concern.

  22. Re:idea by rusty0101 · · Score: 3, Informative

    That depends upon what you consider 'better'.

    Large businesses have a scheduling process and hire people to swap tapes, move tapes in and out of the various facilities, rotate tapes, and replace tapes that are no longer reliable. This process is done on a 24x7x365 (plus leap days) basis. Most of the data is actually being backed up via tape silos and 'robots' to handle the actual tapes while the various backups are hapening, but it is still a significant investment in people.

    A small business may be able to get away with burning a CD-R or CD-RW every night with that days transactions, and a small stack of CD-R (or RW) every weekend which they take home and store in a CD spindle in their freezer, or something. Though I think you would be hard pressed to find a small business that actually does that. (I am sure there are some that do.) Monthly or quarterly they should be taking a spindal of archived data to a remote relative's place to provide further archival of data.

    Mid sized businesses are in a bit of a quandry. The number of tapes needed for a good backup is more than anyone really wants to haul around, handle and store at home, but they are not sure it is worth the expense of using a comercial off-site backup for either.

    A project like this may be just what they are looking for. No tapes or disks to try to keep track of. Everything compressed and encrypted, so it is reasonably secure. Retreival can start as soon as the replacement system is ready to start retreiving it.

    I personally think it should be trialed only as a suplement to some other backup strategy, but even then, someone would decide it was either too much of a hassle, or not reliable enough.

    There are even people here who think it is 'reasonable' to haul around 160 or 250 Gig hard drives to backup their critical data.

    -Rusty

    --
    You never know...
  23. Poorly Thought Out by Naeleros · · Score: 4, Informative

    This idea is poorly thought out. It has a couple of *major* flaws, imo.

    #1) It doesn't recognize the reality of the complexity of backup software. Kinda easy to gloss over 'automated' backups without ever describing it. Pretty hard to imagine some piece of software that can universally back stuff up on everyone's hard drive and at the same time be very easy to use. Imagine mom/dad trying to use software with similar capabilities to Veritas BackupExec isn't easy. And.. imagine the wide variety of live files and databases that it wouid have to handle.

    #2) Data integrity. He suggests a 1:1 ratio for backup space. Not hardly. How is he going to have any kind of redundancy with that? Crashes and people unsubscribing will happen all the time. The data would have to have a *lot* of tolerance to that.

    A parity solution wouldn't be nearly enough. That assumes that only 1 failure at a time happens (using RAID 5 as my basis here). It would be easy to imagine that one person unsubscribed with part of your data and another had a crash or corruption problem.

    So.. complete mirroring would be necessary. Again, its easy to imagine 2 people's system going offline at the same time.. so, you'd probably need more than 2x Mirror. At this point... how much is enough to ensure reliability? 3x 4x 5x ? ? ? How much do you trust your average netizen?

    So.. pick your number and then divide your backup space by it. Like 5x? Add 10GB and you have 2GB usable storage. Not very good.

    I'll just skip over the 'auto backup' of people's 40GB storage over a 128K up line for now.. already typed too much...

  24. Already a commercial product by YetAnotherName · · Score: 3, Interesting

    A company called 312, Inc. already has a commercial product for P2P backups called Lean On Me.

    I don't work for them, etc.

  25. DIBS by wan-fu · · Score: 3, Informative

    Cringley is adding nothing new here. We've all already seen this on Slashdot. Hell, the websiteeven mentions how it's like P2P but not.

  26. Fud? by broothal · · Score: 2, Insightful

    I lost interest in what this guy has to say when I read this:

    "But while it might be easy to use Gmail for offsite backup, I couldn't bring myself to do that just because of the intrusive nature of Gmail. Remember this is a system that is by invitation only, which means that Google can quickly map a social network establishing who knows who. And since Gmail actually analyzes the content of your e-mail and can automatically group it by subject (how creepy is that?), Google not only knows who your friends are, but what do you talk about with those friends."

    I nominate this to the prestigious "Fud of the week" award.

  27. Solution looking for a problem by FullMetalAlchemist · · Score: 2

    I did some research into this on my B.Sc. thesis, in essence it's a solution looking for a problem.

    The thing is, you want backups because you want to be able to get it back, with this (and my idea) you have little control over the backup; in short words, it's not a backup.

    FreeNet may at a first ignorant glance be a solution to this dilemma, however, you still have the same terror of doubt. Because you're not in control!

    To summarize, there is a difference between not wanting to lose something, and wanting something.
    If you don't get something you want, it hurts, if you lose something you need, it kills.

    Control is everything, even if you have a 50% success rate and you know it you'll be quite happy. You will not like a 60% +/-40% success rate.

  28. Cringley's been reading my posts! by Anonymous+Cowdog · · Score: 2, Interesting

    I've also been suggesting this for years. I'm too lazy to search for the older posts, but here is one from July:

    http://slashdot.org/comments.pl?sid=115027&cid=974 3518

    Of course what matters, though, is not talking about ideas, but *doing* them.

  29. Pricing by duvel · · Score: 2, Interesting
    Cringely writes: Apple, for example, will let you mount up to a 100 megabyte iDrive as part of its .mac Internet service, but that costs $99 per year. Eight dollars per month for 100 megabytes of storage is too darned much.

    The company I work for (banking) sells storage for 120 euro per gigabyte per year to our internal clients. That's storage on RAID-disks (think StorageTek and the like), including backup (on tape) and all necessary services (people doing maintenance, restoring backups, etc). 120 euro / gigabyte / year comes to 1,22 dollar / month / 100 megabytes (compare to 8 $ per month with Apple). Considering our 1,22 $ plus some network costs, plus maintaining a billing system for a couple of million clients, and a bit of profit margin, maybe 8 $ per month is not a rip-off.

    --

    I have a photographic memory for numbers. I know almost a hundred of them.

    1. Re:Pricing by legirons · · Score: 2, Insightful

      "Cringely writes: Apple, for example, will let you mount up to a 100 megabyte iDrive as part of its .mac Internet service, but that costs $99 per year. Eight dollars per month for 100 megabytes of storage is too darned much."

      For that $100 per year, you could buy 3 128"MB" USB-keys that give you more storage space, have faster copy-times from your computer, and have 3 times the redundancy as the network-storage option. They're small enough to post to a friend in a different location if you want (cheap, and all backups are encrypted?) or even to hide them around your city if you need security against raids (no pun intended).

      But remember, you can't swallow them if you get caught...

  30. Is it really necessary? by Ars-Fartsica · · Score: 3, Insightful
    How much data do you *really* want backed up? I have lots of MP3s ripped, but I have "backups" on CD. The OS and prtograms I can always reload. That leaves me with about five megs of my own data I do not want to ever lose. There are dozens of free repositories that will handle this.

    For larger, business-driven uses, you probably want something like DataSafe. They will keep media for you in a very safe place. Or better yet, keep your whole business disaster protected -have more than one live site for IT operations.

  31. I'm sorry chuck... by Anonymous Coward · · Score: 3, Informative

    I would have moderated you into oblivion given the chance.

    I genuinely feel for you and your struggle for safety given the recent events, and you have my deepest sincere sympathy...

    But that is not what this article is about. And how about this, given the chance to either leave my data behind or fend for myself given those circumstances...I'd stay with my data.

    Perhaps your data isn't a life or death matter to you, but my stacks of CD's, DVD's and harddrives with the past 15 years of my writing, graphics, and (most importantly) my recording sessions....over 500gb by now probably...it is indeed worth it for me to ensure it is safe. Even under such circumstances. The very thought of that data no longer existing is sickening to me...

    No to undervalue your experiences at all. I mean that genuinely. But this article was about data backup--a form of backup that would have saved you even more time in your race to protect your neck.

    I fail to see how this is informative to the topic at hand when all I see is someone poo-pooing a genuine concern with a slightly related story.

    I'm willing to bet far more slashdotters than just myself value their data as much, if not more...risk life and limb for it? I probably would...it is just that important to me....which is why I would want to back it up in the first place.

  32. Peer-Redundant File System by cwolfsheep · · Score: 2, Informative

    When I was working at a factory last year, I was part of an IT team supporting 1000+ PCs. An idea I thought of, but haven't had much time or chance to flesh out, was a "peer-redundant file system," whereas all those computers could have background hosts serving up a specified amount of space for use by anyone on the same network. The space would be treated like a block of sectors on a network-based drive, allocated by a master server, and made redundant through a desired number of hosts (anytime data gets posted, it should go to at least one random host, plus any more needed for redundancy). As people leave systems on, or turn them off, their shares could be updated by peers or the master server, and be able to sustain the desired space with as few as 1/3 hosts. Using the space would be easy: all client systems would have the same mount or drive letter, with the background software managing the behavior of the drive.

    This situation solves two problems: one, having a network file share run out of space; two, a need for redundant backup. I suspect it could be done using exisiting peer-sharing software as a core.

    --

    Life is irony, and nothing ever goes as planned.
  33. Baxter for Video Activists by lo_fye · · Score: 2, Interesting

    I was thinking about something like this for video activists who frequently have their tapes/discs confiscated by the cops. It'd be great if they had PocketPCs with webcams that were operating in a baxterian sort of way such that the video they were taking was simultaneously being recorded to the storage of other activists/media within wifi range. You could have wifi NAS (network storage) in vehicles and apartments surrounding the demonstration area, as well as on ipod-level storage in future wifi enabled pocketpcs. 3G cameraphones with hard drives might provide another simpler option, if they could be networked together in a p2p fashion. The cops might be able to confiscate my webcam and pocketpc, but my recordings (and proof) would be elsewhere in the aether.

    --
    geeks are cats who dig a certain kind of cool
  34. Already done by Afty · · Score: 3, Insightful

    There are several research groups doing work on distributed P2P backup systems. I know there's a group at MS doing this, as well as a group at MIT (http://catfish.csail.mit.edu/~kbarr/pstore/), and several others that don't come to mind offhand. I did a project on this in grad school, so I'm familiar with the research.

    There are a lot of issues here, mostly centering around the fact that you can't trust people in an open P2P network.
    1) They might look at your data.
    2) They might not be online when you want your data.
    3) They might delete your data, or do other malicious things to it (insert viruses, etc.).
    4) They might freeload by using space on other hosts and then deleting all the data they receive.
    5) If a host leaves the system permanently, you need to detect that and replicate its data somewhere else. Also, how do you know whether it's leaving permanently or just logging off for a while?

    #1 is easy, just encrypt the data. #2, #3, #4, and #5 are hard because data integrity is really important in a backup solution. You end up having to replicate the data all over the place to "ensure" that it'll be available when you need it, but then you've got the problem of having to donate more space than you receive to use the system. Plus, it's still not certain that your data will be available when you need it.

    Basically what I'm trying to say is that it's a hard problem. :)

    1. Re:Already done by burns210 · · Score: 2, Insightful

      So your data on the DRAID (distributed RAID) is encrypted with your public key, so that only you can decrypy it.

      The system should have redudant locations. Similar to the GFS(Google's Filesystem) that has 3 copies of every piece of data(on different computers), for just that reason.

      The system should require that you have 1-3 times as much on your system(that is other's data), that you have on other people's computer.

      The system should not have a user's data stored on a single computer, rather each file or group of files are on an array of computer, such that a large portion of your data is available, even if a given node is offline.

      This would require that for every meg you uploaded, it would be 2-3 megs on the DRAID. For 100 megs of documents, there would be 300 megs worth of copies of those documents, distributed on a dozen+ system. Because of this, you would need to share(host other's files) more than you upload.

  35. Lots of other projects by mrm677 · · Score: 3, Informative

    Nothing new here. Check out Berkeley's OceanStore project for an idea of a global storage solution impervious to local disasters.

  36. Pastiche by bloo9298 · · Score: 2, Informative

    Looks like he might like Pastiche.

  37. Prior art by isomeme · · Score: 2, Interesting

    An equivalent idea was proposed in about 1982, at the dawn of the internet. Simply tar your filesystem, then email the tar to yourself along a lengthy old-style routing chain. If you need your data back, just wait for the email to arrive and untar it. You could tune the recovery latency by adjusting the routing chain. Of course, over dialup uucp, even one-node-out-and-back path could result in a two day latency.

    Man, those were the days.

    --
    When all you have is a hammer, everything looks like a skull.
  38. How about 1.2x? by roystgnr · · Score: 2, Informative

    Error correction gets a lot more sophisticated than checksums, you know. You can make a Reed-Solomon codec for 8-bit code words with 255 byte encoded blocks having any even number of parity bytes, and the way optimal RS codes work is that you can recover the original data as long as the number of missing code words plus twice the number of corrupted code words is less than the number of parity code words you chose.

    So, you divide your data into chunks 225 bytes long. Each byte in a chunk goes to a different peer, and each of the 30 parity bytes also goes to a different peer. Then, even if a dozen peers have simultaneously unsubscribed or crashed and their shares haven't been replicated on new peers yet, you can still recover all your data from the shares that remain.

  39. Re:Queue Linus Quote in..... by SirTalon42 · · Score: 2, Funny

    What if the martians invade the moon? WHERES YOUR DATA NOW! PUNK!

  40. Maybe he should check out DIBS by obi · · Score: 2, Informative

    ... the Distributed Internet Backup System http://www.csua.berkeley.edu/~emin/source_code/dib s/