Cringely's P2P Backup Idea
gewg_ writes "If Napster and Bit Torrent had a baby, would it Baxter?
As a follow-on to Cringely's
last column where he talked about having a backup strategy in the
wake of Hurricane Frances, this week he proposes a distributed RAID notion as a solution."
Queue Linus quote in 3....2....1....
:-)
Ah, there it is
-Chris
--an unbreakable toy is useful for breaking other toys--
Baxter is, of course, the famous IRC client for BeOS. (Hi, Seth!)
Get off my launchpad!
Depending on exactly what you have stored, millions of people may want to help you backup as soon as possible.
The coolest voice ever.
I think this is old news. Some people have been backing up the source code for viruses that they wrote on Kazaa for months now.
Buy Steampunk Clothing Online!
i had this idea a few months ago, wish i could have done something about it then, oh well.
Well, we leave the data where it belongs: in the proxy network where the processes live too. Still a bit incomplete, but maturing WebDAV and mountable slices forthcoming...
Just insert a bunch of data into the network.. record the keys and retrieve once a week then delete. That should keep the data retrievable from the network for a good while. Using two nodes would help. Plus everything is encrypted with some heavy shit.
:(
Or, just make a local-freenet on the company lan.. everything is encrypted and unretrievable without the proper keys, so it's very secure and it's distributed.. + FEC encoding.
That assumes freenet works, AFAIK it's still fucking broken. Ian Clarke is playing too much politics with the project and the only coder that really understands freenet (Mathew Toseland) is swamped with ideas, day after day.. it just gets worse and worse... The donations seemed like a good idea, but after watching the DEV list for the last 18 months, I realize it's a failed project
Skype Me! username: john_allen_mohammed
But on the serious side, the claim of using encryption to store data on someone's hard drive worries me. Let's say the encryption gets broken. Now you might get Aunt Nedda's cookie recipes, but then again, you might get BobCo's strategic investment plan for the next 6 months as well. I can see people signing up just for the chance to hunt through people's data.
is it a better idea than just backing up onto tape and putting in a safe place, like a vault, offsite, under the ground?
Cringley's not the first with this kind of idea. In fact, the Freenet Project already implements something to this effect. Although not specifically designed for reliable backups, the distributed caching algorithms essentially replicate data towards where it's most often needed, helping to improve network performance and creating copies of important data along the way so that it won't be destroyed if a central server fails. Obviously not a commercial solution, but very interesting.
From the article:
It sounds from every description like the solution is Linux-specific, but I'm sure it can be made to work with other UNIX variants, especially since Gmail, itself, runs on Apple xServe 1u boxes. Windows compatibility is unknown, but I'm sure someone will solve that soon.
I know, it's a little childish, but I get a good feeling when I see something small...even this little thing here...that thinks of other OS's first and Windows compatibility will be "real soon now" or something like that.
"Leo Fender was in a 'state of grace' when he designed the Stratocaster." -- Paul Reed Smith
Well, I lived through this storm, checking my PC upstairs to make sure nothing was going to damage it. If the storm was risking the roof flying off and my room becoming flooded, I would have taken out my hdd. This sounds like a brilliant idea.
;)
Hey, it beats trying to store data to gmail accounts!
mysql>SELECT * FROM users WHERE clue > 0
0 Rows Returned
Ideas like Cringely's will be impossible if the INDUCE Act passes.
Save Betamax is a national Congress call-in day this tuesday to oppose the INDUCE Act. It might be our last chance to stop this bill.
As a bonus, you can use it to transport data (eg. your mp3 collection) between places, or even use it to boot linux anywhere with much more space and document storage capability than Knoppix.
How about a raid RAID-0 system using the entire Internet?
Oohh yeah, baby!
Beware: In C++, your friends can see your privates!
It's a neat idea. In a nutshell, he suggests a Peer to Peer encrypted storage network. You get exactly as much storage room as you are willing to offer yourself for others to use. When you store anything, it's encrypted and automatically spread to other systems.
It doesn't make for a very safe backup, though: What happens if somebody decides to stop the service and just deletes his local storage? You've got no more backup at least for a while, and you might not even know it. And of course, other people have head crashes, too, which would also obliberate your backup at least for the time it takes to recreate it from your own data. Of course, by that time, you might have deleted it yourself, either by accident or knowingly, since you have a backup after all. A viable solution would be to store every file multiple times on different remote servers, although that'd lower the storage capacity you get. It's still the right step, though.
The crucial problem is that the service provider can't really give any guarantees that you will be able to regain your lost data. With three or more independent copies in different locations, it's very unlikely that the backup won't work for some reason, but a backup that's not 100% is not a very useful one, especially in those situations where backups are really crucial.
It's still a neat idea, and to my knowledge has not been done to that degree of sophistication. Of course, as others suggest, nobody is stopping you from inserting encrypted data into Freenet, but that's nowhere near as fast and secure as this could be. And while it's not a true backup, it's better than no backup at all, and most likely enough security for many persons.
Switch back to Slashdot's D1 system.
http://www.csua.berkeley.edu/~emin/source_code/dib s/
which is open source and also
http://www.hivecache.com/ which will be commercial 'real soon now'
Peer Pressure
If your character data was stored on everyone else's computer, it would act like a virtual server, where if a few data sets get hacked, they'd be corrected by the whole.
P2P can work in wild ways we haven't even tapped.
too bad orrin hatch is trying to outlaw p2p:
www.geocities.com/James_Sager_PA
God spoke to me.
I am going to release the first beta "real soon now" (sorry - my time is limited since I'm getting married next month), and there is currently an alpha version out for Windows and Linux at: http://www.pensamos.com/mmb/ The alpha version is a little rough around the edges, but I plan to smooth things out over time if there is enough interest. I welcome all feedback. Thank you.
-----
Free P2P Backup, Windows & Linux
Foldershare
We use foldershare for peer-to-peer backup, but the catch is that you invite people that you trust to your libraries.
For backup purposes, I only invite myself and just connect another computer to the account.
Thank you Mario! But our princess is in another castle!
How many times would you have to duplicate the data to ensure that no corruption (both intentional and unintentional) occurred? You would have to compare copies of the data to each other to make sure it matched. I wouldn't want my backup corrupt because some joker wrote Goatse.cx pictures to it a few thousand times. You would also have to store additional data in the event that people ran the program and then quit, taking your backup along with them. So maybe you would have 1gb backed up over the network, and 10gb of other people's crap on your computer. And thats assuming it ran on some sort of credit system where you only got to backup a percentage of what you allowed people to store. Otherwise hoarders would run rampant and take over the system.
Even those who arrange and design shrubberies are under considerable economic stress at this period in history.
would it Baxter?
I don't know, do babies Baxter these days? I mean they puke and shit and cry but when you talk about Baxtering I'm not too sure.
Oh you mean would the baby be Baxter?
Sorry, my fault.
Mother, do you think they'll like this sig?
I just went through Hurricane Ivan in Grenada. If you have been watching the coverage you should know that our island was completely destroyed. There is no water, no electricity, and no security. The university I attend (St. George's) lied to the students' parents about our situation. There were looters with guns and machetes threatening students. The first two nights we fended for ourselves with a large bonfire and homemade weapons, knives, pipes, etc. The third night we had 10 minutes to pack up and leave since we could see the looters lighting fires to apartment buildings on the road we were on. I quickly took the hard drives out of my two laptops (and the external drive I have), picked up a GSM roaming phone, any cash I had, a passport and two pairs of clothes. We ran to campus. Campus had about 200 male students lighting bonfires and running security teams to monitor the area. We chartered our own jet out of Grenada yesterday to Barbados which is where I am writing this from. My point is this: no one cares about data in this situation. No one wants to know about RAID or tape backups. If it came down to it, I would have ran with only a passport, a phone, and cash. We were worried for our lives and whether we had water or not, data was not our concern. People need a reality check. How many of you can claim that you went through a Category III or IV hurricane on an isolated island fending for their lives? Not many, so quite franly Cringely can go to hell.
This idea is poorly thought out. It has a couple of *major* flaws, imo.
#1) It doesn't recognize the reality of the complexity of backup software. Kinda easy to gloss over 'automated' backups without ever describing it. Pretty hard to imagine some piece of software that can universally back stuff up on everyone's hard drive and at the same time be very easy to use. Imagine mom/dad trying to use software with similar capabilities to Veritas BackupExec isn't easy. And.. imagine the wide variety of live files and databases that it wouid have to handle.
#2) Data integrity. He suggests a 1:1 ratio for backup space. Not hardly. How is he going to have any kind of redundancy with that? Crashes and people unsubscribing will happen all the time. The data would have to have a *lot* of tolerance to that.
A parity solution wouldn't be nearly enough. That assumes that only 1 failure at a time happens (using RAID 5 as my basis here). It would be easy to imagine that one person unsubscribed with part of your data and another had a crash or corruption problem.
So.. complete mirroring would be necessary. Again, its easy to imagine 2 people's system going offline at the same time.. so, you'd probably need more than 2x Mirror. At this point... how much is enough to ensure reliability? 3x 4x 5x ? ? ? How much do you trust your average netizen?
So.. pick your number and then divide your backup space by it. Like 5x? Add 10GB and you have 2GB usable storage. Not very good.
I'll just skip over the 'auto backup' of people's 40GB storage over a 128K up line for now.. already typed too much...
FreeNet was sold on a bunch of users for just that but quite simply no one is willing to dump hard drive space to random users out there.
However, I would use this sort of thing on an internal network because I directly control how much space is availible and I'd be able to, with adoption, access video from one of my three computers from a set-top-box in the living room and manage it as a single library. That's the sort of thing we need to be looking at, but unfortunately very few companies are officially designing network-aware set-top-boxes with DiVX decoding and hackers are left to design such things themselves. I had a similar tool for the PS2, but it was unusually flaky when it came to sending the decoded video over the network (since the PS2 could never decode the video itself, the application on my computer had to do it before hand).
A company called 312, Inc. already has a commercial product for P2P backups called Lean On Me.
I don't work for them, etc.
Cringley is adding nothing new here. We've all already seen this on Slashdot. Hell, the websiteeven mentions how it's like P2P but not.
I lost interest in what this guy has to say when I read this:
"But while it might be easy to use Gmail for offsite backup, I couldn't bring myself to do that just because of the intrusive nature of Gmail. Remember this is a system that is by invitation only, which means that Google can quickly map a social network establishing who knows who. And since Gmail actually analyzes the content of your e-mail and can automatically group it by subject (how creepy is that?), Google not only knows who your friends are, but what do you talk about with those friends."
I nominate this to the prestigious "Fud of the week" award.
I did some research into this on my B.Sc. thesis, in essence it's a solution looking for a problem.
The thing is, you want backups because you want to be able to get it back, with this (and my idea) you have little control over the backup; in short words, it's not a backup.
FreeNet may at a first ignorant glance be a solution to this dilemma, however, you still have the same terror of doubt. Because you're not in control!
To summarize, there is a difference between not wanting to lose something, and wanting something.
If you don't get something you want, it hurts, if you lose something you need, it kills.
Control is everything, even if you have a 50% success rate and you know it you'll be quite happy. You will not like a 60% +/-40% success rate.
Still in startup stages, but growing bigger, and with a solid technology core. Portable to any platform which can run Python.
Disclaimer: No, I am not an employee. Yes, I probably would not know this if I didn't know a couple of their employees.
I've also been suggesting this for years. I'm too lazy to search for the older posts, but here is one from July:
4 3518
http://slashdot.org/comments.pl?sid=115027&cid=97
Of course what matters, though, is not talking about ideas, but *doing* them.
...sounds like an idea a friend of mine had to back up data using Kazaa. He said he was going to archive and encrypt the data he wanted to back up and name it twohotlesbiansdoingeachother.avi share it on Kazaa with the knoledge that just about every pimply faced teen boy would be downloading it. When he needed his backup, he'd just search for the file.
He never actually tried it, so I don't know if it would have worked.
The company I work for (banking) sells storage for 120 euro per gigabyte per year to our internal clients. That's storage on RAID-disks (think StorageTek and the like), including backup (on tape) and all necessary services (people doing maintenance, restoring backups, etc). 120 euro / gigabyte / year comes to 1,22 dollar / month / 100 megabytes (compare to 8 $ per month with Apple). Considering our 1,22 $ plus some network costs, plus maintaining a billing system for a couple of million clients, and a bit of profit margin, maybe 8 $ per month is not a rip-off.
I have a photographic memory for numbers. I know almost a hundred of them.
For larger, business-driven uses, you probably want something like DataSafe. They will keep media for you in a very safe place. Or better yet, keep your whole business disaster protected -have more than one live site for IT operations.
I would have moderated you into oblivion given the chance.
I genuinely feel for you and your struggle for safety given the recent events, and you have my deepest sincere sympathy...
But that is not what this article is about. And how about this, given the chance to either leave my data behind or fend for myself given those circumstances...I'd stay with my data.
Perhaps your data isn't a life or death matter to you, but my stacks of CD's, DVD's and harddrives with the past 15 years of my writing, graphics, and (most importantly) my recording sessions....over 500gb by now probably...it is indeed worth it for me to ensure it is safe. Even under such circumstances. The very thought of that data no longer existing is sickening to me...
No to undervalue your experiences at all. I mean that genuinely. But this article was about data backup--a form of backup that would have saved you even more time in your race to protect your neck.
I fail to see how this is informative to the topic at hand when all I see is someone poo-pooing a genuine concern with a slightly related story.
I'm willing to bet far more slashdotters than just myself value their data as much, if not more...risk life and limb for it? I probably would...it is just that important to me....which is why I would want to back it up in the first place.
First of all, I believe it was in reference to hurricane Frances, but I guess that's trivial information.
Secondly, who says that you are going to be with your data?
Many large scale companies no doubt had massive amounts of important data stored down in areas hit by the hurricanes. Though I'm sure that people that worked and lived in these areas could leave, I don't think the systems used to store data were easily transportable.
If they are such large scale companies they should have the ability to backup their data on their own.
What about medium sized businesses though? A business that isn't small enough to take all of it's data out of harms way, but isn't big enough to provide a solid backup system for it's data. I think this would be a good alternative (cost-wise, security, guarantees you'll be able to restore data, and other concerns aside)
--Information Belongs To The World--
I've been using Gmail as a repository for backedup source files. Simply rar, self-email and forget. 1GB offsite storage is perfect for a smalltime developer in search of a cheap backup solution.
Encrypted, distributed, obscured as there no good way to find data unless you know its key..
Too bad its still too slow..
---- Booth was a patriot ----
Is it just me, or did this poster sound like some 1930's colonialist complaining about how 'the natives' got out of control?
You want to go to play-school and take advantage of incredibly low living costs due to enormous depravity between what you hold in your wallet and what the average local makes- you'd better not complain when law breaks down and you suddenly find yourself more wanted than a sugar cookie next to an ant mount.
Funny thing- when one of your two laptops is worth several times more than what the average Grenadan makes in a year, and law breaks down- all those people who smiled at you every day suddenly want to beat you for your money and any food you might have. Huh. Interesting.
We chartered our own jet out of Grenada yesterday to Barbados
Wow. Too bad you took all that medical knowledge that could have been used to help people, and skipped the fuck outta town. So you can afford to charter your own jet- and you want sympathy from people? Did you happen to notice the thousands of tin huts smashed flat (or missing entirely) through your plane window? They're going to have massive problems with disease and famine- and you all just left, despite having medical training that could have saved lives.
Please help metamoderate.
Cringely is just looking for an excuse to be clever, a fluff-piece space-filler.
..... It's a RAID system using donated disk space on a wide area network."
1. He starts by saying he can't use gmail because of privacy. Duh, can you say "encryption"?
2. He also gives a privacy complaint because gmail knows who you associate with, through the chain of invitations.
Bullshit. There are lots of people on the web offering anonymous invitation URLs.
3. Savor this contradiction:
"First, it is for BACKUP, so recovery has to be slow enough so people won't think of it as another hard drive
Let's see: a rai*D* system that MUST be slow and NOT treated as a Disk.
This idea is simple redundancy, nothing more than a variation on Publius or Freenet.
Please, can we all agree to quash this nascent trend of calling everything RAID?
Plan 9's "Venti" works similarly, and is freely available.
There are also several commercial systems that do this sort of thing. It's only out of your control if you store your data blocks on someone else's machines. Doing this across several widely-distributed SANs in a large enterprise is a reasonable backup strategy.
When I was working at a factory last year, I was part of an IT team supporting 1000+ PCs. An idea I thought of, but haven't had much time or chance to flesh out, was a "peer-redundant file system," whereas all those computers could have background hosts serving up a specified amount of space for use by anyone on the same network. The space would be treated like a block of sectors on a network-based drive, allocated by a master server, and made redundant through a desired number of hosts (anytime data gets posted, it should go to at least one random host, plus any more needed for redundancy). As people leave systems on, or turn them off, their shares could be updated by peers or the master server, and be able to sustain the desired space with as few as 1/3 hosts. Using the space would be easy: all client systems would have the same mount or drive letter, with the background software managing the behavior of the drive.
This situation solves two problems: one, having a network file share run out of space; two, a need for redundant backup. I suspect it could be done using exisiting peer-sharing software as a core.
Life is irony, and nothing ever goes as planned.
Baxter? I barely even know her!
I was thinking about something like this for video activists who frequently have their tapes/discs confiscated by the cops. It'd be great if they had PocketPCs with webcams that were operating in a baxterian sort of way such that the video they were taking was simultaneously being recorded to the storage of other activists/media within wifi range. You could have wifi NAS (network storage) in vehicles and apartments surrounding the demonstration area, as well as on ipod-level storage in future wifi enabled pocketpcs. 3G cameraphones with hard drives might provide another simpler option, if they could be networked together in a p2p fashion. The cops might be able to confiscate my webcam and pocketpc, but my recordings (and proof) would be elsewhere in the aether.
geeks are cats who dig a certain kind of cool
Overview from the homepage:
OceanStore is a global persistent data store designed to scale to billions of users. It provides a consistent, highly-available, and durable storage utility atop an infrastructure comprised of untrusted servers.
Any computer can join the infrastructure, contributing storage or providing local user access in exchange for economic compensation. Users need only subscribe to a single OceanStore service provider, although they may consume storage and bandwidth from many different providers. The providers automatically buy and sell capacity and coverage among themselves, transparently to the users. The utility model thus combines the resources from federated systems to provide a quality of service higher than that achievable by any single company.
OceanStore caches data promiscuously; any server may create a local replica of any data object. These local replicas provide faster access and robustness to network partitions. They also reduce network congestion by localizing access traffic.
We must assume that any server in the infrastructure may crash, leak information, or become compromised. Promiscuous caching therefore requires redundancy and cryptographic techniques to protect the data from the servers upon which it resides.
OceanStore employs a Byzantine-fault tolerant commit protocol to provide strong consistency across replicas. The OceanStore API also allows applications to weaken their consistency restrictions in exchange for higher performance and availability.
A version-based archival storage system provides durability which exceeds today's best by orders of magnitude. OceanStore stores each version of a data object in a permanent, read-only form, which is encoded with an erasure code and spread over hundreds or thousands of servers. A small subset of the encoded fragments are sufficient to reconstruct the archived object; only a global-scale disaster could disable enough machines to destroy the archived object.
The OceanStore introspection layer adapts the system to improve performance and fault tolerance. Internal event monitors collect and analyze information such as usage patterns, network activity, and resource availability. OceanStore can then adapt to regional outages and denial of service attacks, pro-actively migrate data towards areas of use and maintain sufficiently high levels of data redundancy.
Many components of OceanStore are already functioning in isolation. A complete prototype is currently under development.
There are several research groups doing work on distributed P2P backup systems. I know there's a group at MS doing this, as well as a group at MIT (http://catfish.csail.mit.edu/~kbarr/pstore/), and several others that don't come to mind offhand. I did a project on this in grad school, so I'm familiar with the research.
:)
There are a lot of issues here, mostly centering around the fact that you can't trust people in an open P2P network.
1) They might look at your data.
2) They might not be online when you want your data.
3) They might delete your data, or do other malicious things to it (insert viruses, etc.).
4) They might freeload by using space on other hosts and then deleting all the data they receive.
5) If a host leaves the system permanently, you need to detect that and replicate its data somewhere else. Also, how do you know whether it's leaving permanently or just logging off for a while?
#1 is easy, just encrypt the data. #2, #3, #4, and #5 are hard because data integrity is really important in a backup solution. You end up having to replicate the data all over the place to "ensure" that it'll be available when you need it, but then you've got the problem of having to donate more space than you receive to use the system. Plus, it's still not certain that your data will be available when you need it.
Basically what I'm trying to say is that it's a hard problem.
Christ.
I just can't believe that. I guess there are even Slashdotters who can't turn down an unrelated fluff piece.
Seriously, is this an episode of "Bart's People"???
This outfit http://www.permabit.com/ are selling a commercial solution similar to whatis being proposed.
PAST is a large-scale, peer-to-peer archival storage facility very similar with Baxter. Content replication and distribution, fault tolerance and other major issues are discussed in the publications on PAST web site. And guess what: PAST has been around since 2001. And if you don't like PAST because it came off (ahem) Microsoft research labs, other /.rs mentioned a bunch of other similar systems (Mango, Pensamos, Freenet).
So, there's nothing really new and exciting in this Cringley's post. It's fine to post this in his blog (it's HIS blog after all) but not in /.. Morons that cheer whatever their idol spews out of his head should be kept away from /. if possible.
...the more I am struck by how stupid he is.
Backing up data is easy and cheap, as cheap as anti-virus measures for windows boxen... the fact is people cannot get either to work because they are LAZY and STUPID and most of all DETERMINED TO STAY THAT WAY.
There is an old saying about fools and their money being soon parted, I think there is also a modern corollary, "Fools and their data are soon parted."
With my work hat on, when asked to help a user with PC problems, I have long evolved a simple tactic, I ask one question.
"Is the data on this computer very important?"
the answer is usually "Oh yes!"
to which I reply "Good!"
(this usually puzzles the user in question)
I then proceed with a full format and virgin reinstall, this being usually the quickest way to fix a windows pc full of cruft, never defragmented, with a corrupt registry full of badly uninstalled software and magazine cover disk trialware, and of course littered with spyware malware and viruses...
Some time later when the user realises ALL their data has gone forever and queries it, I remind them that they were asked if their data was very important, and they answered affirmatively, and therefore they obviously had complete backups of this important data so I was free to fix the problem rapidly.....
I never have the same problem twice...
http://slashdot.org/~GuyFawkes/journal
Nothing new here. Check out Berkeley's OceanStore project for an idea of a global storage solution impervious to local disasters.
you'd better not complain when law breaks down... all those people who smiled at you every day suddenly want to beat you for your money and any food you might have. ... They're going to have massive problems with disease and famine- and you all just left, despite having medical training that could have saved lives.
Sooo... you want the locals to carve the students up with machetes before the students save the locals' lives, or after? Just curious.
Meanwhile, the medical school has been pumping foreign exchange into the local economy for years (not as much as you'd like maybe, but certainly more than zero), and from your attitude I get the impression that you'd be happier if it were located in Cleveland instead. In Cleveland, the students wouldn't be spending money in Grenada, the students couldn't possibly have the slightest chance of helping the locals with their medical training, and -- most tragic of all! -- the locals couldn't kick the hell of out of the students to punish them for putting money into the local economy.
The school's not responsible for Grenada's poverty. All it's doing is alleviating that poverty to some small degree. If some idiots with machetes chose to chase all the medical students off the island during a medical emergency, blame the idiots with the machetes.
"Disparity". "Depravity" means something else entirely.
Looks like he might like Pastiche.
invite and I will be happy© to test it out with a gig of baby photos
Si vis pacem, para bellum! For evil to succeed good men need only do nothing!
An equivalent idea was proposed in about 1982, at the dawn of the internet. Simply tar your filesystem, then email the tar to yourself along a lengthy old-style routing chain. If you need your data back, just wait for the email to arrive and untar it. You could tune the recovery latency by adjusting the routing chain. Of course, over dialup uucp, even one-node-out-and-back path could result in a two day latency.
Man, those were the days.
When all you have is a hammer, everything looks like a skull.
Glad you made it out OK. You got to think though the poor people left on the island with no anything of value mixed in with the looters and rioters.
People just don't grasp how thin the veneer of civilisation is. I've been through three riots, and I don't mean watching it on Tv from 30 miles away either. It goes from normal to MAN 0 MAN THIS REALLY SUCKS in a few minutes. People you might have been sitting next to in a restaurant the day before are now rampaging animals.
Anyway, again, glad you made it out and now you got a tremendous life lesson that most westerners never get, I hope the idea of survivalism and security and backups for everything-besides your data-EVERYTHING-has made an impression. Any area is one cataclysmic event away from normal to sheesh, and 99% of the people out there are as ill prepared as children. My friend who is a survivalist as I am just went through charlie then frances, he had the only generator in his neighborhood, the only stored water, the only stored food, the only functional equipment to deal with big trees down, the only fuel stash, etc. He's doing OK, his neighbors, hat in hand have to go be refugees. Your big screen home entertainment center is worth diddly squat in any emergency, wheras the same amount oif money would leave you with something of everything to deal with morphing reality. He even went through the looter scenario in florida-something not reported a lot on mainstream TV, but the scum come out of the woodwork when they smell opportunity. Luckily he is very well armed and trained, he definetyly had to use what he had, and the laws still allow a minimum of self defense. The local cops even thanked him, as he was able to help keep his neighborhood going and looter-free and they made him a local distribution point for ice, water and food, as he had his stuff together enough they used him so they could go deal with areas where no one was prepared at all.
Farsite. HiveCache. I even worked on a commercial offering: Mangomind (called Medley at the time). Some of these weren't positioned as backup solutions but, structurally, they're just like what Cringely describes. There have been many others, but I'll let people Google for themselves.
Slashdot - News for Herds. Stuff that Splatters.
Error correction gets a lot more sophisticated than checksums, you know. You can make a Reed-Solomon codec for 8-bit code words with 255 byte encoded blocks having any even number of parity bytes, and the way optimal RS codes work is that you can recover the original data as long as the number of missing code words plus twice the number of corrupted code words is less than the number of parity code words you chose.
So, you divide your data into chunks 225 bytes long. Each byte in a chunk goes to a different peer, and each of the 30 parity bytes also goes to a different peer. Then, even if a dozen peers have simultaneously unsubscribed or crashed and their shares haven't been replicated on new peers yet, you can still recover all your data from the shares that remain.
How in the fuck is this flamebait? Fuck you, mods.
During the early 2000's an idea like this had already surfaced during the much hyped Storage Service Provider (SSP) rush. While most companys like the now defunct StorageNetworks (NASDAQ:STOR) were just building massive terabyte clusters into CoLo's around the country one provider Digital Knox was creating a system very similar to the OceanStore concepts from Berkeley. The idea was not using P2P however since this required users to volunteer space. Simply put take the idea of a RAID array with parity and instead of drives think CoLo. Now that the data is spread across multiple centers having just one go down will not effectively kill it. The only draw back of course is time to recover the data which would be slower but far more resiliant to natural disasters (hurricanes, terrorist attacks, etc).
These ideas were published in a book, written by former CTO of DigitalKnox, "Fundamentals of Secure SAN" although the book isn't available yet. The biggest problem of course is the fact that most clients do not like sending their sensitive data to others. For this reason an additional layer of obscurity was added in the form of EFS. This would allow for non RAND type storage to remain secret even from the storage provider. More importantly it eased concerns that *other clients* of the storage service could somehow sneak a peek at their data.
The problems only multiply at this point since now key escrow and remote searching become an issue. The speed tradeoff seemed accetable to many but only for long term storage. The problem hasn't gone away obviously but the market dropped off the face of the planet. One of the only major survivors was Iron Mountain who not only stores your data online but will keep backup tapes in secure vault locations around the country.
Put one hard drive power supply in the Pelican case, use the other one with the hard drives to back your systems up. Even with my MP3 collection, I can still use one of these drives to back up my Macintosh and quite a bit of other stuff. Use the other drive to back up Windows and UNIX boxes, nothing fancy, mount the drive and drag entire filesystems over or tar them up and copy them over. Unmount drives from system, put into Pelican case, put Pelican case in gun safe. Backup systems as needed. I figure that if shit goes down and I need to bail on my house that I'm going to make a stop at the gun safe for a few items, so it's the natural place to put the hard drive case.
In case of bad things happening to to gun safe, retrieve weapons, passport, emergency cash and hard drives. Head out to car and head to safety. No fuss, no muss. Much easier than the idiocy that Cringely describes.
cheap labor conservatives - they want to keep you hungry enough to be thankful for minimum wage.
of my 1st published submission. 8-(
gewg_
>Wouldn't simple encryption solve the privacy problem?
Other folks have mentioned Baxter's 1:1 make-available/use ratio as inadequate and how, without redundancy, unsubscribers would be a weak point.
I think you're closer to right.
gewg_
Poorly-thought out and Cringely in the same sentence?
Say it ain't so!
but, unlikely to work in practise:
how mush redundancy should there be ? Two full copies sound like far too few. If 'R' is the number of redundant copies, then understand that every participant has to be sharing R*D bytes, where D is the average backup size. Plus, of course, their own personal data, so everyone's hard drive has to be at least three times the size of the average data set. For realistic backup strategies, ensuring that a full copy was online at any point in time, R would probably have to be much bigger than 2.
I don't know about everyone else, but on the machines that i use a lot, regardless of the attached storage, the storage is always about 80% full (of my data).
Finally, this just doesn't cut it for businesses: firstly the altruism, or lack of it: you mean I have to back up everyone elses data and make it permanently available online ? Secondly, while I suspect the privacy aspect is easily surmountable: no-one need have a full data set, and only the 'owner' need know how to assemble individual chunks. Businesses would not want to entrust eben their encrypted data to the unwashed masses.
-S
duplicity already allows trading disk space for backups with friends, or even people you don't know. It's safe (all data encrypted by gpg), it's low bandwidth (deltas sent using rsync algorythm), and it's not a business.
The hardest thing about duplicity right now is probably finding a similarly interested party to trade disk space with.
I trade duplicity space with someone I've never met who has a machine in the same colo, for a backup close to my coloed machine. I also use duplicity to send backups of the server home. I've be happy to trade duplicity space from 1 to 20 gb with most any interested and competant party.
see shy jo
What if they forget to backup the index that stores the location of each "chunk".
If their database breaks everybody looses their backups.
Sooo... you want the locals to carve the students up with machetes before the students save the locals' lives, or after?
Nice conflation of local(machete-wielding looting minority) with local(all local residents), there.
If some idiots with machetes chose to chase all the medical students off the island during a medical emergency, blame the idiots with the machetes.
I guess it's in the interpretation? Because in the story I read, there wasn't any chasing - just panicking, paranoid weapons-manufacturing, panicking, the sight of flames in the distance, panicking, rampant xenophobic assumptions as to the intentions of 'idiots with the machetes', more panicking and then a nice chartered jet ride out.
Who was wounded? Who had to flee a specific pursuer? Nobody. This conflict never happened, and its potential is even questionable.
A compnay called 321 Inc. has a product called LeanOnMe which does this. Based on JXTA. http://news.zdnet.com/2110-3513_22-5319920.html
It's pretty neat. I'm not worried about hurricanes per se. I really just wanted to back up some stuff on multiple computers and thought - why do I need yet *another* device to handle backups when I can just copy this information across all my computers? So, that's what I did. The app is pretty nice. I'm still in the trial period.
A professor friend of mine developed just such an algorithm that he believes will allow robust splitting, privacy, redundancy, and resliency to malicious attack. He just never connected the idea with P2P for distribution. I'm a moron for having not done the association for him.
The software is called Pryvit and the site is at www.privit.net ( cached ). A detailed description sits at their site.
Schneier might call it snake oil. I don't think so. If there are holes, it's because they haven't been vetted by cryptographers. The professor is open to formal critiques and papers. He's opened his source for those who would use it for free (as in beer) or analysis. He's a business professor with a keen interest in software and is leaning towards open source, if only he can find the right license.
For those in the know, this is Dr. Doug Lowry at Franciscan University in Steubenville, OH. Tell him he got his name on Slashdot. -ct
So, then, how is this better than using Baxter (or, worse, your competitor's buy-once-use-everywhere product) on my own private network, not paying you a monthly fee, and using the money I save to buy more hard drives so that the backup and restore isn't artificially slow?
small list of projects which are aiming to do the same:e rview.ht m le d_w ork.asp x
http://research.microsoft.com/sn/Fa rsite/
http://oceanstore.cs.berkeley.edu/info/ov
http://research.microsoft.com/sn/Farsite/relat
As has been mentioned this idea is quite old: Freenet, OceanStore. The biggest problem of your approach is the economics: convincing someone to give up storage in order to receive storage. "Samsara" is perhaps the closes project to this idea. In this system, one client gives up a block for storage by another client and gets a "rain-check" to use a block on the former's machine. The project was not well received by the research community b/c the economics were not convincing. Security is an issue, but can be solved with appropriate algorithms. Storage is cheap, perhaps the gmail approach with some privacy modifications is the best current public backup storage option.
Bram Cohen, creator of BitTorrent got the idea for BitTorrent because he was working on a p2p backup system. Its called mojonation:
m l?tid=156
;-)
http://mojonation.sf.net/
Or how about this article from slashdot in 2002 about mojonation:
http://slashdot.org/articles/02/07/18/0244256.sht
Excuse me if my response is off topic
-- -- --
Help my mini cause: My journal
also at
http://www.linux.org/apps/AppId_719.html
and downloadable from
http://mvb.saic.com/freeware/vmslt00a/net/shsecre
shsecret takes a file and splits it into N parts of equal size such that any M parts can be used to reconstruct the secret, but fewer than M will give absolutely no information about the secret. This program is written in strict ANSI C, so it should be completely portable. It is also hopefully simpler and more efficient than other implementations of the same algorithm.
Sam
blog.sam.liddicott.com
Yeah napster ruled ! Bit torrent rules !
Chris ,
Php Programmers.
I don't see why I should use a system where I'd have to deal with all the world's assholes mutilating data, running netlimiter (so getting your actual data is hopeless, even if they can provide checksums on it), deleting the program/back-up and whatnot.
;)
What I would like is a simple distributed system of friends, that could give me a self-synchronizing encrypted folder. I got lots of friends with broadband connections who'd be willing to lend me a gb or so for all the important stuff (documents, pictures etc.) All it takes is an easy-to-use program that'll do this in the background.
I'd say three friends should do fine, even assuming one is crashed, one is offline when I need the back-up. Since they're your friends, they probably won't fuck with you like in an open system. They give me a gig, I give them 3x1 = 3gig. With 500gb+ HDD space, I think I can afford that
Kjella
Live today, because you never know what tomorrow brings
I had something of a braindump along these lines a little while ago here - mainly the techie bits of how one might go about writing something like this without any of the users falling foul of UK legislation if any of the *other* users store material on their hard drives via the system. It kinda petered out due to lack of interest, though.
See http://parchive.sourceforge.net/ for an implementation.
Every month your P2P app reseeds offline blocks. The bittorrent style of scattering blocks among peers is perfect for this system. And you get an alert if parity is getting dangerously low.
This would be very successful I believe.
If you need text styles to communicate then you don't have a message.
But the code works within a drive to try and prevent bad blocks or noisy reads from corrupting your data, it doesn't work between drives. There are companies that do RS error correction between drives (one calls their scheme "RAID X") but I don't know if that's very widespread. You need a lot of independent places to put your data before RS makes sense; for a hard drive or CD where you can put data in hundreds of different physical locations on the disk, or for a P2P system where you can backup data on hundreds of different peers, this makes sense, but even people using RAID arrays are usually doing it with a dozen disks rather than a hundred.
Sounds like LeanOnMe, a JXTA-powered backup system with encryption.