Cringely's P2P Backup Idea
gewg_ writes "If Napster and Bit Torrent had a baby, would it Baxter?
As a follow-on to Cringely's
last column where he talked about having a backup strategy in the
wake of Hurricane Frances, this week he proposes a distributed RAID notion as a solution."
Well, we leave the data where it belongs: in the proxy network where the processes live too. Still a bit incomplete, but maturing WebDAV and mountable slices forthcoming...
Just insert a bunch of data into the network.. record the keys and retrieve once a week then delete. That should keep the data retrievable from the network for a good while. Using two nodes would help. Plus everything is encrypted with some heavy shit.
:(
Or, just make a local-freenet on the company lan.. everything is encrypted and unretrievable without the proper keys, so it's very secure and it's distributed.. + FEC encoding.
That assumes freenet works, AFAIK it's still fucking broken. Ian Clarke is playing too much politics with the project and the only coder that really understands freenet (Mathew Toseland) is swamped with ideas, day after day.. it just gets worse and worse... The donations seemed like a good idea, but after watching the DEV list for the last 18 months, I realize it's a failed project
Skype Me! username: john_allen_mohammed
I once encoded some data in a few MP3s... this was back in 2000. The MP3s were long speech files... about 30mb/file @ 160kbps and were popular, but took so long to transfer, so to propegate the 'new' files as quickly as possible I reduced the bit rate from 160kbps to 32kbps and added in the 'extra' 'noise' as I did this - as it's speech it didn't really matter.
If I do a search now they're easy to find, much easier than the original 160kbps were.
This was just a test, no special data used - but an amazing way to archive and distribute data.
--
It is not the commies, the government, the nigger, nor the corporates. It is your paranoia.
From the article:
It sounds from every description like the solution is Linux-specific, but I'm sure it can be made to work with other UNIX variants, especially since Gmail, itself, runs on Apple xServe 1u boxes. Windows compatibility is unknown, but I'm sure someone will solve that soon.
I know, it's a little childish, but I get a good feeling when I see something small...even this little thing here...that thinks of other OS's first and Windows compatibility will be "real soon now" or something like that.
"Leo Fender was in a 'state of grace' when he designed the Stratocaster." -- Paul Reed Smith
Well, I lived through this storm, checking my PC upstairs to make sure nothing was going to damage it. If the storm was risking the roof flying off and my room becoming flooded, I would have taken out my hdd. This sounds like a brilliant idea.
;)
Hey, it beats trying to store data to gmail accounts!
mysql>SELECT * FROM users WHERE clue > 0
0 Rows Returned
Ideas like Cringely's will be impossible if the INDUCE Act passes.
Save Betamax is a national Congress call-in day this tuesday to oppose the INDUCE Act. It might be our last chance to stop this bill.
I had this idea in about '97 or '98. I looked around to see if anyone else had done anything like this (remember, this is kinda pre-mass-P2P) and found that someone had done so, but on a business scale solution. I think it was called Mango, and is still in production today. It essentially made a portion of your drive available for a drive letter, then whetever was copied onto it could be seen by all. The data was stored in at least 2 places, so if one went down, there was still one copy, and the remaining copy would duplicate, so that there was always at least 2 copies. In the end, I think nobody went for it because it was too expensive... But this is EXACTLY what a lot of Small-Medium businesses need atm. Bring on the Mango's!
It's a neat idea. In a nutshell, he suggests a Peer to Peer encrypted storage network. You get exactly as much storage room as you are willing to offer yourself for others to use. When you store anything, it's encrypted and automatically spread to other systems.
It doesn't make for a very safe backup, though: What happens if somebody decides to stop the service and just deletes his local storage? You've got no more backup at least for a while, and you might not even know it. And of course, other people have head crashes, too, which would also obliberate your backup at least for the time it takes to recreate it from your own data. Of course, by that time, you might have deleted it yourself, either by accident or knowingly, since you have a backup after all. A viable solution would be to store every file multiple times on different remote servers, although that'd lower the storage capacity you get. It's still the right step, though.
The crucial problem is that the service provider can't really give any guarantees that you will be able to regain your lost data. With three or more independent copies in different locations, it's very unlikely that the backup won't work for some reason, but a backup that's not 100% is not a very useful one, especially in those situations where backups are really crucial.
It's still a neat idea, and to my knowledge has not been done to that degree of sophistication. Of course, as others suggest, nobody is stopping you from inserting encrypted data into Freenet, but that's nowhere near as fast and secure as this could be. And while it's not a true backup, it's better than no backup at all, and most likely enough security for many persons.
Switch back to Slashdot's D1 system.
Peer Pressure
If your character data was stored on everyone else's computer, it would act like a virtual server, where if a few data sets get hacked, they'd be corrected by the whole.
P2P can work in wild ways we haven't even tapped.
too bad orrin hatch is trying to outlaw p2p:
www.geocities.com/James_Sager_PA
God spoke to me.
Or for that matter, why not build encryption into the system itself, so that you don't have to manually do it.
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
A company called 312, Inc. already has a commercial product for P2P backups called Lean On Me.
I don't work for them, etc.
I've also been suggesting this for years. I'm too lazy to search for the older posts, but here is one from July:
4 3518
http://slashdot.org/comments.pl?sid=115027&cid=97
Of course what matters, though, is not talking about ideas, but *doing* them.
The company I work for (banking) sells storage for 120 euro per gigabyte per year to our internal clients. That's storage on RAID-disks (think StorageTek and the like), including backup (on tape) and all necessary services (people doing maintenance, restoring backups, etc). 120 euro / gigabyte / year comes to 1,22 dollar / month / 100 megabytes (compare to 8 $ per month with Apple). Considering our 1,22 $ plus some network costs, plus maintaining a billing system for a couple of million clients, and a bit of profit margin, maybe 8 $ per month is not a rip-off.
I have a photographic memory for numbers. I know almost a hundred of them.
I was thinking about something like this for video activists who frequently have their tapes/discs confiscated by the cops. It'd be great if they had PocketPCs with webcams that were operating in a baxterian sort of way such that the video they were taking was simultaneously being recorded to the storage of other activists/media within wifi range. You could have wifi NAS (network storage) in vehicles and apartments surrounding the demonstration area, as well as on ipod-level storage in future wifi enabled pocketpcs. 3G cameraphones with hard drives might provide another simpler option, if they could be networked together in a p2p fashion. The cops might be able to confiscate my webcam and pocketpc, but my recordings (and proof) would be elsewhere in the aether.
geeks are cats who dig a certain kind of cool
And oddly, Simpson Garfinkel, another well-known technopundit, submitted a very similar idea (P2P backup service) as a business plan to the MIT 50k competition back in 2002. See here for the entry summary (search in the page for Garfinkel). Anyway, I somehow dredged that up from the back of my brain when I saw this Cringely piece because I recalled that Garfinkel was interested in actually doing something like this several years back.
On the contrary, I'd say Auntie has a really strong case that she never had the key to someone else's encrypted data stored on her drive, so the RIP act would not apply to her.
I made a related waggish proposal a couple of years ago:
1. Make tarball of backup
2. Encrypt if desired
3. Encode tarball, 4-8 bytes at a time, in email addresses
4. Put email addresses on web
5. Wait for spam
Presto -- spammers now pay for your backup; anytime you have a disk failure, just wait a while and watch your spamcan or smtp log, and reconstruct your backup at will.
(Some assembly required, offer void where prohibited)
An equivalent idea was proposed in about 1982, at the dawn of the internet. Simply tar your filesystem, then email the tar to yourself along a lengthy old-style routing chain. If you need your data back, just wait for the email to arrive and untar it. You could tune the recovery latency by adjusting the routing chain. Of course, over dialup uucp, even one-node-out-and-back path could result in a two day latency.
Man, those were the days.
When all you have is a hammer, everything looks like a skull.
Maybe this would be good for some data, but I would never backup sensitive data on something like this. Nor would a lot of businesses.
I've been backing up sensitive data almost exactly like this for quite a while now. I've got an application that breaks a stream of data into chunks and encrypts them. It compares the md5 of the source block against the md5 of the same block from the previous backup. If they match, it hard links the block into the backup directory, if they don't match, it encrypts the block and stores a new copy.
The net effect is that I have about 2.5GB of full database dumps nightly that take up about 3MB of storage space (thus takes about 3MB to transfer it with rsync -H). I do this without having to store the unencrypted stream.
So yeah, it may be better to not send my GPG encrypted blocks to these other machines, but I trust GPG about as much as I trust my own network and applications keeping my data safe, so it's good for me.
I do the same thing with my DB dumps and my mail. I only have a 144kbps internet connection, so being able send my full dumps out nightly across it is a real win.
-- The world is watching America, and America is watching TV.