MojoNation ... Corporate Backup Tool?
zebziggle writes "I've been watching the Mojo Nation project off and on over the last couple of years. Very cool concept. While taking a look at the site recently. They've morphed into Hive Cache a P2P corporate backup solution. Actually, it sounds like a great way to use those spare gigs on the hd."
50 PC's in your Intranet, each with a 20GByte disk. Thus your backup need is a cool 1000 GByte, if the disks are all fully filled and fully backed-up...
For this concept to work you can see that you need to exclude every copy of Dos95/Office from being backed-up. The basis of P2P is the the service users are also the service provides, thus every participating node needs free HD space. Depending on the crypto overhead and your non-backup portion, you still need a lot of free space for this concept. What is the added value above a reduntant RAID server? Is the total cost of ownership really lower?
MojoNation proposed an awsome concept with their virtual P2P credits. However, this idea seems to suggest that P2P technology increases you HD size, it does not!
Just my 5 EuroCents,
J.
For security reasons we absolutely want to encrypt and sign everything stored on the other computers. There is nothing tricky about this part, the usual cryptography can be used without modifications. This is not going to waste any significant amount of storage space or network bandwidth. But it will require some CPU cycles.
The other not so trivial part of such a system is the redundancy. Reed-Soloman would be one type of redundant coding suitable for the purpose. Parchive also uses this coding.
I know some implementations are limited to at most 255 shares, but for performance reasons, it is probably not feasible to use a lot more than that anyway. I expect the Reed-Soloman code to be the most CPU hungry part of such a system.
We need to choose a threshold for the system, I see no reason why the individual users cannot choose their own threshold. If one user want to be able to reconstruct data from 85 shares, there need to be three times as much backing storage as the data being backed up.
The first approach to storage space would obviously be, that each user can consume as much as he himself makes available to the system. I'd happily spend the 10GB harddisk space needed for two backups of my 1.5GB of important data with a factor three of redundancy. This would if done correctly give a lot better security than most other backup solutions.
One important aspect you may never forget in such a system is the ability to verify the integrity of backups, I guess this is the most tricky part of the design. Verifying with 100% security that my backup is still intact would require downloading enough data to reconstruct my backup. However verifying with 99.9999999% security could require significantly less samples to be made. Unfortunately here the 255 shares can be a major limitation, the larger the number of shares gets the smaller the percentage of data we need to sample gets. I don't wanna do the exact computations right now, but if 18 randomly picked from the 255 shares are all intact, we have approximately the 9 nines of security that there are indeed 85 intact shares of the 255. So we have indeed limited the network usage by almost a factor of five.
If we want:
- Higher security
- Less network usage for verifications
- Good network performance even in case of a few percent of lost
shares
we need more than 255 shares of data. There is no theoretical limit to the number of shares, but the CPU usage increases.What the system also needs is migration of data as users join and leave the system, and a reliable way to detect users responsible for large amounts of lost shares. Creating public key pairs for each user is probably necessary for this. I think this can be done without the need of a PKI, a user can just create his key pair and then start building a reputation.
Do you care about the security of your wireless mouse?
No, you missed it. Click on the 'MojoNation' Hive-Hex tab and you will find a link to the LGPL sites of both the EGTProtocol and the MNET verison of MojoNation.
Here's how ADSM backsup
Clients are installed on the Hosts Enterprise Wide. These can be a mixed platform. AIX, HPUX,
Linux, Windows NT (Cough), Mainframe S/390 (VM)
and so on. The Host running ADSM server has a ton of disk space... a snapshot is taken across the
network to the ADSM server DISK from the Client filesystems to be backed up. The snapshot gets backed up to tape while the snapshot is taken of the next host.. and so on..
This works great across a fast network.
But.......
I use Amanda to backup my Linux Servers at home.
It works in almost the same manner.
Mojo Nation was conceived by Jim McCoy and Doug Barnes in the 90's. At the end of the 90's they hired hackers and lawyers and started implementing.
Their company, Evil Geniuses For A Better Tomorrow, Inc., opened the source code for the basic Mojo Nation node (called a "Mojo Nation Broker") under the LGPL.
During the long economic winter of 2001, Evil Geniuses ran short of money and laid off the hackers (the lawyers had already served their purpose and were gone).
One of the hackers, me, Zooko, and a bunch of open source hackers from around the world who had never been Evil Geniuses employees, forked the LGPL code base and produced Mnet.
Now there is a new commercial company, HiveCache. HiveCache has been founded by Jim McCoy.
BTW, if you try to use Mnet, be prepared for it not to work. Actually the CVS version works a lot better than the old packaged versions. We would really appreciate some people compiling and testing the CVS version (it is very easy to do, at least on Unix).
It would be really good if someone would compile the win32 build. We do have one hacker who builds on win32, but we need more.