The Secure Public Data Repository?
jducoeur writes "So Hailstorm has died an unlamented death. But the demand for the idea of an information repository isn't going to go away -- users demand convenience, and this would be convenient. So here's a timely question looking for wild speculation: how would a truly secure, public data repository work? How would your data be stored? Would it be centralized or distributed? How would you grant access to specific elements within it? What would the business case for running such an archive be? Maybe if we can come up with a good design now, we can head off the next inevitable bad one..."
Why does the repository need to be public? In an era of very powerful client machines, why must we have a centralized database to make this work? Systems like Napster and Gnutella have already demonstrated the ability of end-user machines to distribute data effectively (though not always efficiently.)
I belive the safest route would be to avoid the publicly accessible, centralized data store and focus on what has worked so well for the Internet in the past: standard communications protocols. By leaving the data on individual systems, we minimize the risk of exposing vast quantities of personal information as an attacker would need to go after millions of machines in turn. It's possible, but it wouldn't be easy.
Cryptonomicon anyone? How about sealand? Seems this has been tried before. People like to hang on to their own data, but most aren't qualified to keep it secure (run a secure server, etc). The problem is that no one trusts any big organization to keep their data for them. Especially microsoft. Perhaps what we need is an open source distributed encrypted system. multiple mirrors on regular pc's all sharing the collective data set, and all encrypted.
this sig has been rated E for Everyone.
Opposition to Hailstorm isn't an anti-Microsoft thing. As a matter of fact, most businesses want to have in their own domain the information provided by their customers, without a middle man.
So, people (like me) and businesses (like mine) don't WANT a single repository, thank you very much. Forget this issue.
-- @rjamestaylor on Ello
In fact, Hailstorm was desgned well enough. It's not perfect, but htat's not the point. The problem was not on technical, but on the business side. How do you persuade online businesses to use third-party repository? That's the problem.
Seriously, though, the Net is a public data repository. Each node is as secure as its sysadmins, and information can be public or private. It's publically accessable, and you can protect whatever you want to protect from the public.
Best of all, it's a network, not a centralized, attackable, censorable entity.
Wheel, re-invent, why?
I don't want to have to trust some company to store all my information for me. I also don't want to trust some open source project with that information. In fact, I *especially* don't want to trust an open source project with it. The only person I trust with my personal information is me.
20 Miles from anywhere and it doesn't respect any court of law in the world... So thats what I call secure (Even from the DMCA).
Except that they're not responsible to you for what they do with your data. They can look at it, parse it, copy it, distribute it. You store your neato new plans for a next generation personal mobility device on their servers and suddently you find a company called SLMovers that's beat you to the market with exactly your product.
Hey! You can't do that! Oh wait. No. You can do whatever you want.
Sweat
It breaks my pluginses, my precious!
I'm not sure I feel about having a public repository for private information, at least not until cryptography/system design has reached a level where hacking into the data becomes impossible without destruction of the data (i.e. quantum crypto). There are already a lot of "Online Harddrive Space" websites out there and for users who don't care about who sees whats on there thats fine.
I think it would be the the earth's best interest to create a distributed but moderated and indexed galactic encylopaedia where information from astrophysics, zoology, political structures, history the whole shabang was to be found from one place. I know google is close, but structure would be nice.
We've secretely replaced the Enterprise's dilithium crystals with Folgers crystals. Lets see if they notice.
What we need is not for someone to run a public data store, because whoever runs it isn't going to be trusted by some people. What we need is a protocol for getting data from such a store with the identity information in email address form. Then the users can put their data on a machine they trust, either one provided by an ISP or something or one of their own.
For example, web sites should be able to authenticate users with a client certificate that the client provides when creating the web site account. This client certificate can be essentially anything, so long as it is how the client wishes to be identified. Of course, the client will want to be able to use a different certificate later (if the first one expires), so what the client really is identified by is the certificate chain, which has to have the same name up as far as the self-signed root certificate, and have the same root certificate.
With a scheme like this, users need only find a certificate authority (or create one), and have a way to "log in" with the CA in order to get a client certificate (probably one which expires rapidly).
The server that acts as a CA can also act as a store for other data. Ideally, the browser would be able to fetch form entries from the CA automatically, in response to the user requesting it after logging in. So you could move to the "credit card number" field, hit the "fetch identity value" button, type "CCN" (or whatever you've called it), and the browser would do a HTTPS request with your client cert to get that value and fill in the field with it.
For most people, the CA and data store may be AOL or something, but there's no reason that the CA couldn't be your own machine. While you're at it, you could set it up to recognize other certificates than your own and provide the information you want to make available to these people. If you have a suitable field available to the right set of people, this solves the instant messaging location problem.
Microsoft tried this and it didn't work because no-one wanted it. Why is there an Ask Slashdot story asking people to come with ideas for a product that has been unilaterally rejected?
Here's my design idea: How would a truly secure public data repository store data? By not storing data! The whole point of a public data repository is to gather, track and sell marketing information. User convenience is a cover.
"It's Dot Com!"
and why does it have to be XML? I think an SQL solution would be much more efficient and how exactly are you going ot encrypt all this biometric data? and if its stolen what do you use for authentication?
me thinks this is a troll or someone reading one too many slashdoter posts that read (XML r00olz cuz IT F1x3s 7h3 int3rnet!
Photos.
I think there are several different levels of personal data, which it makes sense to have different levels of security against.
The lowest level of security would be unauthenticated attribution. i.e. someone quoting something I have written. You don't know if the quote is accurate, or even what the context is, so it would make as much sense for you to rely upon it as it would for me to encapsulate it in a gpg signature. One example would be a blog. While it is reasonable to assume that what you find in a blog is from the person attributed, it is rare indeed to find one gpg signed.
Next up would be "for the record" personal data. This is data such as public keys, and personal data that I want publicly known. In this case the data should be stored in a manner that self corrects. gpg signing is only part of the solution, distributed storage similar to a raid5 storage of data across many disperse web servers, such that removing one server does not remove any data, and removing up to a fifth or potentially more of the servers would not prevent accurate data reconstruction, could be appropriate.
From here we move into data that we do not want generally available, but may want to make available to specific people or groups of people. Examples include a wife making a grocery list available to her husband, my employer needing my home address, ssn, and bank account number (to insure that I am insurable, collect taxes, and pay me by direct deposit/debit, respectively.)
Next up is data that I may want to maintain so that I can work with it as part of work, hobbies, or other things, that I do not think needs to be generally available, but would not be bothered if it were public knowledge. Raw un-filtered data, parts lists, etc.
Then comes things like rough drafts of works I would like to publish, or incremental evaluations of results that are not complete. I don't know of an author around that wants to discover the second draft of their most recent book out on the internet. It could even cause them to be in violation of a publishing contract. Likewise research materials, general e-mail, personal diaries (not blogs) or journals. At this level you might find people questioning whether it is necessary to back up this data.
The last level is for information that would be more expensive to be public than destroyed. Bank card PINs, Passwords, Private Keys, Love notes. At this level it may make sense to keep the specific data on a USB storage fob chained to your wrist, or secured by a program that maintains it's encryption key on such a device.
I am aware of some people who would maintain that all data that you do not want to be publicly available should be encrypted. For a lot of people maintaining an encrypting infrastructure is beyond them. You or I might think it trivial to set up an encrypted file storage area using gpg, rsa, or mandrake, but then I doubt that my dad would be able to do so.
Worse, the best known examples of private/secure local storage are easily broken into. For example you can encrypt documents, outlook.pst folders, and the like, only to discover that for $19.99 you can break into any of these files. (Even less if you can find and compile the code to break into these files yourself.)
Until real security is made easily usable, and businesses and people begin to understand that just because they want to know something does not mean that they should be given or be able to purchase that piece of information, I think we are going to ultimately see more companies desiring to archive, and make public or available for purchase addresses for stars, embarasing gaffs of politicians, and people being fired for actions they unwittingly participated in before the rules saying that those actions are cause for termination are created.
-Rusty
You never know...
I hate it when questionable statements are presented as undisputed facts:
"But the demand for the idea of an information repository isn't going to go away -- users demand convenience, and this would be convenient."
I cant see anybody other than advertising agencies or aspiring dictators demanding a central information repository.
And yet the news story suggests that consumers are demanding it. I really really doubt that. Any customer convinience can be achieved if the customer data is stored at his/her computer and is completely under his/her control.
This may be an interesting issue but is worded in a way that loads the question. Slashdot editors should be more careful.
Check it out -- AFS is good for corporations/etc, but Oceanstore is somewhat viable for _everything_.
there is no thing
what else could you want?