Slashdot Mirror


The Secure Public Data Repository?

jducoeur writes "So Hailstorm has died an unlamented death. But the demand for the idea of an information repository isn't going to go away -- users demand convenience, and this would be convenient. So here's a timely question looking for wild speculation: how would a truly secure, public data repository work? How would your data be stored? Would it be centralized or distributed? How would you grant access to specific elements within it? What would the business case for running such an archive be? Maybe if we can come up with a good design now, we can head off the next inevitable bad one..."

15 of 175 comments (clear)

  1. Let me ask one question... by kjz · · Score: 4, Insightful

    Why does the repository need to be public? In an era of very powerful client machines, why must we have a centralized database to make this work? Systems like Napster and Gnutella have already demonstrated the ability of end-user machines to distribute data effectively (though not always efficiently.)

    I belive the safest route would be to avoid the publicly accessible, centralized data store and focus on what has worked so well for the Internet in the past: standard communications protocols. By leaving the data on individual systems, we minimize the risk of exposing vast quantities of personal information as an attacker would need to go after millions of machines in turn. It's possible, but it wouldn't be easy.

    1. Re:Let me ask one question... by crimoid · · Score: 4, Insightful

      Once mobile phones, computer, watches, toasters and everything else under the sun becomes net enabled the "powerful client" gets thrown out the window. The need then becomes one of availability. Needing to keep many of these gadgets "in sync" with one another (and your personal information) becomes hard. The easiest solution is one form of central repository, hence the "need".

      Now one might argue that in the future (present?) broadband will be able to allow everyone to "serve" their own information from their home PC (aka.. home server) but the infrastructure to do this in some sort of secure, standardized, highly-available way is more than "wouldn't be easy".

      For 99% of the population I'd imagine that their personal info would be safer in the hands of trusted professionals rather than residing on grandma's 486. The question will eventually come down to which professional do you trust the most.

  2. Data haven by Jaiden · · Score: 1, Insightful

    Cryptonomicon anyone? How about sealand? Seems this has been tried before. People like to hang on to their own data, but most aren't qualified to keep it secure (run a secure server, etc). The problem is that no one trusts any big organization to keep their data for them. Especially microsoft. Perhaps what we need is an open source distributed encrypted system. multiple mirrors on regular pc's all sharing the collective data set, and all encrypted.

    --
    this sig has been rated E for Everyone.
  3. No Way by rjamestaylor · · Score: 3, Insightful
    I will not have a single repository storing my information -- all my accounts and what not -- unless that repository is my brain. Period.

    Opposition to Hailstorm isn't an anti-Microsoft thing. As a matter of fact, most businesses want to have in their own domain the information provided by their customers, without a middle man.

    So, people (like me) and businesses (like mine) don't WANT a single repository, thank you very much. Forget this issue.

    --
    -- @rjamestaylor on Ello
  4. Hailstorm by igrek · · Score: 3, Insightful

    In fact, Hailstorm was desgned well enough. It's not perfect, but htat's not the point. The problem was not on technical, but on the business side. How do you persuade online businesses to use third-party repository? That's the problem.

  5. Public Repositories by Moonshadow · · Score: 4, Insightful
    Well, there's some newfangled thing like that today. It's called the "Internet" or something like that. Supposedly, anyone can put anything they want on there! Imagine that!

    Seriously, though, the Net is a public data repository. Each node is as secure as its sysadmins, and information can be public or private. It's publically accessable, and you can protect whatever you want to protect from the public.

    Best of all, it's a network, not a centralized, attackable, censorable entity.

    Wheel, re-invent, why?

  6. Why don't you ask the users? by Wonko42 · · Score: 5, Insightful
    Who demands convenience? I don't demand convenience. I *prefer* not having all my eggs in one basket. I like being able to choose which companies get to know which details about me. If I have a hard time keeping track of all my different passwords or user accounts, I'll write my passwords down and store them in a text file that's PGP-encrypted with a 4096-bit key and a passphrase that I know I'll never forget.

    I don't want to have to trust some company to store all my information for me. I also don't want to trust some open source project with that information. In fact, I *especially* don't want to trust an open source project with it. The only person I trust with my personal information is me.

  7. Re:Why Public... by sweatyboatman · · Score: 3, Insightful

    20 Miles from anywhere and it doesn't respect any court of law in the world... So thats what I call secure (Even from the DMCA).

    Except that they're not responsible to you for what they do with your data. They can look at it, parse it, copy it, distribute it. You store your neato new plans for a next generation personal mobility device on their servers and suddently you find a company called SLMovers that's beat you to the market with exactly your product.

    Hey! You can't do that! Oh wait. No. You can do whatever you want.

    Sweat

    --
    It breaks my pluginses, my precious!
  8. Earth Encylopaedia by Caltheos · · Score: 3, Insightful

    I'm not sure I feel about having a public repository for private information, at least not until cryptography/system design has reached a level where hacking into the data becomes impossible without destruction of the data (i.e. quantum crypto). There are already a lot of "Online Harddrive Space" websites out there and for users who don't care about who sees whats on there thats fine.

    I think it would be the the earth's best interest to create a distributed but moderated and indexed galactic encylopaedia where information from astrophysics, zoology, political structures, history the whole shabang was to be found from one place. I know google is close, but structure would be nice.

    --
    We've secretely replaced the Enterprise's dilithium crystals with Folgers crystals. Lets see if they notice.
  9. Software that users can run themselves by iabervon · · Score: 3, Insightful

    What we need is not for someone to run a public data store, because whoever runs it isn't going to be trusted by some people. What we need is a protocol for getting data from such a store with the identity information in email address form. Then the users can put their data on a machine they trust, either one provided by an ISP or something or one of their own.

    For example, web sites should be able to authenticate users with a client certificate that the client provides when creating the web site account. This client certificate can be essentially anything, so long as it is how the client wishes to be identified. Of course, the client will want to be able to use a different certificate later (if the first one expires), so what the client really is identified by is the certificate chain, which has to have the same name up as far as the self-signed root certificate, and have the same root certificate.

    With a scheme like this, users need only find a certificate authority (or create one), and have a way to "log in" with the CA in order to get a client certificate (probably one which expires rapidly).

    The server that acts as a CA can also act as a store for other data. Ideally, the browser would be able to fetch form entries from the CA automatically, in response to the user requesting it after logging in. So you could move to the "credit card number" field, hit the "fetch identity value" button, type "CCN" (or whatever you've called it), and the browser would do a HTTPS request with your client cert to get that value and fill in the field with it.

    For most people, the CA and data store may be AOL or something, but there's no reason that the CA couldn't be your own machine. While you're at it, you could set it up to recognize other certificates than your own and provide the information you want to make available to these people. If you have a suitable field available to the right set of people, this solves the instant messaging location problem.

  10. Solving problems that don't exist by version5 · · Score: 2, Insightful
    after nine months of intense effort the company[Microsoft] was unable to find any partner willing to commit itself to the program.

    Microsoft tried this and it didn't work because no-one wanted it. Why is there an Ask Slashdot story asking people to come with ideas for a product that has been unilaterally rejected?

    Here's my design idea: How would a truly secure public data repository store data? By not storing data! The whole point of a public data repository is to gather, track and sell marketing information. User convenience is a cover.

    --

    "It's Dot Com!"

  11. XML? Biometrics? why? by metalhed77 · · Score: 2, Insightful

    and why does it have to be XML? I think an SQL solution would be much more efficient and how exactly are you going ot encrypt all this biometric data? and if its stolen what do you use for authentication?

    me thinks this is a troll or someone reading one too many slashdoter posts that read (XML r00olz cuz IT F1x3s 7h3 int3rnet!

    --
    Photos.
  12. Types of personal data... by rusty0101 · · Score: 2, Insightful

    I think there are several different levels of personal data, which it makes sense to have different levels of security against.

    The lowest level of security would be unauthenticated attribution. i.e. someone quoting something I have written. You don't know if the quote is accurate, or even what the context is, so it would make as much sense for you to rely upon it as it would for me to encapsulate it in a gpg signature. One example would be a blog. While it is reasonable to assume that what you find in a blog is from the person attributed, it is rare indeed to find one gpg signed.

    Next up would be "for the record" personal data. This is data such as public keys, and personal data that I want publicly known. In this case the data should be stored in a manner that self corrects. gpg signing is only part of the solution, distributed storage similar to a raid5 storage of data across many disperse web servers, such that removing one server does not remove any data, and removing up to a fifth or potentially more of the servers would not prevent accurate data reconstruction, could be appropriate.

    From here we move into data that we do not want generally available, but may want to make available to specific people or groups of people. Examples include a wife making a grocery list available to her husband, my employer needing my home address, ssn, and bank account number (to insure that I am insurable, collect taxes, and pay me by direct deposit/debit, respectively.)

    Next up is data that I may want to maintain so that I can work with it as part of work, hobbies, or other things, that I do not think needs to be generally available, but would not be bothered if it were public knowledge. Raw un-filtered data, parts lists, etc.

    Then comes things like rough drafts of works I would like to publish, or incremental evaluations of results that are not complete. I don't know of an author around that wants to discover the second draft of their most recent book out on the internet. It could even cause them to be in violation of a publishing contract. Likewise research materials, general e-mail, personal diaries (not blogs) or journals. At this level you might find people questioning whether it is necessary to back up this data.

    The last level is for information that would be more expensive to be public than destroyed. Bank card PINs, Passwords, Private Keys, Love notes. At this level it may make sense to keep the specific data on a USB storage fob chained to your wrist, or secured by a program that maintains it's encryption key on such a device.

    I am aware of some people who would maintain that all data that you do not want to be publicly available should be encrypted. For a lot of people maintaining an encrypting infrastructure is beyond them. You or I might think it trivial to set up an encrypted file storage area using gpg, rsa, or mandrake, but then I doubt that my dad would be able to do so.

    Worse, the best known examples of private/secure local storage are easily broken into. For example you can encrypt documents, outlook.pst folders, and the like, only to discover that for $19.99 you can break into any of these files. (Even less if you can find and compile the code to break into these files yourself.)

    Until real security is made easily usable, and businesses and people begin to understand that just because they want to know something does not mean that they should be given or be able to purchase that piece of information, I think we are going to ultimately see more companies desiring to archive, and make public or available for purchase addresses for stars, embarasing gaffs of politicians, and people being fired for actions they unwittingly participated in before the rules saying that those actions are cause for termination are created.

    -Rusty

    --
    You never know...
  13. I dont like this news post by Edmund+Blackadder · · Score: 4, Insightful

    I hate it when questionable statements are presented as undisputed facts:

    "But the demand for the idea of an information repository isn't going to go away -- users demand convenience, and this would be convenient."

    I cant see anybody other than advertising agencies or aspiring dictators demanding a central information repository.

    And yet the news story suggests that consumers are demanding it. I really really doubt that. Any customer convinience can be achieved if the customer data is stored at his/her computer and is completely under his/her control.

    This may be an interesting issue but is worded in a way that loads the question. Slashdot editors should be more careful.

  14. Re:Ocean Store by willis · · Score: 2, Insightful
    OceanStore is much more than what you suggest. It's self-routing/self-healing/self-caching/self-everyt hing -- it's designed to make things as low maintenance as possible. There are processes to defend against compromise (a small but sig. number of corrupted/hacked hosts can't bring it down). There are oceanstore processes that look into the oceanstore and make optomization decisions. (introspection, I believe).


    Check it out -- AFS is good for corporations/etc, but Oceanstore is somewhat viable for _everything_.

    --

    there is no thing
    what else could you want?