Slashdot Mirror


Ask Slashdot: Best Practices For Collecting and Storing User Information?

New submitter isaaccs writes "I'm a mobile developer at a startup. My experience is in building user-facing applications, but in this case, a component of an app I'm building involves observing and collecting certain pieces of user information and then storing them in a web service. This is for purposes of analysis and ultimately functionality, not persistence. This would include some obvious items like names and e-mail addresses, and some less obvious items involving user behavior. We aim to be completely transparent and honest about what it is we're collecting by way of our privacy disclosure. I'm an experienced developer, and I'm aware of a handful of considerations (e.g., the need to hash personal identifiers stored remotely), but I've seen quite a few startups caught with their pants down on security/privacy of what they've collected — and I'd like to avoid it to the degree reasonably possible given we can't afford to hire an expert on the topic. I'm seeking input from the community on best-practices for data collection and the remote storage of personal (not social security numbers, but names and birthdays) information. How would you like information collected about you to be stored? If you could write your own privacy policy, what would it contain? To be clear, I'm not requesting stack or infrastructural recommendations."

120 comments

  1. Just don't do it by sublayer · · Score: 5, Insightful

    Best practice from my perspective: do not collect the data at all.

    1. Re:Just don't do it by puterguy · · Score: 5, Insightful

      If you really feel the need to collect personal data and you *truly* care about the privacy concerns and needs of your customers, then don't go burying such disclosures in a privacy statement that the average user is unlikely to ever see let alone read.

      If you truly care about privacy, then either require the user to *opt-in* to such sharing or prominently display the lack of such privacy on the initial splash screen.

      Burying the collection of personal data in the middle of some lawyerly gobblygook privacy statement is like mortgage lenders burying key terms in the middle of 100's of pages of documentation. Yeah, it's legally there but no one is actually going to read or understand it.

    2. Re:Just don't do it by Anonymous Coward · · Score: 0

      That's good stuff. I think there's a lot to be said for being honest and normal with people. You don't have to be obnoxious about it, just tell me what you're doing, up front, in plain english. It'll earn you a little trust and appreciation from your users.

    3. Re:Just don't do it by hutsell · · Score: 1, Insightful

      Best practice from my perspective: do not collect the data at all.

      Exactly: "Put the Database down now, and step away from the Internet."

      Sorry, but my interest in giving beneficial doubt to the question's possible sincerity was lost when reading the part about the unoriginal solution for insuring honesty and transparency -- the solution being hidden in (the lawyer make-work terms of) "our privacy disclosure".

      --
      Yesterday's Weirdness is Tomorrow's Reason Why
    4. Re:Just don't do it by fm6 · · Score: 2

      So, Slashdot made a mistake in allowing you to create an account?

    5. Re:Just don't do it by c0lo · · Score: 2

      Best practice from my perspective: do not collect the data at all.

      More detailed:
      Rule 1. don't do it
      Rule 2. if for some reasons, rule 1 cannot be followed, collect them but discard them immediately
      Rule 3. if for some reasons, the prev 2 rules cannot be obeyed, after collection put them on a WORN storage (that is: "Write Only, Read Never" media)

      --
      Questions raise, answers kill. Raise questions to stay alive.
    6. Re:Just don't do it by philip.paradis · · Score: 2, Insightful

      Alternately, people could simply take responsibility for themselves and choose to avoid services which require agreement to miles of terms. Given your attitude on the topic, you probably haven't even bothered to read the terms of service for anything you're using right now. It seems you're trying to divert responsibility for yourself onto the backs of the service organizations you choose to deal with. Again, note the word "choose."

      You've also managed to miss the opportunity to discuss where data goes and how it's protected after it's submitted in the first place. Oddly enough, this is the essential question posed by the submitter in the first place, and regardless of what any given set of terms says, is actually the most important piece that very few people think about at all. In other words, you can trust an organization to high heaven based on what they say they will or won't do with your data, but if their infrastructure is a gaping mess of channels by which your information could get compromised, all of a sudden those terms don't mean much. I applaud the submitter for asking the right questions, and remind you to think more about your responses in terms of real wold data acquisition and retention mechanisms before posting again.

      --
      Write failed: Broken pipe
    7. Re:Just don't do it by davester666 · · Score: 5, Funny

      Yes, just store the data in plaintext, in a mysql database connected directly to the internet.

      Bonus points if you create mysql users for each unique user and use their username/password to authenticate connections to the database.

      --
      Sleep your way to a whiter smile...date a dentist!
    8. Re:Just don't do it by CodeBuster · · Score: 3, Interesting

      Whenever I'm signing up for a new site or using a service for the first time, I always do a recon of their sign up procedures using a fake name / email address so I can see what sort of information they "require" before I even get started and even then I only give up what I absolutely have to. If I can get away with using the fake information permanently, then I do that. I keep track of all my fake identities in an encrypted file container by site name so that I can be consistent with my aliases. This strategy works well for me and I'm sure that I can't be the only person out there who does this. As Robert De Niro's character, Jack Byrnes, said in Meet the Fokkers (paraphrased), "If you're outside the circle of trust, you're on a need to know basis and right now you don't need to know."

    9. Re:Just don't do it by Anonymous Coward · · Score: 0

      Use the NULL storage engine.

    10. Re:Just don't do it by Anonymous Coward · · Score: 1

      I hate to IANAL here but here goes:
      In your country of origin you have legislation that you have to prove compliance to should your respective government body find out if you are collecting user information. Personally identifiable stuff (Name, address, Phone Number, E-mail) is considered sensitive, Personally Identifiable sensitive stuff (Social Insurance, Health Records, Employment History, Criminal Records etc. ad. nosium,) comes with hefty legislation like HIPPA for each type of stuff. Again the parent here is correct by "Not collecting, saving or distributing anything"; you save yourself the headache of having to tell your respective government what you are collecting and how long you have to store it; never mind the fact that you need the storage and audit trail on whatever you do collect (e.g.; SOX). Europe and commonwealth countries take this VERY seriously the governments have armies of lawyers who's sole joy in life is to SUE some poor un-complaint (potentially unscrupulous) company into oblivion. Canada's Privacy Commission sued Google over it's data collection practices more than once; the EU has filed numerous complaints and these things are not cheap. AGAIN, IANAL but YMMV and you should hire one to CYA.

    11. Re:Just don't do it by rwise2112 · · Score: 1

      I was going to say "email it to Anonymous - they'll back it up for you too", but your method will be just as effective!

      --

      "For every expert, there is an equal and opposite expert"
    12. Re:Just don't do it by isaaccs · · Score: 2

      There were little to no details given as to how the privacy disclosure would be phrased or provided to users. As it were, your assumption is wrong. There is no desire to squirrel away anything in legalese. Indeed, the question asks: "If you could write your own privacy policy, what would it contain?". You describe the "hidden" (which you've assumed) solution as unoriginal, but provide no alternative suggestions (which was the point of submitting the question to the community in the first place).

    13. Re:Just don't do it by Eadwacer · · Score: 1

      I think it was Robert X. Cringely who compared personal user data to toxic waste. You don't ever want to produce it. If you do produce it, it's your responsibility forever because you don't know where an undiscovered drum of it is hiding. If it touches something, that something becomes toxic also. Finally, the legal implications of it getting out into public are capable of destroying your company.

    14. Re:Just don't do it by jittles · · Score: 4, Interesting

      >Burying the collection of personal data in the middle of some lawyerly gobblygook privacy statement is like mortgage lenders burying key terms in the middle of 100's of pages of documentation. Yeah, it's legally there but no one is actually going to read or understand it.

      When I bought my house, I spent about 3 hours at the title company reading and signing the mountain of paperwork. I would never commit myself to 30 years of anything without knowing and understanding the details. I will say that the notary was pissed. After 30 minutes she said "Are you really going to read the entire thing?" And later "I have an appointment, you're going to make me late." My responses were "Yes, I'd be stupid not to." and "You scheduled this entire block with me, its not my fault you double booked yourself, you'll have to cancel your other appointment."

    15. Re:Just don't do it by AwesomeMcgee · · Score: 1

      Hah +1, yeah I've come to take joy in the pain of those who present me legal documents for signing. They never expect you to read the bloody thing and always get all cranky about how long you're taking. Apt leases, car loans, new banking accounts etc every single person who's handed me one of these after about 30 minutes of me reading it just looks so dejected, plus all the questions I ask as I go, and sometimes even demand addendums. They shouldn't be handing out legal paper work for signature if they don't want to be party to a contractual negotiation.

    16. Re:Just don't do it by Anonymous Coward · · Score: 0

      "Best practice from my perspective: do not collect the data at all."

      Who rated this post "Insightful"? Why don't I see any rating buttons? I don't see how this post is "insightful" since it doesn't provide any reasoning.

    17. Re:Just don't do it by maxwell+demon · · Score: 1

      Alternately, people could simply take responsibility for themselves and choose to avoid services which require agreement to miles of terms.

      Unfortunately that would mean having no internet access (good luck finding an internet provider without a big list of terms and requirements).

      --
      The Tao of math: The numbers you can count are not the real numbers.
    18. Re:Just don't do it by nullchar · · Score: 1

      That's great and all, but what happens when you read page 200 and say "uh, I don't agree with this"?

      Response: "Sorry, no house for you!"

    19. Re:Just don't do it by maxwell+demon · · Score: 1

      Who rated this post "Insightful"?

      Someone with mod points.

      Why don't I see any rating buttons?

      Because you only get the option to moderate if you (a) are logged in (you cannot moderate as Anonymous Coward), (b) have enough Karma (which basically means your posts have been moderated up often enough, and certainly more often than down), and (c) happen to have some mod points (even if your Karma is high enough, you'll only get mod points every now and then, and if you don't use them, they'll expire in a few days).

      --
      The Tao of math: The numbers you can count are not the real numbers.
    20. Re:Just don't do it by jittles · · Score: 1

      That's great and all, but what happens when you read page 200 and say "uh, I don't agree with this"?

      Response: "Sorry, no house for you!"

      Well, if the document does not match the preview doc they sent you, or match the terms and rates that they promised you (you get that in writing before you get the contract), then they have to update the contract. There are some crooks out there that will tell you one interest rate and slip another into the docs. You really need to trust your mortgage broker. I used a friend's dad, thankfully. He was very helpful, and I knew him to be honest. He even used his commission on my loan to buy me some points as a house warming gift. Great guy. You just better hope you find someone else like that. If not, you're better to walk away from an unfavorable mortgage than to get screwed.

      Even though I trusted my mortgage broker, I double checked all the rates, fees, penalties, etc. 1) Because banks make mistakes, and also have crooks and 2) Because I planned to prepay the mortgage and did not want any fees for early payment, and I wanted my 5% interest rate as well. 1% makes a HUGE difference over 30 years.

    21. Re:Just don't do it by sapgau · · Score: 1

      Mod Up +1
      No matter what you want to do with PI you must check first that is legal, first on your jurisdiction (state or province), then your country (countries) where you expect your customers to reside.
      It doesn't matter what good intentions you have, it might not be enough to keep you out of trouble.
      For example, if your jurisdiction forbids you from keeping DOB then make sure you are clean.

  2. risk vs. investment tradeoffs by noh8rz10 · · Score: 4, Informative

    I think your mind is on the right track in identifying your resource limits (i.e. no tip-of-the-spear experts) and the sensitivity of the data (i.e., it's not all nuclear bomb codes). That is the first step. Next, think on the exact types of data that you're collecting, and try to group like data together, for example, all text data, screen caps, keylogging, audio or webcam video if you have it, and find a way to store them in an efficient structure while everything stays linked together. Finally, if possible, associate all data collection events with time (timestamp) and location (gps). this will allow a more complete analysis on the back end.

    1. Re:risk vs. investment tradeoffs by SomePgmr · · Score: 3, Insightful

      Finally, if possible, associate all data collection events with time (timestamp) and location (gps).

      It started getting a little creepy there at the end, bud. ;)

    2. Re:risk vs. investment tradeoffs by Anonymous Coward · · Score: 0

      associate all data collection events with time (timestamp) and location (gps)

      Why location? Why is everybody obsessed with location data these days?

    3. Re:risk vs. investment tradeoffs by Anonymous Coward · · Score: 0

      While you're at it, add some remote desktop feature. You never know when you may need to assist your users in using your app properly. You'll have to get them to install some sort of app exploiting an OS vulnerability for best effect though.

    4. Re:risk vs. investment tradeoffs by Anonymous Coward · · Score: 0

      What, you missed the screen caps and keylogging part?

    5. Re:risk vs. investment tradeoffs by Anonymous Coward · · Score: 0

      "Microsoft: Where are you going today? (never mind, don't tell us...)"

    6. Re:risk vs. investment tradeoffs by Anonymous Coward · · Score: 0

      If they know where you are:
      Lead them to a restaurant down the street.
      Steer them passed a store front and entice them.
      Figure out their normal routine and put stuff up in front of them.

    7. Re:risk vs. investment tradeoffs by noh8rz10 · · Score: 1

      Good point. White hat root kits!

    8. Re:risk vs. investment tradeoffs by asylumx · · Score: 1
      If you think that's bad, keep reading:

      this will allow a more complete analysis on the back end.

      He wants to analyze users "back ends"!!!

  3. Don't store the data. by micheas · · Score: 0

    Just don't.

    When you get the expertise to store the data securely then consider it.

    Once you get into the habit of justifying everything that you store you will be less prone to the woops! plain text password/username/real-name/creditcard table being found by intruders.

    1. Re:Don't store the data. by isaaccs · · Score: 1

      To start, I do appreciate the spirit of the comment - as a professional in a field, it's an argument I make often. But I don't totally agree in this context. It would proove extremely difficult, for example, to build a search engine such as Google without collecting or correlating user information. To build Instagram without collecting pictures (which I'd very much consider private user data/personal identifiers) might also prove vexing. The question wasn't "Should I collect user information?" but "How can I do something that I must do - popular opinion of the widespread practice not-withstanding - responsibly". You suggest that it should only be done when done by an expert: I am admittedly not an expert in securing data. I am an expert in software development, and this is now an area I need to begin to explore. To simply suggest that an ambitious tech startup "shouldn't" innovate in a space because they don't have the material resources to hire an established specialist on one of the myriad topics that goes into building a software product, is, to me, quite close-minded and defies the spirit of do-it-yourselfedness and indeed innovation that makes the startup space and tech sector so exciting to begin with.

    2. Re:Don't store the data. by sapgau · · Score: 1

      I didn't see gp post as condescending, I think he is trying to make the point of how serious private information storage is.

      I did cringe on his reference of putting password/username/cc in one table, even encrypted. I suggest to use hash values to replace those real values and mask CC numbers. So even if the encryption is broken a hacker would not be able to identify the person.

      Doing this doesn't limit your innovation in any way. It's actually a burden we all have to deal with to avoid a legal bomb landing on your lap.

       

  4. Let me have a login? by aliquis · · Score: 1

    Let me have a login for the benefit of having my data saved?

    If I don't log in then don't store my details.

    As for the rest whatever. Hash + salt or whatever?

    If no-one can reach / use the data for anything then maybe say just e-mail address or something such as identifier.

  5. I'm an experienced developer by Osgeld · · Score: 0, Flamebait

    I am not an experienced developer, but if I were I sure as shit would not be asking about it on slashdot, to be honest as I stand here now, if I needed some serious advice about any situation, I would not be asking on slashdot, I would be asking on a forum where people live and breathe the topic at hand like their lives depend on it, cause they are professionals, and not the peanut gallery of random trolls, tards, fanbois and neckbeards.

    1. Re:I'm an experienced developer by ThatsMyNick · · Score: 1

      Well, if you are looking for developer/legal opinions there are better forums, but if you want legal, developer and user opinion (and a discussion based on them), slashdot is not bad. Besides you dont really know that OP has not also posted in a better developer/legal oriented forum (and I find it strange that you mention that you wouldnt post on slashdot, buy fail to mention the forum that is appropriate for this question (unless you yourselves were just trolling)).

    2. Re:I'm an experienced developer by Anonymous Coward · · Score: 2, Interesting

      Agreed. People mistake this for a technical forum.

    3. Re:I'm an experienced developer by Osgeld · · Score: 0

      I dont know the appropriate forum, as I am not an experienced web developer, nor would I expect any serious answer from slashdot when I do need it, I develop electronics, I dont post which FET has the best ESD damage resistance on slashdot, nor would I expect anything but random opinion from it.

      when your serious, you get the data from people who have been down that road, and test it yourself, not post to some news recycler and hope for the best.

    4. Re:I'm an experienced developer by noh8rz10 · · Score: 1

      thank you. I'm updating my sig with your quote.

    5. Re:I'm an experienced developer by SomePgmr · · Score: 3, Insightful

      I'd give him the benefit of the doubt, and assume this isn't the only place he's looking for best practices.

      Meanwhile, "I'm an experienced developer, I'm familiar with all the general rules for securing customer data, but I'd like to hear of any 'gotchas' that you know about"? That seems like a reasonable thing to ask.

      Again, assuming this isn't the one-and-only source. So instead of grabbing our pitchforks, maybe someone has some examples of what he asked about?

    6. Re:I'm an experienced developer by Anonymous Coward · · Score: 0

      Welcome to the club!

    7. Re:I'm an experienced developer by Anonymous Coward · · Score: 1

      There's the blatantly obvious stuff: keep the data heavily encrypted on a back-end d/b or file store, on a server nowhere near a public-facing interface (or DMZ); obfuscate and/or consolidate the individual, personal data as soon as you gather it, assuming you don't need specific per user info to be retained. Needless to say, keep all your OS/software/services/apps/etc patched with latest security on a weekly, if not daily basis, FFS!

      Also, invite some wannabe hack-meisters you can kind-of-trust to try & break into your environment, just to see if they can... ;)

    8. Re:I'm an experienced developer by Anonymous Coward · · Score: 0

      Here's the rub - professionals want to get paid.

    9. Re:I'm an experienced developer by Anonymous Coward · · Score: 0

      .... not the peanut gallery of random trolls, tards, fanbois and neckbeards.

      So which category are you in? /. has good comments but you have to wade through the dross.

    10. Re:I'm an experienced developer by Electricity+Likes+Me · · Score: 1

      Isn't your first bit of advice right there a classic gotcha?

      Encryption doesn't mean anything unless the access routes to that encrypted data are well defined and understood - since at some point it has to be unencrypted to be used. So who's doing the unencrypting, who holds the keys etc.

    11. Re:I'm an experienced developer by bloodhawk · · Score: 1

      Not sure why you got marked flamebait, Even as a developer I find your comments spot on. If you are not experienced enough to know the answer to this topic then /. is not the place to be asking as you won't have the knowledge to sort the garbage from the good advise. Incidentally I would love to know the name of this new site as I think it is one I would avoid for my own safety.

    12. Re:I'm an experienced developer by Anonymous Coward · · Score: 1

      the problem is not that /. has both good and bad, it is if you don't know the answer then how the hell do you think they will know enough to sort the good from the bad. reading through he has already gotten a mix of both for this topic.

    13. Re:I'm an experienced developer by isaaccs · · Score: 1

      The question specifically says "I'm not seeking stack or infrastructural recommendations." This is not a technical question. The question is posed to the community as it bears on *social* issues.

    14. Re:I'm an experienced developer by isaaccs · · Score: 1

      In this forum, I submitted to seek the opinions of a community of technically minded individuals on a question that hinges on broader social concern. I did/do not expect a uniform or comprehensive answer. I expected to hear the voices of different people who have thought about, dealt with, or otherwise concern themselves with data collection. I am much aware that this is not a legal or technical venue - and I appreciate your acknowledgement that this may not be the only avenue I've pursued to inform myself.

  6. Don't by SmartyPants · · Score: 4, Informative

    honestly... try not to store it.

    You need to examine why you actually need the data, and if you can't think of a good reason (except it might be valuable in the future), then don't store it.
    If you do need it for analysis, machine learning apps, etc, try to anonymize it as early as possible, and not to keep raw data longer than you need it. (say raw data for 3 months, then just store aggregate info).

    also.. for behavior.. you don't need years of information, studies have shown people change, so make sure the things people do recently are more important, and the old stuff gradually decays.

    1. Re:Don't by mcrbids · · Score: 1

      As a counterpoint, Don't process all that data.

      At my company, we store everything. Every click, every bit of data, nightly snapshots of all data, etc. Forever. This results in stupid amounts of data about our users and we pretty much don't bother to try to correlate the data, we just provide it upon request of the customer.

      Why try to correlate it, when our customers are eager to pay us to do other things with it? Just because you have the data, doesn't mean you have to be devious with it. Save everything relevant, and then look after the interests of your customers. You'd be surprised at just how far a policy like this works for you!

      Our customers are almost unusually loyal; they almost never leave after 1 year of using our services, and the trust is universally present even when the inevitable problems appear as we upgrade/enhance/update our softwares.

      The truth is that operating in the best interests of your clients is actually a rather effective business strategy!

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    2. Re:Don't by stephanruby · · Score: 1

      If your purpose is really just for "analysis and ultimately functionality, not persistence" then there is really no reason to keep an email or a name. Just assign a unique identifier, and then you're done.

      So if for some reason, the user wants to get in touch with you to file a bug report, or what not, then assign a unique identifier for the device to the bug report (in case you get other bug reports coming from the same source), but don't ask for his/her contact information unless the user ticks a box asking specifically for a reply from you. Basically, if you tie your requests for information directly to your user actions, then it will become obvious to your users why you need the information you need.

      And if you need their emails for marketing reasons, basically the marketing department wants it -- then be upfront about that too. If you're upfront and honest with the way you're going to handle or mishandle information, and not try to bury it under vague language and pages and pages of terms and conditions, then I think most users will still be willing to share their information with you. That being said, don't forget to research and comply with local laws too, in the regions where your application(s) will be made available.

  7. Start reading about PII by Anonymous Coward · · Score: 3, Informative

    Wikipedia (http://en.wikipedia.org/wiki/Personally_identifiable_information) is a good start.

  8. Break the association by cheros · · Score: 4, Insightful

    If at all possible, stay away from personally identifiable data. If your aim is to use identity as an index, work out a way in which you can translate an identity into an an index or hash value (i.e. one way). This is not going to be perfect (there will be about a million "John Smith"s out there), but if you have a consistent pair such as name and phone number, turn that into a hash and use it as data index.

    That means you can still do correlations, but a leak will not result in exposure of personal data.

    However, first of all, look at what you're holding on personal data and simply assume you got hacked and it's "out there" - plan for that crisis first because there is one question you need to answer:

    If you cannot afford to pay for security advice, can you afford to pay for the inevitable consequences?

    --
    Insert .sig here. Send no money now. Owner may sue, contents will settle. Batteries not included.
    1. Re:Break the association by noh8rz10 · · Score: 1

      If you cannot afford to pay for security advice, can you afford to pay for the inevitable consequences?

      put another way: if you can't afford to do it right, how will you afford to do it again?

    2. Re:Break the association by dgatwood · · Score: 1

      Or keep personally identifiable information separate from everything else. Ensure that you cannot get to one data set from the other and vice versa. Use login information as a hash into the identity database and the behaviors database. If you must store any time stamps on database records, make sure you do so in a way that prevents using them to easily correlate the two data sets (e.g. update the time stamp on the personal info record only when the user changes his/her password, address, or whatever, rather than at every login).

      To the extent possible, store the information locally on the client side, or if you must store it on the server (for synchronizing between multiple computers), encrypt it client-side and send the encrypted blob to the server. Sure, that wouldn't prevent you from getting the information (because you control the site's code), but it does make it unlikely that somebody who compromises your database server can get the information.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    3. Re:Break the association by Lorens · · Score: 1

      If your aim is to use identity as an index, work out a way in which you can translate an identity into an an index or hash value (i.e. one way). This is not going to be perfect (there will be about a million "John Smith"s out there), but if you have a consistent pair such as name and phone number, turn that into a hash and use it as data index.

      Bad idea when you get a hash collision. Account numbers do not have to be seen by the user, but there aren't (m)any useful ways of avoiding their use internally.

      If OP is storing data for analysis and not for immediate reuse, there are some often overlooked but stupidly easy things to do like making sure that the user-facing machines collecting the data only have append/insert access to the data (no read, no modify). Analysing the data would be done from another machine/subnet/database account whatever.

    4. Re:Break the association by cheros · · Score: 2

      He said he had little money available, so I figured I gave him something that was easy vs. perfect. The key question is if the delta introduced by the odd hash collision is actually significant in the volume of data he is planning to process. If it isn't, I would not try to develop perfection - he can use his little funding better elsewhere..

      In other words, in theory you're absolutely right, in practice I suspect there is little difference. But my favourite way of avoiding issues with personal data is simply not collecting them in the first place. Unless you are Google and get away with a pathetic fine, of course..

      --
      Insert .sig here. Send no money now. Owner may sue, contents will settle. Batteries not included.
    5. Re:Break the association by fa2k · · Score: 1

      Great idea for some cases. If you need "telemetry" data to understand how people are using your application, assign each session a unique ID and don't store which user did it. It also works for some other statistical data. The argument against is that you may need the correlation between sessions later.

      Depending on the application, you could have a hierarchical system of databases where the lowest level contains session information, the next contains persistent user information but not personally identifiable info, and the highest level contains username, password, name, etc. You could have just a few components that have access to the top level, including the login component. The latter could load the information into the session state, so you could display the username on every page, for example. It's just something I thought of, I'm not an expert (I wrote a quiz and a page for a small event 8 years ago;).

    6. Re:Break the association by fa2k · · Score: 1

      Regarding my second paragraph, an important part was not obvious: Each session in the session database has a unique ID, and each anonymised user in the middle database has a list of sessions, and each user in the top database points to an anonymised user.

    7. Re:Break the association by isaaccs · · Score: 1

      There is validity to this point, but followed to it's conclusion, many of the great boot-strapped startups of our time wouldn't exist. As your exposure and user base grows, so does your ability to consult with specialists and experts - but everyone must start somewhere.

  9. Collect as little as possible, throw it away... by IBitOBear · · Score: 4, Interesting

    I have been toying with a site idea. Your account name is your public key fingerprint. You public nicname is whatever you use in the message. Your login is validated because everything you send is signed wiht the key that matches the fingerprint (and encrypted with my public key for transmision). Input to user form is constrained and validated within those constraints (to prevent padding attacks).

    I would then have a database "key x","paid through date y".

    Sure, I couldn't sell any farmed data a-la facebook, but suppoena requests woudl be a breze... "here's your hex dump..."

    --
    Innocent people shouldn't be forced to pay for inferior software development.
    --"Code Complete" Microsoft Press
    1. Re:Collect as little as possible, throw it away... by Rob+Kaper · · Score: 1

      I have been toying with a site idea. Your account name is your public key fingerprint. You public nicname is whatever you use in the message. Your login is validated because everything you send is signed wiht the key that matches the fingerprint (and encrypted with my public key for transmision). Input to user form is constrained and validated within those constraints (to prevent padding attacks).

      I would then have a database "key x","paid through date y".

      Sure, I couldn't sell any farmed data a-la facebook, but suppoena requests woudl be a breze... "here's your hex dump..."

      If you accept payments, wouldn't those keys still be linked to contact information and/or payment transactions?

  10. Give me control and earn my trust by johnnick · · Score: 3, Insightful

    The short requirements:

    1) Explain what you're collecting in real-time at the moment when you give me the option whether or not to permit you to collect it. Tell me what you will use it for, when you will delete it and the consequences if I don't give it to you. People don't read privacy disclosures. Give notice and ask permission at the moment of proposed collection. Make it opt-in, not opt-out.

    2) Only request the information required to perform the service I've requested. Use the information I provide only to provide the service I've requested. Only share the information I provide with third parties to the limited extent necessary to provide the services I've requested. Obtain contractual commitments from those third parties that cause them to protect my information and delete it as soon as they've done what's required to provide the service I've requested. Keep information only as long as necessary to provide the service I've requested and delete it after you've done what's required to provide the service I've requested.

    3) Protect my information. Encrypt in transit and at rest. Delete thoroughly and don't give in to the urge to collect and keep information just because it might be useful some time in the future. You can't lose what you don't have.

    You say the collection "... is for purposes of analysis and ultimately functionality, not persistence." That seems inconsistent with the collection of name and email address. I can't think of too many use cases where you're collecting my name and email address and don't plan to keep it (and use it for marketing or otherwise share it in some way). If you need to contact me or I need to create a user-id that is my email address, you don't need my name.

    Your privacy policy is your contract with your user. It is an operational document that must be consistent with your practices. The privacy policy should be consistent with your policies and procedures. If the information you collect, or the way you handle it changes, you must change your privacy policy.

    --
    "The plural of anecdote is not data."
    1. Re:Give me control and earn my trust by TheDarkMaster · · Score: 1

      I think your answer is the best I've seen for the issue.

      --
      Religion: The greatest weapon of mass destruction of all time
    2. Re:Give me control and earn my trust by Anonymous Coward · · Score: 0

      Only share the information I provide with third parties to the limited extent necessary to provide the services I've requested.

      To most people, this means that they can write the entire application using some Google JavaScript framework because that's easier.

      Any information transmitted to third parties should go via your own server and be anonymized before it even leaves your server, regardless of the agreements made with the third party. Just because I trust you and you trust them doesn't mean that I trust them (see "Web of Trust").

      Just in case, you should also avoid having third parties whose business model is based on analyzing data and filling in the blanks. They might be able to de-anonymize the data, and in that case you saying "but I tried to anonymize it!!" won't matter.

  11. P.S. by IBitOBear · · Score: 1

    Return email will be sent, if necessary, to whatever address(es) are registered in the public key database for that fingeprint, encrypted with that key.

    Obviously I have no control over your passphrase and can do nothing to help you "recover your password" or whatever. Please see your GPG or PGP documentation for a better explanation.

    Your account will not be "renewed" past the key expiration date.

    --
    Innocent people shouldn't be forced to pay for inferior software development.
    --"Code Complete" Microsoft Press
  12. Support OpenID by interval1066 · · Score: 1

    ...and let your users, investors, and you sleep easier at night. Don't store anything at all except a few prefs.

    --
    Python: 'And then suddenly you have a language which says "we're all stuck with whatever the whiniest coder wants".'
  13. Store nothing. by Anonymous Coward · · Score: 0

    "This is for purposes of analysis and ultimately functionality, not persistence."

    Store nothing.
    Ask people what they want from your service.
    Listen to them.

  14. You can't afford it, by your own admission. by VendettaMF · · Score: 3, Insightful

    If you can't afford the expert then you can't afford to collect such data. Move away from this project to something you have the ability to do.

    --
    kartune85 : Incapable of reason, observation or learning. A kind of dim, drab, flightless parrot.
    1. Re:You can't afford it, by your own admission. by Mike610544 · · Score: 2

      If you can't afford the expert then you can't afford to collect such data. Move away from this project to something you have the ability to do.

      I'm surprised it took this long for someone to say that. The people who will exploit your system and extract something valuable from it can afford those experts.

      --
      ... also, I can kill you with my brain.
    2. Re:You can't afford it, by your own admission. by Anonymous Coward · · Score: 0

      bingo, finally we have a winner. If you don't already have the knowledge to handle this and you can't afford the experts to do it for you or teach you to do it then you should stay the hell away from this area. Every dev thinks they can write good secure code that protects privacy, the reality is most devs have no concept of the methods and skills that experts in this field can utilise to compromise your system or more importantly your users privacy. move on to something you are more comfortable with.

  15. OWASP by FormOfActionBanana · · Score: 5, Informative

    OWASP has guidance; for instance, here: https://www.owasp.org/index.php/IOS_Developer_Cheat_Sheet#Insecure_Data_Storage_.28M1.29

    From https://www.owasp.org/images/5/5e/Mobile_Security_-_Android_and_iOS_-_OWASP_NY_-_Final.pdf
    2. Insecure data storage
    Solution
      Avoid local storage inside the device for sensitive information
      If local storage is “required” encrypt data securely and then store Use the Crypto APIs provided by Apple and Google
      Avoid writing custom crypto code – prone to vulnerability

    --
    Take off every 'sig' !!
    1. Re:OWASP by fa2k · · Score: 1

        Avoid local storage inside the device for sensitive information

      That does make sense, but it still feels like I've fallen into opposite land.

        Avoid writing custom crypto code – prone to vulnerability

      Yes! I'll repeat it a couple of times

        Avoid writing custom crypto code – prone to vulnerability

        Avoid writing custom crypto code – prone to vulnerability

  16. Book of best practices by Okian+Warrior · · Score: 5, Insightful

    In the US, we have the National Electrical Code which explains in clear detail how house wiring is constructed.

    Following the code a legal requirement in many (most?) states, but from the point of an electrician it's a "book of best practices". Use this gauge wire for this current, staple the wire within 6" of the box, and so on. The code gets revised and added to over time as questions crop up and new technologies get added and people get more experience.

    There's a reason for everything. For example, the light in a bathroom should be on a separate breaker from the outlet next to the sink. It makes sense in retrospect, but this is not something that is obvious beforehand.

    It's very detailed, but also very clear. Homeowners routinely understand the instructions and are able to make simple repairs and modifications to their home wiring which conform to the code.

    We throw a lot of "best practices" around here as if they were simple and obvious at the outset, but maybe they're not. Hash your passwords, salt the hash, sanitize the form inputs, don't keep CC info... lots of best practices which in hindsight make sense but which aren't necessarily obvious beforehand.

    Most web apps have common requirements for login, identity management, privacy, various forms of functionality, and so on.

    Should we have a "book of best practices"?

    1. Re:Book of best practices by fuzzyfuzzyfungus · · Score: 1

      I suspect that the big problem with that analogy is that data collection(unlike electrical wiring) is a substantially adversarial field.

      There is a certain amount of tension, (fast, cheap, good, pick any two, and the usual buyer/seller desire to not leave money on the table); but the buyer and the seller both share roughly the same ideal, though they may deviate from it out of laziness, cheapness, or incompetence.

      With data collection, the purely security/architectural aspects are somewhat similar; but there is the more fundamental problem that data collection is frequently not for the good of the collected. There is only the merest pretence of aligned interests, and it mostly is a matter of what the collector can get away with.

    2. Re:Book of best practices by khallow · · Score: 1

      The same tension exists in electrical wiring. But one can physically inspect the entire work. With data collection, it's pretty easy to hide what you are doing from the target of your collecting.

    3. Re:Book of best practices by Anonymous Coward · · Score: 0

      Too many software engineers keep reinventing the wheel and making it slightly more trapezoid in shape each time. A "best practices" compendium is basically a book that would never be read in the software industry. Unless there was a huge consulting scam behind it, like Agile/scrum/etc. Even then, it would be poorly implemented.

  17. Aggregate Data by Archangel+Michael · · Score: 1

    Aggregate the data as quickly as possible to anonymize it.

    Collect "Mary did X, Y but not Z", but aggregate it to Three people did X, Two Y and TWELVE Z and drop Mary from the data. You don't need to know Mary did anything.

    --
    Agent K: A *person* is smart. People are dumb, stupid, panicky animals, and you know it.
    1. Re:Aggregate Data by AwesomeMcgee · · Score: 1

      +1 this is exactly what I was going to say and what I have done in the past when presented with these situations. Best bet if you *must* have non-aggregated data is to simply identify each user by a guid that get's embedded in each client, with no identifying information.

      Also there are a lot of laws around the world regarding things like this which can and cannot be tracked *at all* that no amount of legal disclosure will make lawful in some places. Seriously, just avoid any form of identifying data (preferably both remotely *or* locally on the users device)

  18. What is is for? by Silvanis · · Score: 1

    You say you aren't interested in persistence, so I don't see any reason why the data needs to be personally identifiable. Whether your index is John Smith in Albany,NY or User #71829382 doesn't matter for usage analytics. Even demographic information can at least be stripped of things like name and phone number.

    If you REALLY need to tie this information to a particular instance, then use a hardware key from the mobile device and not a user's information. A hacked phone is easier to deal with than identity theft.

    As someone else mentioned, work from the assumption that anything you save will end up being hacked and used for nefarious purposes. Make the data as useless as possible to a hacker and THEN design the systems and storage to be a hackproof as you can.

  19. Also consider TLDR-TOS by Krishnoid · · Score: 2

    This site provides summaries of the terms-of-service policies for various companies covering privacy, retention, and use of user information. You can use it to compare your plans with those of major companies and identify privacy or TOS concerns you may have overlooked.

  20. "We aim to be completely transparent and honest" by stiebing.ja · · Score: 2

    +5 Funny

    --
    I lag
  21. License under AGPLv3 by Anonymous Coward · · Score: 0

    Doing this means that you will really respect the privacy of anyone using your software since they would have the source code to do as they wish.

    https://en.wikipedia.org/wiki/AGPLv3

    Give your users Freedom and they'll respect you.

  22. Meaning: by Anonymous Coward · · Score: 0

    Be very careful indeed what you really need, and collect only that. The less data you collect the less you have to worry about.

    Note that the easy cop-out is to stick someone else with the trouble, like "supporting" facebook logins or something, but that's actually worse. The why is left as an exercise, but rest assured that if you do that I'm certainly never going to sign up with you, just like I won't be signing up with facebook, or google accounts, or any of the others you might be "supporting".

  23. FBI Laptop by Anonymous Coward · · Score: 0

    Cut out the middle man. Just put it straight onto an FBI laptop.

  24. On a need to know basis only by istartedi · · Score: 1

    My car insurance company needs to be able to pull my DMV records, perhaps even periodicly. They could retain *none* of that information and ask me to visit a web site periodicly where the info gets enterred so they can do the query (and then forget the information required to perform the query). Most customers wouldn't mind them holding that information; but if I'm *that* security minded and they make it clear to me that I'll have to hit their site once a month to maintain my insurance... well... There are always trade-offs, arent' there?

    Just ask yourself, "what do I need in order to serve my customers?". Yes. REAL customer service. People doing good things for other people, and getting paid for it as opposed to just herding people like cattle and exploiting them. Yeah, I know. Strange concept.

    That can be a very successful business model though. Zappos is said to be very customer focused, and AFAIK they are very successful. I can't say I care much for their work environment; but I believe that's a separate issue from customer service. I mean, do you really need to have conga lines and party with your co-workers after hours to be spot-on with a customer? I don't think so; but I just can't think of a good counter-example off the top of my head.

    --
    For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
  25. Best Practices For Collecting and Storing User Inf by Anonymous Coward · · Score: 0

    Consider if your service is intended for only one country or several (a global service).
    Regulations on user information/user data is VERY DIFFERENT in different parts of the world.
    There are quite a few countries where the parts of the population has first hand experience of severe implications of wrongfully used user information like
    -Friends
    -Location
    -Behaviour
    -Etc
    In some of these countries data collection is outright forbidden or the user data can never leave the country.

  26. Payment Recepits by IBitOBear · · Score: 1

    Not for any longer than necessary. Likely I would make that opt-in.

    I would have a payment history (bob paid x dollars for y time) as an atomic event. Bob could check a box to say "remember this for me", or not at the time of payment.

    At the time of payment I would also send Bob a receipt. That recept would say "Bob paid for a service". The receipt would also contain a dot-splash (e.g. Qr Code a linear 2D barcode, depending on how much info space I turn out to need) that was the "proper join record for the database" (e.g. the key tuple that proved that payment X was for service Y on date Z). That tuple would be encrypted with _my_ secret key. Bob could use this receipt by sending it back to me, but I would only have that record until the payment cleared and was essentially irreversible, or when Bob sent it back via email or phone scan etc.

    The actual membership information that Key X was paid-up-and-valid until Date Y would be a separate entry.

    Think double entry accounting but where the account holder and not the institution had the journal that colated things.

    With no start date, and if a person could buy any amount of time, which would be necessary because the key the customer made is only valid till it expires and that expiration date is chosen to the second by the key creator.

    There is some ability to back-figure the expiration dates to the purchases and so the purchasers while both sets of data are present, so the user would have the option to "randomize the duration", e.g. for gambling a little of the funds paid they would gain or be shorted a random amount of time within a reasonable percentage of the purchase duration.

    The idea is that, at every chance, you give the user the magic cookie, to join the information, but you keep the results. As long as the cookie is cryptographically secured it doesn't mater that they are holding it.

    It wouldn't be that hard to figure out who and paid what when, when the user base is first started out, but as the base and transactions mounted the anonymity of payments would increase.

    So imagine you want to buy a year, and your public key is good for at least a year, you could buy a year as one transaction, or cut it up into several transactions (like 2 and 3 and 7 months each) to get the year, or you could buy eleven months and bet a month hoping to go long not short. Without the record that you get into your exclusive custody, there is no good way to ask the site how 12 months ended up on that key from which purchaser.

    If you invalidate your key, you get no money back. If you lose your receipt you have nobody to blame but yourself. That's the risk you take for your privacy. It's basically using an information system hole to make things same-as-cash.

    I haven't figured out how to deal with credit card "charge-backs" or fraudulent disputes. I'd rather take the gift-card route for payment if it came to that kind of problem.

    You could, I suppose, put people who paid via revocable means (like credit cards) in "risk pools" and if someone games, you penalize the pool but let people out of the pool using their receipts as proof that they are not the scammer. As each person used their receipts to change pools, the pool would get smaller but each member would lose more, until only the scammed account and people who didn't care or lost their receipt would lose anything.

    The idea started out as more of a social media/blog/rant site idea more than a profit oriented thing, but I could make it work pretty easily, The "business rules" for an anonymized service seem totally workable, but the anonymous people would have to accept some of the risk for the privacy.

    People who opt in to having _me_ keep the payment records are, of course, buying the surety of service for the loss of anonymity, at least in part.

    And the un-paid people are much less work (e.g. none) to track.

    And a spammer would lose all their content for spamming as the and all its content would be forefet for spamming as a single "hide/delete where" eqivelent action. So rather than make fake accounts on my system they would be "paying" CPU to make keys and encrypt and sign their transmissions to me. Not impossible to script but pretty hard to javascript.

    --
    Innocent people shouldn't be forced to pay for inferior software development.
    --"Code Complete" Microsoft Press
  27. The little nicities by IBitOBear · · Score: 1

    There would be other little niceties.

    Agressive use of POST instead of GET messages on all forms so that pin-trap requirements, if levied, would be largely moot. as in user XXXXXX did POST to "/" at this site on these dates and times. [POST data is not legal to collect in PIN traps in the USA as I understand the law.]

    Services a site could sell? POST the URL you want as part of the encrypted blob you sen to this site, we will retrieve it, scrub it and send its content back to you encrypted to with your key.

    Pay for encrypted, advertisement free page delivery with/without the unpaid peoples noise at your leasure. 8-)

    Encrypted mail box where the records in the mailbox are encrypted to your public key the instant we get it if the "From" matches particular criteria you specify. (this burns time off your subscription key expiration date etc, so you might not want to encrypt "form *".

    Note that this is not a bar to law enforcement if they show up with a court order to "tap" a particular key going forward. It is a barrier to having law enforcement fish into your past. I am not a lawyer so I don't know if this last bit is legal, it's just the noise floating in my head.

    Of course such a site would have no way of knowing whether the "identity" information in the key, if any, was real so just as I could make a key that said I was both Mittens and B.H.O. today, anybody would be foolish to assume that an unsigned and unverrified key was anybody it clamed to represent.

    In short the site design is not to confound the law, but to make the entire issue of identity "Somebody Else's Problem" since I want to be in the business of passing messages for fun and profit, not being the arbiter of who is whom.

    (you should see my thoughts on replacing DNS... 8-)

    --
    Innocent people shouldn't be forced to pay for inferior software development.
    --"Code Complete" Microsoft Press
  28. Don't by Anonymous Coward · · Score: 0

    Only collect what is absolutely necessary at that particular junction; afterwards move that data to a different datastore where only legally required information is stored (e.g. for tax purposes) and everything else is omitted. Encrypt it. Don't store IPs longer than 30 days. Don't set permanent cookies.

  29. Use by MrKaos · · Score: 1

    /dev/null

    --
    My ism, it's full of beliefs.
  30. Functionality Only, Eh? by Anonymous Coward · · Score: 0

    From the article,

    I'm a mobile developer at a startup. My experience is in building user-facing applications, but in this case, a component of an app I'm building involves observing and collecting certain pieces of user information and then storing them in a web service. This is for purposes of analysis and ultimately functionality, not persistence. This would include some obvious items like names and e-mail addresses, and some less obvious items involving user behavior.

    If the intended reason as is stated, then why store the names and email addresses at all? Analysis of user behaviour in the aggregate does not require individually-identifiable information be collected much less stored,

  31. Read "Translucent Databases" by Peter Wayner by cornicefire · · Score: 1

    It explains how to store personal information so it can be used correctly. http://wayner.org/node/46

  32. So it sounds like... by Anonymous Coward · · Score: 0

    Either your truly concerned about what parts of the interface/product/service they use the most and how, or you're collecting sales and marketing data so you can trim the fat and rape people. If the first is the case, just put a counter and/or timer on everything so every time it is accessed, clicked, etc, it is counted and you can see the amounts of time they are spending doing what without ever collecting any personal data at all. That would be completely anonymous and give you all the data you need to build a better interface. If the latter is the case go jump in a river from a very high bridge.

  33. Collecting Personally Identifiable Information by Rozzin · · Score: 2

    On passwords, I liked Jeff Atwood's article, `You're Probably Storing Passwords Incorrectly'.

    For Personally Identifiable Information (PII), I liked Brian Danger Graham's article, `What's in a name database?'.

    --
    -rozzin.
  34. Policies, Procedures, Standards, Trust all Useless by anorlunda · · Score: 1

    If your company goes bankrupt, or is sold to another, all it's assets become the property of someone else. That someone cannot be constrained to respect anything you have promised. You may not even have the opportunity to wipe disks or change passwords.

    For example, a hospital failed to pay the rent on a warehouse storing patient records. The landlord seized and sold those records as scrap. None of the hospital's patient privacy obligations transfer to the landlord, or to the scrap dealer.

      Heed the advice of others who told you don't do it.

  35. Ugh by Anonymous Coward · · Score: 0

    "How would you like information collected about you to be stored?" ::Must resist unhelpful comment::

  36. Keep it on the user's computer, not in the cloud by jbrohan · · Score: 1

    Obviously not the solution for everybody. We write apps for Android Tablets (for old people actually). All the data like Name, email, pictures, and messages are stored in the Android tablet and kept on the Cloud only until they are downloaded. They are encrypted, even the pictures, while waiting on the Cloud database. In the registration part of the app the user does type in his email, but we do not keep it. How to contact the user? We put a record in a table which is checked periodically by an active user's android and they can get the message. Payment is tricky since the PayPal record contains the email of the payer and teh AndroidId of the user's tablet, It's just a matter of throwing away data that does not belong to me!

  37. Google Mobile Analytics by monkeyhybrid · · Score: 1

    Although you state you're not looking for stack or infrastructure recommendations, I'd still recommend having a look at Google Mobile Analytics. They have an SDK for Android and iOS that makes it very easy to integrate in your apps.

  38. Please by Anonymous Coward · · Score: 0

    Let us know the name of this startup, so that we may avoid it like the plague.

  39. It Is a Matter of How to Encrypt by trydk · · Score: 1

    I think everybody would agree that the data should be encrypted, but often the problem with encryption is access to the data. If the server-side application stores the encryption key, this key could potentially be found (maybe through a vulnerability) and thus give access to the entire database.

    Best practice is to encrypt each record with a unique key. This key could be generated by some unique identifiers per user like Visible User ID (maybe E-mail address) and Password and Hidden User ID (different from the visible and generated independently from it) and Android ID.

    To create the database entry:
    1. Collect the information to store
    2. Key = Hash(Visible User ID, Password, Hidden User ID, Android ID)
    3. Send Visible User ID and Key to a receive-only system with as little an Internet Surface as possible (i.e. one that is next to impossible to hack into, if done correctly) -- This information is used retrieve the user data for analysis and such
    4. Store the information in the more accessible database encrypted with the key

    To retrieve the data from a user's application:
    1. Collect Visible User ID, Password, Hidden User ID and Android ID
    2. Key = Hash(Visible User ID, Password, Hidden User ID, Android ID)
    3. Use the key to retrieve the necessary data

    To retrieve the data from the inside:
    1. Use the User ID to Key data to decrypt the data

  40. What's so hard? by Anonymous Coward · · Score: 0

    Don't do anything special, just store it in a database.

    Try to not get it compromised, but if it gets compromised, well, who cares?

  41. Don't. by BVis · · Score: 1

    Analyze data on a nightly basis. Store the results. Scrub database after results are stored. The asshole MBA that your startup hires because it isn't making enough money then has nothing to turn around and sell for a quick buck.

    If you have to store *anything at all*, hire the expert. Can't hire the expert? Your startup is inadequately funded.

    --
    Never underestimate the power of stupid people in large groups.
  42. Simple by Anonymous Coward · · Score: 0

    Just store it on an FTP site somewhere with a secret IP address no one will find it, I promise.

  43. Some advise by Minupla · · Score: 1

    Disclaimer: I work in the field, but do not have nearly enough information on your particular situation, jurisdiction, etc to provide detailed recommendations. What follows is basic best practice stuff based on my jurisdiction and market sector.

    * First, any sensitive information you are collecting, ask if you really REALLY REALLY need it. This stuff is toxic waste. Your first and best defense is not to store it if you don't need it.

    * A hash of something like a SSN, Telephone number, etc is worthless in terms of protecting you. Hashes are only useful if the search space is large enough to make the full space search computationally unfeasible. 1 billion SHAs is not computationally unfeasible. Also typically hashes are only useful if what you want to do is compare two values, e.g. passwords. If you're trying to anonymize, hashing a PII (personally identifiable information) element doesn't anonymize the data as it doesn't break the PII link.

    * DON'T WRITE YOUR OWN ENCRYPTION. EVER. Unless you have a deep deep background in crypto and submit your alg for peer review for years before using it, just don't.

    * Consult a good lawyer. There can be pits in here that you might not think of, particularly if you don't have a security dept with someone who spends their time dealing with privacy issues. A good lawyer won't say "You can't do that" a good lawyer will outline the risks that you will be running and let you accept them - just like a good risk mgmnt dept will

    * Use the security controls in your database. If your client doesn't need to access the hashes because they're being computed by a stored procedure then the user your client accesses the database shouldn't have access to the hashes. Same goes for salts only more so. I've seen too many apps written using one user for everything. Don't do this.

    Hope some of that helps you.

    Min

    --
    On the whole, I find that I prefer Slashdot posts to twitter ones because I don't get limited to 140 chars before
  44. Lost cause by Anonymous Coward · · Score: 0

    Just go ahead and store it in an unencrypted excel on Amazon S3, and save the bad guys the 5 minutes it's going to take to break through whatever well-meaning safeguards you put in place.

  45. Reading it. by hendrikboom · · Score: 2

    Here in Quebec, the notary actually reads the entire document to you and asks you enough questions that he is sure you've understood it.

    1. Re:Reading it. by jittles · · Score: 1

      That is probably how it should be. Most people have a hard time understanding the kind of language they use in contracts anyway. I had the advantage of working for lawyers for most of my college career first doing help desk for the Attorney General and then later doing transcription as a psuedo-legal secretary for a large bankruptcy firm. Otherwise, the contract might have seemed like Latin...