Slashdot Mirror


Databases and Privacy

A couple of stories made an interesting juxtaposition today. First read this story about information marketers scouring public records to compile personal information. Note the emphasis on cross-linking data from various sources to provide more information than any one source did - databases are synergistic. Now read this column about David Nelson, and its follow-up.

12 of 173 comments (clear)

  1. Access public records online by Anonymous Coward · · Score: 3, Informative

    Yes, you can access public records online for free:

    Public records online

    Free directory Info from AT&T

  2. Good thing databases are perfect! by plopez · · Score: 5, Informative

    Seriously, I spend a large amount of my time working with gov't. and private databases and info sources. Reconciling different views of the universe is nearly impossible. WHen I read about people cross referencing databases the amount of checking, QA and scrubbing required to have any confidence in the results iis horrendous.

    Example: person A gives you a download from thier database into a SS, person B (who may actually work for the same agency or company) supposedly gives you the same information but the 2 version do not match.

    And this is assuming that there are other areas where they may or may not be in alignment (e.g. abbreviations, type of info gathered, spelling variations etc.).

    Now take the combinatorics of tens of thousands of gov't and private DB's, and you will understand that:
    1) A good clean DB is horrendously expensive.
    2) Driven by the profit motive, most compaies are unwilling to take the time and spend the money to properly QA and scrub thier data.
    3) Much of the cross matching is therefore useless due to noise.
    4) TIA is totally bogus. See above.
    5) Having some anonymous DB of information tracking your life is very scary.

    --
    putting the 'B' in LGBTQ+
    1. Re:Good thing databases are perfect! by stanwirth · · Score: 4, Informative

      Actually, governments and corporations are very willing to spend tremendous amounts of money on:

      • data cleansing and QA
      • data warehousing
      • surrogate key generation
      • data correlation
      • data mining
      • geocoding (linking an address to a lat/lon, identifying the lat/lon with a neighborhood, municipality, county, state, country; linking a lat/lon to an address)
      • database integration
      • data migration
      • legacy systems
      • data audit trail generation
      • dataset purchases
      It's not "impossible" to reconcile different data on the same subjects, it's just a whole lot of work, much of it analysis and data discovery, and being able to do the work typically requires that you be familiar with a variety of RDBMS's, billing engines, debt engines, file formats and platforms. The combinations are almost endless.

      Take heart. You'll start seeing the same kinds of problems over and over: middle initial vs. middle name, spacing and capitalisation issues, address data entered as a small number of big long strings that needs to be parsed out into attributes, date/time format inconsistencies, record doubling, data integrity issues (1 supposedly unique key identifying multiple distinct records), data accuracy issues (data way out of range, data incorrect), null values with meaning, attributes used to identify a range of different things, "smart keys" that are not so smart being used to code everything about a customer in 8 characters, and so on and so forth. And you'll know to look for these "usual suspects" first, and develop some standard ways of dealing with them.

      Metadata management and ETL tools make the job easier, but as you say, data are imperfect. There are plenty of legitimate applications--every merger, acquisition and JV is yet another opportunity for some more mind-numbing, back-breaking, soul-destroying, spirit-crushing DB work. Oh goody. That's why they call it "work," I suppose. I'm surprised the work Neo was doing in The Matrix -- before he found his "calling" so to speak--was something as creative and interesting as software development. The real grind is the big databases. As you so aptly point out.

      Many industries have, as their primary asset, data and data only . Banking and insurance are the classic examples. Companies in these industries are certainly willing to invest in their most important asset, because just about all the money in the world is in databases.

      A database is like a gun. It can protect you, it can kill you. You can shoot yourself in the foot, somebody else can take you out in a 'hunting accident.'

      The difference between a database and a gun is that a gun needs someone behind it pulling the trigger. A database, OTOH, has triggers that can fire based on whatever criteria's been set--like when a 'David Nelson' tries to fly to Peoria. Yah, it's scary, all right.

  3. Re:The New Government Blacklist by Anonymous Coward · · Score: 2, Informative

    Actually, John Ashcroft flew commercial airlines until July, 2001. cbs news

  4. Marketing In Texas by Anonymous Coward · · Score: 1, Informative

    Down San Antonio way, they'd say of you "He NEEDED killin'"

  5. I work for a "Risk Management" company.. by booms · · Score: 5, Informative

    And honestly, you'd be surprised how many privacy laws we have to follow (which is a good thing). For instance, we only sell accounts to people who have a legitimate purpose for searching information (such as insurance companies when you apply for insurance, law enforcement agencies to track down criminals, collection agencies who are trying to track down people who skip payments, etc.). If I were to search for information about someone besides myself or others in the development team whom have agreed to let me search their names, even when testing, I'd be fired within the hour. We have a compliance department who keeps track of all searches, has to report them to various authorities, etc. If someone searches for someone marked as a celerbrity, their account is shut down within minutes and one of our compliance people is on the phone getting documentation about why they searched for that name. In fact, the applications to get to the data we sell are quite nasty, and we only have a very narrow scope of people that we can sell data to.

    I think in general, personal data is protected more than you would think (at least public records, credit agency data, etc)-- I really have no idea how these 'unscruplous' companies get by with public data without having anyone come down on them. I'm a privacy & security advocate, and I don't feel what I do crosses my moral boundries (at least at this point).

    1. Re:I work for a "Risk Management" company.. by booms · · Score: 4, Informative

      Like I said, I don't know how other companies get around all of the various laws. He also violated FCRA by getting information about you which was used in a decision to "allow or deny credit" without it being a place which is certified for that, which is a pretty nasty penalty as I understand it. I don't know the specifics, as IANAL.

      I can see why the local police would probably not do much about it to be honest, but they are lazy for not pointing you in the right direction. If you want, I can ask around to see who the proper authorities would be to report this occurance to.

  6. Re:Random Lies by Cygnusx12 · · Score: 5, Informative

    Anyone else? I Lie. Sometimes I'm a yak herder with a yearly income of ~$6000, other times I'm a "Decision Maker" with a yearly income of $800k+.

    As someone who used to work in database aggregation with this sort of data. I can tell you that we corrollated income as a function of your home value. (Which is freely available right down at your local county court house in most states).

    You typically don't have 800k/yr decision makers living in 12k/yr apartments. There's a process in compilation here, they don't just enter this into a database and sell it.

  7. Re:Some comfort by hswerdfe · · Score: 3, Informative

    There needs to be some simple rules on DataBases and collection of Information.

    One I am partial to is
    Any Person should have, the Right to request a copy of any and all information a company, or government agency stores about them.

    I find it strange when I can't even look at data that is specifically about me.

    thats the only one I have seen so far that doesn't have much of a down side... ...anybody have any more

    --
    --meh--
  8. Gilmore v. Ashcroft by tsvk · · Score: 3, Informative

    From the second "David Nelson" article:

    Dennis Radke finds it ominous. "Given sufficient time, is it unreasonable to expect we Americans will be required to carry travel papers inside the U.S., just as residents of Nazi Germany and Stalin's Soviet Union" did?

    As previously reported on Slashdot, the issue of requiring ID when traveling within the US has already been challenged as unconstitutional. EFF co-founder John Gilmore sued the government and two airlines for not letting him board aircraft without ID.

    See his site for history and court documents.

  9. Re:Google by scrod · · Score: 2, Informative
    The threat exists today that one may end up on a terrorist watch list simply because of their searching habits.

    Fortunately there are always public proxy servers, and of course this google search proxy available on google-watch as well:
    http://www.google-watch.org/cgi-bin/proxy.h tm
  10. Actualy in EU this function like that by aepervius · · Score: 2, Informative

    I can't cite the exact paragraph but a piece of the law says that "everybody has a right of checking and rectification for every database he is written in. Be it COMMERCIAL or GOVERNEMENTAL".

    AFAIk, this is exactly why the EU protested against the APIS/CAPS program. Because this would violate this fundemmental law (data would go in the US govt without right of rectification in case of error and would stay there for an unknown time).

    --
    C. Sagan : A demon haunted world:
    http://www.amazon.com/gp/product/0345409469/
    visit randi.org