Slashdot Mirror


Checking Web Content for Sensitive Data?

NetFiber asks: "I work as a security analyst for a large university. We have recently been tasked to scour our network in the hopes of finding and removing sensitive information such as credit card numbers, social security numbers, and such on all publicly available web servers. Our current method of analysis is to archive all the content (which often grows over 100GB) and later parse the data with various utilities and regexes that search for patterns and other pertinent information. So far, this process has proven to be rather cumbersome and time consuming. Does anyone have any experience collecting and sanitizing large amounts of web content? If so, what procedures/utilities do you use to accomplish this?"

5 of 44 comments (clear)

  1. Re:Visa PCI CISP is a good set of practices by scdeimos · · Score: 2, Insightful

    PCI/CISP does have software process recommendations for securing credit card data, but it's largely recommendations for people processes and facility processes.

    I believe the original requestor is asking about software to help automate/speed-up monitoring and scanning of content that's being put up on web sites by staff and/or students.

  2. Re:Visa PCI CISP is a good set of practices by plover · · Score: 4, Insightful
    I believe the original requestor is asking about software to help automate/speed-up monitoring and scanning of content that's being put up on web sites by staff and/or students.

    I know what he's asking for, and I answered with what it takes to make it happen for real. The answer is the various teams that are storing the data need to be held accountable for storing it securely. Just grepping for and deleting a database holding SSNs isn't enough -- his university has to make sure that all the teams are educated to not ask for nor store SSNs. They'll also benefit from a cohesive policy that gives specifics, such as "replace SSNs with student ID numbers."

    If this is just some security manager saying "go find SSNs and wipe 'em out" then they're up the creek. For every database they clean up, someone else will have created a new one. They'll be ignored and stonewalled by teams who have neither the time nor the budget to comply. This sort of thing has to come down from the board of regents, and they have to put the responsibility on everyone, otherwise they're just pissing in the wind.

    --
    John
  3. johnny i hack stuff by wwest4 · · Score: 2, Insightful

    JIHS comes to mind.

  4. Re:The answer is simple by SlowDancing · · Score: 2, Insightful
    Do nothing. Given enough time, some industrious hacker will find all the data for you.

    I think the OP may be hoping for that, since they're posting on Slashdot and have disclosed the identity of the university just as cleverly as any redacted PDF would.

  5. One tip by Anonymous Coward · · Score: 1, Insightful

    is to stop using Social Security numbers. Another is to stop using Social Security numbers. Yet another is to stop using Social Security numbers. And yet another is to stop using Social Security numbers.

    Is your university contributing to the students' Social Security accounts for some unknown reason? If not, there's no legitimate reason for the school to continue to use students' Social Security information.

    Same with birth dates. In grade school, along with permanent records, we were assigned a student ID number. Thirty years later, I still remember mine. There's absolutely no issue with manually maintaining, in a notebook (remember those?), with a pencil or pen (remember them?) a two column chart that correlates Social Security numbers with an arbitrarily assigned student ID number issued by the college or university for identification purposes, if maintaining Social Security information is absolutely necessary, which it most likely isn't.

    For every reason for maintaining the Social Security, birthdate, or other sensitive information by the university, there's a reason and a method that shows that it isn't necessary. A couple of universities who made news in the recent past because of sensitive data breeches announced that they'll no longer use Social Security numbers for identifying students. If they can do it, so can your institution. No excuses. Stop using Social Security numbers.

    Ask your institution this: if Congress enacted a law that said that a university could be held financially liable for the consequential damages of a data breech involving Social Security numbers, and the liability could extend to all of the endowments in possession of the university by all past alumni, would the university continue to use Social Security numbers as identifiers, or would they find and implement a different identification system rather than risk losing their entire endowment funds?

    The simple answer is to stop using Social Security numbers. And to stop using Social Security numbers.

    As for the other part of your post, wtf is the university doing storing credit card numbers on its computers?