Slashdot Mirror


To Purge Or Not To Purge Your Data

Lucas123 writes "The average company pays from $1 million to $3 million per terabyte of data during legal e-discovery. The average employee generates 10GB of data per year at a cost of $5 per gigabyte to back it up — so a 5,000-worker company will pay out $1.25 million for five years of storage. So while you need to pay attention to retaining data for business and legal requirements, experts say you also need to be keeping less, according to a story on Computerworld. The problem is, most organizations hang on to more data than they need, for much longer than they should. 'Many people would prefer to throw technology at the problem than address it at a business level by making changes in policies and processes.'"

6 of 190 comments (clear)

  1. Easier to keep by Geoffrey.landis · · Score: 5, Insightful

    The problem is that it's easier to just archive the cruft stuff than it is to go through it all and figure out what's worth keeping.

    --
    http://www.geoffreylandis.com
    1. Re:Easier to keep by Daimanta · · Score: 5, Insightful

      True, proper archiving takes huge amounts of time since it adds overhead to your operation.

      In an ideal world, everything that you store is automatically labeled and old data will automagically be purged. But storing all kinds of shit is just that much easier. It also doesn't help that data storage is so dirtcheap. 1TB can be bought for around $100 if I am not mistaken. It doesn't pay to kill old useless stuff you have floating on your hard disk.

      --
      Knowledge is power. Knowledge shared is power lost.
    2. Re:Easier to keep by daeg · · Score: 5, Insightful

      The bigger problem is that you will fight different battles. If you're fighting a sales rep that sold your clients to a competitor, you want as much ammunition as possible. If a client is suing you for incorrect information relayed 8 years ago and you're probably guilty, you want as little information as possible.

    3. Re:Easier to keep by cmause · · Score: 5, Interesting
      There used to be a sort of gentlemen's agreement between attorneys to not dig in to electronically stored information (ESI). That was back when everything important ended up on paper anyway, which was discoverable.

      As time went on, fewer things ended up on paper, but the rules of discovery didn't evolve. That was the time of backing up a U-Haul full of printed out copies of every file, e-mail, etc. that a company had. Now the opposition had to dig through mounds of trash in the hopes that they will find that one incriminating document.

      Then attorneys got more savvy, and in the so-called Rule 26 (refers to the Federal Rules of Civil Procedure), the attorneys would agree on the format of ESI to be exchanged. In December, 2006, the Federal Rules of Civil Procedure changed to directly address ESI and electronic discovery.

      Now, in litigation, parties may still get obnoxious amounts of data, but it's electronic. Once it's processed and converted (usually to TIFFs with extracted text, but sometimes PDF), attorneys can do what amounts to a Google search through the files and find what they want pretty quickly. In fact, paper documents are usually scanned and OCRed so they can be handled and searched in the same manner.

      Actually, I thought it was a fairly common legal tactic to make the data as difficult to actually find as possible, without revealing too much to the other side.

      "They want records from three years ago? Send a truck with printouts of all the files we have, that'll keep them busy..."

      Does anyone know that this is no longer the case?

      So no, it's no longer the case. But the first guy who did it must have thought he was pretty funny.

  2. 10 GB user data? Not likely by arth1 · · Score: 5, Insightful

    10 GB of data per user, sure.
    10 GB of user data, no way.
    If assuming 300 work days per employee, that would mean that the average employee creates 1.2 kB of data per second.

    The only way this could be true is if you count data that isn't user generated, and they count the total data storage for the company and divide it by employees.
    If so, users deleting their e-mails won't have much of an effect.

  3. Re:hmm by MrMr · · Score: 5, Interesting

    The top 500 company I worked for did just the opposite: Destroy all data in case a legal issue comes up.
    They called it 'desk cleanout day', and unless you were an official dedicated contact on a particular subject you were to wipe all correspondence of more than a year old.
    (There were also other grades of information, but erase after a year was the default).