Software Sorts Electronic Evidence
securitas writes: "The New York Times has a very interesting article about the legal industry using new search software to sort through electronic evidence such as e-mail, documents and recovered files, and the process that they go through to make the evidence usable. It has spawned an industry."
This is why it is critical to have a retention schedule so old emails don't come back to bite you. I haven't seen employees take this issue seriously even when there is a clear policy.
I thought the most interesting part was: "Responding to a request for documents during a merger review by the Federal Trade Commission took one company nearly a year because the electronic documents were kept in offices all over the world and in all manner of different formats". But the twist there is: would you like to have everything easy to find (and if so, how do you do that, what with all the fragmentation between office 97, office 2000, not to mention different file servers and such)? Or are you better off making it inconvenient for other litigants to get at the data?
I knew a guy that used to write code for the defense (electronic) industry. He told me that it could cost up to about $75 million to trace someone's email or telnet sessions if they did it smart enough. The problem comes out trying to get so many companies and foriegn countries to cooperate with you enough to look at their logs. He said it was because there isn't much goodwill towards the US in many parts of the world.
The continuing investigations into the Clinton presidency ran into a major problem. Every email sent to, and within, the administration was saved. But not indexed. Apparently there are terabytes of the stuff. On tape. Sorting through it looking for evidence is likely to take years. And then it has to go to the national archives. And that's just the email. Add in all the other digital content they generated, all of which had to be saved, and there's a major problem.
Best Slashdot Co
I worked for one of the state level governments in N.A. and had access, and "da-buck-stops-here" responsibility for the IT side of "Archives". Archives is leglislatively required to hold in permenant storage, "All materials relating to the ongoing business of the government". This caused some real problems:
1) we had a case of an outgoing elected official low level formatting their HDD on the way out the door. Had to be sent out to a special data recovery lab. (they can do some amazing things with scanning electron microscopes on half tracks and such)
2) there are stacks and stacks of 8" floppy disks, in formats like IBM DisplayWriter, and other chunks of physical hardware that haven't been seen by mortal man in 20 yrs.
Finding a chunk of info is damn tricky, but after you find it, you have to find something that can read the punchcard/papertape/magtape/floppydisk/harddisk in question. And due to a querk in how the original act was written (keeping in mind that these things were written back when data was carved on rock slates and format isn't a big consideration) we were required to keep it in its original form.
I feel for someone with my job in 50 yrs. I ran away from govt work after that. It was scary!
One plus side. EMP has a hard time taking out papertape!
On the whole, I find that I prefer Slashdot posts to twitter ones because I don't get limited to 140 chars before
In reality, the biggest difference between grep and so-called "forensics" software is the emphasis on examining the data without modifying it and maintaining the chain of custody and audit trail. In fact, many experienced computer investigators do their jobs with little more than DD, grep, and various other Unix utilities. Most of the digital forensics software out there simply attempts to make this funcionality more accessable to your less tech saavy investigator. (The problems caused by inexperienced/unqualified investigators performing this type of analysis are beyond the scope of this response.)
I am currently the designer and project lead for a cross-platform open source (GPL) digital evidence processing suite. It is intended to bring together the various functionalities required to perform this type of work, and (ideally) operate on whatever platform the investigator desires. Our primary development platform is RedHat 7.1.
There are currently software packages out there that attempt to do this, including EnCase and The Forensic Toolkit in the commercial arena and The Coroner's Toolkit in the open source arena, however they lack the broad filesystem support and/or true ease of use to make them usable by everyone. The other barrier is price as EnCase, for example, costs thousands of dollars per copy.
We're well funded, and have already done a significant amount of work. We have some of our core components functional and plan on starting beta testing and releasing our first code drop later this year. If this field interests you and you'd like more information, or you work in the investigative field and have thoughts on what you'd like to see in such a tool, I'd love to hear from you.
---
It seems to me that this could also bring about another niche for programmers... software that goes through a companies systems and completely wipes out data. I'm sure this would cause havoc with a company's ISO procedures, but with more and more companies getting caught because of e-mail or other data that is archived on their tapes, I could see the need to make sure that they have control over the "evidence" that exists on their own systems.
I'm not saying it's the right thing to do, or even that it's legal, but since most companies are not even aware of what data they do have backed up, and what is retrievable and what isn't, I could see this happening, if it hasn't already.