Slashdot Mirror


Obama Orders Federal Agencies To Digitize All Records

Lucas123 writes "President Obama this week issued a directive to all federal agencies to upgrade records management processes from paper-based systems that have been around since President Truman's administration to electronic records systems with Web 2.0 capabilities. Agencies have four months to come up with plans to improve their records keeping. Part of the directive is to have the National Archives and Records Administration store all long-term records and oversee electronic records management efforts in other agencies. Unfortunately, NARA doesn't have a stellar record itself (PDF) in rolling out electronic records projects. Earlier this year, due to cost overruns and project mismanagement, NARA announced it was ending a 10-year effort to create an electronic records archive."

6 of 186 comments (clear)

  1. Unit Of Measurement by wideBlueSkies · · Score: 5, Funny

    So, how many Library of Congress equivalents worth of material are they intending to scan??

    --
    Huh?
    1. Re:Unit Of Measurement by Anonymous Coward · · Score: 5, Funny

      At least 1.

  2. Lockheed Martin by dg41 · · Score: 5, Insightful

    Looking at the NARA article, as soon as I saw that some big IT contract was given to Lockheed Martin I saw all I needed to know about this initiative.

  3. Re:seriously, how hard is this? by kiwimate · · Score: 5, Interesting

    Is there some complication I don't understand?

    Yes. More than one.

    Nothing fancy, just a database of scanned forms in pdf format and the like.

    There's the first problem. It's never simple.

    First issue - if you're going to put documents in, you're going to want to get them out. How do you search for them? You're going to want to define the metadata, and that's a headache. Got lawyers? They'll want client and matter. But those fields are just about meaningless to anyone else. How do you resolve the incompatibility? Do you use different forms for different groups of users? How will the engineering department find the subpoena papers that the lawyers filed?

    What fields are globally useful? Are they so generic that any search will retrieve hundreds of documents? Conversely, are they so specific as to make your metadata field selections horribly long and therefore ambiguous? (Free text metadata? Let's not go there.)

    Remember that you've got to fill in that metadata any time you add a document. What's the balance between useful and annoying? Too many fields and nobody will want to fill it in. Too few, and you won't be able to find anything.

    That's for new documents. When you first implement a DMS, you have a truckload of documents to be imported. You're not going to do it manually, you're going to use an auto-import. But how do you define the metadata for all those millions of documents you're importing? What if you have client/matter, for instance? Hopefully they're all already sorted, and you can use something like Kofax Capture, a seriously powerful and fast scanner, and separator sheets on which you can do forms recognition to define the metadata fields. But there's a lot of work involved up front to get that import working properly.

    Don't forget the OCR. Hopefully all your paper documents are clean and will OCR nicely, so you can do full text indexing.

    Security. Better get that set up right. Profile level security? It's more secure, but people will complain that they don't know if a document is there and they just need to request access because profile level security means if you don't have permissions to access a document it won't even show up in your search results. Groups. And by the way, remember to define the permissions on all those millions of documents you're importing.

    Version control. How do you control check in and check out? Do you control check in and check out, or just audit it?

    I've only just scratched the surface of a document management system. Then there's records management. You'll want to make sure your system is DoD 5015.2 compliant. Setting up the retention schedules...hopefully you've got a records retention policy already, otherwise that's months worth of work to define those policies and ensure you comply with all regulatory requirements while still balancing your need to purge/archive old records.

    How does something even become a record? Hopefully you've already got knowledgeable librarians (yes, that's what they're called), and you just need to train them on your new RM system.

    Are all your boxes already barcoded? Your RM system should be able to register where a record is - building, shelf, box.

    You're probably getting the idea. The technology is easy. The processes are complicated, and they get exponentially more complicated as the size of your client base grows.

  4. Re:You've been smoking the hope by forkfail · · Score: 5, Informative

    Actually, I went and read the executive order here:

    http://www.scribd.com/doc/74042394/Managing-Government-Records-November-28-2011

    which itself says nothing about Web 2.0 itself. Nor about moving to the cloud. The requirements laid out there are business level, and basically translate to the following: "You have 120 days to come up with system level requirements to move our data from hard copy to soft copy."

    With this said, the section from the order that you're quoting is 2-b-i. It refers to the need to have a unified solution for archiving all existing electronic communication. Would you prefer that every department and agency have its own? And here I thought you might be in favor of cutting costs and efficiency.

    Finally, your link shows that Obama has issued 17 signing statements in 3 years. That's about 6 per year. Bush issued 161 over 8 years. That's 20 per year. The number of executive orders is similar. And honestly, the Democrats in congress didn't play the cloture games that the Republicans play now. They made a huge stink about the ONE appointment that the Democrats tried to block (remember the chants of "up ur down! up ur down!"). Now, the Republicans won't let a damn thing to the floor of the Senate for a vote that doesn't explicitly further their causes. In other words, false equivalance fail.

    --
    Check your premises.
  5. In the Archival Trenches... by jlaprise1 · · Score: 5, Informative

    As a professional historian who has worked in the National Archives in College Park, MD and at four different presidential libraries, which incidentally are also managed by NARA, I need to interject that this is an immense costly but valuable project.

    Remember "the warehouse" from the Indiana Jones movies? NARA is a little like that in terms of size but are better organized. Aisle upon aisle, shelf upon shelf, row upon row, room upon room, floor upon floor, building upon building of neatly indexed banker's boxes with labelled folders of documents. The labels may have been checked by the archivists at NARA, but they may also simply be the labels affixed to the records by the source federal agency. The individual documents in folders are almost never labelled. In the course of my work, I gathered 30k digital pictures of documents over the course of two months. The acquisition process sounds deceptively easy. Look in the index, find key words and request boxes from the archivist. Then you look through folders to locate individual documents. In point of fact, I probably visually scanned 3M pages to see if they were "interesting" and photo worthy for future research, usually taking only a few seconds per page to make a snap judgement. My decisions on which boxes of documents to request were far more time consuming. What is the right keyword for talking about computers in government in 1970? If you said "information automation" then you would be right. A few presidential (Ford especially) libraries have updated electronic files for indexing which is a huge advantage.

    On my trips to the archives, it was interesting to see both professionals and amateurs using a range of technologies. I saw really old school researchers using 3x5 note cards and taking notes on legal pads. They sometimes supplemented their work by photocopying really important documents at $.75/copy. Some researchers avoided this cost by using flat bed scanners which they carried in with them. Still other researchers brought in high end digital cameras and tripods. I used a digital camera freehanded. All of these people still need to find a way to actually get to physical proximity with the records. Digitalization would open up a new era in research.

    On the metadata issue, most of these records already have copious amounts of metadata recorded in well-established fields that are used by NARA.

    On the OCR issue, some documents have hand-written notes on them which would not be machine readable and sometimes are not human readable. It is likely that the documents will have to be digitally scanned and flagged if handwriting is detected.

    Making these records available to the general public would be a huge advantage to anyone interested in government and US history. Come to think of it, in terms of size and complexity, it would be a worthy challenge for Google. U.S. government documents run back to the founding of the country and the number of documents only increases over time.