Washington State Archives Go Digital
prostoalex writes "USA Today and dozens of others report that Washington state archives went online. Over the past two years project participants scanned 1 million documents issued by state and country authorities. The archive is located in my alma mater Eastern Washington University (go Eagles!) The 800 terabyte storage system was developed by Microsoft and EDS."
Just in case someone actually wanted the address for the archives it's http://www.digitalarchives.wa.gov/
Posting on EDS site
From the NYSE Site
%blow
%blow: No such job
^how did the sex change go?
Modifier failed
The Province of New Brunswick Provincial Archives have been like this for quite some time now, with birth, death, marriage certs and census records. I have been able to search for information about my family history online using their handy dandy search tool, as well as visiting the Archives themselves at University of New Brunswick. It never occurred to me that others might be trying catching up, but I guess that this type of service isn't something that most governments deem necessary for the public.
"Well you're not Fiona Apple, and if you're not Fionna Apple, I don't give a rat's ass."
That's right, you'll put all the data online in single partition disks hung off one server. Why didn't I think of that?
While there is nothing to stop an NTFS partition being 800Tb, it is far more likely that some sort of nearline hierachical storage is being used, the sort of system that is used the world over in workflow/image systems.
http://www.digitalarchives.wa.gov/
"God fights on the side with the best artillery." - Napoleon, Marshal of France - speaking truth to power
DjVu is a web-centric format and software platform for distributing documents and images. DjVu can advantageously replace PDF, PS, TIFF, JPEG, and GIF for distributing scanned documents, digital documents, or high-resolution pictures. DjVu content downloads faster, displays and renders faster, looks nicer on a screen, and consume less client resources than competing formats. DjVu images display instantly and can be smoothly zoomed and panned with no lengthy re-rendering. DjVu is used by hundreds of academic, commercial, governmental, and non-commercial web sites around the world.
DjVuLibre is an open source (GPL'ed) implementation of DjVu, including viewers, browser plugins, decoders, simple encoders, and utilities.
I'm using Firefox (from Windows sadly) and I can access the content just fine.
As for OSX and Linux users, there is a plug in for viewing the content needed. But they report to support OSX and "UNIX". The plug-in is called DjVu and has an open source equivalent at sourceforge (with RPMs, OS/2 and even Cygwin support).
Get your Unix fortune now!
The system isn't 800TB, but will scale to 800TB, according to this EDS press release. In fact, given that they've spent a mere $2.5M (powerpoint!) there's not a hope in hell that they've got 800TB! The powerpoint says it's a 5TB EMC SAN & an ADIC tape library for backup.
An interesting point is that they're delivering the documents using DjVu by Lizardtech, which is GPLd, and developed by the creators of DjVu in conjuction with LizardTech (after a period of LT not-getting-it). The DjVuLibre home page is here. LizardTech still have the best encoders for the format.
What years? This database seems to be limited to older archives... the most recent year for a record I found was 1965.
-Joe