Slashdot Mirror


How Would You Archive Mounds of Genealogy Data?

dexter riley asks: "Hello, all. My mother, a librarian, historian and genealogist for over twenty years, died about a year ago. She left a huge amount of genealogy information, culled from books, magazines, and the internet, mostly in the form of typewritten, photocopied, and printed pages. My main goals are: Preservation - converting the documents into a compact format that can be easily copied and transferred to others; and Indexing - making it possible for someone else to easily find the documents referring to a particular person, family, place, or document type (like land, marriage, military, birth or death records). To this end, I would like to convert her work into a format that can be stored digitally and scanned for keywords, to make it easier for others to use this information for their genealogy projects later on. What tools do you recommend for handling a project of this size?" " I'd estimate there are at least 10,000 pages of documents in all. Much of it is organized by binder into family groups, but a lot of it is unorganized, loose paper. Besides being an irreplaceable resource for any future genealogists in my family, there are other researchers working on related lines that may find some part of this data useful. At the very least, I would like the satisfaction of keeping some part of her work from being lost for a few years more.

Here's a general list of things that I've determined I would need:
  • Scanners: What flatbed scanners would you recommend for fast, high-resolution scanning of documents?
  • Image formats: What lossless image formats would you scan your original documents into?
  • OCR software: Although OCR is not perfect, would you recommend using it to allow keyword searching to the original document? If so, which software would you suggest?
  • Document Indexing: In addition to OCR, are there other tools (document tags?) that you would use to help classify and organize images and other digital documents?
  • File organization software: Ultimately, many thousands of text and image files will be generated. Since I don't want to just convert a paper mess into a digital mess, what tools would you use to organize related image and text files?
Did I miss anything in the above list? Any suggestions you all might have would be hugely welcomed."

1 of 73 comments (clear)

  1. iPod?? by SimianOverlord · · Score: 0, Troll

    I find the most convenient method of carrying reams of data around is my iPod. All you need to do is scan in all your documents and use it like you would any other storage device. The advantages of this are:

    1) You can also listen to music and
    2) You could convert your genealogy data into music notes, record them into mp3 files or aac, and listen to them. If you developed enough facility with this music -language (musuage) you could listen, on the hoof and answer questions relatives may have in real time.

    Perhaps Aunt Nora may approach you at a BBQ and ask you about your mothers brothers in laws second cousins puported realtionship to Henry VII. One quick spin of the patented iPod wheel later, and you're listening to that relationship aurally and giving her a running commentary of that side of the family, whilst thoughtfully munching on a burnt sausage roll. I can see big things with this approach.

    --
    Meine Schwester ist sehr, sehr reizvoll - Nietzsche