Slashdot Mirror


Multi-page PDF To Multi-page TIFF and Archiving?

GeorgeMonroy writes "One of my clients has aperture cards that they have been scanning into multi-page PDF files — but now they want them in multi-page TIFFs instead. One of the reasons they gave for this is that TIFF files require less storage space. While that is true, I wonder if TIFF is the best format going into the future. Are TIFFs better than PDFs for future use? I wonder what format you think would last longer. Are there any other formats that you think would be better or more future-proof? To me, storage is not a good enough reason to go to TIFF, because storage prices are always dropping anyway. Also, since they already have many of these files in PDF format and they want to convert them into multipage TIFFs, are there any programs that you can recommend that will perform batch processing of files so that we do not have to convert each PDF one by one? If another file format is better than TIFF, then are there any programs for batch processing that you can recommend?"

4 of 125 comments (clear)

  1. Re:Are these things images or documents? by pclminion · · Score: 4, Informative

    If they're images, then you should use TIFF (or perhaps PNG). However, it doesn't make sense for them to be "multi-page." If they're documents, then PDF is appropriate.

    Multi-page TIFF is well supported in the industry. There is nothing "weird" about it. It even supports embedded, searchable text (a Microsoft addition, but something that actually adds value). PDF archival can be difficult to do correctly. At the very least you want to use a product which supports PDF/A, followed up with some serious validation to make sure the results are actually compliant. Otherwise you may get bitten decades down the road. Searchable TIFF, on the other hand, will be around for freaking ever.

  2. Don't do this. But if you insist, here's how. by Cheesey · · Score: 5, Informative

    Are TIFFs better than PDFs for future use? I wonder what format you think would last longer. Are there any other formats that you think would be better or more future-proof? To me, storage is not a good enough reason to go to TIFF, because storage prices are always dropping anyway.

    Don't use TIFF. Stay with PDF. PDF is what all the big digital libraries are using. It's a proper standard, it's readable and writable by lots of free open source software, so even if Adobe disappears in a puff of intellectual property, you'll still be able to read your documents.

    TIFF, on the other hand, is a container format (like AVI, but worse). It isn't fully supported by every program - what sort of TIFF do you want, anyway? Compressed with LZW? With RLE? Not compressed at all? There's free software that will read and write the most common types of TIFF, so you can certainly do it, but why give up the convenience of using PDF?

    Also, since they already have many of these files in PDF format and they want to convert them into multipage TIFFs, are there any programs that you can recommend that will perform batch processing of files so that we do not have to convert each PDF one by one?

    Use ghostscript. Use something like the following command line:

    gs -dNOPAUSE -sDEVICE=tiffgray -sOutputFile=output%02d.tiff -dBATCH -r300 input.pdf
    This turns input.pdf into a series of 300 dpi tiff files, one for each page, called output01.tiff, output02.tiff, etc. Change the DEVICE to get a different sort of tiff file, and use gs --help to get a list of options. You can easily wrap this command in a script of almost any sort to make the process fully automatic.
    --
    >north
    You're an immobile computer, remember?
  3. pdf2tiff.sh by Anonymous Coward · · Score: 4, Informative

    let's not reinvent the wheel -- I did this about 9 months ago //wolfmann -- and this code is Public domain (done on federal gov't time):

    # cat pdf2tiff.sh
    #!/bin/bash

    for file in */*.pdf #for each pdf
    do
                    filename=`echo $file | cut -d'.' -f1`
                    if [ ! -e "$filename".tiff ]
                    then
                                    echo "gs -q -dNOPAUSE -dBATCH -sDEVICE=tiffg4 -sOutputFile=$filename.tiff $file"
                                    gs -q -dNOPAUSE -dBATCH -sDEVICE=tiffg3 -sOutputFile="$filename".tiff "$file" 2> /dev/null
                    else
                                    echo "$filename.tiff exists! skipping..."
                    fi
    done

  4. Re:Are these things images or documents? by MBGMorden · · Score: 4, Informative

    Multi-page TIFF is well supported in the industry. Better supported than PDF in some cases. Our records management (in addition to keeping electronic scanned copies) still insists on having a microfilm copy of all of our retained documents. We can send digital copies to a processing company to have them processed, but they don't accept PDF documents - only TIFF's (multi-page is acceptable). Given that our internal document management is all in PDF, I ended up having to find a program to convert all of that information about a year ago (though the name of the program we ended up using escapes me - I wouldn't recommend it anyways, since it crashed for me very frequently).
    --
    "People who think they know everything are very annoying to those of us who do."-Mark Twain