Slashdot Mirror


How Do You Store and Reconcile Email Archives?

heyitsjustme wants to know how you deal with old email. "I delete most of what I get but keep the stuff from friends and relations as an archive. Unfortunately I have these email archives from the late 80's through today in the form of macintosh, linux and windows mailboxes including AOL 1.0 mailboxes. What does everyone use to archive email across multiple platforms and non-standard mailbox formats? Is there an easy solution out there? Does anyone archive IM?"

1 of 380 comments (clear)

  1. How I do it by Matt+Perry · · Score: 5, Informative
    I use a procmail recipe to archive my mail. I put it after filtering mailing lists and before I filter spam:

    OLDMAILDIR = $MAILDIR
    MAILDIR = $ARCHIVE_DIR
    :0 cW: archive.lock
    | /bin/gzip >>mailarchive-`date +%Y%m`.gz
    MAILDIR = $OLDMAILDIR

    I use grepmail to find old emails that I might need. Grepmail lets you use perl regular expressions to find messages and then outputs the entire message where a match was found. You can use grepm to open grepmail matches as a mailbox in mutt. grepine does the same for Pine, which I use.

    At the end of each year I clean the spam out of my archives using a procmail recipe and spamassassin. This recipe marks messages as deleted in the mailbox. I open these in pine, sort by deleted, and double check them. Once I'm sure they're all spam, I delete them:

    # vim:ft=procmail:

    LINEBUF = 8192
    SHELL = /bin/sh
    MAILDIR = $HOME/mail

    :0 fW: spamclean.lock
    | spamassassin -e --prefs-file=/home/matt/.spamassassin/user_prefs-s pam_clean 2>/dev/null

    # If the message was deemed to be spam, set the status to "deleted" so that
    # we can delete it easily and optionally review it.
    :0 e
    {
    :0 fhw
    * ^^rom[ ]
    | sed -e '1s/^/F/'

    :0 f: formail.lock
    | formail -I 'X-Status: D'
    }

    # Fix the mangled "From" line
    :0 fhwE
    * ^^rom[ ]
    | sed -e '1s/^/F/'

    # Remove the last of the SpamAssassin headers
    :0 f: formail2.lock
    | formail -I 'X-Spam-Checker-Version'

    # File message in temporary mailbox
    :0: sandbox.lock
    z-cleaned_mbox

    The special spamassassin config turns off bayesian filtering and sets the threshold high:

    required_hits 15
    clear_headers
    fold_headers 0
    use_bayes 0
    The rest of the spam I clean out by hand.
    --
    Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.