Slashdot Mirror


How Do You Store and Reconcile Email Archives?

heyitsjustme wants to know how you deal with old email. "I delete most of what I get but keep the stuff from friends and relations as an archive. Unfortunately I have these email archives from the late 80's through today in the form of macintosh, linux and windows mailboxes including AOL 1.0 mailboxes. What does everyone use to archive email across multiple platforms and non-standard mailbox formats? Is there an easy solution out there? Does anyone archive IM?"

11 of 380 comments (clear)

  1. Here's what I do... by sub7 · · Score: 5, Funny

    I archive all my pr0n on DVDs these days. It's really easy and oh wait... fsck!

    --
    rm -rf /bin/laden
    1. Re:Here's what I do... by The+Amazing+Fish+Boy · · Score: 5, Funny

      I archive all my pr0n on DVDs these days. It's really easy and oh wait... fsck!

      One day... someone... somewhere is going to invent some sort of mechanism for removing text you've already typed. It shall be called "back-one-space" and will remove the letter before it.

      If this is impossible, surely they can keep a way of having all our text auto-submitted!

  2. Disk space is cheap. Why bother deleting? by heypete · · Score: 5, Insightful

    Save it all. With the exception of some mail archives lost to catastrophic disk failures (I keep archives for my own convenience, not for any official purposes, so I don't back them up), I keep all my email.

    Thunderbird is able to import all my old mail archives (from years and years of Eudora) and search it effectively. If I were inclined to export all my archives from my Mac to my Windows machine, I could use Google Desktop Search to really search through it all.

    1. Re:Disk space is cheap. Why bother deleting? by Libraryman · · Score: 5, Insightful

      Why delete?

      Because if you delete early and often, you've committed no crime. If you wait to delete it until someone (feds, cops, *IAA, UN-black-helicopter troopers, whoever) demands you turn it over to them, you're screwed.

      After all, you break laws too (everybody does, they are written that way). You just haven't been caught yet. (I know this because if you had, you wouldn't have all you email archived!)

  3. I work for Microsoft by Anonymous Coward · · Score: 5, Funny

    ...so I just delete everything after a major deal falls through.

  4. One Word by Zone-MR · · Score: 5, Insightful

    One word: IMAP. If you can read your email using any decent email client, it should support moving it to an IMAP server. If you are using web-based email or some crappy client which can't export emails to a standard/raw format, you'll have to write a script to convert the messages.

  5. It's simple: plain text by Faust7 · · Score: 5, Insightful

    Ever since I first got acquainted with e-mail on my Apple IIe in the '80s, I've used e-mail programs that offer plain-text storage as at least an option. It's one of the most universal formats in existence, and can be read one way or another on computers both decades old and brand new. I encountered some weird proprietary clients in the '90s that still stored e-mail in this format, because from a corporate perspective, this stuff was still in its infancy, plus HTML hadn't yet mucked everything up. To this day I still store in plain text from Eudora 6.2.

    I burn it to CD-Rs that I know won't get moved around or scratched. They stand a good chance of lasting the rest of my life.

  6. Re:One word by Padrino121 · · Score: 5, Insightful

    Gmail?

    I don't know about you but I generate about 6GB of email archives per year. Besides that having my email potentially available for searching doesn't sit well with me. I'm not sure where it stands now but there were a lot of potential privacy issues with Gmail.

    No I don't receive hords of email, just a lot of engineering related with source code,research, white papers attached. If you do anything business related it's important to keep all of the original emails received so there is an electronic paper trail.

  7. Re:Log everything... by Rosco+P.+Coltrane · · Score: 5, Funny

    I log and keep all my traffic including IRC logs going back to '94.

    Hey B5_geek, here's a trick to free up a lot of disk space *and* raise the S/N ratio in your logs:

    mv irclog.txt irclog.txt.fat && grep -vi lol irclog.txt.fat > irclog.txt && rm -f irclog.txt.fat

    --
    "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
  8. Re:One Word by pHDNgell · · Score: 5, Interesting

    One word: IMAP

    Absolutely. I use no fewer than two mail clients on two different machines on any given business day. Every email I've sent since 1995 or something like that, and received since 1998 is available and searchable. Over this time, I've accessed this archive with the following clients:

    * pine (lots of pine)
    * mac mail
    * thunderbird
    * various netscapes/mozillas
    * ML (some random IMAP reader)
    * My phone (my old Sony/Ericcson speaks IMAP)
    * My palm (two different apps)
    * python
    * a java webmail system I wrote
    * three or four other webmail systems
    * mutt ...who knows what else. I've got freedom to try whatever I want at any given moment without losing my current or past mail.

    --
    -- The world is watching America, and America is watching TV.
  9. How I do it by Matt+Perry · · Score: 5, Informative
    I use a procmail recipe to archive my mail. I put it after filtering mailing lists and before I filter spam:

    OLDMAILDIR = $MAILDIR
    MAILDIR = $ARCHIVE_DIR
    :0 cW: archive.lock
    | /bin/gzip >>mailarchive-`date +%Y%m`.gz
    MAILDIR = $OLDMAILDIR

    I use grepmail to find old emails that I might need. Grepmail lets you use perl regular expressions to find messages and then outputs the entire message where a match was found. You can use grepm to open grepmail matches as a mailbox in mutt. grepine does the same for Pine, which I use.

    At the end of each year I clean the spam out of my archives using a procmail recipe and spamassassin. This recipe marks messages as deleted in the mailbox. I open these in pine, sort by deleted, and double check them. Once I'm sure they're all spam, I delete them:

    # vim:ft=procmail:

    LINEBUF = 8192
    SHELL = /bin/sh
    MAILDIR = $HOME/mail

    :0 fW: spamclean.lock
    | spamassassin -e --prefs-file=/home/matt/.spamassassin/user_prefs-s pam_clean 2>/dev/null

    # If the message was deemed to be spam, set the status to "deleted" so that
    # we can delete it easily and optionally review it.
    :0 e
    {
    :0 fhw
    * ^^rom[ ]
    | sed -e '1s/^/F/'

    :0 f: formail.lock
    | formail -I 'X-Status: D'
    }

    # Fix the mangled "From" line
    :0 fhwE
    * ^^rom[ ]
    | sed -e '1s/^/F/'

    # Remove the last of the SpamAssassin headers
    :0 f: formail2.lock
    | formail -I 'X-Spam-Checker-Version'

    # File message in temporary mailbox
    :0: sandbox.lock
    z-cleaned_mbox

    The special spamassassin config turns off bayesian filtering and sets the threshold high:

    required_hits 15
    clear_headers
    fold_headers 0
    use_bayes 0
    The rest of the spam I clean out by hand.
    --
    Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.