Eisenstadt's Analysis Of 8 Years' Worth Of Email
Hylton writes "Thought this might be of interest: Marc Eisenstadt's saved every email he's gotten over the past eight years, including spam, and run an analysis of it."
← Back to Stories (view on slashdot.org)
on their webserver.
I have received more spam in the past week than I have legitimate email in the past 10 years.
Must be nice to be able to look back on porn-spam and feel old. 'Hot XXX - Newcomer Jenna!'
Bite me. Seriously, I enjoy it.
Wonder how this will affect bayesian technology in the future...
Apparantly the computer spent months compiling and cross referencing only to spit out this cryptic message: Host not found
If its already slashdotted, he's also probably saving all of his server logs as well.
my yahoo account i use to collect spam gets 1700 a month, while my "real" email account i've recieved 1566 since august of 2003, only 10 of those being spam.
You call it excessive, I call it ambitious.
I will never buy anything from spam, and whoever does has got to be a complete moron.
And mirrordot didn't pick it up Here is a google cache: http://64.233.167.104/search?q=cache:GshwWambHvEJ: www.corante.com/getreal/archives/2005/02/11/eight_ years_of_email_stats_pass_1.php+eight+years+of+ema il+stats&hl=en
- Teja
I have managed to maintain a hotmail account for 10 years, which I consider a feat. It isn't clean of spam or anything, but 10 filters based on keywords in the subject or body make a HUGE dent in the amount that reacheas my inbox.
Don't misread like I did. I was like, what the hell was Einstein doing with email..
That's what anti-spam laws should be targeting, the morons who use the services offered by spammers.
How we know is more important than what we know.
So I've got a question for analysis (although it seems the server could use a liquid nitrogen bath right now)...
If all the spam-based penis growth pill claims were stacked end to end, how many times would it circle the world, and would it be worth the money to have a member that large?
This is pretty interesting (sadly i can't access TFA) /deny google to take stats out of your email. Many interesting information can be collected, like, for example, Ammount of SPAM / Legitim E-mail, % of each kind of spam (viagra, drugs, porn, etc), spam by countrys, % of Text / HTML email, and even other interesting stats not e-mail related, for example, language analisys, frequent mispells, toppics of interest by age, etc,etc,etc. I Would gladly allow google to make such stats, it can be done in such a way that no personal / sensitive information would be leaked.
Google should have such a program, there should be a preference in you GMail account, where you can allow
(Thinks about what has just said, and puts tinfoil hat on)
ALMAFUERTE
WTF am I doing replying to an AC at 5 A.M on a Friday night?
Saved it for what exactly? Maybe vintage 1997 pr0n e-mails are now worth something to antique pr0n collectors...
Microsoftie Chen's analysis, slashdotted a while ago, has pictures too!
I remember a time when the size of my genitalia wasn't an issue.
I remember when I never had any Korean friends.
I remember a time when I went to the pharmacist for a drug I needed, not the pharmacist asking me which drugs I wanted to buy online.
I remember when consolidating a loan was a big decision instead of "just a click away!".
I remember a time where when I left high school, there was no chance in hell I'd ever have to hear from those nitwits again.
God, I miss those days.
In summary, I don't utilize e-mail much to begin with, I didn't maintain the archives I had very well, and all my figures are speculative.
https://www.eff.org/https-everywhere
If you think this article is about spam, make sure you read it all the way to the end. It's not.
He's questioning the entire technology of email as an effective way of communicating.
Analyzes not just the spam-count in his email, but the work-time needed to respond to the non-spam emails, too.
This is one of the most thought-provoking articles posted on Slashdot in a long time.
I still have all my e-mail dating back to 1997. (My packrat mentality is alive and well on my computer.)
But running a scan on it wouldn't do much use, since I culled all the spam manually over the years...
Does it make you happy you're so strange?
I had a very similar setup going on for a while, but I lost it over a year ago. 6 years and 2 gigs of emails lost to a faulty power supply. Scouring turned up nothing usable and I didn't have backups of my emails.
I felt like I lost a part of my past...
Goes to show the value of backing up your data.
--RIAmAses! Let my MP3ople go!
This 'law' is base based on the fact that of many thousands of emails, there were only about 3 or 4 that I judged to be of value (worth keeping) after three years.
A corollary:
Here is an example of the application of "Femto's Law". The boss sends you an email asking you to do something. If you ignore the email, the boss will either a) if it is important come and tell you personally or, b) find someone else to do the task. Ultimately I think the law is based on the fact that email is mainly used for trivial stuff and important stuff will eventually be presented to you in a form which is harder to ignore.
I guess the applicabililty might have changed since 1998, if email has come to be used for non-trivial stuff, but I reckon it's mostly still true.
Side note: the reason I ended up doing the analysis is because the 'delete' button stopped working on my mail client and I had to sort my emails when jobs. AT the time I posted my conclusions to the rest of the University department, to other people's amusement.
PS. No, I'm not brave enough to ignore my email!
I rotate my email folders every 6-9 months to increase performance.
Even so, I have 2 folders with over 9000 Emails in them. My work Inbox alone has 1015. None of these are spam - I filter those out through a combination of SpamAssassin and manual filtering.
Anyways - my point is that the numbers in this article are small potatoes. He talks about 250 Emails in a week - I easily get 300 -400 Emails **a day**, probably 40-50 of which are directly work related, the other 350 related to various other side projects of mine, so they are just as important.
I would say I read around 25-50% of my Emails. The rest I only give a cursory scan. His numbers for reply times are way off for a number of reasons:
- Hardly anyone replies to every email they recieve. Most of it needs no reply.
- He basically says that the time spent reading the emails and responding is a waste. Well, what do you think managers did to communicate with you before email? You had faxes, daily memos, daily reports to file... it is just more streamlined now. It is not like this stuff is new.
Newsflash - work is difficult. People are distracting to your work. Shit happens. Deal with it, just like everyone else has for the past 150 years.
It depends on what you use it for.
:) Heck, without email I don't even think I could do that by phone without hiring a call center.
I work for a company on the other side of the globe.. couldn't do that without email. I also support an opensource project with 10,000 downloads a week... that generates 'a few' support queries
Is this:
90% of all eMail is useless the moment it arrives in your inbox.
The First Corollary of eMail age is this:
All remaining eMail is useless no more than one year after the moment it arrives in your inbox.
The Second Corollary of eMail age is this:
eMail accidently deleted will become instantly irrelevant or it will be resent without your request.
Links to the text cache only, so doesn't try to access the original site.
has Netcraft Confirmed that? I won't believe it until then ;)
Game Overdrive - Gaming News
*Sacrafices karma to protest idiotic mods*
Snowden and Manning are heroes.
It's interesting to think of where the time goes...
Even checking that, I didn't have spam. I don't know how I'm doing it but I seem to be able to keep spam away from my inbox. Except for my Yahoo acct.
I have 11 years worth!
/me too!
Karma: -2147483648 (Mostly affected by integer overflow)
If it takes your wife 2 hours to scan and prioritize 30 e-mails, I sure hope she doesn't work in triage.
https://www.eff.org/https-everywhere
Having your own domain offers a neat way of tracking where spam comes from. For example, if you see the email I use here, I will know any spam that comes from someone getting my address from here. Of course, /. isn't the best example. Say I sign up at a website, misfitriprapper.com. I will use misfitriprapper.com as the username before the @4la... I use this method EVERYWHERE. I just sent an email last night to Epson support. My email address? epson.com@4la... We've all learned years ago to not trust anybody, so, I don't even trust the big companies like Epson.
All your searching needs (and free money!) - 4Lancer.net
Seems slightly excessive for low priority emails; does she need to respond to a lot of them? Checking my work mail for today (we do email for everything that can't be handled outside the weekly staff meeting), I see I got 13 emails, plus a few spams and one or two I deleted. Doubt I spent more than 10 minutes on it, if that..
Twenties Retirement
No Images or Text
I've started filtering my email on what is basically a Steven Covey 4 Quadrants principle:
Urgent is email that is important to _my_ goals in life, where there is a deadline. Usually that means other people are involved. For example, email from my PhD students who should be working on research that furthers my interests as well. (Covey quadrant 1)
Important is email that is important to my goals, with no deadline. The stuff that is good for me if I read it, but I didn't used to because of the deadline issue. I now make sure to read through the Important folder once a day. An example is conference announcements in my area. (Covey quadrant 2)
Distracting is stuff that is important to other people, but not really me. Most of my Staff mailing lists go in here. (Covey quadrant 3)
Timewasting is stuff that is fun but not really important to anyone. Friends mailing lists talking about the latest in computer games or eclectic news stories, for example. Stuff I can read for 5 minutes to get a chuckle before meetings. (Covey quadrant 4)
Other email gets put aside for me to find out how to not get it again. For example mailing lists I subscribed to once thinking they'd be useful for me, but really I'm better off searching the web when I need that info rather than wasting my time keeping on top of it every day/week.
It works very nicely, and I only have a couple of filters for the lot. I get 400+ emails a day, incidentally.
Try it -- just set up 4 filters copying rather than moving the emails, and run it in parallel with your current filters...
R
.
The trouble with the world is that the stupid are cocksure and the intelligent are full of doubt.
-Bertrand Russel
I would highly reccommend that you all check out http://www.dodgeit.com/ . They offer a free, no sign-up, recieve only, RSS enabled, no password email service. It is great for signing up for random things that require you to follow a link.
If I spent 3 minutes reading/responding to each of the 2,200 emails, then I've spent nearly 4.5 DAYS on email in the last 5.5 months.
Just for comparison...how much time have you spent reading Slashdot in the last 5.5 months?
Twenties Retirement
Total Emails / Seven Days: 10503
Total Blocked Attempts: 3371
Total Filtered Junk Email: 1778
49.0% of email detected as junk
Estimated 20,596 junk emails per month.
The above is not worth reading.
I've saved every e-mail I've sent/received (minus spam) since early 1998 or so. I've saved I don't see any reason not to. It's sort of a journal of my life. I'm making regular, multiple, and offsite backups so the longevity of the data should be possible.
I've only saved a few IRC and IM chatlogs before 2003 before I started using Gaim. Since then I've saved every IM conversation I've had since then.
I don't really think the data has any value except maybe to reminisce about old friendships, or what things used to be like. It was kind of weird reading an old IM conversation I had with someone telling them about this new "MP3" file format. Who knew it's popularity would explode and turn into a huge legal mess.
At least at the office I'm getting *paid* to go through the catch-all account - it's currently running greater than 99% spam.
Did anyone else find this article INCREDIBLY boring?? I just couldn't care how a balding man analyzes email..
i hate pansy republicans
I also save all my emails. Over the last two months I have seen a daily average of 708 spams, 75 messages on public mailing lists that I read for work (such as mozilla-webtools), 26 internal work emails, and some dribs and drabs.
I filter most of it by hand (an RBL filter refiles about a third of the spam). The spam takes me about one second per message (I press 's' to refile in the spam folder). I think I will deploy an automatic spam filter in 2005, but I'll still keep all the messages.
I have every email that I've ever received since 1992, with some exceptions for work accounts, and if I'd just thrown them out on the "if it's more than a year old..." principle, I'd have missed one of the best email romances any geek has ever had.
All remaining eMail is useless no more than one year after the moment it arrives in your inbox.
That is actually not true. One year is a local minimum.
As email ages it loses currency and it becomes increasingly difficult to act on the information contained. However, that's not all the value in email.
Older email is valuable as historical record. It contains details no longer stored in your head. It's value increases with age.
These curves cross at about 1 year.
Thus, the one year mark isn't the time to throw email away. It's the time to archive. Put it all aside and come back to it much much later.
Sorry for that spam but I think this program might actually be relevant: mboxstats generates a statistical report of a mailbox with information like who wrote the most messages, at what time are the most messages written, what is the most used subject, etc. etc.
www.vanheusden.com - home of Multitail, HTTPing, CoffeeSaint, EntropyBroker, rsstail, bsod, listener, nagcon, nagi
Does anyone know if he used the same email address throughout those 12 years? If he switched addresses the spammers might not have known about the new address, thus reducing the amount of spam he will receive the coming years.
I've only been reading slashdot since the end of January... but I've probably "wasted" weeks already!
I work in a small office of about 40 people, and of those 14 have actual pc's. While upgrading from one mail server to the next, I had the pleaseure of going desk to desk and archiving every email account and moving the archives to a network drive. One account had 150,000 emails in the inbox alone. Over 100,00 replies, and well over 300,000 in the deleted email. I made the suggestion to empty deleted mail and this persona informs me, that they need the trashed items. Uhh wouldn't that mean he wouldn't have trashed them? Either or, 4 1/2 hours later the account was archived over 10 pst files. Our spam account which is only 3 months old gets 10,000+ a day, and deleted daily. My account... no more than 25 inbox 0 deleted 0 sent, and I still look busy :)
Last year I kept all my email at work for 6 months. I called all mail that I had not personally signed up for to be SPAM and that includes conference announcements. Approximately 51% was SPAM of about 3000 total. I don't have the exact numbers in front of me anymore. During the summer and fall, I let some graduate students use my computer, and now I get approximately 75 % SPAM. I don't read it all, but I also get email to my computer that has a different user name and email address.
I hate saving email. I consider most email to be like a telephone conversation. You get the info you need and then once it's over you don't sit and save the conversation. I don't want to re-read a conversation from 4 years ago about my friends and I planning a dive trip.
If my Inbox gets 50 messages in it, it's time to clean it. I archive some message - mainly stuff I've sent to myself that has tech tips/tricks in it, or info on how to do something that I don't do that often.
I used to save everything - but then I realized that I never go and re-read it...I started being a delete nazi.
Never make me think of this again. NEVER!
A patriot must always be ready to defend his country against his government. -edward abbey
I posted an automated weekly analysis of the language used in my email some time ago.
http://www.2ad.com/~john/spam_zeitgeist/
This focuses more on language used rather than on message type. So it reveals some of the patterns used in marketing messages.
John
Your two suggestions don't address the problem and would not solve anything. This spam comes to non-existent user names which instantaneously makes it spam; I don't need to check the sender, the content, white lists or any kind of lists. As soon as the sender says who it is for with the RCPT: command, I know it is spam, and can deal with it very simply, at least now that I have switched to a full time connection.
As for the spam that comes directly to the few real accounts, it is about 200 a day but very easy to deal with, since the idiots send several copies of each spam, which sticks out like a sore thumb. I probably spend 30 seconds a day dealing with it.
I didn't do anything for a long time because, being intermittent dialup, almost all mail came thru my ISP as secondary MX, and bouncing it would only cause them far more trouble than simply accepting it and diverting to the bit bucket. But I just switched to a full time account, and now I can tarpit it, reject it, drop the connection, whatever I want.
Infuriate left and right
Qmail also can reject it as soon as it knows it is going to a non-existent user. But since my ISP collected almost all mail to me as secondary MX and forwarded it to me only when I was connected intermittently, rejecting it would usually mean just making trouble for the ISP, so I just received it into the bit bucket. I recently switched to a full time account, and now I can treate it appropriately.
Infuriate left and right
I had a yahoo account for five years with no spam at all.
Then my 10 year old cousin sent me an email from her new account at some child oriented webmail site. I think it was cutie.com but they don't seem to be there anymore.
I sent a reply from my linux box which had its own smtp server and the message bounced, I was on a dialup netblock. These days I use my ISPs smtp server.
So I sent the reply from my yahoo account and the next day I got 20 spam messages in that inbox.
So the thing which shits me about this is that this company made a big deal about their filtering software protecting children, fair enough, while at the same time they were clearly selling contacts to spammers.
http://michaelsmith.id.au