Eisenstadt's Analysis Of 8 Years' Worth Of Email
Hylton writes "Thought this might be of interest: Marc Eisenstadt's saved every email he's gotten over the past eight years, including spam, and run an analysis of it."
← Back to Stories (view on slashdot.org)
Wonder how this will affect bayesian technology in the future...
I have managed to maintain a hotmail account for 10 years, which I consider a feat. It isn't clean of spam or anything, but 10 filters based on keywords in the subject or body make a HUGE dent in the amount that reacheas my inbox.
This is pretty interesting (sadly i can't access TFA) /deny google to take stats out of your email. Many interesting information can be collected, like, for example, Ammount of SPAM / Legitim E-mail, % of each kind of spam (viagra, drugs, porn, etc), spam by countrys, % of Text / HTML email, and even other interesting stats not e-mail related, for example, language analisys, frequent mispells, toppics of interest by age, etc,etc,etc. I Would gladly allow google to make such stats, it can be done in such a way that no personal / sensitive information would be leaked.
Google should have such a program, there should be a preference in you GMail account, where you can allow
(Thinks about what has just said, and puts tinfoil hat on)
ALMAFUERTE
WTF am I doing replying to an AC at 5 A.M on a Friday night?
Microsoftie Chen's analysis, slashdotted a while ago, has pictures too!
If you think this article is about spam, make sure you read it all the way to the end. It's not.
He's questioning the entire technology of email as an effective way of communicating.
Analyzes not just the spam-count in his email, but the work-time needed to respond to the non-spam emails, too.
This is one of the most thought-provoking articles posted on Slashdot in a long time.
Try your own domain name on a dialup connection :-) My own account gets around 200 spams a day. It annoys me, but doesn't take long to delete, since so much of it is 5 or ten copies of the same thing, which sticks out like the proverbial sore thumb when viewed once or twice a day.
But starting last summer, maybe 9 months ago, some spammers realized they had an untapped (fools') gold mine to plunder, and my simple little home domain has been receiving more and more spam to accounts that don't exist, like bill123 and so on. My poor little dialup domain has been receiving around 50-60,000 spams a day to those bogus accounts. It hit 120,000 one day.
It's easy enough to deal with since it is known to be spam by definition of going to bogus accounts. I never see it unless I am curious. I collect stats daily on how many unique account names were used, around 3000. It just amazes me that those bozos would send so much pure crap with no hope of ever getting a response.
Infuriate left and right
This 'law' is base based on the fact that of many thousands of emails, there were only about 3 or 4 that I judged to be of value (worth keeping) after three years.
A corollary:
Here is an example of the application of "Femto's Law". The boss sends you an email asking you to do something. If you ignore the email, the boss will either a) if it is important come and tell you personally or, b) find someone else to do the task. Ultimately I think the law is based on the fact that email is mainly used for trivial stuff and important stuff will eventually be presented to you in a form which is harder to ignore.
I guess the applicabililty might have changed since 1998, if email has come to be used for non-trivial stuff, but I reckon it's mostly still true.
Side note: the reason I ended up doing the analysis is because the 'delete' button stopped working on my mail client and I had to sort my emails when jobs. AT the time I posted my conclusions to the rest of the University department, to other people's amusement.
PS. No, I'm not brave enough to ignore my email!
It's interesting to think of where the time goes...
No I'm afraid not. It's actually a variant of Gelfling's Axiom of voicemail which is:
You don't really need it, if it's important enough they'll call back.
"ahh the irony of your post combined with your sig (Ferion being amway for geeks and all)."
:)
Referall rewards != Amway. Besides, that's not what irony means. Perhaps if I had said "It's all a huge scam" and if Ferion were actually what you claim it to be, you could call me a hypocrite.
"Derp de derp."
Ferion: "It's free to play"
Amway: "Make $10,000 a week!"
Ferion: "well actually, it's only free to play if you sucker other people into paying for you."
Amway: "well actually, you can only make $10,000 a week if you sucker other people into selling for you."
How we know is more important than what we know.
I've started filtering my email on what is basically a Steven Covey 4 Quadrants principle:
Urgent is email that is important to _my_ goals in life, where there is a deadline. Usually that means other people are involved. For example, email from my PhD students who should be working on research that furthers my interests as well. (Covey quadrant 1)
Important is email that is important to my goals, with no deadline. The stuff that is good for me if I read it, but I didn't used to because of the deadline issue. I now make sure to read through the Important folder once a day. An example is conference announcements in my area. (Covey quadrant 2)
Distracting is stuff that is important to other people, but not really me. Most of my Staff mailing lists go in here. (Covey quadrant 3)
Timewasting is stuff that is fun but not really important to anyone. Friends mailing lists talking about the latest in computer games or eclectic news stories, for example. Stuff I can read for 5 minutes to get a chuckle before meetings. (Covey quadrant 4)
Other email gets put aside for me to find out how to not get it again. For example mailing lists I subscribed to once thinking they'd be useful for me, but really I'm better off searching the web when I need that info rather than wasting my time keeping on top of it every day/week.
It works very nicely, and I only have a couple of filters for the lot. I get 400+ emails a day, incidentally.
Try it -- just set up 4 filters copying rather than moving the emails, and run it in parallel with your current filters...
R
.
The trouble with the world is that the stupid are cocksure and the intelligent are full of doubt.
-Bertrand Russel
As the From: address usually is forged they will just bounce to the forged address. So most spammers don't check if the mails arrive or not, they can use that bandwidth for sending more spam.
:)
I've given up trying to not get spam, I filter it instead. Usually it's aroung ~400-500 spams/day.
I've only saved all my legitimate email for the last 10 years though
Erik Dalén
i'm curious how much of it is second-hand-spam (you submitted email, they sold it) vs. bot/spidering/harvesting (eg: wholly unsolicited email) ...
i have a catch all tld i use to watch. signup like kjamez-slashdot@tld.com which comes to all the same box, so when i start getting unsolicited emails to kjamez-slashdot@tld.com from random people, i can at least see the origin to some degree. i do the same with magazine subscriptions and credit cards and the like. all slight variations on my real name, some even wholly ficticious.
i'm just curious like that.
you can't have everything, where would you put it?
I posted an automated weekly analysis of the language used in my email some time ago.
http://www.2ad.com/~john/spam_zeitgeist/
This focuses more on language used rather than on message type. So it reveals some of the patterns used in marketing messages.
John