Has Google Broken JavaScript Spam Munging?
Baxil writes "For years now, Javascript munging has been a useful tool to share email addresses on the Web without exposing them to spammers. However, Google is now apparently evaluating Javascript when assembling summary text for web pages' listings, and publishing the un-munged email addresses to the world; and spammers have started to take advantage of this kind service." Anyone else seen this affecting their carefully protected email addresses?
Seriously, queue the obfuscation != security thing. If your email address is carefully protected, it is not displayed on a web page, obfuscated or not.
Really with the development of better OCR technologies and such comes the elimination of e-mail security by obscurity. If you don't want spam either A) have a decent spam filter (I don't think I've had a single piece of spam pass through G-mails filter and only one false positive) or B) don't share your e-mail address. Those are the only two ways to prevent spam that will continue to work.
Taxation is legalized theft, no more, no less.
See http://en.wikipedia.org/wiki/Mung
That should be the title. That is, if it were newsworthy. Which it isn't.
Error 001
Security Scan and Virus Detection do not work with your operating system.
This can easily be fixed, and should be right away. If Google is turning JavaScript into text output, they can easily parse that output (just like the spammers currently are) and see if the text contains an e-mail address. And if it does, they should omit it from search results (unless the address was originally plain text and not obfuscated, in which case they can assume the author wants it searchable).
Dear Google:
Welcome to the "Impossible to do anything right" club.
Regards,
Wal-Mart,
Microsoft,
G. W. Bush
WTF? Over?
So much content on the web these days is spat out by document.write(), I'm not surprised at all that google evaluates certain javascripts in order to get any content to index.
Even done a "View Source" on a google mail or google maps page? The web is now javascript.
nowadays, half of the pages I try to visit don't render at all without javascript. Somtimes the main content is missing (you just get the headline, the links that go on the sides, and the ads), somtimes it's just a blank page. It seems like all these traditional news organizations just _have_ to be "web 2.0" to appear relevant again.
Google needs to index the page, they don't have much choice.
--
Stay tuned for some shock and awe coming right up after this messages!
The spammers WILL get your email address. Be it web trawling, google searchers, or stealing email address off of compromised computers, the spammers will get, and then resell, you email address.
Trying to keep the spammers from getting your email address is a lost cause, and not a battle worth fighting.
Test your net with Netalyzr
Your email address will almost certainly get out. If not by a spambot then through an unscrupulous merchant.
That's why spam filtering is better than email hiding. Gmail's spam filter, for example, is very good. I get spam in my Inbox about once a quarter.
Google's job is to turn human-readable pages into machine-searchable pages. So it will always seek to expand what it can read: images, Flash, JavaScript, etc.
It's best not to hide in the direction that technology is advancing.
I assume if you load your obfuscation code from script.js and put script.js in robots.txt that you will be safe, although that is sort of a pain.
What would be nice is if google created a new tag in the lines of rel="nofollow" which would be an in-line way to keep the engine from seeing content.
When things get complex, multiply by the complex conjugate.
http://mailhide.recaptcha.net/
weinersmith
A better method is to have a Contact Me form that doesn't display your e-mail address anywhere on it. Yes, you'll get spammers filling it out, but you can cut down on those with some simple techniques. For example, make a "Phone Number" field and set the CSS display attribute to none. Normal users won't see this field and won't fill it out. Spam-bots will see it and attempt to fill it out. Then, have your submission script silently fail to send to e-mail if the "Phone Number" is filled out. (If you toss an error, the spammer might figure out the trick.) No method is fool-proof, of course, but this is much better than putting your e-mail address on your webpage and hoping that someone doesn't de-mung it.
My sci-fi novel, Ghost Thief, is now available from Amazon.com.
Yeah, no kidding. I was wondering where Chowder and Schnitzel were
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
In order to prevent SPAMbots once and for all, you should require that everyone interested in contacting you first drive to the next geohash http://www.wiki.xkcd.com/geohashing/Main_Page in the region of your choosing, wearing a lumberjack outfit and carrying a case of jolt cola.
Then, and only then, does the read quest begin...
-Taylor
Worldwide Military budgets: $2100 billion. Worldwide Space Exploration budgets: $38 billion. Really, world? Really?
How about "pay to email"?
I register with a pay-to-email site, and give it my actual email address. It gives me my new publicly visible email address. Anyone who wants to can send me an email through this service if they pay me an amount of money that I set. After I receive the email, I can refund the sender. The pay-to-email site takes a 10% cut on all un-refunded emails.
Sound like a winner?
Education is the silver bullet.
>The wikipedia page also links to munge - modify until not guessed easily -
> which I guess is what the original person intended
Then the original poster is a chimp and so are you. If you aren't aware that adding ~e may change the meaning of a word, I should come round and rap your ears.
It's a hack. When moving technology forward, you need to pick your battles when asking "should we not improve this service? It will break the hacks"?
All in all, you are displaying text on a page. Google's job is to take text that humans can read and make it text that humans can find.
I agree, spam is a problem, but this kind of obfuscation will only get you so far. It's the same argument that can be said about MP3s. If you can hear it, we can steal it. Same as "if you can see it."
Spam stinks, but in the end, even with these tricks, you are making your address public. Public information will be harvested by mortals and robots alike.
I don't think the spammers got his email address from Google. I mean, to do that they'd have to send a fairly narrow query to Google -- something like 'chibi jesus' -- and then scrape the results ... just scraping the cached page wouldn't help -- that contains JS, not the email address. Plus, I imagine Google would notice if a bot started sending lots of search queries its way.
It's far more likely that spammer bots are now actively processing JS. As others on this thread have pointed out, it ain't hard to do.
Go somewhere random
Then the original poster is a chimp and so are you. If you aren't aware that adding ~e may change the meaning of a word, I should come round and rap your ears.
Then the original poster is a chimp and so are you. If you aren't aware that adding ~e may change the meaning of a word, I should come round and rape your ears.
You're right, just one 'e' and the whole thing changes.
Just another ignorant American.
Yoeu're reight, juest onee 'e'e ande thee whoele tehing chaenges.
Canada: The US's more awesome sibling.
I publically list my email whenever I need to. If I want someone to email me something, I say, "Send it to itoltz@gmail.com". In fact, if HTML is allowed where ever I'm writing that, I'll even be so kind as make it a mailto link (i.e. <a href='mailto:itoltz@gmail.com'>itoltz@gmail.com</a>).
And you know what? I almost never get spam in my inbox. I'd say a piece squeaks through Gmail's filters every few months (though when it does, I usually seem to get 2-3 similar spams over the course of a day or two).
Granted, not everyone has the option of using gmail, and for those who do not everyone is comfortable with the idea of using it. That's fine. But the point is, if gmail is that good at filtering out spam, anyone else can be too.
And if you double it:
I should come round and rape your arse
Do you even lift?
These aren't the 'roids you're looking for.
This only works for as long as spammers don't care about it. I think anyone who can figure out the HTML resulting from javascript, can also figure out the style of an element.
What's really funny about this problem is that we used to talk about using captchas to tell the robots apart from the meatbags, so that you could discriminate against robots. But now people want the robots to make sense of their page (so that they get referrals from Google) but they don't want the robots to make sense of their page (so that their email box doesn't get referrals from spambot). You're on the web or you're not. Choose.
"Believe me!" -- Donald Trump
For everyone's information: the page the author links to as the one that has javascript munging also has a noscript tag with the email out in the open. Guess what Google and spammers' email-crawlers really do? ;)
Apple has "Mac vs PC", Microsoft has "Laptop Hunters", Linux has recession
Yay, Google. Judging by the responses I've seen so far, it seems most of us think this is a step forward for the search engine. That said, why don't we use this story as an opportunity to have a productive conversation about e-mail address security in a world where JavaScript's effectiveness is dwindling? Here's one from A List Apart that uses some fancy mod_rewrite stuff. http://www.alistapart.com/articles/gracefulemailobfuscation/ I know we've got a lot of geniuses and experts in here. Don't be modest! Show off how smart you are! And yes, the next brilliant security measure will someday be pummeled by a robot that some spammer puts together, but hell if that ain't just exciting! We're helping people build better, "smarter" robots, and criminals are some of society's greatest innovators.
Welcome to Slashdot. Replace this text with your desired signature before replying to a story.
I believe a 'WHOOSH' is in order.
Actually proper English indicates that you double consonant when adding 'ing' if it ends with one, or drop the 'e' if it ends with one:
hop -> hopping
hope -> hoping
so:
munge -> munging
mung -> mungging
Jumpstart the tartan drive.
From Jargon File (4.4.4, 14 Aug 2003) [jargon]:
mung /muhng/, vt.
[in 1960 at MIT, "Mash Until No Good"; sometime after that the
derivation from the {recursive acronym} "Mung Until No Good" became
standard; but see {munge}]
1. To make changes to a file, esp. large-scale and irrevocable
changes. See {BLT}.
2. To destroy, usually accidentally, occasionally maliciously. The /muhnj/ is now usual in
system only mungs things maliciously; this is a consequence of
{Finagle's Law}. See {scribble}, {mangle}, {trash}, {nuke}. Reports
from {Usenet} suggest that the pronunciation
speech, but the spelling `mung' is still common in program comments
(compare the widespread confusion over the proper spelling of
{kluge}).
3. In the wake of the {spam} epidemics of the 1990s, mung is now
commonly used to describe the act of modifying an email address in a
sig block in a way that human beings can readily reverse but that will
fool an {address harvester}. Example: johnNOSPAMsmith@isp.net.
4. The kind of beans the sprouts of which are used in Chinese food.
(That's their real name! Mung beans! Really!)
Like many early hacker terms, this one seems to have originated at
{TMRC}; it was already in use there in 1958. Peter Samson (compiler of
the original TMRC lexicon) thinks it may originally have been
onomatopoeic for the sound of a relay spring (contact) being twanged.
However, it is known that during the World Wars, `mung' was U.S.: army
slang for the ersatz creamed chipped beef better known as `SOS', and
it seems quite likely that the word in fact goes back to Scots-dialect
{munge}.
Charles Mackay's 1874 book Lost Beauties of the English Language
defined "mung" as follows: "Preterite of ming, to ming or mingle; when
the substantive meaning of mingled food of bread, potatoes, etc.
thrown to poultry. In America, `mung news' is a common expression
applied to false news, but probably having its derivation from mingled
(or mung) news, in which the true and the false are so mixed up
together that it is impossible to distinguish one from another."
See the third definition.
The email address is not munged, or you couldn't un-mung it.
You munged it; you can't un-mung it!
Stay tuned for more... Tales! Of! Internet!
I knew it! I'm surrounded by assholes!
"Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves
Nice try, but that rule only applies to "[^ng]g$" words.
but it doesn't apply "[n]g$", because the n modifies the sound of the g, and gg$ is uncommon enough that it's an exception in itself.
Unfortuantely we don't have many examples of "ung$" because most of the words of that form are either nouns (e.g. dung, lung, young) or past participles (e.g. clung, hung, sung), so their present participles are generally formed from the present tense "ing$" form of word (e.g. cling/clung/clinging, hang/hung/hanging, sing/sung/singing), etc.
Note that we do have plenty of examples of "unge$" forming "unging$":
So that's plenty of reason to believe that the rule is "unge + ing = unging", despite the fact that "inge + ing" can be either "inging" or "ingeing" depending on the word (and in some cases both are valid):
Therefore I strongly contend that:
You may dispute the claim above, but there's no disputing:
:)
It has been happening for quite some time.
I have always said that the only way to keep your e-mail address safe from spammers is to not give it out at all. Although Google may be doing it now, it's been perfectly possible for as long as computing power has been available cheaply to the spammers (ie botnets).
About 4 years ago I conducted an experiment with anti-spam techniques for the comments on my blog. One of the things I tried was a bit of javascript which added a validation field to the form. The spammers kept on as if it wasn't there, which meant they had to be evaluating javascript.
And the thing is, once your obsfucation measures are broken by the spammers, because of places like archive.org the internet never forgets - so you can't claw it back. You can update your obsfucation code on your site, but there's nothing stopping the spammers from simply trawling the archives and mirrors to find it there.
The only way to protect your e-mail address is to never send it client-side - always put it behind a form and a server-side mailing script.
There's thus no point whatsoever in any form of address obfuscation or munging: it's a complete waste of time indulged in only by the clueless, delusional few who haven't been paying attention to what's gone in during the past decade. What's truly ironic is how many of these people are actually running Windows and thus stand a reasonably good chance of having their own system be the point at which their address(es) are harvested.
A far better point to critique Google on would be their pointless munging of addresses in Usenet news articles -- spammers have had their own Usenet feeds for MANY years and all Google's done is make the archives less useful for everyone else.