Slashdot Mirror


Has Google Broken JavaScript Spam Munging?

Baxil writes "For years now, Javascript munging has been a useful tool to share email addresses on the Web without exposing them to spammers. However, Google is now apparently evaluating Javascript when assembling summary text for web pages' listings, and publishing the un-munged email addresses to the world; and spammers have started to take advantage of this kind service." Anyone else seen this affecting their carefully protected email addresses?

32 of 288 comments (clear)

  1. *rolleyes* by Anonymous Coward · · Score: 5, Insightful

    Seriously, queue the obfuscation != security thing. If your email address is carefully protected, it is not displayed on a web page, obfuscated or not.

    1. Re:*rolleyes* by hardburn · · Score: 4, Interesting

      Javascript did a pretty good job at this

      No, it didn't. Google isn't doing anything the spammers couldn't have done themselves with a little bit of Perl.

      --
      Not a typewriter
    2. Re:*rolleyes* by broken_chaos · · Score: 3, Informative

      Spambots don't, and never have, invested enough time to include JavaScript parsing. One of the linked articles suggests this is due to a possibility of crashing when trying to interpret badly formed or incorrect JavaScript, but it could also be due to simple plaintext (maybe with stripping HTML tags) parsing has been producing enough results so far.

      Most spambots have been proven, in several experiments, to not even parse hex/decimal HTML character entities, so JavaScript parsing was considered to be mostly safe for the moment. It's not like people assume this is a perfect spam-blocking method - just that it's good enough to not get thousands upon thousands of spam, limiting it to a reasonable number.

    3. Re:*rolleyes* by NewWorldDan · · Score: 3, Interesting

      Yep, the keyword there is most spambots. It just takes one motivated enough to write a parser for javascript for common munging techniques. Or in this case, finding an app out there that does it automagically for them. I would expect that email addresses stored as an image would be less subject to abuse for two reasons: First, it creates a much larger download causing a bottle neck and second, it's much more computationally intensive. Still, it can of course, be done. After all, it may only be a matter of time until Google or MSN parse it and save the results for the rest of the world.

      What I find works best is to use a web form for submitting messages on our company website. That only gets spammed about once a month, and usually for something almost relavant to what we do. Then again, 2 years ago it never got spammed.

    4. Re:*rolleyes* by david.given · · Score: 4, Funny

      <pedent>

      This, of course, is the traditional spelling/grammar flame typo. I think it's a law of nature.

  2. Really.... by Darkness404 · · Score: 4, Insightful

    Really with the development of better OCR technologies and such comes the elimination of e-mail security by obscurity. If you don't want spam either A) have a decent spam filter (I don't think I've had a single piece of spam pass through G-mails filter and only one false positive) or B) don't share your e-mail address. Those are the only two ways to prevent spam that will continue to work.

    --
    Taxation is legalized theft, no more, no less.
    1. Re:Really.... by buchner.johannes · · Score: 4, Insightful

      No it is not. If you increase the time used per website, you can not process that many websites anymore. JS obfuscated emails were protected because spammers didn't take effort.
      You might say computers got faster, but unfortunately the web didn't get smaller.

      Anyway, I understand the need to post email addresses on a website. How else should people contact you the first time? Personally, I don't like contact forms. Would you advocate for a CAPTCHA or requiring a POST request to obtain the real email address? You could still cry "security by obscurity".

      But you can't take away the option of posting email addresses on websites from users, as it is very useful to contact people by email. Reminds me of people saying "Flash is proprietary, and too fancy for my taste anyway, so nobody must use it. Use Javascript.".

      Maybe one should make swf files with the email in them. Muhahaha

      --
      NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
  3. Re:Mung by eikonoklastes · · Score: 4, Informative
  4. "Google indexes correctly rendered page" by RichardDeVries · · Score: 5, Insightful

    That should be the title. That is, if it were newsworthy. Which it isn't.

    --
    Error 001
    Security Scan and Virus Detection do not work with your operating system.
  5. Welcome to the club by fataugie · · Score: 5, Funny

    Dear Google:

    Welcome to the "Impossible to do anything right" club.

    Regards,

    Wal-Mart,
    Microsoft,
    G. W. Bush

    --

    WTF? Over?

  6. What else can google do? by Bazman · · Score: 3, Insightful

    So much content on the web these days is spat out by document.write(), I'm not surprised at all that google evaluates certain javascripts in order to get any content to index.

    Even done a "View Source" on a google mail or google maps page? The web is now javascript.

    1. Re:What else can google do? by hairyfeet · · Score: 3, Insightful

      Well I don't know about him, but I can tell you why I block JavaScript and use Noscript and ABP, and it is because JavaScript is becoming the new ActiveX. You see, ActiveX wasn't really that bad when it was just used by a couple of corporate types for very basic jobs but then along came everybody and their dog and soon the web became a giant ActiveX nightmare.

      Now we are seeing the same thing all over again with JavaScript and Flash. Sites that could have been perfectly fine in plain old HTML become this giant bloated mess that can cause even a good dual core and cable connection to go "WTF? Is the script not responding?" because they have overloaded it with crap. So I will happily block most scripts and keep my bandwidth and my sanity, thanks ever so much. I have found most sites that are the worst offenders rarely have anything worth looking at anyway.

      BTW, Who is writing the code for Slashdot anyway? I've found if I don't block JavaScript on Slashdot it looks like the page was rendered with a shotgun. Just really nasty and hard to read.

      --
      ACs don't waste your time replying, your posts are never seen by me.
  7. It's not google, it's the web developers by Punto · · Score: 5, Insightful

    nowadays, half of the pages I try to visit don't render at all without javascript. Somtimes the main content is missing (you just get the headline, the links that go on the sides, and the ads), somtimes it's just a blank page. It seems like all these traditional news organizations just _have_ to be "web 2.0" to appear relevant again.

    Google needs to index the page, they don't have much choice.

    --

    --
    Stay tuned for some shock and awe coming right up after this messages!

    1. Re:It's not google, it's the web developers by BlitzTech · · Score: 3, Insightful

      AJAX is a great technology that has vastly improved the usefulness of the web. However, like every other fad, it gets significantly overused in places where it just IS NOT reasonable. I wish more developers would come to the realization that AJAX != 'Web 2.0-ifying your page' and move back to using the right technology for a given problem. AJAX everywhere just reeks of the same kind of software bloat that makes modern computers run slow compared to 5-10 year old equipment.

      When all you have is a hammer...

    2. Re:It's not google, it's the web developers by Todd+Knarr · · Score: 3, Insightful

      Seconded. You don't need Javascript to do a simple hyperlink. You don't need a scrolling text-box to display your page, the browser can scroll the page just fine thankyouveddymuch. You don't need to dynamically replace elements to change content while maintaining a navigation header or sidebar when appropriate (note: appropriate) use of frames will accomplish exactly what you want.

      The two sins of engineering: making it more complicated than it needs to be, and making it simpler than it needs to be. Avoid them.

  8. Who CARES? by nweaver · · Score: 4, Interesting

    The spammers WILL get your email address. Be it web trawling, google searchers, or stealing email address off of compromised computers, the spammers will get, and then resell, you email address.

    Trying to keep the spammers from getting your email address is a lost cause, and not a battle worth fighting.

    --
    Test your net with Netalyzr
  9. robots.txt by physicsphairy · · Score: 3, Interesting

    I assume if you load your obfuscation code from script.js and put script.js in robots.txt that you will be safe, although that is sort of a pain.

    What would be nice is if google created a new tag in the lines of rel="nofollow" which would be an in-line way to keep the engine from seeing content.

    1. Re:robots.txt by RajivSLK · · Score: 4, Insightful

      What would be nice is if google created a new tag in the lines of rel="nofollow" which would be an in-line way to keep the engine from seeing content.

      That would be exploited by spammers to the extreme. Imagine clicking on a listing for disney kids fun house only to have a hidden ad for an online Viagra dispensary dominate the page.

  10. Contact Me Form by Jason+Levine · · Score: 5, Informative

    A better method is to have a Contact Me form that doesn't display your e-mail address anywhere on it. Yes, you'll get spammers filling it out, but you can cut down on those with some simple techniques. For example, make a "Phone Number" field and set the CSS display attribute to none. Normal users won't see this field and won't fill it out. Spam-bots will see it and attempt to fill it out. Then, have your submission script silently fail to send to e-mail if the "Phone Number" is filled out. (If you toss an error, the spammer might figure out the trick.) No method is fool-proof, of course, but this is much better than putting your e-mail address on your webpage and hoping that someone doesn't de-mung it.

    --
    My sci-fi novel, Ghost Thief, is now available from Amazon.com.
  11. Pay to email by Viking+Coder · · Score: 5, Interesting

    How about "pay to email"?

    I register with a pay-to-email site, and give it my actual email address. It gives me my new publicly visible email address. Anyone who wants to can send me an email through this service if they pay me an amount of money that I set. After I receive the email, I can refund the sender. The pay-to-email site takes a 10% cut on all un-refunded emails.

    Sound like a winner?

    --
    Education is the silver bullet.
    1. Re:Pay to email by Kozz · · Score: 4, Funny

      How about "pay to email"?

      I register with a pay-to-email site, and give it my actual email address. It gives me my new publicly visible email address. Anyone who wants to can send me an email through this service if they pay me an amount of money that I set. After I receive the email, I can refund the sender. The pay-to-email site takes a 10% cut on all un-refunded emails.

      Sound like a winner?

      My... GOD... that's genius! Your plan clearly has no flaws. We should implement it right now.

      OK, honestly, I was just too lazy to fill out the ubiquitous rejection form.

      --
      I only post comments when someone on the internet is wrong.
  12. Re:Mung by Anonymous Coward · · Score: 5, Funny

    >The wikipedia page also links to munge - modify until not guessed easily -
    > which I guess is what the original person intended

    Then the original poster is a chimp and so are you. If you aren't aware that adding ~e may change the meaning of a word, I should come round and rap your ears.

  13. One might say Google "Fixed" it by dmomo · · Score: 3, Interesting

    It's a hack. When moving technology forward, you need to pick your battles when asking "should we not improve this service? It will break the hacks"?

    All in all, you are displaying text on a page. Google's job is to take text that humans can read and make it text that humans can find.

    I agree, spam is a problem, but this kind of obfuscation will only get you so far. It's the same argument that can be said about MP3s. If you can hear it, we can steal it. Same as "if you can see it."

    Spam stinks, but in the end, even with these tricks, you are making your address public. Public information will be harvested by mortals and robots alike.

  14. Re:Mung by digitalsolo · · Score: 3, Funny

    Then the original poster is a chimp and so are you. If you aren't aware that adding ~e may change the meaning of a word, I should come round and rap your ears.

    Then the original poster is a chimp and so are you. If you aren't aware that adding ~e may change the meaning of a word, I should come round and rape your ears.

    You're right, just one 'e' and the whole thing changes.

    --
    Just another ignorant American.
  15. Much ado about nothing by Asmor · · Score: 4, Insightful

    I publically list my email whenever I need to. If I want someone to email me something, I say, "Send it to itoltz@gmail.com". In fact, if HTML is allowed where ever I'm writing that, I'll even be so kind as make it a mailto link (i.e. <a href='mailto:itoltz@gmail.com'>itoltz@gmail.com</a>).

    And you know what? I almost never get spam in my inbox. I'd say a piece squeaks through Gmail's filters every few months (though when it does, I usually seem to get 2-3 similar spams over the course of a day or two).

    Granted, not everyone has the option of using gmail, and for those who do not everyone is comfortable with the idea of using it. That's fine. But the point is, if gmail is that good at filtering out spam, anyone else can be too.

  16. Re:Mung by larry+bagina · · Score: 3, Funny

    And if you double it:

    I should come round and rape your arse

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

  17. Some robots are more equal than others by Cajun+Hell · · Score: 5, Insightful

    For example, make a "Phone Number" field and set the CSS display attribute to none. Normal users won't see this field and won't fill it out. Spam-bots will see it and attempt to fill it out.

    This only works for as long as spammers don't care about it. I think anyone who can figure out the HTML resulting from javascript, can also figure out the style of an element.

    What's really funny about this problem is that we used to talk about using captchas to tell the robots apart from the meatbags, so that you could discriminate against robots. But now people want the robots to make sense of their page (so that they get referrals from Google) but they don't want the robots to make sense of their page (so that their email box doesn't get referrals from spambot). You're on the web or you're not. Choose.

    --
    "Believe me!" -- Donald Trump
  18. Let's Geto to Work by tomsomething · · Score: 3, Interesting

    Yay, Google. Judging by the responses I've seen so far, it seems most of us think this is a step forward for the search engine. That said, why don't we use this story as an opportunity to have a productive conversation about e-mail address security in a world where JavaScript's effectiveness is dwindling? Here's one from A List Apart that uses some fancy mod_rewrite stuff. http://www.alistapart.com/articles/gracefulemailobfuscation/ I know we've got a lot of geniuses and experts in here. Don't be modest! Show off how smart you are! And yes, the next brilliant security measure will someday be pummeled by a robot that some spammer puts together, but hell if that ain't just exciting! We're helping people build better, "smarter" robots, and criminals are some of society's greatest innovators.

    --
    Welcome to Slashdot. Replace this text with your desired signature before replying to a story.
  19. Re:Mung by PearsSoap · · Score: 3, Funny

    The email address is not munged, or you couldn't un-mung it.

    You munged it; you can't un-mung it!

    Stay tuned for more... Tales! Of! Internet!

  20. Re:Mung by Anonymous Coward · · Score: 5, Informative

    Nice try, but that rule only applies to "[^ng]g$" words.

    beg + ing = begging
    dig + ing = digging
    hog + ing = hogging
    rag + ing = ragging
    tug + ing = tugging

    but it doesn't apply "[n]g$", because the n modifies the sound of the g, and gg$ is uncommon enough that it's an exception in itself.

    bang + ing = banging
    bring + ing = bringing
    (egg + ing = egging)
    hang + ing = hanging
    long + ing = longing
    ping + ing = pinging
    sing + ing = singing

    Unfortuantely we don't have many examples of "ung$" because most of the words of that form are either nouns (e.g. dung, lung, young) or past participles (e.g. clung, hung, sung), so their present participles are generally formed from the present tense "ing$" form of word (e.g. cling/clung/clinging, hang/hung/hanging, sing/sung/singing), etc.

    Note that we do have plenty of examples of "unge$" forming "unging$":

    expunge + ing = expunging
    lounge + ing = lounging
    lunge + ing = lunging
    plunge + ing = plunging
    scrounge + ing = scrounging

    So that's plenty of reason to believe that the rule is "unge + ing = unging", despite the fact that "inge + ing" can be either "inging" or "ingeing" depending on the word (and in some cases both are valid):

    binge + ing = binging or bingeing (both are valid; look it up)
    cringe + ing = cringing
    impinge + ing = impinging
    singe + ing = singeing
    twinge + ing = twinging or twingeing (both are valid)

    Therefore I strongly contend that:

    mung + ing = munging
    munge + ing = munging or mungeing (both are valid)

    You may dispute the claim above, but there's no disputing:

    mung + ed = munged
    munge + ed = munged

    :)

  21. Re:Mung by SausageOfDoom · · Score: 3, Insightful

    It has been happening for quite some time.

    I have always said that the only way to keep your e-mail address safe from spammers is to not give it out at all. Although Google may be doing it now, it's been perfectly possible for as long as computing power has been available cheaply to the spammers (ie botnets).

    About 4 years ago I conducted an experiment with anti-spam techniques for the comments on my blog. One of the things I tried was a bit of javascript which added a validation field to the form. The spammers kept on as if it wasn't there, which meant they had to be evaluating javascript.

    And the thing is, once your obsfucation measures are broken by the spammers, because of places like archive.org the internet never forgets - so you can't claw it back. You can update your obsfucation code on your site, but there's nothing stopping the spammers from simply trawling the archives and mirrors to find it there.

    The only way to protect your e-mail address is to never send it client-side - always put it behind a form and a server-side mailing script.

  22. Re:Google interprets javascript? Really? by The+Famous+Brett+Wat · · Score: 4, Interesting

    For everyone's information: the page the author links to as the one that has javascript munging also has a noscript tag with the email out in the open. Guess what Google and spammers' email-crawlers really do? ;)

    I've checked your claim, and it's not true. The "noscript" tag contains warning text about Javascript being turned off and an instruction to use a web form instead of email. I've also checked my own Javascript obfuscation, which uses "blah at domain" type descriptive text in the noscript tag, and Google's search results do not de-obfuscate it. This may be due to the fact that my Javascript is loaded from a separate file -- a point raised in TFA.

    Even if Google is rendering some amount of Javascript in this way, it's still a stretch to accuse Google of being the leak. If you correspond with a person who has malware installed on their computer, there's a high risk that your email address will be exposed to spammers via that route. Such malware is hardly uncommon, is it? The obfuscation technique was only ever going to buy a little extra spam-free time in any case.

    --
    proof, n. A demonstration that a conclusion is implied by certain premises and axioms.