Slashdot Mirror


Has Google Broken JavaScript Spam Munging?

Baxil writes "For years now, Javascript munging has been a useful tool to share email addresses on the Web without exposing them to spammers. However, Google is now apparently evaluating Javascript when assembling summary text for web pages' listings, and publishing the un-munged email addresses to the world; and spammers have started to take advantage of this kind service." Anyone else seen this affecting their carefully protected email addresses?

61 of 288 comments (clear)

  1. *rolleyes* by Anonymous Coward · · Score: 5, Insightful

    Seriously, queue the obfuscation != security thing. If your email address is carefully protected, it is not displayed on a web page, obfuscated or not.

    1. Re:*rolleyes* by eln · · Score: 2, Funny

      Maybe he's merely advocating that the "obfuscation != security" people should form a line. You shouldn't be so quick to judge.

    2. Re:*rolleyes* by hardburn · · Score: 4, Interesting

      Javascript did a pretty good job at this

      No, it didn't. Google isn't doing anything the spammers couldn't have done themselves with a little bit of Perl.

      --
      Not a typewriter
    3. Re:*rolleyes* by broken_chaos · · Score: 3, Informative

      Spambots don't, and never have, invested enough time to include JavaScript parsing. One of the linked articles suggests this is due to a possibility of crashing when trying to interpret badly formed or incorrect JavaScript, but it could also be due to simple plaintext (maybe with stripping HTML tags) parsing has been producing enough results so far.

      Most spambots have been proven, in several experiments, to not even parse hex/decimal HTML character entities, so JavaScript parsing was considered to be mostly safe for the moment. It's not like people assume this is a perfect spam-blocking method - just that it's good enough to not get thousands upon thousands of spam, limiting it to a reasonable number.

    4. Re:*rolleyes* by RJFerret · · Score: 2, Informative

      Recaptcha has a service specifically for email addresses, no obfuscation needed... Which also has the added benefit of aiding book digitizing!

    5. Re:*rolleyes* by twidarkling · · Score: 2, Funny

      I dunno. Lining up works. After all, there's likely a large number of people who'd say that. You'd hardly want them all running amok.

      --
      Canada: The US's more awesome sibling.
    6. Re:*rolleyes* by Chabil+Ha' · · Score: 2, Insightful

      To add:

      Relying on the expected behavior (Google not processing JS) of something over which you have no control for your security is pretty silly as well.

      --
      We're all hypocrites. We all have hidden parts, it's the contrast between them that make us more a hypocrite than others
    7. Re:*rolleyes* by NewWorldDan · · Score: 3, Interesting

      Yep, the keyword there is most spambots. It just takes one motivated enough to write a parser for javascript for common munging techniques. Or in this case, finding an app out there that does it automagically for them. I would expect that email addresses stored as an image would be less subject to abuse for two reasons: First, it creates a much larger download causing a bottle neck and second, it's much more computationally intensive. Still, it can of course, be done. After all, it may only be a matter of time until Google or MSN parse it and save the results for the rest of the world.

      What I find works best is to use a web form for submitting messages on our company website. That only gets spammed about once a month, and usually for something almost relavant to what we do. Then again, 2 years ago it never got spammed.

    8. Re:*rolleyes* by interkin3tic · · Score: 2, Insightful

      Do you really think whipping up a perl script is beyond the abilities of somebody who has the ability to run a spamming "business"?

      Maybe you mistook that for a rhetorical question, sorry for that misleading question, it was semi-honest. I really don't know how much effort goes into a spamming buisiness. Never met anyone who identified themselves as a spammer, so I don't know if they're as dumb as they seem. For that matter, I've never written a perl script.

      Just seems to me like if you have a decent head on your shoulders you'd be doing more than the equivalent of agressively begging for change on the sidewalk.

    9. Re:*rolleyes* by david.given · · Score: 4, Funny

      <pedent>

      This, of course, is the traditional spelling/grammar flame typo. I think it's a law of nature.

    10. Re:*rolleyes* by enoz · · Score: 2, Insightful

      You miss the point.

      The Javascript obfuscation method allows you to make a mailto: url that was accessible to users yet difficult for spammers.

      Sticking your email in an image is probably worse then simply asking users to solve a captcha before giving them your email.

  2. Really.... by Darkness404 · · Score: 4, Insightful

    Really with the development of better OCR technologies and such comes the elimination of e-mail security by obscurity. If you don't want spam either A) have a decent spam filter (I don't think I've had a single piece of spam pass through G-mails filter and only one false positive) or B) don't share your e-mail address. Those are the only two ways to prevent spam that will continue to work.

    --
    Taxation is legalized theft, no more, no less.
    1. Re:Really.... by Anonymous Coward · · Score: 2, Informative

      It's TRIVIAL for a spambot to execute code like this sitting in script tags in the "js" binary and dumping the contents, and then grabbing emails with a regex.

      I use the "js" binary to rip porn off sites all the time.

      ~$ js -v
      JavaScript-C 1.7.0 2007-10-03
      usage: js [-PswWxCi] [-b branchlimit] [-c stackchunksize] [-v version] [-f scriptfile] [-e script] [-S maxstacksize] [scriptfile] [scriptarg...]

    2. Re:Really.... by buchner.johannes · · Score: 4, Insightful

      No it is not. If you increase the time used per website, you can not process that many websites anymore. JS obfuscated emails were protected because spammers didn't take effort.
      You might say computers got faster, but unfortunately the web didn't get smaller.

      Anyway, I understand the need to post email addresses on a website. How else should people contact you the first time? Personally, I don't like contact forms. Would you advocate for a CAPTCHA or requiring a POST request to obtain the real email address? You could still cry "security by obscurity".

      But you can't take away the option of posting email addresses on websites from users, as it is very useful to contact people by email. Reminds me of people saying "Flash is proprietary, and too fancy for my taste anyway, so nobody must use it. Use Javascript.".

      Maybe one should make swf files with the email in them. Muhahaha

      --
      NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
    3. Re:Really.... by mshieh · · Score: 2, Insightful

      I don't think I've had a single piece of spam pass through G-mails filter and only one false positive

      You mean you've only noticed one false positive. I'm sure it's been mentioned in half of the comments in this thread, but security by obscurity is effective because there is value in stopping half of the spam, unlike traditional security where having your data stolen and sold once is not a big gain over having it done many times. There are many reasons why obscurity works towards this goal of reduction rather than elimination.

    4. Re:Really.... by DragonWriter · · Score: 2, Interesting

      Personally, I don't like contact forms. Would you advocate for a CAPTCHA or requiring a POST request to obtain the real email address?

      Never happen, but better would be:
      You get the actual e-mail address via a POST request over SSL secured by a valid client certificate from a reputable CA, the client certicate's public key and associated identity information is transferred to the owner of the e-mail address, who requires e-mail to also be digitally signed, and who filters by using a sender address whitelist and validating the signature against the associated key. Senders are added to the whitelist when their key is received (e.g., from the website system, or out-of-band) and presumed good until they send spam or do something else unwelcome, at which point the receiver removes them from the whitelist.

      Accountability, not obscurity.

  3. Re:Mung by eikonoklastes · · Score: 4, Informative
  4. "Google indexes correctly rendered page" by RichardDeVries · · Score: 5, Insightful

    That should be the title. That is, if it were newsworthy. Which it isn't.

    --
    Error 001
    Security Scan and Virus Detection do not work with your operating system.
  5. They should fix this right away by Null+Nihils · · Score: 2, Insightful

    This can easily be fixed, and should be right away. If Google is turning JavaScript into text output, they can easily parse that output (just like the spammers currently are) and see if the text contains an e-mail address. And if it does, they should omit it from search results (unless the address was originally plain text and not obfuscated, in which case they can assume the author wants it searchable).

  6. Welcome to the club by fataugie · · Score: 5, Funny

    Dear Google:

    Welcome to the "Impossible to do anything right" club.

    Regards,

    Wal-Mart,
    Microsoft,
    G. W. Bush

    --

    WTF? Over?

  7. What else can google do? by Bazman · · Score: 3, Insightful

    So much content on the web these days is spat out by document.write(), I'm not surprised at all that google evaluates certain javascripts in order to get any content to index.

    Even done a "View Source" on a google mail or google maps page? The web is now javascript.

    1. Re:What else can google do? by hairyfeet · · Score: 3, Insightful

      Well I don't know about him, but I can tell you why I block JavaScript and use Noscript and ABP, and it is because JavaScript is becoming the new ActiveX. You see, ActiveX wasn't really that bad when it was just used by a couple of corporate types for very basic jobs but then along came everybody and their dog and soon the web became a giant ActiveX nightmare.

      Now we are seeing the same thing all over again with JavaScript and Flash. Sites that could have been perfectly fine in plain old HTML become this giant bloated mess that can cause even a good dual core and cable connection to go "WTF? Is the script not responding?" because they have overloaded it with crap. So I will happily block most scripts and keep my bandwidth and my sanity, thanks ever so much. I have found most sites that are the worst offenders rarely have anything worth looking at anyway.

      BTW, Who is writing the code for Slashdot anyway? I've found if I don't block JavaScript on Slashdot it looks like the page was rendered with a shotgun. Just really nasty and hard to read.

      --
      ACs don't waste your time replying, your posts are never seen by me.
  8. It's not google, it's the web developers by Punto · · Score: 5, Insightful

    nowadays, half of the pages I try to visit don't render at all without javascript. Somtimes the main content is missing (you just get the headline, the links that go on the sides, and the ads), somtimes it's just a blank page. It seems like all these traditional news organizations just _have_ to be "web 2.0" to appear relevant again.

    Google needs to index the page, they don't have much choice.

    --

    --
    Stay tuned for some shock and awe coming right up after this messages!

    1. Re:It's not google, it's the web developers by BlitzTech · · Score: 3, Insightful

      AJAX is a great technology that has vastly improved the usefulness of the web. However, like every other fad, it gets significantly overused in places where it just IS NOT reasonable. I wish more developers would come to the realization that AJAX != 'Web 2.0-ifying your page' and move back to using the right technology for a given problem. AJAX everywhere just reeks of the same kind of software bloat that makes modern computers run slow compared to 5-10 year old equipment.

      When all you have is a hammer...

    2. Re:It's not google, it's the web developers by Todd+Knarr · · Score: 3, Insightful

      Seconded. You don't need Javascript to do a simple hyperlink. You don't need a scrolling text-box to display your page, the browser can scroll the page just fine thankyouveddymuch. You don't need to dynamically replace elements to change content while maintaining a navigation header or sidebar when appropriate (note: appropriate) use of frames will accomplish exactly what you want.

      The two sins of engineering: making it more complicated than it needs to be, and making it simpler than it needs to be. Avoid them.

    3. Re:It's not google, it's the web developers by JCSoRocks · · Score: 2, Informative

      Frames aren't a replacement. There's a reason people dropped frames. Layout limitations, limited scaling, poor bookmarking, broken back button, etc. I, for one, appreciate partial page refreshes - when done correctly. Full page postbacks suck.

      --
      You are using English. Please learn the difference between loose and lose; they're, there, and their; your and you're.
  9. Who CARES? by nweaver · · Score: 4, Interesting

    The spammers WILL get your email address. Be it web trawling, google searchers, or stealing email address off of compromised computers, the spammers will get, and then resell, you email address.

    Trying to keep the spammers from getting your email address is a lost cause, and not a battle worth fighting.

    --
    Test your net with Netalyzr
  10. Yes, but . . . by Art3x · · Score: 2, Insightful

    Your email address will almost certainly get out. If not by a spambot then through an unscrupulous merchant.

    That's why spam filtering is better than email hiding. Gmail's spam filter, for example, is very good. I get spam in my Inbox about once a quarter.

    Google's job is to turn human-readable pages into machine-searchable pages. So it will always seek to expand what it can read: images, Flash, JavaScript, etc.

    It's best not to hide in the direction that technology is advancing.

  11. robots.txt by physicsphairy · · Score: 3, Interesting

    I assume if you load your obfuscation code from script.js and put script.js in robots.txt that you will be safe, although that is sort of a pain.

    What would be nice is if google created a new tag in the lines of rel="nofollow" which would be an in-line way to keep the engine from seeing content.

    1. Re:robots.txt by RajivSLK · · Score: 4, Insightful

      What would be nice is if google created a new tag in the lines of rel="nofollow" which would be an in-line way to keep the engine from seeing content.

      That would be exploited by spammers to the extreme. Imagine clicking on a listing for disney kids fun house only to have a hidden ad for an online Viagra dispensary dominate the page.

    2. Re:robots.txt by Anonymous Coward · · Score: 2, Informative

      On Google appliances, there is actually a googleon / googleoff set of comment tags you can use.

  12. one answer by martas · · Score: 2, Informative
  13. Contact Me Form by Jason+Levine · · Score: 5, Informative

    A better method is to have a Contact Me form that doesn't display your e-mail address anywhere on it. Yes, you'll get spammers filling it out, but you can cut down on those with some simple techniques. For example, make a "Phone Number" field and set the CSS display attribute to none. Normal users won't see this field and won't fill it out. Spam-bots will see it and attempt to fill it out. Then, have your submission script silently fail to send to e-mail if the "Phone Number" is filled out. (If you toss an error, the spammer might figure out the trick.) No method is fool-proof, of course, but this is much better than putting your e-mail address on your webpage and hoping that someone doesn't de-mung it.

    --
    My sci-fi novel, Ghost Thief, is now available from Amazon.com.
  14. Re:Mung by TheRealMindChild · · Score: 2, Funny

    Yeah, no kidding. I was wondering where Chowder and Schnitzel were

    --

    "When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
  15. I have a new solution: by Facegarden · · Score: 2, Funny

    In order to prevent SPAMbots once and for all, you should require that everyone interested in contacting you first drive to the next geohash http://www.wiki.xkcd.com/geohashing/Main_Page in the region of your choosing, wearing a lumberjack outfit and carrying a case of jolt cola.

    Then, and only then, does the read quest begin...
    -Taylor

    --
    Worldwide Military budgets: $2100 billion. Worldwide Space Exploration budgets: $38 billion. Really, world? Really?
  16. Pay to email by Viking+Coder · · Score: 5, Interesting

    How about "pay to email"?

    I register with a pay-to-email site, and give it my actual email address. It gives me my new publicly visible email address. Anyone who wants to can send me an email through this service if they pay me an amount of money that I set. After I receive the email, I can refund the sender. The pay-to-email site takes a 10% cut on all un-refunded emails.

    Sound like a winner?

    --
    Education is the silver bullet.
    1. Re:Pay to email by Kozz · · Score: 4, Funny

      How about "pay to email"?

      I register with a pay-to-email site, and give it my actual email address. It gives me my new publicly visible email address. Anyone who wants to can send me an email through this service if they pay me an amount of money that I set. After I receive the email, I can refund the sender. The pay-to-email site takes a 10% cut on all un-refunded emails.

      Sound like a winner?

      My... GOD... that's genius! Your plan clearly has no flaws. We should implement it right now.

      OK, honestly, I was just too lazy to fill out the ubiquitous rejection form.

      --
      I only post comments when someone on the internet is wrong.
    2. Re:Pay to email by Anonymous Coward · · Score: 2, Funny

      Well, here you go:
      ---
      Your post advocates a

      ( ) technical ( ) legislative (*) market-based ( ) vigilante

      approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)

      ( ) Spammers can easily use it to harvest email addresses
      (*) Mailing lists and other legitimate email uses would be affected
      ( ) No one will be able to find the guy or collect the money
      ( ) It is defenseless against brute force attacks
      ( ) It will stop spam for two weeks and then we'll be stuck with it
      (*) Users of email will not put up with it
      ( ) Microsoft will not put up with it
      ( ) The police will not put up with it
      ( ) Requires too much cooperation from spammers
      ( ) Requires immediate total cooperation from everybody at once
      (*) Many email users cannot afford to lose business or alienate potential employers
      ( ) Spammers don't care about invalid addresses in their lists
      ( ) Anyone could anonymously destroy anyone else's career or business

      Specifically, your plan fails to account for

      ( ) Laws expressly prohibiting it
      ( ) Lack of centrally controlling authority for email
      ( ) Open relays in foreign countries
      ( ) Ease of searching tiny alphanumeric address space of all email addresses
      ( ) Asshats
      ( ) Jurisdictional problems
      ( ) Unpopularity of weird new taxes
      ( ) Public reluctance to accept weird new forms of money
      ( ) Huge existing software investment in SMTP
      ( ) Susceptibility of protocols other than SMTP to attack
      ( ) Willingness of users to install OS patches received by email
      ( ) Armies of worm riddled broadband-connected Windows boxes
      ( ) Eternal arms race involved in all filtering approaches
      ( ) Extreme profitability of spam
      ( ) Joe jobs and/or identity theft
      ( ) Technically illiterate politicians
      ( ) Extreme stupidity on the part of people who do business with spammers
      ( ) Dishonesty on the part of spammers themselves
      ( ) Bandwidth costs that are unaffected by client filtering
      ( ) Outlook

      and the following philosophical objections may also apply:

      (*) Ideas similar to yours are easy to come up with, yet none have ever been shown practical
      ( ) Any scheme based on opt-out is unacceptable
      ( ) SMTP headers should not be the subject of legislation
      ( ) Blacklists suck
      ( ) Whitelists suck
      ( ) We should be able to talk about Viagra without being censored
      ( ) Countermeasures should not involve wire fraud or credit card fraud
      ( ) Countermeasures should not involve sabotage of public networks
      ( ) Countermeasures must work if phased in gradually
      (*) Sending email should be free
      ( ) Why should we have to trust you and your servers?
      ( ) Incompatiblity with open source or open source licenses
      ( ) Feel-good measures do nothing to solve the problem
      ( ) Temporary/one-time email addresses are cumbersome
      ( ) I don't want the government reading my email
      ( ) Killing them that way is not slow and painful enough

      Furthermore, this is what I think about you:

      (*) Sorry dude, but I don't think it would work.
      ( ) This is a stupid idea, and you're a stupid person for suggesting it.
      ( ) Nice try, assh0le! I'm going to find out where you live and burn your house down!

    3. Re:Pay to email by Viking+Coder · · Score: 2, Interesting

      Thanks for the sarcasm. I'll try to not stoop down as I respond to you:

      (*) Mailing lists and other legitimate email uses would be affected

      No they wouldn't. You can set up a whitelist.

      (*) Users of email will not put up with it

      If you have my private email account, you use it. I'm offering up an idea of a service that someone can use to mask their email address. If you really want to contact someone, you can send them a no-stamp email, and hope they happen to see it. This is no better and no worse than today. If you want them to see it, you affix a stamp. The receiver could easily let you know what their threshold level is. If you don't want to pay that much, then don't.

      (*) Many email users cannot afford to lose business or alienate potential employers

      Many email users will not use the idea. Okay. Some will. If you want to be employed by someone, or do business with them, give them your direct email address.

      (*) Ideas similar to yours are easy to come up with, yet none have ever been shown practical

      If you desire spam-free email, point out the actual problems with my system. If you don't care to point out the actual problems, then don't.

      (*) Sending email should be free

      Receiving email should be spam-free. Also, sending email is free under my idea, as long as the people who receive it agree that it wasn't spam. Yes, there's a "deposit" which is held, but it should be good for as long as you don't spam people.

      (*) Sorry dude, but I don't think it would work.

      That's legitimate. I have Skype credit right now for the simple purpose of making phone calls. I have a recurring credit card debit set up from Amazon to pay for my AS3 (JungleDisk) access. I pay my ISP, and I suspect you pay yours, too. I pay per every text I send from my phone; you might pay a monthly fee to have "unlimited" texts. Returning a book from a library after the due date has a nominal fee.

      If I want to send "larry (at) somesite (dot) com" an email, but Larry is as sick of getting spam as I am, and if we agree to trade the same reusable stamp with a group of like-minded individuals, would you seriously be completely unwilling to drop $1 onto a website to join the club?

      I remember way back when the signal to noise ratio of email was THOUSANDS of times higher than it is now. I'd be willing to drop a $1 deposit to get back into those kinds of numbers.

      --
      Education is the silver bullet.
    4. Re:Pay to email by Viking+Coder · · Score: 2, Interesting

      "Actually, No. It's designed to be open in this manner."

      Actually, email is a content delivery system. It's up to the participants to decide the content. A stamp is perfectly valid content.

      "require a specific definition to 'SPAM' that all agree on."

      No, each person decides what spam is. I thought that was pretty obvious from what I was saying, sorry.

      You do something publicly on the internet, and leave your stamp-required email address. I want to get in touch with you, so I send you an email with a stamp. If you decide, for whatever reason, to keep my stamp, I just have to accept that. The stamp was a nominal charge in the first place. Chances are, someone will send me an email I don't particularly want to receive, and I can keep their stamp to offset your action. Perhaps it will be considered rude to not return stamps. Perhaps it will be considered gracious to INSIST that recipients keep your stamps so they can donate them to their preferred charities, or use them themselves. Would you donate some email stamps to the homeless, so they can be more effective in emailing potential employers, or health care providers, or state representatives? ...just a thought.

      Physical mail has more impact than email, when you write to your senator. Perhaps stamped email will carry a tad more weight. "Oh, geez - this is a $20 stamp, and it's even marked for-charity-only." (The recipient CAN'T return it, and CAN'T use it themselves...?)

      "However, what if someone you haven't talked to in a while just sends an email out of the blue? is that spam? I know someone who considers that spam."

      Then people will either not mind buying stamps to email that person, or they will. If that person ever wants to send emails back, the original senders you described should keep their stamp as payback.

      "What about things that are not legally considered spam?"

      "Legally" has nothing to do with it. It's a reusable stamp. Apply it to any purpose you want to.

      "Or, you could get a Google account."

      I've already got one, but I'm not quite cavalier enough to post my gmail address all over creation. Are you? Does it really work well enough on your spam?

      --
      Education is the silver bullet.
  17. Re:Mung by Anonymous Coward · · Score: 5, Funny

    >The wikipedia page also links to munge - modify until not guessed easily -
    > which I guess is what the original person intended

    Then the original poster is a chimp and so are you. If you aren't aware that adding ~e may change the meaning of a word, I should come round and rap your ears.

  18. One might say Google "Fixed" it by dmomo · · Score: 3, Interesting

    It's a hack. When moving technology forward, you need to pick your battles when asking "should we not improve this service? It will break the hacks"?

    All in all, you are displaying text on a page. Google's job is to take text that humans can read and make it text that humans can find.

    I agree, spam is a problem, but this kind of obfuscation will only get you so far. It's the same argument that can be said about MP3s. If you can hear it, we can steal it. Same as "if you can see it."

    Spam stinks, but in the end, even with these tricks, you are making your address public. Public information will be harvested by mortals and robots alike.

  19. I don't think they got the email from Google by bheer · · Score: 2, Insightful

    I don't think the spammers got his email address from Google. I mean, to do that they'd have to send a fairly narrow query to Google -- something like 'chibi jesus' -- and then scrape the results ... just scraping the cached page wouldn't help -- that contains JS, not the email address. Plus, I imagine Google would notice if a bot started sending lots of search queries its way.

    It's far more likely that spammer bots are now actively processing JS. As others on this thread have pointed out, it ain't hard to do.

  20. Re:Mung by digitalsolo · · Score: 3, Funny

    Then the original poster is a chimp and so are you. If you aren't aware that adding ~e may change the meaning of a word, I should come round and rap your ears.

    Then the original poster is a chimp and so are you. If you aren't aware that adding ~e may change the meaning of a word, I should come round and rape your ears.

    You're right, just one 'e' and the whole thing changes.

    --
    Just another ignorant American.
  21. Re:Mung by twidarkling · · Score: 2, Funny

    Yoeu're reight, juest onee 'e'e ande thee whoele tehing chaenges.

    --
    Canada: The US's more awesome sibling.
  22. Much ado about nothing by Asmor · · Score: 4, Insightful

    I publically list my email whenever I need to. If I want someone to email me something, I say, "Send it to itoltz@gmail.com". In fact, if HTML is allowed where ever I'm writing that, I'll even be so kind as make it a mailto link (i.e. <a href='mailto:itoltz@gmail.com'>itoltz@gmail.com</a>).

    And you know what? I almost never get spam in my inbox. I'd say a piece squeaks through Gmail's filters every few months (though when it does, I usually seem to get 2-3 similar spams over the course of a day or two).

    Granted, not everyone has the option of using gmail, and for those who do not everyone is comfortable with the idea of using it. That's fine. But the point is, if gmail is that good at filtering out spam, anyone else can be too.

    1. Re:Much ado about nothing by hplus · · Score: 2, Interesting

      Given the immense quantity of mail that Google processes, they are in a uniquely effective position to classify mail as spam based on heuristics and other techniques that are similar to the sorting that they do for page-rankings. I'm not saying that other entities could not necessarily do what Google does, just that Google has a nice head start.

    2. Re:Much ado about nothing by Phroggy · · Score: 2, Informative

      Something that most people don't understand is that spam is NOT universal. Every e-mail address is unique, and will get a different assortment of spam. Some of the users on my mail server get spam that I don't get, and I get spam that they don't get.

      In particular, a new e-mail address will never get spam, unless:

      1. A spammer randomly guesses the address, using a dictionary attack
      2. The address is posted on a web site, and scraped by a spammer
      3. The address is submitted to a company or organization which posts it on their site
      4. Malware extracts the address from somebody's address book
      5. Somebody hacks into a company or organization that the address and takes it from their database
      6. Some sleazy company sells it

      That's pretty much it. #1 is only likely if your username is common (like just your first name). #3 isn't a common problem anymore, since most sites either don't post their users' e-mail addresses, or they obfuscate them (like Slashdot does). #5 isn't a common problem either. I've only gotten burned by #6 a few times.

      --
      $x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
      $x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
  23. Re:Mung by larry+bagina · · Score: 3, Funny

    And if you double it:

    I should come round and rape your arse

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

  24. Some robots are more equal than others by Cajun+Hell · · Score: 5, Insightful

    For example, make a "Phone Number" field and set the CSS display attribute to none. Normal users won't see this field and won't fill it out. Spam-bots will see it and attempt to fill it out.

    This only works for as long as spammers don't care about it. I think anyone who can figure out the HTML resulting from javascript, can also figure out the style of an element.

    What's really funny about this problem is that we used to talk about using captchas to tell the robots apart from the meatbags, so that you could discriminate against robots. But now people want the robots to make sense of their page (so that they get referrals from Google) but they don't want the robots to make sense of their page (so that their email box doesn't get referrals from spambot). You're on the web or you're not. Choose.

    --
    "Believe me!" -- Donald Trump
  25. Google interprets javascript? Really? by eugene2k · · Score: 2, Interesting

    For everyone's information: the page the author links to as the one that has javascript munging also has a noscript tag with the email out in the open. Guess what Google and spammers' email-crawlers really do? ;)

    --
    Apple has "Mac vs PC", Microsoft has "Laptop Hunters", Linux has recession
    1. Re:Google interprets javascript? Really? by The+Famous+Brett+Wat · · Score: 4, Interesting

      For everyone's information: the page the author links to as the one that has javascript munging also has a noscript tag with the email out in the open. Guess what Google and spammers' email-crawlers really do? ;)

      I've checked your claim, and it's not true. The "noscript" tag contains warning text about Javascript being turned off and an instruction to use a web form instead of email. I've also checked my own Javascript obfuscation, which uses "blah at domain" type descriptive text in the noscript tag, and Google's search results do not de-obfuscate it. This may be due to the fact that my Javascript is loaded from a separate file -- a point raised in TFA.

      Even if Google is rendering some amount of Javascript in this way, it's still a stretch to accuse Google of being the leak. If you correspond with a person who has malware installed on their computer, there's a high risk that your email address will be exposed to spammers via that route. Such malware is hardly uncommon, is it? The obfuscation technique was only ever going to buy a little extra spam-free time in any case.

      --
      proof, n. A demonstration that a conclusion is implied by certain premises and axioms.
  26. Let's Geto to Work by tomsomething · · Score: 3, Interesting

    Yay, Google. Judging by the responses I've seen so far, it seems most of us think this is a step forward for the search engine. That said, why don't we use this story as an opportunity to have a productive conversation about e-mail address security in a world where JavaScript's effectiveness is dwindling? Here's one from A List Apart that uses some fancy mod_rewrite stuff. http://www.alistapart.com/articles/gracefulemailobfuscation/ I know we've got a lot of geniuses and experts in here. Don't be modest! Show off how smart you are! And yes, the next brilliant security measure will someday be pummeled by a robot that some spammer puts together, but hell if that ain't just exciting! We're helping people build better, "smarter" robots, and criminals are some of society's greatest innovators.

    --
    Welcome to Slashdot. Replace this text with your desired signature before replying to a story.
  27. Re:Mung by Anonymous Coward · · Score: 2, Insightful

    I believe a 'WHOOSH' is in order.

  28. Re:Mung by Midnight+Thunder · · Score: 2, Insightful

    Actually proper English indicates that you double consonant when adding 'ing' if it ends with one, or drop the 'e' if it ends with one:
        hop -> hopping
        hope -> hoping

    so:
        munge -> munging
        mung -> mungging

    --
    Jumpstart the tartan drive.
  29. Re:Mung by collinstocks · · Score: 2, Informative

    From Jargon File (4.4.4, 14 Aug 2003) [jargon]:

        mung /muhng/, vt.

              [in 1960 at MIT, "Mash Until No Good"; sometime after that the
              derivation from the {recursive acronym} "Mung Until No Good" became
              standard; but see {munge}]

              1. To make changes to a file, esp. large-scale and irrevocable
              changes. See {BLT}.

              2. To destroy, usually accidentally, occasionally maliciously. The
              system only mungs things maliciously; this is a consequence of
              {Finagle's Law}. See {scribble}, {mangle}, {trash}, {nuke}. Reports
              from {Usenet} suggest that the pronunciation /muhnj/ is now usual in
              speech, but the spelling `mung' is still common in program comments
              (compare the widespread confusion over the proper spelling of
              {kluge}).

              3. In the wake of the {spam} epidemics of the 1990s, mung is now
              commonly used to describe the act of modifying an email address in a
              sig block in a way that human beings can readily reverse but that will
              fool an {address harvester}. Example: johnNOSPAMsmith@isp.net.

              4. The kind of beans the sprouts of which are used in Chinese food.
              (That's their real name! Mung beans! Really!)

              Like many early hacker terms, this one seems to have originated at
              {TMRC}; it was already in use there in 1958. Peter Samson (compiler of
              the original TMRC lexicon) thinks it may originally have been
              onomatopoeic for the sound of a relay spring (contact) being twanged.
              However, it is known that during the World Wars, `mung' was U.S.: army
              slang for the ersatz creamed chipped beef better known as `SOS', and
              it seems quite likely that the word in fact goes back to Scots-dialect
              {munge}.

              Charles Mackay's 1874 book Lost Beauties of the English Language
              defined "mung" as follows: "Preterite of ming, to ming or mingle; when
              the substantive meaning of mingled food of bread, potatoes, etc.
              thrown to poultry. In America, `mung news' is a common expression
              applied to false news, but probably having its derivation from mingled
              (or mung) news, in which the true and the false are so mixed up
              together that it is impossible to distinguish one from another."

    See the third definition.

  30. Re:Mung by PearsSoap · · Score: 3, Funny

    The email address is not munged, or you couldn't un-mung it.

    You munged it; you can't un-mung it!

    Stay tuned for more... Tales! Of! Internet!

  31. Re:Mung by RivieraKid · · Score: 2, Funny

    I knew it! I'm surrounded by assholes!

    --
    "Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves
  32. Re:Mung by Anonymous Coward · · Score: 5, Informative

    Nice try, but that rule only applies to "[^ng]g$" words.

    beg + ing = begging
    dig + ing = digging
    hog + ing = hogging
    rag + ing = ragging
    tug + ing = tugging

    but it doesn't apply "[n]g$", because the n modifies the sound of the g, and gg$ is uncommon enough that it's an exception in itself.

    bang + ing = banging
    bring + ing = bringing
    (egg + ing = egging)
    hang + ing = hanging
    long + ing = longing
    ping + ing = pinging
    sing + ing = singing

    Unfortuantely we don't have many examples of "ung$" because most of the words of that form are either nouns (e.g. dung, lung, young) or past participles (e.g. clung, hung, sung), so their present participles are generally formed from the present tense "ing$" form of word (e.g. cling/clung/clinging, hang/hung/hanging, sing/sung/singing), etc.

    Note that we do have plenty of examples of "unge$" forming "unging$":

    expunge + ing = expunging
    lounge + ing = lounging
    lunge + ing = lunging
    plunge + ing = plunging
    scrounge + ing = scrounging

    So that's plenty of reason to believe that the rule is "unge + ing = unging", despite the fact that "inge + ing" can be either "inging" or "ingeing" depending on the word (and in some cases both are valid):

    binge + ing = binging or bingeing (both are valid; look it up)
    cringe + ing = cringing
    impinge + ing = impinging
    singe + ing = singeing
    twinge + ing = twinging or twingeing (both are valid)

    Therefore I strongly contend that:

    mung + ing = munging
    munge + ing = munging or mungeing (both are valid)

    You may dispute the claim above, but there's no disputing:

    mung + ed = munged
    munge + ed = munged

    :)

  33. Re:Mung by SausageOfDoom · · Score: 3, Insightful

    It has been happening for quite some time.

    I have always said that the only way to keep your e-mail address safe from spammers is to not give it out at all. Although Google may be doing it now, it's been perfectly possible for as long as computing power has been available cheaply to the spammers (ie botnets).

    About 4 years ago I conducted an experiment with anti-spam techniques for the comments on my blog. One of the things I tried was a bit of javascript which added a validation field to the form. The spammers kept on as if it wasn't there, which meant they had to be evaluating javascript.

    And the thing is, once your obsfucation measures are broken by the spammers, because of places like archive.org the internet never forgets - so you can't claw it back. You can update your obsfucation code on your site, but there's nothing stopping the spammers from simply trawling the archives and mirrors to find it there.

    The only way to protect your e-mail address is to never send it client-side - always put it behind a form and a server-side mailing script.

  34. Address-munging ceased being useful years ago by Arrogant-Bastard · · Score: 2, Interesting
    Spammers have many methods of acquiring addresses, including but not limited to:
    • subscribing to mailing lists
    • acquiring Usenet news feeds
    • querying mail servers
    • acquiring corporate directories (sometimes from their web sites)
    • insecure LDAP servers
    • insecure AD servers
    • use of backscatter/outscatter use of auto-responders
    • use of mailing list mechanisms
    • use of abusive "callback" mechanisms
    • dictionary attacks
    • purchase of addresses in bulk on the open market.
    • purchase of addresses from vendors, web sites, etc.
    • purchase of addresses from registrars, ISPs, web hosts, etc.
    • domain registration (some registrars are spammers
    • AND harvesting of the mail, address books and any other files present on any of the hundreds of millions of compromised Windows systems.

    There's thus no point whatsoever in any form of address obfuscation or munging: it's a complete waste of time indulged in only by the clueless, delusional few who haven't been paying attention to what's gone in during the past decade. What's truly ironic is how many of these people are actually running Windows and thus stand a reasonably good chance of having their own system be the point at which their address(es) are harvested.

    A far better point to critique Google on would be their pointless munging of addresses in Usenet news articles -- spammers have had their own Usenet feeds for MANY years and all Google's done is make the archives less useful for everyone else.