Slashdot Mirror


Gmail Now Rejects Emails With Misleading Combinations of Unicode Characters

An anonymous reader writes: Google today announced it is implementing a new effort to thwart spammers and scammers: the open standard known as Unicode Consortium's "Highly Restricted" specification. In short, Gmail now rejects emails from domains that use what the Unicode community has identified as potentially misleading combinations of letters. The news today follows Google's announcement last week that Gmail has gained support for accented and non-Latin characters. The company is clearly okay with international domains, as long as they aren't abused to trick its users.

10 of 79 comments (clear)

  1. FÜÇK ÿèàh by Anonymous Coward · · Score: 5, Funny

    ...

  2. Good that this applies to from: and not the body by CRCulver · · Score: 4, Interesting

    ...of the e-mail. Any attempt to block spam or phising on the basis of mixing character sets would have to confront the fact that some people do need to mix character sets. Typically representations of Mari in the Latin alphabet, for example, also make use of the Greek letters beta and eta. In fact, eta is used in Latin representations of several minority languages of Russia. And the Reddit crowd loves making weird smilies in their English-language writing by means of symbols drawn from Indian scripts.

  3. Re:Good that this applies to from: and not the bod by Russ1642 · · Score: 3, Funny

    If this spells death to those ridiculous smilies then it's ok with me.

  4. Re:Good that this applies to from: and not the bod by mi · · Score: 3, Interesting

    I routinely substitute Cyrillic letters for Latin on Disqus and other forums to get around their filters (which block for more than mere "profanity").

    Slashdot does not allow non-ASCII characters — although it does not attempt to screen out profanity either.

    --
    In Soviet Washington the swamp drains you.
  5. Sounds bad by Anonymous Coward · · Score: 2, Insightful

    If I start a business with a unicode domain, and if later a scammer registers an ascii domain that is similar looking, then Gmail will blackhole my business, not the scammer, because I'm the one using unicode.

  6. whack-a-mole 3.0 by Anonymous Coward · · Score: 2, Insightful

    And the latest round of whack-a-mole begins...

  7. Why are we still blocking spam ? by Anonymous Coward · · Score: 3, Interesting

    90% of the population would be better off with a white listed email account, i.e. if you are not on their list the email does not get through. END OF STORY.

    I would seem to be more efficient to filter mail IN than to filter it out. Most people would have 20 or so people they actually want mail from.
    I have mail accounts strictly for family and my local email rules enforce this
    I have mail accounts for "sign up" sessions for competitions that I know are going to get spammed to hell
    I have mail account for work, another for my business , etc etc all with differing contacts.

    White listing would pretty much kill off spam, if there is zero chance of it getting though, what is the point. Currently spammers get through because of out dated spam lists, new tricks to get around baynesian filters, etc etc etc. White lists would negate the need.

    Google, if you set up a white listed email system, my friends and family will happily sign up.

  8. Re:all of them then? by TheGavster · · Score: 2

    The "restrictive profile" that Google is using for the filtering is defined in Unicode as any combination of the Latin character set with another set or sets, with the exception of very specific combinations (selected legitimate combinations of Asian sets that contain radically different letter forms and thus are unlikely to cause confusion).

    --
    "Because Science" is one step from "Because old book". Try "Because of my experiment testing my falsifiable assertion".
  9. Re:Good that this applies to from: and not the bod by Ichijo · · Score: 2

    Slashdot does not allow non-ASCII characters...

    ...unless they're in code page 1252.

    --
    Any sufficiently unpopular but cohesive argument is indistinguishable from trolling.
  10. Sounds rather ethnocentric by Chrisq · · Score: 2

    It allows combinations of Latin + Han + Hiragana + Katakana; Latin + Han + Bopomofo; or Latin + Han + Hangul.

    There are a lot of equally safe combinations - what about Latin + Devanagari + Tamil? There would be no look-alike characters and it would allow a lot of people to put their name in multiple scripts that are likely to be meaningful to certain audiences (e.g. someone from Tamil Nadu sending an email to people throughout India and internationally). I'm sure that there are many other combinations that wouldn't have "look alike" issues but which would be useful