Gmail Now Rejects Emails With Misleading Combinations of Unicode Characters
An anonymous reader writes: Google today announced it is implementing a new effort to thwart spammers and scammers: the open standard known as Unicode Consortium's "Highly Restricted" specification. In short, Gmail now rejects emails from domains that use what the Unicode community has identified as potentially misleading combinations of letters. The news today follows Google's announcement last week that Gmail has gained support for accented and non-Latin characters. The company is clearly okay with international domains, as long as they aren't abused to trick its users.
...
...of the e-mail. Any attempt to block spam or phising on the basis of mixing character sets would have to confront the fact that some people do need to mix character sets. Typically representations of Mari in the Latin alphabet, for example, also make use of the Greek letters beta and eta. In fact, eta is used in Latin representations of several minority languages of Russia. And the Reddit crowd loves making weird smilies in their English-language writing by means of symbols drawn from Indian scripts.
If this spells death to those ridiculous smilies then it's ok with me.
I routinely substitute Cyrillic letters for Latin on Disqus and other forums to get around their filters (which block for more than mere "profanity").
Slashdot does not allow non-ASCII characters — although it does not attempt to screen out profanity either.
In Soviet Washington the swamp drains you.
If I start a business with a unicode domain, and if later a scammer registers an ascii domain that is similar looking, then Gmail will blackhole my business, not the scammer, because I'm the one using unicode.
And the latest round of whack-a-mole begins...
90% of the population would be better off with a white listed email account, i.e. if you are not on their list the email does not get through. END OF STORY.
I would seem to be more efficient to filter mail IN than to filter it out. Most people would have 20 or so people they actually want mail from.
I have mail accounts strictly for family and my local email rules enforce this
I have mail accounts for "sign up" sessions for competitions that I know are going to get spammed to hell
I have mail account for work, another for my business , etc etc all with differing contacts.
White listing would pretty much kill off spam, if there is zero chance of it getting though, what is the point. Currently spammers get through because of out dated spam lists, new tricks to get around baynesian filters, etc etc etc. White lists would negate the need.
Google, if you set up a white listed email system, my friends and family will happily sign up.
The "restrictive profile" that Google is using for the filtering is defined in Unicode as any combination of the Latin character set with another set or sets, with the exception of very specific combinations (selected legitimate combinations of Asian sets that contain radically different letter forms and thus are unlikely to cause confusion).
"Because Science" is one step from "Because old book". Try "Because of my experiment testing my falsifiable assertion".
...unless they're in code page 1252.
Any sufficiently unpopular but cohesive argument is indistinguishable from trolling.
It allows combinations of Latin + Han + Hiragana + Katakana; Latin + Han + Bopomofo; or Latin + Han + Hangul.
There are a lot of equally safe combinations - what about Latin + Devanagari + Tamil? There would be no look-alike characters and it would allow a lot of people to put their name in multiple scripts that are likely to be meaningful to certain audiences (e.g. someone from Tamil Nadu sending an email to people throughout India and internationally). I'm sure that there are many other combinations that wouldn't have "look alike" issues but which would be useful