Gmail Recognizes Addresses Containing Non-Latin Characters
An anonymous reader writes In response to the creation in 2012 by the Internet Engineering Task Force (IETF) of "a new email standard that supports addresses incorporating non-Latin and accented Latin characters", Google has now made it possible for its Gmail users to "send emails to, and receive emails from, people who have these characters in their email addresses." Their goal is to eventually allow its users to create Gmail addresses utilizing these characters.
So the next lot of phishing will come from: róót@gmail.com / Àdministrator@gmail.com or BìllGàtes@gmail.com etc?
Great.
Sendmail is like emacs: A nice operating system, but missing an editor and a MTA.
Now we'll get spam from addresses that use code pages that just look like valid latin characters.
Google updated their regular expression. Good for them.
Finally I can get motörhead@gmail.com!
From what I can tell, a mail server has two options when receiving this mail:
Accept it.
Reject it.
The default, with software that doesn't understand this RFC yet (which seems to be... just about everything), is to reject. So trying to use this as an email is not only going to mess up every form you try to fill in online (because they won't see it as an email address either), but quite likely just gets you bouncebacks from everyone you email.
What was needed was surely a system similar to the IDN system for internationalisation, which would allow those with ASCII-only DNS servers etc. to STILL WORK, by converting the Unicode characters to ASCII subsets and then sending the email as normal, through the entire PLANET-worth of working email servers out there that could accept it.
Having a content negotiation option at the SMTP level, that mail servers have to implement and handle specifically, is just ridiculous, and even with GMail's kickstart it could be decades before you can guarantee that your UTF-8 email address will work across the Internet and even then there'll be some old legacy server that will just bounce all your email BECAUSE of that character set in your address. And it will be perfectly legitimate to do so.
However, as others have pointed out, if this goes through, it will be nigh-on impossible to spot phished/faked email addresses, just like it is with IDN links unless you know how to find the original ASCII-encoding of them.
My e-mail address ends with the suffix ".name". It is perfectly correct (even if not common), but I still sometimes have issues today because some stupid website has an outdated regular expression which says that ".name" is not correct.
Now imagine this with non-latin characters (or just non-ASCII characters)... If you only write to people also using GMail, it might work.
The Internet is about interoperability. Intentionally breaking that goes against everything that has made the Internet worthwhile.
I hope they implement the same kind of anti-phishing measures that browsers are taking for displaying domain names with non-Latin scripts. http://en.wikipedia.org/wiki/I...
How on earth am I supposed to email someone when I don't even have a key that corresponds to a letter in their email address. And do I'm not keeping a huge chart of Alt+number combinations handy.
you cannot use solely international characters, the first one need to be simple ascii
http://www.irongeek.com/homogl...
Maybe now my e-mails to Tutankhamun will quit bouncing.
Sheesh, evil *and* a jerk. -- Jade
By default ubuntu doesn't unless your codepage requires it. Most of the 'complete' unicode fonts aren't included by default.
The IETF devoted time to a _new_ email standard and still hasn't fscking solved the spam problem?
WTF? Send them to bed without dinner.
So there was an IRC network running in CP1251.
So there were attempts to substitute letters and impersonate other users.. so an ircd patch was written to treat all similar-looking characters as Latin.
Next wave of Unicode dirty-hacks and workarounds surely will be even more strange than that. // irccity, nya.
Don't be afraid of it.
My e-mail address ends with the suffix ".name". It is perfectly correct (even if not common), but I still sometimes have issues today because some stupid website has an outdated regular expression which says that ".name" is not correct.
I'm still waiting for web developers to realize that a "+" character is valid in the local-part of an e-mail address. It's been around since RFC 822, and yet the web folks still can't get their shit together.
That "signed char" was a bad coding choice back in the day.
5 times pile of poo at gmail dot com!
They better be filtering out the non-printing characters that do fun stuff like reverse the text direction, overstrike, etc. How long until people start registering gmail addresses with Zalgo text?
And how long until someone registers pile of poo @gmail.com?
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
As a webdev who gets irritated at websites that fail badly with their email validation (e.g. not allowing + in the local part, or only allowing 2 or 3 char TLDs), I do try very hard to get this right. So I've got a solid(ish) email validation function. But, I'm a bit sketchy on what to do with UTF-8.
For the domain, I'd hope that the MTA (Postfix in my case) would allow UTF-8 and convert to punycode as required, but I'm not sure it does. So currently I don't allow for that. I _could_ convert to punycode myself, but I don't.
And as for the local-part, I'm fairly certain Postfix doesn't allow for UTF-8 at present.... at least, not the Postfix version supported on Debian 7.
So I'm just wondering what everyone else is doing? Should I improve my support, or should I just wait for support to be added to my MTA before I bother?
Probably set up so that if the Russian gets bounced, it tries again with the latin alphabet.
Also, the signature of all emails sent from this should have a copy of the latin email address, so that people that don't have the Russian capability can reply.
excitingthingstodo.blogspot.com
Now that Google has implemented 2012 i18n technology, maybe vaunted technology site Slashdot can catch up to 1998 and implement UTF-8 properly?
Nah.
Liberty in your lifetime
first off, I went down the slippery death defying slope of email address validation recently... Our software had simple regex rules... so I thought I would just implement RFC rules, or find a library that did... wow. RFC is a mess... APIs are worse.
This is a valid email address:
dude"".dude@[192.168.1.1]
so is this:
a@com
also valid:
test+test=gmail.com@test.com
none of those will work in MS Outlook or exchange, none of them will work with jquery validation plug-in, some close to that will work with java mail API. Most funky but standards compliant email addresses will pass Apache commons validation.
In the end, I went with a 2 part validation: 1) Apache Commons Validation (mostly RFC correct), then a second pass on Javax.mail because if I can't send email to it, then what is the point of having it? We still get addresses that pass both validations, and bounce at some SMTP relay due to "invalid address format."
I am sure internationalization will make all this better.
Isn't this something, which was introduced years ago?
...between these two addresses:
firstname.lastname@gmail.com
firstnamelastname@gmail.com
I keep getting email at the former addressed to the latter. Anyone else encounter this oddity with Gmail?
What's next, smiley faces in phone numbers?
We're one step closer to "I'll just text you the address" level of frustration with someone who needs a cutesy e-mail address that you can tell them over the phone.
It's like IPv6 . . . I want to remember an address by the time I cross the room.
The "From:" header has been spoofable in ASCII since the beginning of e-mail. Given its unreliability, you are foolish if you put much stock into it.
Fascism should more properly be called corporatism because it is the merger of state and corporate power. -- Mussolini
I'm just can't help being amazed how fast the tech industry keeps moving and innovating!
It feels like yesterday when the first Unicode specification was published 23 years ago.
Just amazing! What next?
Looking forward to wingdings in my email address.
Try using those email addresses to register on various web sites and watch them say "invalid email address".