Scammers Abuse Multilingual Domain Names (bbc.com)
Cyber-criminals are abusing multilingual character sets to trick people into visiting phishing websites. BBC: The non-English characters allow scammers to create "lookalike" sites with domain names almost indistinguishable from legitimate ones. Farsight Security found scam sites posing as banks, loan advisers and children's brands Lego and Haribo.
Smartphone users are at greater risk as small screens make lookalikes even harder to spot. The Farsight Security report looked at more than 100 million domain names that use non-English character sets -- introduced to make the net more familiar and usable for non-English speaking nations -- and found about 27% of them had been created by scammers. It also uncovered more than 8,000 separate characters that could be abused to confuse people.
Farsight founder Paul Vixie, who wrote much of the software underpinning the net's domain names told the BBC: "Any lower case letter can be represented by as many as 40 different variations."
Farsight founder Paul Vixie, who wrote much of the software underpinning the net's domain names told the BBC: "Any lower case letter can be represented by as many as 40 different variations."
small screens make lookalikes even harder to spot....Farsight Security
Yes, this does sound like a job better suited for Nearsight.
Saw this coming years ago. Unicode assignment is a god awful mess, made worst now that nearly every single noun has an emoji version. Pity that we're probably stuck with it until the end of humanity.
Safe use of the Internet requires digital "street smarts."
One should not need to be told that it is unsafe to click links in emails, or that virus scanners don't alert you via popups on a web page. Understanding of the basics of how these things work make it obvious, and make safe browsing practices just as obvious.
The industry has bent over backwards to grant access to swarms of people too stupid to be safe online.
So, the scammers take them for all they are worth.
Personally, I consider stupidity to be a vice (and largely a choice), so I don't have much sympathy for people who fall for this sort of thing.
Seriously...what they where thinking?!?!
Browsers should have you choose a language and not allow sites in other languages (in the url) by default. You go in somewhere and say allow everything or populate a list of acceptable languages. It should at least give a popup.
DNS entries are ASCII. Punycode is a way to put unicode in ASCII in a way that is sort of mostly human readable. For an English speaker (AKA ASCII character users) always set your browser to display the raw punycode and not the unicode points. For the less technical but still English speaking you should be fine as long as you only visit sites with HTTPS. No reputable CA should be signing EV certs with punycode that looks like English words. Ones that do will quickly be removed from the browsers.
For the non-English, you're f#@ked. Seriously. This was a good awful idea. We are going to return to an English only internet because everything else will be untrustable.
I remember this was a big deal - what, 10 years ago. Various desktop browsers implemented features to make the real URL of websites more obvious and then a variety of TLDs were certified as not allowing such domain name spoofing. Everything old is new again, huh?
Never saw that coming.
Not at all.
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
in firefox's about:config page
set network.IDN_show_punycode to true
to force firefox always use the punycode, e.g:
https://www.xn--80ak6aa92e.com...
good write-up here (where the above example, which looks like 'www.apple.com' comes from):
https://www.xudongz.com/blog/2...
Give an option to disable the display of IDN's. Instead display the "Punycode" translation of the name.
Better yet, default that for English and any other language that doesn't require non-ascii characters.
I can understand the logic behind adding support for characters that weren't necessarily a priority back when the internet was a DARPA and some mostly anglophome universities project; but are there any non-scam/amusing novelty use cases for mixed alphabet domain names?
I ask in sincere curiosity. With the possible exception of non-latin alphabets used alongsiide hindu-arabic numerals; I can't think of any situations where a human natural language is written such that it would use domain nes that are a mixture of multiple alphabets from a Unicode perspective(and, if there were such a language, it would arguably be on Unicode to fix that by assigning the necessary codepoints to the alphabet currently being cobbled together out of several: since Unicode is about glyphs rather than fonts the fact that the same symbol is used doesn't make it the same thing for Unicode purposes, as with all the Greek letters that get one codepoint as mathematical symbols and another as Greek letters, or the visually identical overlaps between Latin and Cyrillic that get coded as completely distinct things because they are.); but what I don't know about linguistics and contemporary natural language usage is very much not an impressive arguement.
Are there any legitimate/expected use cases; or should a domain name cobbled together from multiple alphabets be treated as deeply suspicious in essentially all cases?