Scammers Abuse Multilingual Domain Names (bbc.com)
Cyber-criminals are abusing multilingual character sets to trick people into visiting phishing websites. BBC: The non-English characters allow scammers to create "lookalike" sites with domain names almost indistinguishable from legitimate ones. Farsight Security found scam sites posing as banks, loan advisers and children's brands Lego and Haribo.
Smartphone users are at greater risk as small screens make lookalikes even harder to spot. The Farsight Security report looked at more than 100 million domain names that use non-English character sets -- introduced to make the net more familiar and usable for non-English speaking nations -- and found about 27% of them had been created by scammers. It also uncovered more than 8,000 separate characters that could be abused to confuse people.
Farsight founder Paul Vixie, who wrote much of the software underpinning the net's domain names told the BBC: "Any lower case letter can be represented by as many as 40 different variations."
Farsight founder Paul Vixie, who wrote much of the software underpinning the net's domain names told the BBC: "Any lower case letter can be represented by as many as 40 different variations."
small screens make lookalikes even harder to spot....Farsight Security
Yes, this does sound like a job better suited for Nearsight.
Saw this coming years ago. Unicode assignment is a god awful mess, made worst now that nearly every single noun has an emoji version. Pity that we're probably stuck with it until the end of humanity.
Safe use of the Internet requires digital "street smarts."
One should not need to be told that it is unsafe to click links in emails, or that virus scanners don't alert you via popups on a web page. Understanding of the basics of how these things work make it obvious, and make safe browsing practices just as obvious.
The industry has bent over backwards to grant access to swarms of people too stupid to be safe online.
So, the scammers take them for all they are worth.
Personally, I consider stupidity to be a vice (and largely a choice), so I don't have much sympathy for people who fall for this sort of thing.
Seriously...what they where thinking?!?!
Hooray!
This was new (news?) about a decade ago, perhaps more. This just goes to show that what is old becomes new again.
Or rather that eventually the yung'uns discover what we old farts have known for ages ... and they think it is a new discovery!
Browsers should have you choose a language and not allow sites in other languages (in the url) by default. You go in somewhere and say allow everything or populate a list of acceptable languages. It should at least give a popup.
DNS entries are ASCII. Punycode is a way to put unicode in ASCII in a way that is sort of mostly human readable. For an English speaker (AKA ASCII character users) always set your browser to display the raw punycode and not the unicode points. For the less technical but still English speaking you should be fine as long as you only visit sites with HTTPS. No reputable CA should be signing EV certs with punycode that looks like English words. Ones that do will quickly be removed from the browsers.
For the non-English, you're f#@ked. Seriously. This was a good awful idea. We are going to return to an English only internet because everything else will be untrustable.
I remember this was a big deal - what, 10 years ago. Various desktop browsers implemented features to make the real URL of websites more obvious and then a variety of TLDs were certified as not allowing such domain name spoofing. Everything old is new again, huh?
Never saw that coming.
Not at all.
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
in firefox's about:config page
set network.IDN_show_punycode to true
to force firefox always use the punycode, e.g:
https://www.xn--80ak6aa92e.com...
good write-up here (where the above example, which looks like 'www.apple.com' comes from):
https://www.xudongz.com/blog/2...
Safe use of the road requires "road smarts". But we found that we could make road using safer by, say, introducing seat belts, airbags, crumple zones, and so on.
We could have done something similar for the computers connected to the public internet. No html rendering in emails, no idiot accents in domain names, no toy OSes that drop their pants and shout "COME GET IT BOYS" at the slightest provocation. But we didn't. We liked our crap protocols and idiot committees and silly standards that were anything but. We liked the playground full of monkeys producing "code" that is flat-out unsafe to use. We blamed "the user", while trying to make'im even dumber. The entire shtick was "no training needed, it's intuitive!"
So we elevated stupidity to the gold standard of "using a computer". And then we stuffed the 'net with as many idiots as we could find, repeating again and again that you didn't have to be smart, or have "street smarts", or whatnot, to "be online".
So we get stuck with the finest idiocies, the very idiotest of idiot, with the worst software that bends over backwards to better serve the most idiotic people and ideas both, and so on, and so forth.
You can't just "have no sympathy" for those people since we (well, certain companies' marketeering department, but everyone else helped even if only by simple acquiescence) not merely let them in, or invited them in. We went out and seeked them out and "connected" them.
And just like with many other things, you need a certain percentage of people knowledgeable or the knowledge becomes drowned in stupidity. We've dropped well into the range where the stupidity doesn't merely drown out the smart, it's self-reinforcing. Including idiot posts like yours. Reasons why left as an exercise for the less than entirely stupid.
Mixing upper and lower thresholds in one sentence - please stop doing that. That's just like "Save up to 95% on select in-store items!" It's completely meaningless other than to attempt to grab attention. It's just abusing a typically small number of outliers to suggest a much broader fact.
I work for the Department of Redundancy Department.
Give an option to disable the display of IDN's. Instead display the "Punycode" translation of the name.
Better yet, default that for English and any other language that doesn't require non-ascii characters.
Everything old is new again, huh?
This is just the chickens coming home to roost.
It's also a good example of how letting the idiots do their thing brings on the idiocy. You cannot have a browser that doesn't leak memory because of the complexity of "the DOM". Websites are insecurable because of the way html is written and driven. html is the way it is because --in the words of the late Erik Naggum-- the w3c is built on the idea that competence doesn't really matter if you're in a committee. IOW, being stupid is okay if you're being stupid together. Their complete cowing to the whatwg (google et al) shows that they don't even have any agency left, leaving us wondering if they ever had any. But it's no surprise for anyone who has ever tried to read their "standards" with a critical eye.
unicode similarly is built on a highfalutin' premise that "nobody could have predicted" would leave us in dire straits. They're busily eating their own tail and pooping emojiiiii. Because that's a really important feature to have, being able to drop smiling poops on your text. And have big fights about whether the "gun" is a water pistol or more like a people killing one, and some vendors do it like this, some do it like that. Also really important. And the diversity, oh the inclusiveness of it all. Whole families!
The intersection of both kinds of idiocy is in IDN. Yes, we all predicted it wouldn't end well. It didn't. We still let it happen.
About a decade ago we had domains like paypai.com (presented in the email with the i capitalized) and Unicode doppelganger domains like this story mentions. How is this news?
W3C didn't really have much choice in the matter. They rejected to the two proposals that were later merged to become HTML5. The browser vendors and others went off and formed WHATWG to develop HTML5, saying the would not implement XHTML 2.0.
The mistake, or lack of foresight, was made much earlier, in the design of XHTML 1.0. That required a rewrite that wasn't backward compatible, XHTML 2.0, which didn't meet the needs of the way the web was evolving.
... unicode ?!
IT _SHouLD_ BE LoLo-L-o-LO-LOL-OLO-LOLOL-OL-OL.
I've often wondered how much malware could have been stopped if email clients simply didn't allow clickable links
I can understand the logic behind adding support for characters that weren't necessarily a priority back when the internet was a DARPA and some mostly anglophome universities project; but are there any non-scam/amusing novelty use cases for mixed alphabet domain names?
I ask in sincere curiosity. With the possible exception of non-latin alphabets used alongsiide hindu-arabic numerals; I can't think of any situations where a human natural language is written such that it would use domain nes that are a mixture of multiple alphabets from a Unicode perspective(and, if there were such a language, it would arguably be on Unicode to fix that by assigning the necessary codepoints to the alphabet currently being cobbled together out of several: since Unicode is about glyphs rather than fonts the fact that the same symbol is used doesn't make it the same thing for Unicode purposes, as with all the Greek letters that get one codepoint as mathematical symbols and another as Greek letters, or the visually identical overlaps between Latin and Cyrillic that get coded as completely distinct things because they are.); but what I don't know about linguistics and contemporary natural language usage is very much not an impressive arguement.
Are there any legitimate/expected use cases; or should a domain name cobbled together from multiple alphabets be treated as deeply suspicious in essentially all cases?
Been going on for quite awhile, hasn't it?
The framing of this article is distorting. The right question is, what tools are most used and most effective for scammers to steal from people? Then see if internionalised domain names make the top of this list. Instead, the article frames the topic as:
> Cyber-criminals are abusing multilingual character sets to trick people into visiting phishing websites.
Well sure, cyber criminals are abusing everything they can to trick people. But what I understand from security experts is that the top tools have nothing to do with internationalised domain names. Authoring an email where the link text says "Friendlybank.com", but the underlying url is "scammer.com" is a big one. Another is using a long domain name like "friendlybank.com.login.distraction.scammer.com", and relying on the mark not to read the whole domain name. Exploiting confusables within a single script, such as "Il" (capital 'i' vs lower-case 'L') or "O0" (letter capital-O vs digit zero) is also a tool. Abusing the large character set of internationalised domain names is definitely a tool, just not one of the most significant ones.
> found about 27% of [domain names in non-latin domains] had been created by scammers
Taking that statistic at face value: in .com and .net and .uk, what is the fraction of domain names which have been created by scammers?
I understand that journalists frame articles to illuminate a point of view or to tell a story. But if we want to understand security risks, the framing used by this article is a distortion.
See the uproar over the {U+0262}oogle.com domain a couple of years ago. The merry Russian prankster doing that was just playing "Hey! Look what I did! Ha Ha Ha!" with it, whoever he could get to click on it, but it was certainly obvious then that it could be used for nefarious purposes.
I use there font Verdana where possible -- the letters all look different.
Th lI is bullshit that every font designer believes in.