How To Evade URL Filters With (Not-So) Fancy Math
Trailrunner7 writes "In their constant quest to find new and interesting ways to abuse the Internet, attackers recently have begun using an old technique to obfuscate URLs and IP addresses to bypass URL filters and direct users to malicious sites. The technique takes advantage of the fact that modern browsers will allow users to specify IP addresses in formats other than base 10. So a typical IP address that looks something like this — 192.10.10.1 — can also be written in base 8, hexadecimal or a handful of other formats, and the browser will recognize it and take the user to the specified site. What is interesting though is that due to the relative obscurity of using such methods to denote an IP or URL, it is quite feasible that existing security products do not correctly identify the URLs as valid or flag them as malicious when they point to existing known bad websites."
The linked article is next to worthless. The real details are in this blog post.
I actually preferred using a url with the 10 digit number that was my base 10 IP address in E-Mails as it got people's attention in an otherwise bland sea of domains. This has been a feature of libc as long as I can remember (in Linux you should be able to ping an IP address in some other number base) but Firefox actually makes an effort to disallow using IP addresses with this notation. So if they're using Firefox, it won't work so well.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
That's the same combination I have on my luggage!
It doesn't matter which way you enter the address into your browser, it still resolves to the same IP. If that IP is blocked, you won't get through even if you use this method.
FTFA:
In other words, no testing has been done at all. What is this poorly-thought-out bit of speculation doing on the front page of Slashdot?
"A week in the lab saves an hour in the library"
All the alternate methods of specifying IP addresses for URLs work in Chrome. When you mouse over the link, you see it with the traditional decimal IP address, so it's not as obfuscated as it could be. Similarly when you reach the site, the URL displayed is in the traditional format.
Addresses like http://0xdeadbeef/ and http://0xdeadd00d/ are assigned to a Chinese telecom company (they have all of 0xdead....).
You can't just do things like this based on the syntax of the input, but rather on the semantics. In this case, to properly block the URLs, you need to parse them and transform them into an abstract representation of what they mean, e.g. a struct that encodes the protocol, host, port, document and query strings, and then examine the parse result to check if it matches the rule.
The IT industry just systematically fails this over and over, because of people's bad habit of doing shit with regular expressions instead of parsing and semantic analysis. See, for example, the gazillion ways that people get around cross-site scripting filters; or if you want to see it from the other angle (generation instead of parsing), see SQL injection.
Are you adequate?
The problem with this approach is that the requested URL doesn't provide a hostname, just the IP address. As IP addresses are in short supply, it has been an extremely common practice for years to assign multiple websites to a single IP address, otherwise known as name-based virtual hosting. This is common even for large companies. When you specify the URL with an IP address, the browser doesn't provide an appropriate Host: HTTP header, so any web server set up this way won't know which of the many websites it hosts should be returned. This means that anybody browsing the web with this technique will find that some websites work and some won't, seemingly at random to them.
Bogtha Bogtha Bogtha
Who thought it was a good idea to allow IP addresses to be entered in so many different formats? Who are you to decide that 0x01 is not a domain name? This is a feature which is hardly ever going to be used legitimately, but the code must be written and tested. KISS. Keep it simple, stupid.
Here is some text to get past the filter.
I'm glad Slashdot is here to tell us about these things, or else I might not have found this important security bulletin.
We must have had 20 different ways to get to goatse.cx.
I didn't need 20 different ways. I just had it bookmarked for quick and easy viewing.
Unfortunately you now cannot configure your ADSL modem until you install and configure local DNS and add the modem to the zone. Hardly something most grandmothers can do.
Not all IP address filtering is done by IP firewalls. These days there are many applications, most notably web browsers, that consult online databases of known or suspected malicious hosts in order to protect users from malicious hosts. I know for a fact that Firefox and Safari do this--if you try to go to a known suspected malware site, the browser pops up a warning page instead of the page you asked for. Google also do it for their search results--suspected malware site results don't link to the site in question, they link to a warning page. Many websites also have anti-XSS submission filters that perform textual matching against known "bad" addresses, to protect their users from attacks.
Apparently, many such programs are not parsing the textual IP addresses into a canonical form, and are therefore vulnerable to this sort of obfuscation. So the typical result here is that a comment submission system will fail to block a comment that has some XSS in it, and the users' browsers, running on a network whose firewally doesn't filter the IP address in question, will then fetch a malicious script from a known malware site.
Are you adequate?
The author apparently does not realize this, but you can also partly concatenate octets and mix various notations:
http://0x4a.8196963/
And yes, congratulations on being cutting edge: this thing is so old and well-known that it's even explicitly covered in RFC 3986, section 7 ("Security Considerations"), subsection 7.4 ("Rare IP Address Formats").
This is actually just a watered down version of a very, very old trick wherein you'd take a URL like http://3273372964/en/weblog?weblogid=208188044 and insert www.cnn.com@ before the ip address in long form. This of course meant the browser would try to login to the "real" website with the login "www.cnn.com". So you'd end up with a url that looked very much like it was part of CNN's website but was in fact something else entirely. I'd show you a demonstration URL, but Slashdot filters out the obfuscating part of urls formatted in that way so it would look identical.
At any rate, these days, not only do forums like Slashdot actively weed out those sorts of URLs as obvious attempts at obfuscation, but browsers pretty much universally will throw up a warning before you taking you to a website obfuscated in that manner. And as a result, that trick long ago fell out of fashion.
But it seems everything old is new again, if you wait long enough.