Firefox, Opera Allow Phishing By Data URI Claims New Paper
hypnosec writes "A student at the University of Oslo, Norway has claimed that Phishing attacks can be carried out through the use of URI and users of Firefox and Opera are vulnerable to such attacks. Malicious web pages can be stored into data URIs (Uniform Resource Identifiers) whereby an entire webpage's code can be stuffed into a string, which if clicked on will instruct the browser to unpack the payload and present it to the user in form of a page. This is where the whole thing gets a bit dangerous. In his paper, Phishing by data URI [PDF], Henning Klevjer has claimed that through his method he was able to successfully load the pages on Firefox and Opera. The method however failed on Google Chrome and Internet Explorer."
In other words, IE and Chrome do not implement the data URI to the specification.
Lucky them, they can pose now as "more secure".
Questions raise, answers kill. Raise questions to stay alive.
What are some benevolent use cases of these data URIs that justify supporting them? I'm not baiting you, just ignorant and curious.
How do these malicious URIs get access to the underlying Operating System?
AccountKiller
small inline images on a webpage, it saves a separate retrieval of the image. Of course the download is a bit larger since the image is first base64 encoded.
I'm not actually sure that this is the case. (Change the Wikipedia entry if it's wrong, then.)
it's more about how IE and Chrome don't support DATA uri's. the article is stupid. that's what the article really is about, if it supports data uri's or not.
Chrome and IE both support data URIs. Chrome doesn't allow redirecting to a data URI for this reason
I've been reading the Wikipedia entry, and if I grasp it correctly there's a distinct negative repercussion to use of them: they could apparently be used to stuff HTML elements into one "get" and possibly defeat all sorts of HTTP proxy filters, ad blockers, and other sundry Web-page tweakers in the process. If that's true, I would not be in favor of their use or support at all. I use all sorts of tools and extensions to "take back the Web"; I don't want to lose the abilities those tools enable.
Testing if I can embed
Can anyone explain to me why this is worse than serving up the same "malware" on a web page instead of a data URL? The screenshot in the paper clearly shows the url starting "data:text/hml;" instead of http://en.wikipedia.org/ so surely it is just doing the same thing as if I hosted a mock wikipedia login on "mysite.com" - and is a lot less likely to fool people than if I used a domain like wikipediaLogin.com
This might technically be a phishing exploit but you would have to be pretty stupid to fall for it still as the address bar at the top of the page would not be your banks a web address.
I dont read
So I click on a link and a page loads, as expected. What happens then? How does that page compromise my browser?
Whatever may or may not be true in regards to IE security, this particular vulnerability does not work on IE because it has a length limit on data URIs, not because anyone thought of it and secured it against it. It's accidental. Chrome is the browser that has an actual security feature preventing this attack.
Assorted stuff I do sometimes: Lemuria.org
They were originally implemented to contain data inside documents where you need everything to be contained in one file - such as early e-mail systems.
Assorted stuff I do sometimes: Lemuria.org
Take a website with 100 small images, with average image size 10kb, latency (3-way handshake+data) = 25ms, and your bandwidth = 10Mbit/s
Using 5 paralel connections (max allowed by http) the site will download in 10/1280*100 + 0,025*20 = 1,28 seconds
Embeding all images in original document using data URI's (~1.37x overhead to data size but no latency impact), the site will download in 10*100*1,37/1280 = 1,07 seconds
HTTP2.0 / SPDY will solve this, but it will take many years till they are widely adopted.
IE does support Data URI's though??? at least it did, have they removed this functionality or is it merely a more secure implementation than firefox and opera?
I noticed from the Wikipedia article that it had a long history dating back to the late Nineties. Webmail certainly had more limitations back then, not that HTML was really designed to handle that job.
My worry, if I understand this correctly, is that this could be used as a means to thwart every ad-blocker and page tweaker and HTTP proxy filter in existence. That would not be a good thing at all....
HTTP2.0 / SPDY will solve this, but it will take many years till they are widely adopted.
Not entirely. You still need to completely fetch and parse the main web page before you start fetching the images from it. If you use data URLs, then you implicitly fetch them before you even know that you need them. This is one advantage that Flash and Java applets have over JavaScript + HTML + image + sound files. There was some plan for allowing browsers to grab a page plus all of its resources in some kind of container file, but I don't recall it going anywhere.
I am TheRaven on Soylent News
I'm sorry, but that's downright untrue. See for yourself: https://en.wikipedia.org/wiki/Data_Uri#Web_browser_support
Microsoft has limited its support to certain "non-navigable" content for security reasons, including concerns that JavaScript embedded in a data URI may not be interpretable by script filters such as those used by web-based email clients. Data URIs must be smaller than 32 KiB in Version 8.
Version 9 does not have the 32 KiB limit.
Avoiding expensive HTTP requests.
Oh Lord its Twitter! Long time man, WTF you been up to? I can't believe you made a knockoff of my user ID, hell you made knock offs of Macthorpe and all the other old timers so I was starting to feel left out there, nice to see I'm well regarded enough to rip off, thanks.
As for TFA that's why i give my customers Comodo Dragon, based on Chromium but without all the Google phone home bits. the new IE may be secure but frankly who gives a rat's ass, when the refused to backport it to their STILL UNDER SUPPORT operating systems like XP (and isn't the next version only gonna work on 7 & 8?) it became completely useless as far as I'm concerned.
With Dragon you can run any Windows from XP-8 and have the same browser synced so it all "just works" which is nice and as a bonus you have any problems their devs are constantly on their forums and are quick to get back to you with help.
ACs don't waste your time replying, your posts are never seen by me.
and a guy is likely to link the link rather than be redirected to one in the first place, no?
No - that was one of the points of the article.
Increasingly the source of phishing URLs is social media rather than email. In tweets (and to a lesser extent on other social media) it's common to send a shortened URL (tinyurl, bit.ly, goo.gl etc) that redirects to the actual URL, and consequently users won't be surprised to receive such a short URL, and will probably click on it - whereas if they received a massively long "data:" URL with lots of base64 data after it, their suspicions would be more likely to be raised...
Need to type accents and special characters in Windows? Use FrKeys
I actually went and read the paper that this is supposedly all based on. (I know, it's not the done thing and I apologise) I don't know if it has changed since the other article was written but I couldn't find any reference to Opera or Firefox.
It does mention that Chrome will throw an error but if you hit enter or reload it will work. There is a one sentence reference to the fact that IE has "a limit to URIs". I presume that means a length limit and if so IE is not invulnerable - only the initial payload has to be smaller.
While there is much hand wringing about the fact that it cannot be shut down because there is not central server it is hosted on I don't see it as an issue. For phishing to be effective the stolen data has to actually GO somewhere which probably provides a target that can be shut down. It doesn't matter how long the URI circulates after the target is shut down - all that stolen data is probably going to the great byte bucket in the sky.
I think the more interesting point that the paper made is that phishing sites can effectively be hosted on link shortening services using this method.
it's more about how IE and Chrome don't support DATA uri's. the article is stupid. that's what the article really is about, if it supports data uri's or not.
WRONG! From the PDF:
"In Google Chrome in particular, a control for unsafe redirection is im-
plemented, disabling the user direct access to a data URI if that URI is
the target of a redirection, such as from a URL shortening service."
"Internet Explorer has a limit to data URIs,"
Google Chrome and IE have implemented security features to prevent this form of attack.
One use case is the local generation of downloadable content. For example, if you generate a sound file with Javascript, you can present the file as a clickable link that will allow the user to save the file, without first sending all the data to the server and then downloading it from there. A DTMF sequence generator could be written as a 100% client side script. An image processing script could likewise allow the user to save the edited image locally. There exist Javascript libraries for that purpose which convert HTML5 canvases into data URIs.
Generally data URIs come in handy when a roundtrip to the server is undesirable. Many use cases have different solutions now ("sprite collections" instead of many small files in web design, MIME embedded images in email.) Unfortunately, the potential for mischief has been quite obvious for a long time, because data URIs avoid many detection schemes by avoiding the server roundtrip, and consequently they have fallen from grace somewhat and saw no further improvement. You can't even suggest a filename if you use data URIs as a downloadable link.
sandboxing is just another layer of security, it isnt a silver bullet solution... in fact many times (like in chrome) is used as a excuse to not proper check things and do a more careless development (from the security point of view). all is well until someone finds a way to break out the sandbox (just look at the recent java security problems) and then you can use one of the many holes to hop jump the sandbox and reach the OS.
Firefox mostly dont have sandbox, but have many other proper security checks that other lack, and its secure because of then. Of course sandbox is yet another layer that should exist and they are slowly sandboxing key areas. Its harder because they want to support various OS at the same level where chrome have a full sandbox in windows but a lot weaker one in linux (see https://code.google.com/p/chromium/wiki/LinuxSandboxing and https://code.google.com/p/chromium/wiki/LinuxSUIDSandbox... things might be better when seccomp is enabled by default in chrome)
So yes, sandbox is good, but should not be trusted as the main security barrier in one application, other checks are always needed.
Higuita
You sound like an 7 year old pretending to be a robot.
OTOH if those small images can be cached, the advantage of using data-URIs disappears (is negated) on the 2nd time someone visits the page. So I don't think it's a very good idea to do it in this case.
"I love my job, but I hate talking to people like you" (Freddie Mercury)
... in an alert box of it's own:
javascript: and data: URIs typed or pasted in the address bar are disabled to prevent social engineering attacks.
Developers can enable them for testing purposes by toggling the "noscript.allowURLBarJS" preference.
Browsing the Web w/o NoScript is dangerous to the core anyway.
Just my 2cents
- Holger
They're doing it wrong, the correct security implementation is a warning prompt for the user.
In some cases, data-URI might be still faster (though less bw-effective), i.e if you take the original example and account for 54ms latency (3way handshake+initial response packet) then reloading the page (with all images cached) would take 0,054*20=1,08s since a query to the server for each image is still required
When using high-latency - high-throughput connection (i.e. mobile, satellite) then data-URI will be a lot faster than caching.
WRONG! "Has a limit to data URIs" is not a security feature, it's just poor implementation. Not redirecting to data URIs is a feature, as evidenced by ERR_UNSAFE_REDIRECT code.
I never click on a shortened URL. Maybe I am too old :-(
Interesting. Then Wikipedia and TFA are at odds regarding their claims regarding IE data support. Thanks for the correction.
Assorted stuff I do sometimes: Lemuria.org
Your argument is invalid.
Most of us are using http1.1 which has connection keep alive.
That would make your example 0.80625 seconds where uris would still need 1,07 seconds.
Also if you live somewhere where you need 25ms for a tcp handshake to complete, consider changing your ISP.
The best use case I know of is to inline all your small images you use for styling the site in the master stylesheet for the website. This way you only have one request instead of the hundred plus that many sites have.
Based on some tests I have done on many sites the vast majority of the time is spent on just getting 304s back on all the resources that have not changed. Inlining those small images can save 90% or more of the page load time.
Computer modeling for biotech drug manufacturing is HARD!
If you nest your languages, you can do a remarkable amount in a data URI: here's a Javascript chess-playing app, and an unbounded supply of webpages exploring the Collatz Graph, respectively. I expect you could get a small phishing site (which pulled graphics, etc, from the real thing) done similarly, and there's no server to take down. Writing a viral data URI that mailed itself to your friends might be harder.
It would block proxy filters and adblockers, /if/ the ads were kept onsite(which is one of the main problems with most ads today - loading them from offsite takes ages). Otherwise, any browser-based tools will simply treat it like a image/object from the page which can then be blocked accordingly. It will be loaded, but the extra KB or so in the single main page request won't really affect load time on anything but dialup, and the time will be far less than if the image was seperate.
Only until you create rules for blocking that include it; it won't prevent the ad from /loading/, but it can stop it from displaying. And this sort of ad wouldn't allow for load tracking specifically, so it wouldn't matter - you're loading the mainpage anyway, right?
Well played Anonymous Coward.
This changes everything
Here's an online Base64 decoder for those unwilling to click the link
http://www.motobit.com/util/base64-decoder-encoder.asp
Don't forget to set it for "decode"
[Fuck Beta]
o0t!
If your webserver supports GZIP compression in HTTP responses the difference might not be too bad.
Chrome uses data uris internally for inline background-images in CSS.
Webpages can use them for similar purposes. One less resource to query for.
They can also be used to easily construct files for the user to download, then you can stick them in a data uri and present them to the user as a link, or navigate to a data uri to force a download or display the resource to the user.
It's not at all nice to deliberately misframe the comments of another merely to create an artificial opportunity to mock him. You're being a tool, and not the useful kind.
I said it was a WORRY that it COULD be used that way. I said nothing at all about whether I thought such extensions and proxies could be updated to compensate, and I said nothing about it because I don't actually know that to be the case with any authority; for all I know it might not even be possible to intercept such repackaged elements. I also don't know with any authority that it ISN'T the case... I SIMPLY DON'T KNOW and thus didn't address it. Then you came along gunning for an opportunity to make yourself feel superior and decided to manufacture an opportunity out of thin air. You help no one with that bullshit, least of all yourself.
Any relation to account 527904? Hope not. That one's a tool.
1,08s since a query to the server for each image is still required
That's not the case if the images came with an "Expires" header or similar, browsers will just reuse them without any network operation. You can verify this with all the built-in header/network debugging facilities in major browsers.
"I love my job, but I hate talking to people like you" (Freddie Mercury)
That's why you put the data URIs in an external CSS file. Then they're cached with the rest of the CSS that is only referenced by each page. OTOH most web designers use sprite collections (many small images combined into one file, separated by CSS). That's even faster, as it doesn't incur the base64 encoding overhead and only needs one additional HTTP request (one for the CSS plus one for the image collection, instead of just one for the CSS).
Embeding all images in original document using data URI's (~1.37x overhead to data size but no latency impact), the site will download in 10*100*1,37/1280 = 1,07 seconds
The size overhead would be likely not much of an issue; the data URIs would compress quite reasonably back down to something close to the originating data size (assuming a compressed data format, of course, but that's the overwhelmingly common case). Reworking the math to allow for the compression takes the download time estimate back to 0.78s; that's about ~40% faster, not ~20% faster as you had worked out.
On the other hand, I suspect your calculations are an oversimplification anyway due to the fact that a full TCP initiation handshake isn't required for every image due to connection reuse and pipelining, and there's the possibility to do parallel downloading of the containing document (through use of the HTTP Range header) and so on. Plus there's the effect of caches, and using CSS sprites to reduce the per-image overhead, and a whole bunch of other things that I can't think of right now. But data URLs can most certainly be part of the mix.
"Little does he know, but there is no 'I' in 'Idiot'!"
You could also stitch all the images together in a grid and then use CSS to display different portions of the same image. JQuery does this for widgets.
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
If the server side code pulled the ads and the page was dynamically generated, you'd have the "best" of both worlds. Unblocked ads that weren't stored locally.
Or consider changing the speed of light. If you live 1/2 way around the world from the server, it would take 133 ms just for a single round trip. And that's only taking into account the speed of light, and not counting real world scenarios. In the real world, you effectively have to double that, giving you about 266 ms just for a single ping. tcp handshake is a little more complex. Even New York to Los Angeles is about 4000 km, which would give a theoretical minimum ping time of 26.8 ms.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
SPDY can actually push content to the browser without waiting for requests. So you can push all the images that the client will need to display the document without waiting for the client to parse the document and make requests.
And they're AWESOME for packing a few small images into a CSS file to save round-trips to the server and make life MUCH better on mobile devices with high latency.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
Ads are always loaded from other domains, don't worry about that.
New things are always on the horizon
What they usually do is create one CSS-file with all the smaller images included. The CSS-file can be cached, but still saves a whole bunch of requests for small images.
New things are always on the horizon
The biggest problem with TCP is False Start, but which means many small HTTP-requests will still be a lot slower than sending one large HTTP response.
New things are always on the horizon
The problem I ran into is that in practice people rarely actually use expires. They just let their web framework send a 304 when the browser checks. Also if you use expires it makes things more complex since if you change the image the new information does not show up instantly.
Mostly though the problem is very very crappy web frameworks.
What I do is set the expires etc stuff to 30 years or so and then I change the urls to the images whenever the image changes. That works insanely well since everything caches it correctly but you still see changes instantly.
Computer modeling for biotech drug manufacturing is HARD!
Many do both, include the sprite (as such a grid is called) in a CSS-file, the CSS-file is referenced on many pages on the site.
New things are always on the horizon
No worries, IE still supports MHTML which doesn't have the lenght limit.
New things are always on the horizon
The appropriate url is displayed, data URIs serve a purpose. OP, this is ridiculous. Quit giving this guy a voice.
Where genius and insanity become confused true wisdom is found
Are you crazy? 25ms is very fast for a tcp handshake.
It's convenient to put a black-and-white low-res placeholder image on a page using a data uri, so the area of the page where the full-resolution image will eventually be placed isn't blank while waiting for the much larger image to download.
"Lame" - Galaxar
As the author of the paper I feel the need to clarify a tiny point before I fall asleep. Google Chrome is vulnerable, it is only REDIRECTION TO A DATA URI that Chrome sees dangerous and denies. For more details, please contact me on Twitter (@hennikl) or by email (it's in the paper title). I'll try to watch this thread and give more exhaustive answers after some hours of beauty sleep. It seems a lot of the commenters do not grasp the idea completely ;) --Henning Klevjer
As the author of the cited paper, I feel that I have to clarify a few issues here: As well as Opera and Firefox, GOOGLE CHROME ALSO "suffers" from the ability to host data URIs. It just distrusts being redirected to one. IE (it is said) has a size limit to data URIs of 32 KB. However, in my tests, a ~26 KB URI was tried, unsuccessfully. The data URI phishing pages can be made in many ways, differing in how they use other data. One can make a true offline (or local) version of a web page if all linked content on the page is contained in the "root page" through yet another data URI. If the data URI web pages are presented on a computer running a related trojan program, this program may handle the communication of the "secret information" (credit card #, passwords, etc.). This can be done P2P (as in botnets) thus no need for server infrastructure. Another issue I'm discussing in my paper (http://klevjers.com/papers/phishing.pdf) is that of ownership to the data URI contents. I feel TinyURL unwittingly takes ownership of whatever content that is hosted there, as they store the entire (phishing) web page on their servers.
To solve this latency problem, most well-designed websites use a single large GIF or PNG for all their tiny CSS images, then slice the image to indicate each independent icon, border, etc. This not only reduces the total image overhead but also greatly reduces the total number of 304s to receive.
Example: one of Facebook's icon resource files
File this one under uninteresting and an obvious forgery.
Technically this is not much different than just hosting a look-alike page to collect passwords. The phishing attack would be much more interesting if the URL wasn't so obviously bogus. According to the paper an attack could use a URL shortener to further hid the obviously odd URI. The problem with this is that the URI attack described in the article requires that you send the URI with payload to the victim. A URL shortener service has no reasonable way to direct the short URL to the crafted URI.
Ironically, using a shortened URL (tinyurl, bit.ly goo.gl etc) would make it easy to hide a real phishing site hosted out on the Internet. To say that this is a security hole is to say that because all browsers allow people to go to sites that can claim to be who they are not all browsers are insecure.
-rd
The security feature in Chrome is that redirection to data URIs are disallowed. Not data URIs themselves. If you enter a data URI into the address field, or is linked to one directly, without redirection, it works.
The problem is advertizers don't trust their customers not to cheat. What's to stop you, a site owner, from requesting the ad a billion times and sending it to /dev/null?
Only a customer-originated request hitting your adserver assures the customer got the ad.
Then, client-side adblocks would still be able to block the ads...
For me,the greatest yet-unrealized advantage of data URIs is what the article lists as its disadvantage: ability to easily integrate rich user content, without need to upload it to some servers. Posting an image to a forum? Don't upload it to imageshack or other services that will delete it in a month. Don't have a fancy system of uploading to the forum host. Simply have some javascript create your [IMG]data:image/jpeg,base64;86fciyitv==[/IMG] and hide the actual base64 content in the message typing preview.
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2