Google Bots Doing SQL Injection Attacks
ccguy writes "It seems that while Google could really care less about your site and has no real interest in hacking you, their automated bots can be used to do the heavy lifting for an attacker. In this scenario, the bot was crawling Site A. Site A had a number of links embedded that had the SQLi requests to the target site, Site B. Google Bot then went about its business crawling pages and following links like a good boy, and in the process followed the links on Site A to Site B, and began to inadvertently attack Site B."
not just "could care less". Sheeesh.
If you have http GET requests going (effectively) straight into your database, that's YOUR problem, not Google's.
How is that news? Zalewski wrote a book on that years ago ("Silence on the wire")
TFA seems to place all the faults on Google.
Fact is, Google is not the only one who is crawling the Net. Yahoo does it as well as Bing, among others.
If the Google "bots" can be tricked into doing the "heavy lifting", so can the Yahoo "bots", Bing "bots", and "bots" from other search engines.
In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe".
My Hello World is 512 bytes. But it's also a valid Fat12 boot sector, Fat12 file reader, and Pmode routine.
This is Slashdot. What do we know about GET HEAD methods?
That doesn't really have much to do with anything, a lot of DB connection/query libraries allow stacked queries to be performed (i.e. more than one queries, separated by ';') so by appending your own SQL query (say, a DELETE one) via a vulnerable input you can still do plenty of damage, even via a GET method.
TFA isn't newsworthy in my opinion, this has been known for a while now.
The trick is that retrieval can be dangerous by itself if you're using the database and forgot to sanitize your SQL. Being a moron can't be solved by an RFC.
This is Slashdot. What do we know about GET HEAD methods?
I was going to say that they return Futurama quotes but then I checked and they are gone. When did that happen?
My Hello World is 512 bytes. But it's also a valid Fat12 boot sector, Fat12 file reader, and Pmode routine.
In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe".
That's the funny thing about SQL injection attacks - it can turn a SELECT into a DELETE or UPDATE. So you may have *meant* your GET request to be a simple retrieval, but a successful attack could make it do so much more.
Which is a great segue to the obligatory xkcd comic!
http://xkcd.com/327/
The problem with this line of thinking is that spiders are only supposed to crawl links. If you use a live link without authentication, shame on you. If you use a query to a db for something like a parts catalog that's capable of r/w, then shame on you. If you tether your logic through a pipe, the pipe needs parser constraints on the query.
Blaming Google or any other crawler-spider-bot, despite my other distain for Google, is pointing the finger at the wrong culprit. Everyone wants sub-second response times, but if you don't parse, you're a target for all sorts of injection goodies.
---- Teach Peace. It's Cheaper Than War.
If Microsoft follows links shown in "private" skype conversations (and probably several NSA programs too) they could be used to attack sites this way. Could be pretty ironic to have government sites with their DBs wiped from a SQL attack coming from an NSA server.
I'm not sure to which line of thinking you're referring, both myself and the GP just posted a technical remark each. Also (to my great joy and surprise) no-one is blaming Google (at least not yet) and rightly so.
As for the back-end countermeasures you described, you are of course spot on, however it's safe to assume that if you're vulnerable to something as trivial and mundane as SQL injection, you won't have the required foresight to setup and use different DB roles, each with the absolutely least privs for the queries you expect to perform through them.
This guy(who I won't name, you know who you are), was once writing some PHP code for some webapp. Well in app, he had some delete links and he hadn't finished the authentication code apparently, so googlebot crawled is site, followed all of the delete links and completely wiped out his database.
Of course, you can keep googlebot away from your crappy code with robots.txt too...
'Someone failed at the most basic level here and it wasn't Google. From RFC 2616 (HTTP) Section 9.1 Safe and Idempotent Methods - "In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe"."`, Matthieu Heimer
The point is not that you can attack lousy website using GET requests. The idea is that HTTP firewalls shoud not blatlantly white-list google bots and other website crawlers in the sake of SEO optimization, because google bot will follow malicious links from other website..
So lets say you have a filter with rules that prevent common SQL injections in GET requests parameters, this is a weak security practice but can be useful to mitigate some 0-day attacks on vulnerable scripts. This protection can be by-passed IF you white-listed google bot.
If you want more performance, you should be using prepared statements and statement caching, not string concatenation to construct your queries.
Then you don't need to waste CPU time and memory escaping input data.
It's not the admins of the sites embedding the links that are the problem. They're they attackers. The fault lies with the admins of the sites the links point to.
When I first started doing web apps, I made a basic demo of a contacts app and used links for the add, edit, and delete functions. One day I noticed all the data was gone. I figured someone had deleted it all for fun so I went in to restore from a backup and decided to look at the logs and see who it was. It was googlebot -- it had come walking through, dutifully clicking on every "delete" and "are you sure?" link until the content was gone.
(I knew about when to use GET versus POST -- it was just easier to show what was happening when you could mouse over the links and see the actions.)
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
So they do care a bit then?
The Caring Continuum - http://incompetech.com/Images/caring.png
Wanna buy a shirt?
https://www.redbubble.com/people/stealthfinger/shop?asc=u