Google Bots Doing SQL Injection Attacks

← Back to Stories (view on slashdot.org)

Google Bots Doing SQL Injection Attacks

Posted by Soulskill on Tuesday November 5, 2013 @12:19PM from the it's-not-a-bug,-it's-a-feature dept.

ccguy writes "It seems that while Google could really care less about your site and has no real interest in hacking you, their automated bots can be used to do the heavy lifting for an attacker. In this scenario, the bot was crawling Site A. Site A had a number of links embedded that had the SQLi requests to the target site, Site B. Google Bot then went about its business crawling pages and following links like a good boy, and in the process followed the links on Site A to Site B, and began to inadvertently attack Site B."

35 of 156 comments (clear)

Min score:

Reason:

Sort:

could not care less by Anonymous Coward · 2013-11-05 12:22 · Score: 5, Informative

not just "could care less". Sheeesh.
1. Re:could not care less by Anonymous Coward · 2013-11-05 12:47 · Score: 5, Funny
  
  Means the same thing irregardless.
2. Re:could not care less by sootman · 2013-11-05 15:40 · Score: 4, Interesting
  
  It's probably laziness, but it could also be a shortened version of "I could care less, but I'd have to try."
  "Sure as hell" and "sure as shit" have no meaning either, right? How sure is hell, or shit? Those are shortened versions of "as sure as hell is hot" and "as sure as shit stinks". Language happens.
  I'm more concerned with errors on non-idiomatic speech, like "should of" and "could of" instead of "should have" and "could have", "try and" instead of "try to", and #1 on my list, "literally" meaning "figuratively".
  After we sort that out, we can come to an agreement on split infinitives, the Harvard comma, and people whether punctuation that isn't part of a quote should be inside quotation marks or out. :-)
  
  --
  Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
3. Re:could not care less by Neil+Boekend · 2013-11-05 20:01 · Score: 2
  
  It becomes a problem when the literal meaning is the exact reverse of the intended meaning. "Sure as hell" does not have another meaning. For natives that is no problem. Non natives have more trouble with this and I am sure I don't need to remind you that most people do not have English as their mother language.
  
  In a similar case I actually had a supplier put a line on a quote that translates to: "all products are available from stock if another date is mentioned in the line". What they meant was: "all products are available from stock unless another date is mentioned in the line".
  
  --
  Well, I might have a way, but it only works on a semi spherical planet in a vacuum.
4. Re:could not care less by GodGell · 2013-11-05 20:36 · Score: 3, Informative
  
  I'm more concerned with errors on non-idiomatic speech, like "should of" and "could of" instead of "should have" and "could have"
  THIS, a thousand times this!
  I'm not much of a grammar nazi, as I view communication to be the primarry purpose of text and not syntax... but "should of" actively takes chunks out of my brain every time I read it. It honestly makes me feel like I'm trying to talk to a retard, it just makes so little sense.
  The worst part is, while currently it's almost exclusively native English speakers who make this mistake (which is pretty odd), soon enough people like me who learnt by practice are going to start using it en masse, and then it'll be here to stay (like "could care less" - another one perpetuated by native speakers, btw).
  
  --
  [SHOW SOME LENIENCY TOWARDS ... I mean, FUCK BETA] Eat. Survive. Reproduce. GOTO 10
5. Re:could not care less by stealth_finger · 2013-11-05 21:19 · Score: 2
  
  Actually, since they are there already, crawling it, they really could care less. They could not be there at all, but no. They do care, and are crawling.
  So it was correct to say they could care less.
  Well, the quote says 'could care less about your site'. I doubt google care one iota about most individual sites in isolation. They care a great deal about the web in general to be sure, but if they cared about your site, they'd send a person to look at it rather than an automated crawler.
  
  --
  Wanna buy a shirt?
  https://www.redbubble.com/people/stealthfinger/shop?asc=u
Uhh... by Anonymous Coward · 2013-11-05 12:23 · Score: 5, Insightful

If you have http GET requests going (effectively) straight into your database, that's YOUR problem, not Google's.
1. Re:Uhh... by Anonymous Coward · 2013-11-05 12:49 · Score: 3, Informative
  
  I whole heartedly agree. Database programming 101: you cannot trust any inputs (user or otherwise). You must assume that any input is malicious and sanitize it as such. Maybe the devs that are researching/complaining about this should consider the target as the problem not the 12,000 different ways to input malicious code.
2. Re:Uhh... by Anonymous Coward · 2013-11-05 13:18 · Score: 2, Insightful
  
  Suppose there is a way to mitigate this issue on Google's end, is there something wrong with taking action to reduce the amount of attacks, even if the website is at fault?
  Yes, there is something "wrong"- Google has no idea what is or is not a "malformed" request. You're basically asking Google to sanitize the database input, which is generally not possible if you don't know anything about what the database should or should not accept. adding something along the lines of 'user=root' or 'page=somekindofdata' to a query may be perfectly legitimate for one site, and a massive problem for a different one.
3. Re:Uhh... by smellotron · 2013-11-05 17:21 · Score: 4, Informative
  
  As long as you escape them properly
  
  Friends don't let friends generate dynamic SQL. Please use prepared statements!
How is that news? by d33tah · 2013-11-05 12:24 · Score: 2

How is that news? Zalewski wrote a book on that years ago ("Silence on the wire")
How about Yahoo "bots", Bing "bots" ? by Anonymous Coward · 2013-11-05 12:27 · Score: 5, Insightful

TFA seems to place all the faults on Google.
Fact is, Google is not the only one who is crawling the Net. Yahoo does it as well as Bing, among others.
If the Google "bots" can be tricked into doing the "heavy lifting", so can the Yahoo "bots", Bing "bots", and "bots" from other search engines.
1. Re:How about Yahoo "bots", Bing "bots" ? by _Sharp'r_ · 2013-11-05 12:31 · Score: 5, Insightful
  
  Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!
  Next you'll be suggesting that you could do that transparently to the user and have their browser re-use their already logged in session on another site to do things with their credentials for you!!!!
  What will they think of next? It's a good thing we have these wonderful stories to explain how this whole web thingy works with all it's links and stuff...
  
  --
  The party of stupid and the party of evil get together and do something both stupid and evil, then call it bipartisan.
2. Re:How about Yahoo "bots", Bing "bots" ? by aztracker1 · 2013-11-05 12:48 · Score: 4, Informative
  
  What's funny is bing has bots that will actually execute and follow through JavaScript requests... last year, I worked to refactor our link structure (normalizing, and reducing variance), this caused a reindex of the site (about 50k urls), however Bing bots went nuts, and because they executed JS, this really affected our unique visitors on our Google Analytics (they don't actually filter bots). It looked like our unique visitors went up by 40% (all from 3 locations, all Microsoft), while our pages per visit plummeted. Bots are necessary, but can be dangerous if you don't account for them.
  
  --
  Michael J. Ryan - tracker1.info
3. Re:How about Yahoo "bots", Bing "bots" ? by icebike · 2013-11-05 13:16 · Score: 3, Informative
  
  Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!
  Real people don't have to click that link. Their computers and devices have web browsers that follow links ahead of time to
  improve browsing experience. Chrome calls this "Predict network actions to improve page load performance".
  But such hits would come from a wide variety of IPs, not from Google.
  
  --
  Sig Battery depleted. Reverting to safe mode.
4. Re:How about Yahoo "bots", Bing "bots" ? by Anonymous Coward · 2013-11-05 14:17 · Score: 4, Informative
  
  No need to use links, either.
  Good old <img src="http://your.site.is/dumb?and=has+sql+injection%22;drop table users;--"/> would work just by visiting the site, as would an iframe, whether browser tries to be smart or not.
5. Re:How about Yahoo "bots", Bing "bots" ? by ArsenneLupin · 2013-11-06 03:58 · Score: 2
  Why, it's not just bots! If you put a link out on a public web site, real people might even click on the link for you!
  This is actually very useful for more persistent attacks.
  
  Site A and Site B have both SQL injection vulnerabilities
  Site A is the real target, and site B is a high traffic site getting lots of visitors
  Site B somehow gets an <img width=1 height=1 src="http://www.site-a.com/cms?id=%3Bupdate content set text%3D'%3Cimg src%3D%22http://goatse.fr/hello.jpg%22%3E"--> tag added somewhere in its content
  Random visitor visits site B, visitor's browsers attempts to fetch the 1x1 pixel from site A
  Site A now looks lots nicer
  ... but somehow this does not please site A's administrator, who restores it from backup
  Another person visits site B
  Site A is again lots nicer
  
  ... but this still doesn't please A's admin, so he goes for another restore from backup
  ... and he can't even block the artist's IP because, the well, those IPs are all over the place
  ... all the while site B's admin is completely oblivious to the phenomenon, as nobody is going to notice a small 1x1 pixel image, much less complain about it
6. Re:How about Yahoo "bots", Bing "bots" ? by AliasMarlowe · 2013-11-06 05:08 · Score: 2
  
  Actually, bingbot is particularly stupid. It has downloaded several zip files of public domain material (each exceeding 1GB with total over 10GB) from our web site at home. It does so about once per month despite the fact that these files are unchanging, instead of merely doing a conditional GET and checking for a 304 return. The various googlebots all do it this way, as do other bots (e.g. docomo, yahoo, yandex).
  We don't yet bar bingbot, but if it starts dowloading several GB at times when other visitors are looking at videos (mostly 720p and 1080p), it will find itself in the wrong part of robots.txt. If I get really irritated, then it will get customized garbage results, just like the ZmEu crap...
  
  --
  Those who can make you believe absurdities can make you commit atrocities. - Voltaire
HTTP RFC - Section 9.1 Safe and Idempotent Methods by ChaseTec · 2013-11-05 12:27 · Score: 4, Informative

In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe".

--
My Hello World is 512 bytes. But it's also a valid Fat12 boot sector, Fat12 file reader, and Pmode routine.
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth by Anonymous Coward · 2013-11-05 12:30 · Score: 5, Funny

This is Slashdot. What do we know about GET HEAD methods?
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth by Zapotek · 2013-11-05 12:37 · Score: 2

That doesn't really have much to do with anything, a lot of DB connection/query libraries allow stacked queries to be performed (i.e. more than one queries, separated by ';') so by appending your own SQL query (say, a DELETE one) via a vulnerable input you can still do plenty of damage, even via a GET method.

TFA isn't newsworthy in my opinion, this has been known for a while now.
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth by d33tah · 2013-11-05 12:38 · Score: 2

The trick is that retrieval can be dangerous by itself if you're using the database and forgot to sanitize your SQL. Being a moron can't be solved by an RFC.
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth by ChaseTec · 2013-11-05 12:38 · Score: 4, Interesting

This is Slashdot. What do we know about GET HEAD methods?
I was going to say that they return Futurama quotes but then I checked and they are gone. When did that happen?

--
My Hello World is 512 bytes. But it's also a valid Fat12 boot sector, Fat12 file reader, and Pmode routine.
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth by hawguy · 2013-11-05 12:41 · Score: 2

In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe".
That's the funny thing about SQL injection attacks - it can turn a SELECT into a DELETE or UPDATE. So you may have *meant* your GET request to be a simple retrieval, but a successful attack could make it do so much more.
Which is a great segue to the obligatory xkcd comic!
http://xkcd.com/327/
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth by postbigbang · 2013-11-05 12:50 · Score: 2

The problem with this line of thinking is that spiders are only supposed to crawl links. If you use a live link without authentication, shame on you. If you use a query to a db for something like a parts catalog that's capable of r/w, then shame on you. If you tether your logic through a pipe, the pipe needs parser constraints on the query.
Blaming Google or any other crawler-spider-bot, despite my other distain for Google, is pointing the finger at the wrong culprit. Everyone wants sub-second response times, but if you don't parse, you're a target for all sorts of injection goodies.

--
---- Teach Peace. It's Cheaper Than War.
Skype too by gmuslera · 2013-11-05 12:52 · Score: 5, Interesting

If Microsoft follows links shown in "private" skype conversations (and probably several NSA programs too) they could be used to attack sites this way. Could be pretty ironic to have government sites with their DBs wiped from a SQL attack coming from an NSA server.
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth by Zapotek · 2013-11-05 13:26 · Score: 2

I'm not sure to which line of thinking you're referring, both myself and the GP just posted a technical remark each. Also (to my great joy and surprise) no-one is blaming Google (at least not yet) and rightly so.

As for the back-end countermeasures you described, you are of course spot on, however it's safe to assume that if you're vulnerable to something as trivial and mundane as SQL injection, you won't have the required foresight to setup and use different DB roles, each with the absolutely least privs for the queries you expect to perform through them.
reminds me of someone from irc... by AndroSyn · 2013-11-05 13:39 · Score: 2

This guy(who I won't name, you know who you are), was once writing some PHP code for some webapp. Well in app, he had some delete links and he hadn't finished the authentication code apparently, so googlebot crawled is site, followed all of the delete links and completely wiped out his database.
Of course, you can keep googlebot away from your crappy code with robots.txt too...
Read RFC 2616: Safe and Idempotent Methods .. by codeusirae · 2013-11-05 14:02 · Score: 2

'Someone failed at the most basic level here and it wasn't Google. From RFC 2616 (HTTP) Section 9.1 Safe and Idempotent Methods - "In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe"."`, Matthieu Heimer
1. Re:Read RFC 2616: Safe and Idempotent Methods .. by dkleinsc · 2013-11-05 15:00 · Score: 2
  
  I can't tell if you're serious, so I'm going to act like you are in case you or some other reader doesn't understand the problem:
  In the URL that you'd be using to hit that page, change the "id=42" or whatever you have there to "id=0 OR 1=1". Poof, your page is now reading the entire catalog in rather than the single record you wanted to. Hit that fast enough, and if that catalog is large enough, and the bad guy may have just brought your nice database server to a screaming halt as it loads up 300 million records multiple times every second.
  Or, if you're really not careful, a black-hat can use that to start pulling out whatever they want out of your database: usernames, passwords (hashed, but bad guy can crack 1-way hashes relatively quickly if it's useful to do so)
  
  --
  I am officially gone from /. Long live http://www.soylentnews.com/
Did anybody read TFA? by ghn · 2013-11-05 14:07 · Score: 4, Interesting

The point is not that you can attack lousy website using GET requests. The idea is that HTTP firewalls shoud not blatlantly white-list google bots and other website crawlers in the sake of SEO optimization, because google bot will follow malicious links from other website..
So lets say you have a filter with rules that prevent common SQL injections in GET requests parameters, this is a weak security practice but can be useful to mitigate some 0-day attacks on vulnerable scripts. This protection can be by-passed IF you white-listed google bot.
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth by viperidaenz · 2013-11-05 14:15 · Score: 2

If you want more performance, you should be using prepared statements and statement caching, not string concatenation to construct your queries.
Then you don't need to waste CPU time and memory escaping input data.
Re:HTTP RFC - Section 9.1 Safe and Idempotent Meth by viperidaenz · 2013-11-05 14:17 · Score: 2

It's not the admins of the sites embedding the links that are the problem. They're they attackers. The fault lies with the admins of the sites the links point to.
I had that happen to me once. by sootman · 2013-11-05 15:52 · Score: 2, Interesting

When I first started doing web apps, I made a basic demo of a contacts app and used links for the add, edit, and delete functions. One day I noticed all the data was gone. I figured someone had deleted it all for fun so I went in to restore from a backup and decided to look at the logs and see who it was. It was googlebot -- it had come walking through, dutifully clicking on every "delete" and "are you sure?" link until the content was gone.
(I knew about when to use GET versus POST -- it was just easier to show what was happening when you could mouse over the links and see the actions.)

--
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
Could care less? by stealth_finger · 2013-11-05 20:53 · Score: 2

So they do care a bit then?
The Caring Continuum - http://incompetech.com/Images/caring.png

--
Wanna buy a shirt?
https://www.redbubble.com/people/stealthfinger/shop?asc=u