Slashdot Mirror


Googling Your Way Into Hacking

knifee writes "New scientist is running an article explaining how hackers can use Google's cache to quickly hunt down sensitive pages, for example, by searching the terms "bash history", "temporary" and "password". Might be worth looking at this tutorial about robots.txt if you think you might be at risk." That's pretty amusing.

58 of 431 comments (clear)

  1. This happens because of dumb admins, not google by mjmalone · · Score: 5, Insightful

    For example, one common filename for passwords is "bash history".

    This guy is a security consultant? Come on, what admin in their right mind would enter a password in cleartext on the command line and allow it to be stored in ~/.bash_history? The first thing I do when I log onto a box is link bash_history to /dev/null, just out of habit. The security problem isn't google's fault, it is stupid admin's who don't know what they are doing.

    1. Re:This happens because of dumb admins, not google by numbski · · Score: 5, Funny

      Wouldn't it be more fun to ln -s ~/.bash_history /dev/random instead?

      Would make for interesting google logs. ;)

      Don't have to worry about that particular problem. Both FreeBSD and MacOS X use tcsh by default anyway, and all of my users are Unix stupid, so they never log into shell.

      --

      Karma: Chameleon (mostly due to the fact that you come and go).

    2. Re:This happens because of dumb admins, not google by Bigby · · Score: 5, Funny

      Even better yet, "rm ~/.bash_history && ln -s /dev/dsp ~/.bash_history". Now everything you type will literally "sound like crap".

    3. Re:This happens because of dumb admins, not google by gooru · · Score: 5, Insightful

      It's not even just ~/.bash_history but ~/ itself! Who in the world would make that world-readable and published on the web?!?!? This isn't even the default for any configuration I've seen. (Does anyone else know differently?) It's one thing to spider ~/public_html or /var/www or whatever you have set up for your webserver...quite another to have ~/ published on the web. I can't believe this is a security problem for people, though I suppose it is a proven possibility.

    4. Re:This happens because of dumb admins, not google by dan14807 · · Score: 5, Informative

      > The first thing I do when I log onto a box is link > bash_history to /dev/null

      unset HISTFILE

    5. Re:This happens because of dumb admins, not google by Zigg · · Score: 4, Funny

      Except that it doesn't work, unless you intended to try to execute /dev/audio.

    6. Re:This happens because of dumb admins, not google by Anonymous Coward · · Score: 5, Funny

      OHMYGOD!! TEH SECURITY RAMIFICATIONS!!1!
      http://custom.lab.unb.br/pub/dc e/.bash_history
      pwd
      ls -l
      ls -l
      ls -la
      whoami

      http://www.mhhe.com/socscience/.bash_history
      vi test1
      ls -l
      who am i
      touch test2
      ls -l
      pwd
      cd ../business/
      ls -l
      vi randomfile
      ls
      ls -l
      cd marketing
      ls -l
      pwd

    7. Re:This happens because of dumb admins, not google by inertia187 · · Score: 5, Interesting
      It's happened to me. My .bash_history has contained passwords. Why? Because I'd type too fast and not look at the screen. For example:
      bash-2.05a$ ssh inertia@whatevre
      ssh: whatevre: no address associated with hostname.
      bash-2.05a$ f33lokihum
      Oops.
      --
      A programmer is a machine for converting coffee into code.
    8. Re:This happens because of dumb admins, not google by Bigbutt · · Score: 4, Funny

      Well, we had a stupid admin who, as a test put the /etc/passwd file into webspace.

      We had another admin who tried to su to root and typed in su [root password]. We check the logs searching for someone typing in a non-user account that looks like garbage and we notify the admin to change their password.

      --
      Shit better not happen!
    9. Re:This happens because of dumb admins, not google by SeanAhern · · Score: 4, Informative

      ln -s ~/.bash_history /dev/random

      Whoops!

      You meant: ln -s /dev/random ~/.bash_history

    10. Re:This happens because of dumb admins, not google by Ascender · · Score: 3, Insightful

      One possibility is that some 'clever' admin has set the 'webmaster' user's home directory to /var/www (or whatever your docroot is) - Then, as well as easy access to the html files, the .bash_* files would be left there too

  2. Google Cache, in case of slashdotting by Anonymous Coward · · Score: 5, Funny
    1. Re:Google Cache, in case of slashdotting by vgaphil · · Score: 4, Funny

      Or go here google

      --
      A clever person solves a problem. A wise person avoids it. -- Einstein
    2. Re:Google Cache, in case of slashdotting by Scott+Hale · · Score: 5, Funny
      Google is not affiliated with the authors of this page nor responsible for its content.

      Now I'm really confused.

    3. Re:Google Cache, in case of slashdotting by SlayerofGods · · Score: 4, Funny

      That is really cool, the whole site is done in it. Someone try to read this and not have your head explode.

      --

      Technology, the cause of and solution to all of life's problems.
    4. Re:Google Cache, in case of slashdotting by joynt · · Score: 4, Funny

      The sad thing is I can read it.

  3. RIAA Logic: by connsmythe96 · · Score: 5, Funny

    Google can be used to illegaly hack into computers (possibly stealing copyrighted information). Google must be shut down and all of its users owe us lots of money.

    --
    if(!cool) exit(-1);
  4. It's a little harder... by Tweakmeister · · Score: 3, Insightful

    A quick search for "Password" doesn't yield any "promising" hacking results. It's too common a word.

    --

    Colossians 2:8

    1. Re:It's a little harder... by Elminst · · Score: 4, Insightful

      But the third link down gives us this-
      http://216.239.57.104/search?q=cache:p5ouM3 2marEJ: www.necmitsubishi.com/markets-solutions/government /necfiles/Chicago911.doc+%22do+not+distribute%22+p assword&hl=en&ie=UTF-8

      Which at the bottom of the document has-

      Editors Note:
      Product photography is available at http://www.liska.com/necmit.
      Username: necmit
      Password: monitors


      Which seems to prove the point of the search...

      --
      No unauthorized use. Trespassers will be shot. Survivors will be shot again.
  5. Yea by mao+che+minh · · Score: 4, Funny
    Must be how that guy found out that my phpnuke code had a mySQL injection flaw in the news module. My article about a Hulk doll with big penis wasn't exactly fine journalism, but I would imagine that it was better then 40 lines of "hacked by Stacey 100% brasil LOL" that it was overwritten with.

    Damn script kiddies.

  6. Even better than Google by Anonymous Coward · · Score: 3, Interesting
    I tried this a while back - it isn't as easy as it looks with Google. I recently discovered WhittleBit and it is pretty good at narrowing down what you are searching for because it lets you indicate which search results are good and which aren't, and re-search on that basis.

    This is particularly useful for this type of thing since it isn't always obvious what the criteria are for what you want to search for - with WhittleBit you don't need to know, it figures it out for itself.

  7. problem with robots.txt tutorial by brlewis · · Score: 5, Interesting

    They should mention that disallowing a URI in robots.txt tells crackers which URIs on your site have sensitive information. What I do is create a top-level /unpub/ URI, and everything sensitive goes underneath it with hard-to-guess names. In robots.txt I disallow /unpub only.

    1. Re:problem with robots.txt tutorial by PetoskeyGuy · · Score: 4, Insightful

      I hope you at least have an .htaccess on the files to put a password on that directory. Hard-to-guess names is good, but making them password protected is better.

      Of course on some of the cheaper web hosting companies out there you can just search the /home/*/web folders. They have to be public so the web server can read them. Stupid I know, but all to common. Config.php for most apps will have all the users passwords in plaintext.

      The HTTPD user should be a member of each users group so you don't have to set world rights to your files. Assuming it's just hosting and no other rights are required.

    2. Re:problem with robots.txt tutorial by brlewis · · Score: 3, Interesting

      Password-protected directories wouldn't need to be in robots.txt. Using robots.txt + security by obscurity is for things like family photos, where I don't want to maintain usernames and passwords for my entire extended family, but it isn't absolutely critical that no unauthorized person ever see them. I doubt I could trust my entire extended family to keep passwords secure anyway.

      Yeah, cheap shared hosting is largely insecure. I wonder how tough it would be to set up shared hosting using squid as an http accelerator, and let users run web servers under their own UID on different ports, while squid forwards from port 80.

  8. robots.txt? by Karma+Sucks · · Score: 4, Interesting

    You're kidding right? Putting stuff in robots.txt is the best way to *guarantee* that robots will go specifically for the file/directories you choose to deny.

    Don't be naive about robots.txt... expect to have to do some relatively fancy hacking to actually enforce it.

    --
    (Please browse at -1 to read this comment.)
    1. Re:robots.txt? by rossz · · Score: 3, Insightful

      And that's why I have a disallow for a trap directory. Accessing it gets you added to a mysql database and you are blocked with iptables.

      --
      -- Will program for bandwidth
  9. Sesitive? by GoofyBoy · · Score: 3, Funny


    use Google's cache to quickly hunt down sesitive pages,

    Try hacking a dictionary.

    --
    The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
  10. robots.txt by panaceaa · · Score: 5, Interesting

    Robots.txt only makes well-behaved search engines not index certain portions of your site. You're still going to be vulnerable until you take the sensitive pages off-line completely. But even then, if a passwords list has been indexed by Google, updating your robots.txt file won't remove it from Google's cache until Google spiders your site again. At which time, Google will discover the passwords list doesn't exist and remove it from the cache.

    At least that's how it should work. Is anyone aware of Google requesting robots.txt more often than they spider pages? And then proactively removing pages from their cache based on new robots.txt entries?

    While the article deals with Google specifically, lots of non-well-behaved spiders go through common locations looking for password files regardless of what you've blocked out with robots.txt. The only way to completely protect your data is to remove it from your site.

    1. Re:robots.txt by Jugalator · · Score: 4, Funny

      ROFL -- It's also amusing when the admins don't understand what the file is for!

      Look at IBM:

      http://www.ibm.com/robots.txt

      First comment:

      Date: 19950130
      By: epc
      Reason: finally understood what the file was for!

      At least the admin was honest, but a bit embarrasing for being on ibm.com. :-P

      --
      Beware: In C++, your friends can see your privates!
    2. Re:robots.txt by UncleOlethros · · Score: 3, Informative
      According to my experience with my webservers, Google will request robots.txt frequently as it spiders a site. And yes, they do remove pages from their cache based not only because of new robots.txt entries but new META tags in individual pages.

      If you can't wait until the next time Google crawls your site to have your information removed, you can always use Google's Automatic URL Removal System. Details are available here.

      A few months back I updated all of my web pages to include the NOARCHIVE META tag. I then submitted my site to Google's Removal System and within three days Google had crawled everything and updated their database. The result was that my pages were still searchable, they just weren't cached.

      As you noted, though, there are plenty of robots that do not obey robots.txt. Google may be conscientious, but others are not.

    3. Re:robots.txt by frodo+from+middle+ea · · Score: 5, Interesting
      Check out Sun's robots.txt

      Part i like best

      # If you do actually go to the trouble of figuring out how to download # the files without registering, what you'll end up with is 1 or 2MB of # stuff that is meaningless to you unless you have purchased an # Ultra AX board from Sun. So, please do purchase an Ultra AX board, # but then you might as well use the URL you'll be given along with it.

      --
      for the last time people, I am "frodo from middle eaRTH", not "middle eaST".
  11. robots.txt by zero-one · · Score: 4, Interesting

    Having a robots.txt is a good idea but it always amuses me when web sites use robots.txt to list all the areas of their site that they don't what people to look at. When robots.txt contains entries like "Disallow: /admin.asp" or "Disallow: /backdoor.asp" it stops being a way of controlling search engines and becomes a site map of all the places hackers might be interested in.

  12. use deflection in mod_rewrite to keep crawlers out by stonebeat.org · · Score: 3, Informative

    It is always a good iea to kep the robots out of anywhere there is sensitive information. i several methods for added security. robot.txt is a good way, but i also the deflecction technique in apache's mod_rewrite to keep the crawlers out.

  13. ICQ by bazik · · Score: 5, Interesting

    A friend of mine actually used this to steal ICQ numbers. He wrote a perl script wich googles from "00000001.idx 00000001.dat" to "99999999.idx 99999999.dat" and spits out the result links to a textfile if it gets a full match.

    The ICQ password is stored in one of those two datafiles and there are dozend of free decrypt programms for that out there.

    But if you think about it... how or why does someone put his ICQ directory on a webserver?!

    On the other hand... some people are hosting pr0n sites and dont even know about it ;)

    --


    --
    One by one the penguins steal my sanity...
  14. Forgotten by orange_6 · · Score: 4, Funny

    So if I forgot my password, google can just tell me what it is? Can it tell me my credit card number too?

  15. My favorite... by inertia187 · · Score: 5, Informative
    My favorite Google search phrase is:
    "Index of" "Name Last modified Size Description"
    Then you add file extensions or other things. For example:Anyway, as you can see, it's pretty effective. Sometimes admins wise up, and all you have is the Google cache. But sometimes they don't, and you get to look. Thanks Google!
    --
    A programmer is a machine for converting coffee into code.
    1. Re:My favorite... by barryfandango · · Score: 3, Funny

      Oooh that's cool! check this link out that it turned up:

      http://www.liada.net/~secret/

      all in spanish, but the documents are all about toxic substances, i think... and there's one JPEG that appears to be a sketch of a missle! Now that's top secret!

      --
      In all matters of opinion, our adversaries are insane. -Oscar Wilde
  16. Well, duh! by panda · · Score: 3, Insightful

    If something is meant to be private, then why even temporarily put links to it on your publicly visible pages? Additionally, if something really is private, then lock it down in the httpd.conf so that only certain IP addresses can access it. Then, its basically invisible to the rest of the world.

    Of course, if there's a bug in your server software all bets are off. Which is why it's better not to put private stuff where it can be seen on a public network.

    I would have thought that was pretty obvious.

    --
    Just be sure to wear the gold uniform when you beam down -- you know what happens when you wear the red one.
  17. Re:/etc/passwd by jared_hanson · · Score: 3, Funny

    You should really use something other than '*' for your password. It is far to easy to guess. Just a suggestion

    --
    -- Fighting mediocrity one bad post at a time.
  18. Interesting Website Ideas by fastdecade · · Score: 3, Funny

    This article gives me great ideas for a website:

    * bash.history blog - Everything I ran today
    * /dev/tty blog - Everything I typed today
    * /dev/stdout blog - Everything I saw today

    COMING SOON: Welcome to My Bank Account Details, Favourite Passwords I Enjoy Using

  19. Scuse me? by arth1 · · Score: 5, Insightful

    Shouldn't that be bash_history, passwd and tmp?
    Was this written down by a non-techie from an audio interview?

    Regards,
    --
    *Art

  20. Wrong use of robots.txt by vadim_t · · Score: 5, Insightful

    It's supposed to be used to tell bots not to access some parts of your site due to other reasons.

    Common reasons would be that you host a site with a forum on a DSL line and don't want google to index all 5000 threads on it. It's also good for dynamic pages, for example it makes no sense to index a generated page that will be out of date tomorrow. It'll be much better to let it index the archive instead.

    Using this for security is just stupid though, as it'd contain a list of vulnerable places. Maybe it will make harder for people to find your vulnerabilities from google, but it will help a lot whoever wants to attack you specifically.

    Security problems have to be fixed by setting proper permissions and keeping your server up to date, and not by relying on that every spider that comes to your site will be polite enough to follow robots.txt

  21. phpmyadmin same thing by joeldg · · Score: 4, Interesting

    I have seen more phpmyadmin pages wide open on google that anything else.. Not putting things like that under htaccess at least is pure laziness and stupidity.

    Also it seems people put mysql dumps on their webservers as well..
    search for ' "SELECT * FROM credit" + "###" ' and you will see.

    This has been going on since google introduced the site cache.

  22. some guide! by mblase · · Score: 4, Funny

    Long says an obvious combination of search terms would include the terms "bash history", "temporary" and "password".

    Hmph. When I searched for those phrases at Google, all I got were a bunch of Linux technical how-tos and code samples. If this guy wants to teach us how to be hackers using Google, he's going to have to be more helpful than that!

  23. For more h4x0r fun . . by scarolan · · Score: 3, Interesting

    try searching for _vti_pvt and service.pwd on Google. There are lots of people still using frontpage 4.0 or whatever, with their frontpage password file in plain view. I won't tell you what to do with that file, if you don't know already.

  24. Google Warez Machine by dhodell · · Score: 5, Interesting

    I regarding the ability to use Google as a warez search machine. The article was about Google censorship and the one response to my post pinpointed almost exactly the point that I brought up, which is the point discussed in this article.

    Google has a nice long list of directory lists containing warez (remember the days of l33t FTP searching for filenames? Google for something like, in my last article: "xwin32*.exe * * * * *" "listing of"), serial numbers (Oh, I've found XP's serial number several times in Google's cache) and other "sensitive" information. My question is if other commercial sites are being constantly shut down due to these links (intentional or not), why aren't people targeting Google as well?

    In fact, if I'm *cough*too cheap to buy software*cough* or just want to evaluate some crippleware or such before I buy it, I often skip astalavista and cracks.am and just Google it up. Saves me the porn and pop ups, and I don't have to cripple my browser for this (yes I know it's possible to do in other ways, yes I enjoy javascript, no thanks, I don't want comments about how I'm retarded because I don't do it the right way).

    This is similar for sites such as the Internet Archive's Wayback Machine that contains other sensitive information.

    Because of the academic merit of both of these search mechanisms, I doubt either one will be shut down. Indeed, I highly doubt restrictions will be placed. They're valuable tools for finding more valuable tools. For more information about this sort of stuff, I suggest searching on Fravia+'s web-searching lore. Other information on there relates to "reality cracking", reverse engineering, and other taboo topics. Google's got it all cached. Interested? Just search for (insert topic here) site:searchlores.org.

    Sometimes I don't think the comparison of Google to God is that far off. Pardon my heresy.

    --
    Kind regards, Devon H. O'Dell
  25. Google file searching.... by Rahga · · Score: 4, Interesting

    I honestly know of nobody else who uses this technique, I just figured I would try it back when I was hunting down upgrades for old games like Quake 2 while places like FilePlanet were getting hammered:

    At google, type "index of", followed by the precise name of the file you are looking for.

    I'd say this gives me good results on a fast server 95% of the time.

  26. Re:My favorite... Searchlores by sICE · · Score: 3, Informative

    If you like this kind of tricks you can find dozen tricks like those ones and betteron Fravia's web site SearchLores.

  27. Doesn't work by lawpoop · · Score: 5, Funny
    I tried "bash history", "password", and "temporary", hit "I feel lucky" and I didn't get to hack anything.

    I guess I don't have the patience to be a real hacker.

    --
    Computers are useless. They can only give you answers.
    -- Pablo Picasso
  28. SCO Logic: by KillerHamster · · Score: 4, Funny

    Google uses operating systems! All your code are belong to us! Google must be shut down and all of its users owe us lots of money.

  29. publishing analogy by muppet · · Score: 3, Insightful
    as an author of a web page or even a log file, you have the right to publish and de-publish it. just because it's on the net does not give google the right to cache it indefinitely.
    by the publishing analogy, doesn't this mean that libraries don't have the right to lend books that are no longer in print? in that respect i see google's cache as a library's copy of a book; they let you look at it, and you can see when it was published. they don't claim it's the most up-to-date, and at any time you can go to the source and see for yourself (e.g. go to a bookstore and buy a new copy).
  30. A little bit OT by edmz · · Score: 3, Informative

    Not the same kind of "hacks", but more than one might have missed that O'Reilly published recently Google Hacks. Mostly targeted to webmasters or "power users".

  31. Not always dumb... depends on what's there by jd · · Score: 5, Interesting
    #include "IANAL.h"


    You can probably use this to set up "honeypots" which may be legal in States where traditional fake services would be considered illegal as entrapment.


    Simply set up a virtual machine (user-mode linux is a good one for this). Have the root account publicly read/write and somehow "accidently" visible to httpd.


    Have the login shell a program which acts as your honeypot, logging activity, tracing back to the user, etc. All the stuff honeypots do so well.


    Next is to ensure that the root password is visible, plain-text, and in a file that is visible to search engines. Your average skript kiddie is not going to question the apparent generosity of the admin. To get the engine to find the account, you probably want to have your main web page link into your virtual machine's root account - say via an FTP.


    Now, none of this is entrapment, in the sense that the person must pro-actively attempt to present a false identity before the service is accessed. There can be no question that the identity of any user logging in is fake, that the user logging in knows that it is fake, and that there has been a deliberate, pre-meditated attempt to compromise an account.


    If you want to go one step further, have the login shell transfer some goodies, such as cpuburn. Now, these have to have a "legit" use by a "legit" user, as anyone who gets burned is likely to complain. You have to be able to stand your ground and say "hey, I use this service as a convenient way to do hardware tests on remote machines - I locked that account against intruders, so if an intruder gets in, it's not my fault if they get burned."


    (If you leave something dangerous "just lying around", you could probably be held accountable if someone gets hurt, even if they were stupid or malicious. But if you make a "reasonable" attempt to deny access, then it's not your problem.)


    In fact, if you do any freelance tech stuff, you might very well use the service for real as a way of fetching over stress-testing software. It would make it a lot harder for "victims" of your root snare to complain, as you could then prove a legitamate use by legitamate users - the victim not being one of them.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  32. Google Hacking Tutorial by hohokus · · Score: 3, Informative
    while randomly googling for "index of" and ".bash_history", i found this, which may be amusing:

    http://www.smart-dev.com/texts/google.txt

  33. Re:Oops by clary · · Score: 3, Informative

    Nope...doesn't pass the LUHN check. See LUHN Check.

    --

    "Rub her feet." -- L.L.

  34. Re:Entrapment by fizbin · · Score: 4, Interesting

    Probably not, but his statement of the situation squares with my experience when I talked to an FBI agent after having discovered (and logged) some IRC kiddies who were constructing a DDOS network out of sub7-infected machines.

    I'd created a sub7 honeypot on my linux box with a little perl script; after that collected the IRC server ip and channel name, I connected with a random username (pretending to be a bot) and just logged the conversation.

    The FBI agent interviewed me very carefully to make certain that my setting up monitoring, etc., was not in any way instigated by a law enforcement officer. (No, I'd just gotten annoyed at random SYN packets) Then, he had no trouble with it. I don't know if this makes the evidence I provided useable legally, but it never came to that. As he explained it, the question was whether I was acting as an agent of the state when setting up the honeypot. Committing entrapment is not anything that non-state actors ever need worry about.

    Not that this lets you off the hook entirely - there may be charges of wiretapping involved; monitoring your own machine should be safe legal ground, but connecting to the IRC network (as I did) is a slight bit more dicey legally, and shouldn't be done if you have any reason to believe that the relevant prosecutor would like to hang something on you as well.

  35. Re:Entrapment by PenguiN42 · · Score: 3, Informative

    Also, entrapment is only illegal if the law officers used fraud or undue persuasion to cause someone to commit a crime -- so much so, that an ordinarily law-abiding person would be compelled to commit the crime.

    Cops can tempt criminals to commit crimes, and even initiate or plan out the criminal act (ie, buying or selling drugs, offering or buying prostitution, planning a bank robbery heist). None of this is entrapment, unless their actions would have cause a normally law-abiding person to commit the crime.

    If a cop tricks someone into unintenionally breaking the law, or harasses them so much that they eventually cave in and break the law, or threaten them, etc, it may be entrapment. It's actually pretty subjective and up to the jury, usually.

    But a lot of misconceptions of entrapment abount -- ie the ever-popular, "if you ask them if they're a cop, and they say no, then it's entrapment." And also the misconception that entrapment is a crime and can apply to non-law-enforcement. It's not a crime, it's a defense against being charged with a crime. (Well, unless you perform a crime while trying to get someone to perform a crime -- that's still a crime)

    For a somewhat inflammatory discussion, see this: http://www.libertyhaven.com/politicsandcurrenteven ts/nationalbudgetsdefecitsorspending/lawdeceit.htm l

    I had a more objective look at it, written by a lawyer, but I can't find it.

    sorry if this is off-topic.

    --
    The following sentence is true. The preceding sentence was false.