Online Search Engines Lift Cover Of Privacy
Rican writes "MSNBC has an interesting article about how 'Googledorks' are using the powerful search engine to do searches across the web for sensitive and/or private information. Some of this information includes 'Medical records, bank account numbers, students' grades, and the docking locations of 804 U.S. Navy ships, submarines and destroyers.'"
Go into kazaa and gnutella and search for any .doc files. Or some likely sounding names like "resume" or "job application"
It's surprising what people will sit in their kazaa upload directory, using it like a documents dump. Legal papers, company's employee policy documents, employee records, sensitive stuff, medical records.
Taken straight from people's HDs, no hacking, cracking or other media-unfriendly terms needed, just the ignorance of the people who leave this stuff open is needed.
This isn't anything too new. For kicks, I once searched for "Resume" and "Credit card" on KaZaA and got hundreds of results. Presumably, the trouble is that people sometimes believe that security through obscurity works - or, in the case of KaZaA, a lack of attention leads people to share files they didn't really want to.
Interestingly, I found a text file with all the user names and passwords for brokerage firms, and bank accounts, of the IT director at the firm I was working in. Scary, considering he was supposed to have "15 years in the IT industry".
A while back I Googled my credit card number for a laugh. I was shocked to find it in an indexed webserver log for a site I had previously 'tried' to purchase from. (the form timed-out and I gave up).
A quick call to the bank and a few angry calls to the company sorted it, but I was not impressed.
Perhaps a tool to search for ones own private details should be developed to keep an eye on this?
The problem comes when google searches down records in web servers, and using partners such as Opera, will crawl into pages that are normally not publicly accessible!
i s/ private_document.html
Here's how it works. Let's say you put a page on your site called
http://yoursite.com/temporary/hidden/dontreadth
And it is not linked to ever.
If you send that URL to someone using Opera with the right settings (but you don't know that) and they read the private document, within minutes GOOGLE WILL CRAWL THAT DOCUMENT!
Nothing is private any more under situations like that. Let's say that private document then links to all your older private documents. Google can then freely crawl it's way in to read the rest.
Who's to blame for this then? not you. You've already ensured you hadn't linked to it. Not the opera user, as they have read the document, and respecting your privacy they've not mentioned it to anyone else
However underhanded tactics like sneaking in a google crawl in this manner is unacceptable to me. My firewall blocks all google crawler bots for this very reason
Hmmm, let's see:
1. Microsoft has stated it wants to win the search engine war.
2. MSNBC (Microsoft owned) puts out story calling Google insecure because it invades your privacy.
3. MSN Search comes out with "secure, private searching" for only $9.95 a month.
4. Profit???
Conclusion: This is nothing more than a FUD story designed to sow the seeds of doubt about Google.
Visceral Psyche Films
The article seems to imply that the problem is Google, but that simply isn't fair--the problem is that people are posting private info to the web. If you don't want the public to see it, don't post it in public.
I read once that an old trick some people used to use is to do a search for "root" on Altavista (yeah, this was back in the days) and it would actually return useful information for gaining access. Not sure if that was just a geek urban legend but it sound plausible to me.
EvilCON - Made Famous by
You suck at making fun of people's sucky FP.
...to coincide with Google's IPO, had they not delayed it. A story saying Google is a threat to privacy AND national security. May as will throw Intellectual Property into the mix too, for all the warez searches. Just like that operating system our congresspeople were just informed about by the alert people at SCO.
Wow, this clearly shows that the better solution would be a more limited search engine that doesn't actually let the user search for whatever he/she wants, just in case it's naughty. Perhaps something tied into a Trusted platform that can make these legal judgement calls on the user's behalf.
Wasn't SCO planning to sue Google soon? Wow, what an incredible coincidence! Bad timing for your IPO, Google!
I'd end this with [/tinfoil hat], but I think I could actually be right...
What "privacy"? The information is posted on the WORLD WIDE Web... One person's blog topic is another's secret sometimes. There's a big diference to information to give to your family and information you should be leaving within view of Google... but some people don't realize that yet.
People have used this for years to find things like Bill Gates' social security number
For the curious, it's 539-60-5125. Leaked in 1995. The 539 means it was issued in Washington.
Nothing is private any more. I wholly agree. But:
Anyone else notice that the site is msnbc.msn.com? Isn't Microsoft trying to develop a google competitor?
Am I just another cynical bastard?
Sig (appended to the end of comments you post, 120 chars)
I am a member of a university organisation called the Assassins Guild, the basic premise being that, on the basis of the most limited possible information, we hunt down and "kill" other guild members with weapons such as cap guns and cardboard swords. As such, I have some personal experience of the use of Google in stalking. I can tell you that, in a university composed presumably of some of the most net-savvy people around, I have only found a photo once. Occasionally I have found a usenet posting or slashdot account. Old schools are common, but the folk at my uni are often those who are mentioned in school newsletters. The average web presence of the average user is approximately nil. In a range of cases, someone may become more prominent (either by accident or design - Darl McBride for example), but on the whole there is very little you can gather from Google. Occasionally it's enough to kill your target, but don't count on bank details.
For the love of God, please learn to spell "ridiculous"!!!
NCIC is a closed system. It's one thing to have the codes to query computerized criminal history (CCH) information. It's another thing to get into the system to make the query. It'd be easier to social engineer a police dispatcher and get her/him to run it for you.
Heck is a place for people that don't believe in gosh.
And why wouldnt the guy at sears be considered a 'tool'? He is a 'device' _used_ for finding the information you want.
The same as a metal detector or store directory leaflet - these are tools used for information retrieval.
I.O.U One Sig.
Most of the codes are actually to enter stolen property. To query a CCH on a person you need a name, sex and DOB. You can also use a SSN.
Most of the info you get back is kinda boring. With the exception of juvenile arrest data, it's all public record. But you'd have to know what court house to go to. the NCIC CCH file brings it all into one place.
You'd get, name, race, sex, dob, ssn and dl info, along with height, weight, hair and eye color, fingerprint classification along with a listing of arrests, and court dispositions of those arrests.
If you are going to steal someone's identity, you could do better than stealing a crook's.
If you know someone has been arrested by the Anytown Police Department. Go to their records section and do an open records act request for the last arrest's booking sheet. Most likely you'll get most of their identifying info except the SSN.
But whatever you do, don't ever run the President's DL. The Secret Service gets real nasty about that!Heck is a place for people that don't believe in gosh.
I was looking at a few examples and tried out intitle:"Index of..etc" passwd. The first result is a honey pot :)
They have some Webalizer stats for the honey pot too.
(\(\
(^.^)
(")")
*This is the cute bunny virus, please copy this into your sig so it can spread
How to use this for evil is obveous. (Actually I do searches on myself ever now and then just to see what I look like on the Internet. Do it yourself it's fun.)
Your an evil badguy and go nuts on Google... Credit Cards... Horray... Now to go nutz.
Leave it to MS NBC to neglect to mention that this is also a tool for good.
Your a credit card holder..... Now go google your credit cards... DO IT NOW.
Did you find it? I didn't.
I've got 4 credit cards.. two store cards one business visa and one personal mastercard.
(Oh yeah hackers the name on the card is Felinoid) Yeah they'll buy that.. not...
Don't need to use Google BTW... Use Alta Vista.. or Microsoft serch.. or Lycos...
Oh yeah and when your done put your credit cards away (I had to leave desk while entering post an left my wallet on desk... Now my credit cards are gone and I think I saw a stuffed teady bear running down the street yelling "Charge it"... Just kidding got all my cards..).
(Oh yeah if you do see a teady bear running down the street your missing credit cards are the least of your conserns)
Now to set up a bot to trap all thies searches on Google....
(Oh come on it had to be said)
I don't actually exist.
This isn't news when it comes to the ships for the navy. For years I have been a member of a small group of warship fans in the Seattle who have swapped emails for years about ship X being at location Y. It basically amounts to: "That new destroyer put into Bremerton last week. Go take a look at it!" Of course the only difference here is now that that information is available to the general public. Whoopee! Disaster! You might know something!
Google will leave you right the fuck alone
All it takes is one cross-link from a site that links, and a number of hits, and google will advertise the cross-link, robots.txt or not.
Google has been great for catching plagiarism - my mother has used it to verify essays she suspected of being plagiarized.
Bít, zabít, jen proto, ze su liska!
Heh, in about 1996 or so I got a hit on my home page from gatekeeper.eop.gov. I have no idea what that was about.
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
"The scariest thing is that this could be happening to the government and they may never know it was happening," Long said.
This isn't "happening to the government", as if the government is some innocent victim. Rather, "the government screwed up big time". Likewise, if some company has sensitive personal information lying around on a public web server, the company is at fault and should be liable.
Let's not make victims out of perpetrators.
From the article:
Rican writes "MSNBC has an interesting article about how 'Googledorks' are using the powerful search engine to do searches across the web for sensitive and/or private information."
---
From the website:
googleDork (gOO gol'Dork) noun 1. Slang. An inept or foolish person as revealed by Google.
---
Ok... So who here is the googledork (hint: It's not me)? The dork who googles for the victim's information or the clever person who googles for the dork's information? Confused? If the website is more authoritative than the original slashdot poster (Rican) then maybe Rican is the dork?
I don't know why Google never indexes this stuff, it's clearly public record and can be of interest to a lot of people, but they never did (I checked them many times, including just now, and they show no indication of the document). I wonder what other good government documents are out there if you only know where to look for them.
I'm an American. I love this country and the freedoms that we used to have.
The allinurl and site search features can be used to good affect when looking for machines with vunl cgi that give one execute or read permissions.
.gov and .mil one only needs a web browser to gain the foothold on their DMZ/LAN. (Heh, DMZ, giving them way too much credit).
for example:
allinurl: cgi print site:.mil
You would cry if you realized that to hack
Anyway, using common cgi tricks like dot traversal, poison null byte (RFP you can kiss my ass), obfuscation (".." == "%2e%2e"), etc... Oh dont forget the pipe operator.
I agree with other posters who say it is not Google's fault. They do a great job. It is the people who program those cgis need to really take a bit more time.
Click on the "show me some pictures" button at the upper-right.
Google is fetching these pages to analyse them for displaying AdSense (Adwords text ads targetted to the webpage you're viewing) in the free version of Opera.
This does not end up in Google's web search index.
If it's sensitive, it shouldn't be world readable. Ever. It shouldn't matter if you know that htttp://www.CIA.gov/secret/topsecret/locationsOfAl lAgentsInTheWorld.xls is where the file is; the server shouldn't let anyone load it.
my sig's at the bottom of the page.
how about robot.txt? is it forgotten? does current modern search engine ignore them?
above all of that, does it was a stupid idea to hide an information with just no link point it? u must make sure it's properly secure with access control like ip address or password of the visitor.
maybe some people it was not simple to build access control using some content management or any self build scripting. but i think it was so simple to use http autenthication whose provided by most web server.
----
so many dreams r swinging out of the blue we let them come true (forever young, alphavile)
O'Ferrell said.
But the MSN story, just a few lines later, says:
"And it is all legal, using the world's most powerful Internet search engine."
Hmm... Excuse me if I smell a rat.
Try this if you like that sort of thing. Does an automated search through GIS.
Did anyone notice how heavily "enhanced" the cited MSNBC web page is? Try to print it using Mozilla 1.2.1 on Linux and it crashes the browser. Try to view it with Mozilla 1.1 on Windoze XP and page is displayed very incorrectly. Even printing with IE from XP took 3 tries.
These fuckers never give up.
The problem with this is, anybody can now download your robots.txt and have a list of your unprotected sensitive data.
/personal/
Not really. I mean, you're not really giving much away with
Disallow:
unless going to http://mysite.com/personal/ returns a directory listing.
The general point is that yes, you do have to trust people to respect the robots.txt. The problem we're talking about is Google, though, and we know they do respect it.
He, he. I just googled for "filetype:txt inurl:robots.txt" and the first hit was www.whitehouse.gov/robots.txt. It contains very interesting entries like: /911/911day/iraq /911/response/iraq /vicepresident/iraq /space/iraq /president/winterwonderland/iraq
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
The listings end in either "text" or "iraq". Is "iraq" an acronym? If so, it's pretty funny.