I remember a few years ago, when I made the switch to Google. I was impressed from the get-go, and have never looked back. Everyone I talked to, everybody who was using some other search engine, I turned them on to google. (It wasn't hard)
And now, in some places, rather than saying "do a search for [something]" people say "google-search it" (even if they don't use google).
You know something's great when people make a verb out of its name.
-- "Peace, Love and Apathy"
Re:The Switch
by
cyborch
·
· Score: 2, Interesting
I don't know why, possibly it's the lack of web portal-ness of google, but very few non-geeks I know use google. They alle stick to the local Yahoo clone.
I may be missing something, but I really can't see the reason why... could anyone enlighten me, is there something geekish about google? Or is it just me thinking that non-geeks want to use more bloated and less efficient solutions than geeks?
I'm pretty sure it's similar to the "use windows because it's on my machine when I buy it".
"I'll use this search engine because that's what appears when I install AOL, @home, sypmatico, etc."
That, and most people can't be bothered remembering more than two web addresses. www.hotmail.com (being replaced by simply using MSN messanger) and www.my-favourite-porn-site.com.
Perhaps Google doesn't do such a good job when searching on non-geek topics. I use the Web mostly for computer stuff and random urban legend / Kevin Bacon searches, so I wouldn't know. But maybe if you want to book a holiday a semi-automated index like Yahoo does a better job than just counting links (after all, who links to a competitor's site?).
-- --
Ed Avis
ed@membled.com
Re:The Switch
by
markov_chain
·
· Score: 2, Interesting
I've been using the expression "just google for X" when referring friends to some site about X that I know shows up high on the hit list. It's funny, google is starting to replace DNS for me-- instead of remembering or bookmarking URLs, I just remember the keyword to google for. For example, the URL http://www.uk.research.att.com/vnc/ is harder to remeber than to just type "vnc" and hit "I'm feeling lucky."
I often wonder how much less productive I would be if google went away tomorrow:) If anyone from google.com is reading this, thank you!
Perhaps you're right and it can be attributed to the higher incidence of geeky topics in Google as opposed to other search engines... But do a search on non geek topics.
Slightly more in all categories, however try the searches with PHP and MySQL as search terms in google and altavista (php:5,440,000/8,925,806 and mysql:1,980,000/16,577,970 respectively.) If anything Google has a lower tech:nontech ratio for search results returned. At least on the topics I've searched for.
The *QUALITY* of the results returned were higher for all categories in Google.
-Sara
Nice description
by
torqer
·
· Score: 3, Informative
In case you were like me and really had no idea what the submitter was talking about in his description...
The link is to an article that gives some insight into how google searches through the hordes and hordes of webpages. And bashes other search engines.
Note to submitter: while brevity may be the soul of wit try to remember we haven't read the article yet and need just a little more information.
There's a problem with this
by
Tim+Ward
·
· Score: 5, Insightful
Now that Google will find anything you want so easily, isn't there a danger that people will stop putting links to useful and interesting sites on their pages?
I don't need to tell people, via a link, about some wonderful site I've found if they can find it for themselves quicker and easier using Google. So I might not bother to maintain my collections of useful links, and Google will lose its information source. A victim of its own success.
What happens then?
Re:There's a problem with this
by
GigsVT
·
· Score: 4, Interesting
I've thought of this myself. I know I don't do nearly as much "surfing" between related sites now that Google is here and works. I usually hit Google up, then if that site isn't what I want, I don't bother clicking their links section, I just go straight back to Google.
The one thing that may save us though is AOLers. Bear with me here.:) I think that maybe we have found the most efficient way to get the information we want, mostly because the novelty of the Internet has mostly worn off for us. We no longer spend hours bouncing from site to site, just reading random stuff. We use the Internet as a tool to expand our effective knowledge and intelligence.
This is obvious with the various Googlebots that have sprung up in lots of IRC chat rooms. This happens a lot in help rooms, if no one knows the answer, or doesn't want to take the time to explain it fully, they just !google and the bot returns the first link in the search.
So while people like us, if we were the only people on the net, would cause Google to fail, so long as there are still "surfers" out there, it should allow Google to remain meaningful.
Just my two cents.
-- I've had enough abrasive sigs. Kittens are cute and fuzzy.
Re:There's a problem with this
by
ForceOfWill
·
· Score: 2, Interesting
Then, it becomes harder to find things using google, and people start giving each other links again, and google gets better again.... see the cycle? I expect that there would be some sort of damping effect on this oscillation, so it would all even out in the end, with google being just short of good enough to warrant using it instead of passing links manually.
--
-- Seeing is believing; You wouldn't have seen it if you didn't believe it.
Re:There's a problem with this
by
guiding_knight
·
· Score: 2, Funny
In addition, what about all these/. links to google searches? Does google have a check in its programming to find links to itself? If not, as more and more people link to google searches, google could convince itself that it is the most authoritative site for any and every subject. I dunno about you, but I would find this very entertaining...
Re:How to abuse Google
by
PeterClark
·
· Score: 3, Informative
Well, this has been known for a long time. But really, it's not as big a deal as one might think. "Scientology" as a search term pulls up an entire page of Scinetologist sites, except for #4, which is xenu.net. However, the first page for "Scientology secrets" is full of sites that debunk Scientology. So yes, the Church of Scientology has a virtual monopoly on the search "Scientology" but is far, far from controlling other search items. It all works out in the end.
I'd not call it "abuse". It's simply that more pages (by real and virtual people) link to "real" scientology pages. After all, the COS is the source of information about scientology, don't you think? Telling this is the only job of google.
The same way, when you search for microsoft, you don't expect linux.org to come out at the top, and vice versa. In the COS case, the picture has more shades, obviously, but any serious research should be done not only on the first link.
You can help the opponents by linking Scientology to xenu.net this way on all the pages you maintain, after all.
-- "Ten years from now, they could do it in a few seconds." --
The Racketeer of the Hellfire Club, 1993, Phrack 42
Where's the magic?
by
guerby
·
· Score: 2, Insightful
In the age of DMCA, SSSCA, and angelic companies
running after all those evil pirates in order to protect their beloved authors that deserve their protection, how comes
no one has yet sued the biggest copyright
infringer of all times... the Google cache?
The short version: The DMCA makes provisions for certain caches used in the transmission of information, such as your ISP may use. There are certain defined procedures that the ISP must implement to allow people to get their content out of that cache.
Google implements those procedures, and claims protection under the DMCA for their cache. (Note the hoops you must jump through to get them to remove stuff are the legally mandated hoops under the DMCA; they are not trying to be nasty.) Now, a careful reading of the DMCA will show that Google probably doesn't meet the qualifications of this cache exception; but nobody has cared enough to fight it yet. The few who care just jump through the hoops and forget about it.
The long version is: Read the DMCA and compare against Google's DMCA page and decide for yourself.
No human decisions ?
by
EpsCylonB
·
· Score: 2, Interesting
Do you think that the google search could be improved by more human decisions ?.
An example might be that goat.ce page (or whatever the url is) might get linked to a lot as example of bad taste (I seen a few pages that link to it and describe the page urging people not to visit it), which fine except that this web site is now getting linked to (or voted for which is how the google algorithm treats a link) yet it isn't a particularly good or informative website.
Even if someone was searching for something on bad taste, that page is not really an authoritive page about bad taste just an example of it.
More Google Links
by
Schwarzchild
·
· Score: 5, Informative
Google is brillient if you know what you are looking for. It finds the best pages straight away. However, when I'm idely surfing (tm) I use something else.... I want to wander around the 'net not be taken straight to my destination.
Bit like driving somewhare along the back roads. You never know what you might find
--
Anyone quoted by a reporter knows how little they understand
Don't believe what you read is the truth.
In a world of degradable storage, replicating copies is the surest way to guarantee longevity. Whether your data is in atoms or bits, the more copies you make of it and the more widely you disperse it, the greater the likelihood that your data will persist forever. (That's why Jaron Lanier jokingly proposed encoding printed matter into the DNA of the notoriously prolific cockroach, as a means of ensuring archives through a nuclear war and beyond.)
I can see some future biologist doing the the heavy work on decoding this now. And the arguments. of course, if it contained something like the Linux kernel, figuring it out could take awhile.
Heck I am still waiting for folks to find a licensing and copyright statement in the human genome.
;-)
-- "It is a greater offense to steal men's labor, than their clothes"
I would go on worrying if i were you
by
limbop
·
· Score: 4, Insightful
Google works on the recursive principle that an important document is one linked to by a lot of important documents. search for "child pornography" and (i'm generalizing here) you're likely to find two kinds of sites: sites offering child pornography and sites opposing it. those will probably create two seperate cliques (if you look at the web as a graph) or clusters. It will be quite easy to offer them as two seperate lists both satisfying the search query. i believe northern light (http://www.northernlight.com/) does exactly this.
Now how about a similar principle for people? A suspicious person is one who communicates with suspicious people. If you have access to Email messages sent on the internet this is quite easy to achieve. Filter the messages to those mentioning "child pornography" and now do the same analysis as google does. voila! you are left with lists of child pornographers and of internet vigilantes. easy. automatic. you can start worrying again.
btw, if you are looking for an interesting technical description of the best search engine around, the original google article (http://citeseer.nj.nec.com/brin98anatomy.html) by Brin and Page does the job a lot better than Doctrow's.
Re:I would go on worrying if i were you
by
GigsVT
·
· Score: 2
It is a little too simplistic, but it is totally feasible.
What about when a vigilante emails a bunch of sites flaming them and telling to take their stuff down?
This happens a lot in the spam/antispam world, antispammers probably trade more email with spammers than other antispammers.
-- I've had enough abrasive sigs. Kittens are cute and fuzzy.
nothing new...
by
illaqueate
·
· Score: 2, Informative
A puff piece with poor logic
by
XDG
·
· Score: 4, Insightful
The article boils down more or less to the following:
1. "Old" search technologies (Altavista, Yahoo) failed because they used approaches that found words but not content (Altavista) or relied on non-scalable human editorial judgement (Yahoo).
2. Google works (and is cool) because it uses available information about the number of links to determine (a) valuable content and (b) smart judges of other valuable content
3. The government efforts at creating the Panopticon will fail because they'll be stuck using "old" keyword approaches that can't pick out real content.
This argument is flawed in two key ways:
1. The author confuses the nature of the "search". Web searching is about finding *content* and the challenge is differentiating "good" content from "bad" content. Governmental "security" searching is more akin to traffic analysis and the goal is identifying dangerous *individuals* based on the content and pattern of their traffic. The challenge there is differentiating "good" (safe) speakers from "bad" (dangerous) speakers.
2. The author assumes (based apparently simply on opinion and what is popularly reported in the press) that the government will blindly apply "alta-vista style" techniques. His lack of fear of the Panopticon is based on an assumption of incompetence in the application of surveillance methods. Given the motivation and resources (both of which the government now has in spades), there is no reason to believe that more sophisticated and effective techniques will not be developed and pursued. Assuming Echelon has really been in operation, it's hard to imagine that, in the closed halls of the NSA, researchers aren't well aware of the limitations of keyword search and are far along applying cryptanalytical techniques to the real problem identified above.
It would seem that the author is trying to take advantage of hype and concern about government surveillance not to make a serious comment about it or whether one should truly be concerned, but rather to get an audience for his opinion that Google is really cool, which most of already knew anyway.
-XDG
Re:A puff piece with poor logic
by
sam_handelman
·
· Score: 3, Insightful
the challenge is differentiating "good" content from "bad" content.... The challenge there is differentiating "good" (safe) speakers from "bad" (dangerous) speakers.
I agree with all else you say - including that the government has the resources to come up with new approaches to the problem - but I don't think that this challenge is really different from distinguishing between good and bad content. In so far as the government is trying to do what it shouldn't even remotely be doing, using this technology to identify subvsersives, you are right. However, in so far as carnivore might *actually* be used to intercept a criminal communique, I think that the challenge is very similar to what is faced by google.
Suppose that Inoccuous260@hotmail.com only ever sends one message, from some terminal in a public library, and it is the delivery schedule for a nuclear weapon. The best, most morally (if not legally) defensible use of Carnivore would be to intercept this message and hand it over to the Feds. If the Feds can do this, even once, Carnivore will be with us forever, however else it may be abused, b/c you will never rally the public will to end use of such a tool. The problem of identifying that message, and I don't want to brainstorm ideas here, but I'm sure we could come up with several, is very similar to the problem of picking out a biographical sketch of Allen Turing among all the sci-fi and hoopla, which Google can do using characterisation by links, and which the government would be hard-pressed to do without that human resource.
So, the author raises a fair point about the limitations on the "legitimate", let us say intended, use of carnivore. However, the unintended/illegitimate use, simple identification of dissidents, could indeed be carried out by a clever 10 year old, and is plenty worrisome even if Carnivore never does what it was supposedly intended to do.
-- The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
Wrong about email
by
Karellen
·
· Score: 5, Informative
He's wrong about one thing. Email does have links. It has links indicating who it came from and who it went to. Even without the content, that sort of information, about who is talking to whom, and in what patterns, can be really informative to those who know what they're looking for.
If you include the content, it's a goldmine.
URLs embedded in email would make it better again
Aside from that though, great article.
-- Why doesn't the gene pool have a life guard?
Re:Wrong about email
by
guiding_knight
·
· Score: 2, Funny
Privacy concerns aside, if the google technique was applied to emails in the same manner, spam and pornography would be more prominent than any relevant info on many search pages. The sheer volume of this would tip google-style search results. I'm sure the spammers would love this, sending extra, no cost(to them) copies of spam to everyone at the NSA:)
I'm sure that UBE would be easily identifiable by a google type of database as practically no mail will exist that goes _back_ to the source.
Filters based on that (to either look for UCE, or to discard it) would probably be trivial based on ratios of sent/received messages to/from a particular envelope.
-- Why doesn't the gene pool have a life guard?
Wrong panopticon
by
dallen
·
· Score: 5, Insightful
Doctorow's point, I believe, is that we have a luxury of choices for searching information, but those who want to wiretap us do not have the luxury of infinite time and infinitely improved ways to find the information they want.
If they could only track us via the public internet, I would probably agree.
I would say we don't know what sort of technology they ultimately have for searching our data; until we knew that, we should not assume anything such as he has, that they're not able to keep up with the flood of data.
Remember that they're not only recording elements of email, phone, and other communications; but they are also tracking who is sending and receiving it; and those who are under "wiretap" are nearly perfectly trackable as long as they can associate an identity to an IP to a person. That is the Panopticon, the prison with ideal survailance; mapping a person to their communication and selectively watching those who bear suspicion.
Maybe the semantic web will...
by
wiresquire
·
· Score: 2, Interesting
That is a very interesting point. If you check out the Semantic Web activity there is a move to semantic definitions . DAML + OIL and several other efforts are all looking at defining the spoken/written language for computers.
I wonder if the number 1 ranked page will always end up being a single document - the ontology.?
--
So does Anonymous Coward have good karma?
incredibly short-term viewpoint
by
AdamBa
·
· Score: 4, Funny
1) Google sucks. All search engines suck right now. Altavista may suck 99% and Google may only suck 97%, but they are all terrible, and will remain so until they can actually start to understand what a page is about. The author may bag on AI, and it it bad now, but it's the only hope for workable search engines in the future.
2) What is this absolute crapola about how bytes are more reliable than allegedly "fragile" books? Does this tubesteak realize that there are 500 year old books that are completely legible, while 15-year-old electronic data is unreadable? Yeesh. The only bright spot is that this guy's ravings are in electronic form, so future generations won't have to worry about them.
- adam
Re:Sad simplification of storage issues
by
AdamBa
·
· Score: 2
Think of the Library of Congress who want to be able to store data forever. Let's think just 50 years from now. Even if they had the appropriate hardware, do you think they would have a copy of Microsoft Word 2000 handy? MS sure as hell won't be for sale and won't be supported. Would it run on any of the hardware available in 2052?
Informative but Not Conclusive
by
MadFarmAnimalz
·
· Score: 2, Interesting
> Then they must use some hybrid approach: human editors and AI
Well, there's the implied assumption here that the people running this surveillance operate with standard hardware, where standard means something google, altavista, lycos, etc. can get their hands on. Sketchy information suggests that they do not; specialised hardware seems to be the order of the day.
Besides, there's a lot of research going on in terms of context recognition, here to name one place.
This article is insightful? It is deceiving.
I read something interesting about the "Panopticon" not long ago...
"The agency which Poindexter will run is called the Information Awareness Office. You want to know what that is? Think, Big Brother is Watching You. IAO will supply federal officials with 'instant' analysis on what is being written on email and said on phones all over the US. Domestic espionage."
--John Sutherland of UK's Guardian.
Remember John Poindexter? Mr. Iran-Contra? He lied to Congress and kept Ronald out of the loop. He also was responsible for shredding lots of docs on the subject as well. Now he'll be spying on US domestic electronic transmissions.
There is some irony in him destroying thousands emails to cover his ass then and now being in charge of watching everyone else's emails.
I'm also sure that the billions of dollars for his new office may be able to overcome shortcomings of certain search engines. Nobody's going to have to type all those boolean operators.
Cheers to all the spooks! I think it is a job well done!
-b.
Google: Big improvement, but not perfect
by
livingdots
·
· Score: 2, Insightful
I like Google; it weeds out most of the spam -- unlike AltaVista. It isn't perfect, though. I once searched for prostate milking, after reading this. The search results were quite interesting: It brought up hundreds of, apparently fake, headlines ("Located here! Prostate Milking") and domain names ("childhood-disease.accurate-health.com/prostate-m ilking.html"); it in fact still does, even though a month has passed since. Many of the links don't work, but some redirect you to other sites (this one amazingly owned by Novartis, a supposedly "respectable" biotechnology company). Question: How do they do this?
The /. contradiction in one sentence
by
Anonymous Coward
·
· Score: 2, Funny
"I hate it how everything is being cached and observed and indexed, but I love it cuz its cool!!!"
It seems to me that those 50 or so "official" hits are not a result of a deliberate attempt to dominate Google results. They're just a symptom of the way Scientologists -- like any other religious zealots -- love to blather about themselves.
The grim era before Google, when searching was a spew of boolean mumbo-jumbo, NEAR this, NOT that, AND the other?
I kind of liked the "NEAR" operator - wish google had it!
-- I'm a 2000 man.
alleged fragility of books
by
AdamBa
·
· Score: 3, Insightful
Maybe 500 was an exaggeration (given that the printing press was about that old)...but there are certainly 300 year-old books that are fine (not having been vacuum-sealed) and 100 year-old books are not even that unusual.
The article (or that part of it) reminds me of the people who claimed that newspapers were going to fall apart and they all needed to be microfilmed and stored that way...now the newspapers that were dumped are in such great shape that The Sharper Image is selling them for $30 a pop, and the microfilms are deteroriating, that is the ones that were made legible to begin with.
Copying bytes may be easy but every time I switch computers I have to worry about moving stuff and where is it stored, then there is 20-year-old stuff on 5 1/4" floppies...meanwhile my books from childhood are all doing great. Even the cheap-o dot-matrix printouts from my BBS days in 1983 are perfectly preserved, which is more than I can say for any data I had from back then.
- adam
Re:alleged fragility of books
by
NaturePhotog
·
· Score: 2
Maybe 500 was an exaggeration (given that the printing press was about that old)...
Actually, there are books that pre-date the printing press. The oldest printed book still around is The Diamond Sutra, at The British Library. It dates from 868AD.
It may also be the oldest existing Open Source document:
The colophon, at the inner end, reads: `Reverently [caused to be] made for universal free distribution by Wang Jie...
:-)
Re:Another problem... Google Spyware now in use!
by
PurpleFloyd
·
· Score: 2
Google has had ads for years. They have been off to the side in little pastel-tinted boxes. The great thing is you never even notice them until one of them is useful -- unlike all the damn popups everywhere, they stay out of the way until you need them. As such, I will willingly click on a Google ad if it relates to what I want (and it usually does!), while popups are killed via Mozilla, or, failing that, immediately destroyed as soon as they come up. Interesting to note that while everyone else seems to take the idea of "ads should obscure content", Google has taken the rational and sane approach of "ads should be relevant content".
--
That's it. I'm no longer part of Team Sanity.
Google Can Search Your Apartment and Your Brain
by
weston
·
· Score: 2
Paul Ford wrote a hilarious piece on what life might be like if google tried to index the world.
Me, I think that the reason that the Harry Potter film ended up looking uncannily like what was in everybody's head is because Google can index the brain.
Google sometimes defies explanation.....
by
fwc
·
· Score: 3, Interesting
I was talking to a friend about "mystery email attachments", and wanted to find this user friendly strip.
So, without thinking I fire up google and type the search:
"user friendly the comic strip" email attachment
and then clicked on search. The first hit is the cartoon I wanted, so I click on it. When I pull up the page, I realize that the text words "email attachment" don't appear anywhere on the screen other than the graphic text in the comic itself, so google shouldn't have found the page - at least according to how I thought google worked. So I pulled up the source to see if there was a meta tag there which would explain this. Nope.
The only thing I can think of is that google either OCR's the pictures (seems scary, and that font which Illiad uses doesn't look very OCR-able). The other thing I thought about is that perhaps google also matches text found within <A> tags which link to that page or something.
I've shot a message off to google to ask about this but I haven't heard back yet. I'll be interested to find out how the *@(#*$ they did this.
I think that I saw an ad somewhere which said "How the @(#$* did they do that?" was the highest praise one web designer could give to another. If that's true, they've definately earned my praise in this case. Regardless, some wizard at google got their search engine to do exactly what I wanted with whatever technology they used. Technology sufficiently advanced is indistinguishable from magic. And google is definately magic.
I remember a few years ago, when I made the switch to Google. I was impressed from the get-go, and have never looked back. Everyone I talked to, everybody who was using some other search engine, I turned them on to google. (It wasn't hard)
And now, in some places, rather than saying "do a search for [something]" people say "google-search it" (even if they don't use google).
You know something's great when people make a verb out of its name.
"Peace, Love and Apathy"
The link is to an article that gives some insight into how google searches through the hordes and hordes of webpages. And bashes other search engines.
Note to submitter: while brevity may be the soul of wit try to remember we haven't read the article yet and need just a little more information.
Now that Google will find anything you want so easily, isn't there a danger that people will stop putting links to useful and interesting sites on their pages?
I don't need to tell people, via a link, about some wonderful site I've found if they can find it for themselves quicker and easier using Google. So I might not bother to maintain my collections of useful links, and Google will lose its information source. A victim of its own success.
What happens then?
Actually Google's system can, and is, beeing abused..
echo '[q]sa[ln0=aln80~Psnlbx]16isb572CCB9AE9DB03273snlbxq' |dc
In the age of DMCA, SSSCA, and angelic companies running after all those evil pirates in order to protect their beloved authors that deserve their protection, how comes no one has yet sued the biggest copyright infringer of all times ... the Google cache?
So where's the magic?
--
Laurent Guerby <guerby@acm.org>
Do you think that the google search could be improved by more human decisions ?.
An example might be that goat.ce page (or whatever the url is) might get linked to a lot as example of bad taste (I seen a few pages that link to it and describe the page urging people not to visit it), which fine except that this web site is now getting linked to (or voted for which is how the google algorithm treats a link) yet it isn't a particularly good or informative website.
Even if someone was searching for something on bad taste, that page is not really an authoritive page about bad taste just an example of it.
Undocumented Google Commands
Google Time Bombs
Google Science-Fiction
"sweet dreams are made of this..."
Google is brillient if you know what you are looking for. It finds the best pages straight away.
However, when I'm idely surfing (tm) I use something else.... I want to wander around the 'net not be taken straight to my destination.
Bit like driving somewhare along the back roads. You never know what you might find
Anyone quoted by a reporter knows how little they understand
Don't believe what you read is the truth.
I can see some future biologist doing the the heavy work on decoding this now. And the arguments. of course, if it contained something like the Linux kernel, figuring it out could take awhile.
Heck I am still waiting for folks to find a licensing and copyright statement in the human genome.
;-)
"It is a greater offense to steal men's labor, than their clothes"
Google works on the recursive principle that an important document is one linked to by a lot of important documents. search for "child pornography" and (i'm generalizing here) you're likely to find two kinds of sites: sites offering child pornography and sites opposing it. those will probably create two seperate cliques (if you look at the web as a graph) or clusters. It will be quite easy to offer them as two seperate lists both satisfying the search query. i believe northern light (http://www.northernlight.com/) does exactly this.
Now how about a similar principle for people? A suspicious person is one who communicates with suspicious people. If you have access to Email messages sent on the internet this is quite easy to achieve. Filter the messages to those mentioning "child pornography" and now do the same analysis as google does. voila! you are left with lists of child pornographers and of internet vigilantes. easy. automatic. you can start worrying again.
btw, if you are looking for an interesting technical description of the best search engine around, the original google article (http://citeseer.nj.nec.com/brin98anatomy.html) by Brin and Page does the job a lot better than Doctrow's.
Vannevar Bush, As We May Think (July 1945)
Ben Schneiderman, Codex, Memex, Genex (December 1997)
Henry Jenkins, Information Cosmos (April 2001)
The article boils down more or less to the following:
1. "Old" search technologies (Altavista, Yahoo) failed because they used approaches that found words but not content (Altavista) or relied on non-scalable human editorial judgement (Yahoo).
2. Google works (and is cool) because it uses available information about the number of links to determine (a) valuable content and (b) smart judges of other valuable content
3. The government efforts at creating the Panopticon will fail because they'll be stuck using "old" keyword approaches that can't pick out real content.
This argument is flawed in two key ways:
1. The author confuses the nature of the "search". Web searching is about finding *content* and the challenge is differentiating "good" content from "bad" content. Governmental "security" searching is more akin to traffic analysis and the goal is identifying dangerous *individuals* based on the content and pattern of their traffic. The challenge there is differentiating "good" (safe) speakers from "bad" (dangerous) speakers.
2. The author assumes (based apparently simply on opinion and what is popularly reported in the press) that the government will blindly apply "alta-vista style" techniques. His lack of fear of the Panopticon is based on an assumption of incompetence in the application of surveillance methods. Given the motivation and resources (both of which the government now has in spades), there is no reason to believe that more sophisticated and effective techniques will not be developed and pursued. Assuming Echelon has really been in operation, it's hard to imagine that, in the closed halls of the NSA, researchers aren't well aware of the limitations of keyword search and are far along applying cryptanalytical techniques to the real problem identified above.
It would seem that the author is trying to take advantage of hype and concern about government surveillance not to make a serious comment about it or whether one should truly be concerned, but rather to get an audience for his opinion that Google is really cool, which most of already knew anyway.
-XDG
He's wrong about one thing. Email does have links. It has links indicating who it came from and who it went to. Even without the content, that sort of information, about who is talking to whom, and in what patterns, can be really informative to those who know what they're looking for.
If you include the content, it's a goldmine.
URLs embedded in email would make it better again
Aside from that though, great article.
Why doesn't the gene pool have a life guard?
Doctorow's point, I believe, is that we have a luxury of choices for searching information, but those who want to wiretap us do not have the luxury of infinite time and infinitely improved ways to find the information they want.
If they could only track us via the public internet, I would probably agree.
I would say we don't know what sort of technology they ultimately have for searching our data; until we knew that, we should not assume anything such as he has, that they're not able to keep up with the flood of data.
Remember that they're not only recording elements of email, phone, and other communications; but they are also tracking who is sending and receiving it; and those who are under "wiretap" are nearly perfectly trackable as long as they can associate an identity to an IP to a person. That is the Panopticon, the prison with ideal survailance; mapping a person to their communication and selectively watching those who bear suspicion.
HOWTO get better dates on slashdot
Same for Islam.
I wonder if the number 1 ranked page will always end up being a single document - the ontology.?
So does Anonymous Coward have good karma?
2) What is this absolute crapola about how bytes are more reliable than allegedly "fragile" books? Does this tubesteak realize that there are 500 year old books that are completely legible, while 15-year-old electronic data is unreadable? Yeesh. The only bright spot is that this guy's ravings are in electronic form, so future generations won't have to worry about them.
- adam
Exactly...that's why we need open data formats for everyone.
- adam
> Then they must use some hybrid approach: human editors and AI
Well, there's the implied assumption here that the people running this surveillance operate with standard hardware, where standard means something google, altavista, lycos, etc. can get their hands on. Sketchy information suggests that they do not; specialised hardware seems to be the order of the day.
Besides, there's a lot of research going on in terms of context recognition, here to name one place.
Blearf. Blearf, I say.
Remember John Poindexter? Mr. Iran-Contra? He lied to Congress and kept Ronald out of the loop. He also was responsible for shredding lots of docs on the subject as well. Now he'll be spying on US domestic electronic transmissions.
There is some irony in him destroying thousands emails to cover his ass then and now being in charge of watching everyone else's emails.
I'm also sure that the billions of dollars for his new office may be able to overcome shortcomings of certain search engines. Nobody's going to have to type all those boolean operators.
The quote above is from the UK's Guardian... Check out what you might have been missing
An interesting story, curiously not in CNN..
Nor MSNBC...
Couldn't find it in Washington Post..
Article in LA times on his appointment does not describe what he is to do in his new job except to blather about Sputnik and stealth aircraft.
Not in CBC.ca : (
Cheers to all the spooks! I think it is a job well done! -b.
I like Google; it weeds out most of the spam -- unlike AltaVista. It isn't perfect, though. I once searched for prostate milking, after reading this. The search results were quite interesting: It brought up hundreds of, apparently fake, headlines ("Located here! Prostate Milking") and domain names ("childhood-disease.accurate-health.com/prostate-m ilking.html"); it in fact still does, even though a month has passed since. Many of the links don't work, but some redirect you to other sites (this one amazingly owned by Novartis, a supposedly "respectable" biotechnology company). Question: How do they do this?
"I hate it how everything is being cached and observed and indexed, but I love it cuz its cool!!!"
It seems to me that those 50 or so "official" hits are not a result of a deliberate attempt to dominate Google results. They're just a symptom of the way Scientologists -- like any other religious zealots -- love to blather about themselves.
The grim era before Google, when searching was a spew of boolean mumbo-jumbo, NEAR this, NOT that, AND the other?
I kind of liked the "NEAR" operator - wish google had it!
I'm a 2000 man.
The article (or that part of it) reminds me of the people who claimed that newspapers were going to fall apart and they all needed to be microfilmed and stored that way...now the newspapers that were dumped are in such great shape that The Sharper Image is selling them for $30 a pop, and the microfilms are deteroriating, that is the ones that were made legible to begin with.
Copying bytes may be easy but every time I switch computers I have to worry about moving stuff and where is it stored, then there is 20-year-old stuff on 5 1/4" floppies...meanwhile my books from childhood are all doing great. Even the cheap-o dot-matrix printouts from my BBS days in 1983 are perfectly preserved, which is more than I can say for any data I had from back then.
- adam
Google has had ads for years. They have been off to the side in little pastel-tinted boxes. The great thing is you never even notice them until one of them is useful -- unlike all the damn popups everywhere, they stay out of the way until you need them. As such, I will willingly click on a Google ad if it relates to what I want (and it usually does!), while popups are killed via Mozilla, or, failing that, immediately destroyed as soon as they come up. Interesting to note that while everyone else seems to take the idea of "ads should obscure content", Google has taken the rational and sane approach of "ads should be relevant content".
That's it. I'm no longer part of Team Sanity.
Paul Ford wrote a hilarious piece on what life might be like if google tried to index the world.
Me, I think that the reason that the Harry Potter film ended up looking uncannily like what was in everybody's head is because Google can index the brain.
Just a theory.
Tweet, tweet.
So, without thinking I fire up google and type the search:
"user friendly the comic strip" email attachment
and then clicked on search. The first hit is the cartoon I wanted, so I click on it. When I pull up the page, I realize that the text words "email attachment" don't appear anywhere on the screen other than the graphic text in the comic itself, so google shouldn't have found the page - at least according to how I thought google worked. So I pulled up the source to see if there was a meta tag there which would explain this. Nope.
The only thing I can think of is that google either OCR's the pictures (seems scary, and that font which Illiad uses doesn't look very OCR-able). The other thing I thought about is that perhaps google also matches text found within <A> tags which link to that page or something.
I've shot a message off to google to ask about this but I haven't heard back yet. I'll be interested to find out how the *@(#*$ they did this.
I think that I saw an ad somewhere which said "How the @(#$* did they do that?" was the highest praise one web designer could give to another. If that's true, they've definately earned my praise in this case. Regardless, some wizard at google got their search engine to do exactly what I wanted with whatever technology they used. Technology sufficiently advanced is indistinguishable from magic. And google is definately magic.