Are We Searching Google, Or Is Google Searching Us?

← Back to Stories (view on slashdot.org)

Are We Searching Google, Or Is Google Searching Us?

Posted by kdawson on Tuesday July 29, 2008 @09:43PM from the eye-to-eye dept.

An anonymous reader writes "The folks at the Edge have published a short story by George Dyson, Engineer's Dreams. It's a piece that fiction magazines wouldn't publish because it's too technical and technical publications wouldn't print because it's too fictional. It's the story of Google's attempt to map the web turning into something else, something that should interest us. The story contains some interesting observations such as, 'This was the paradox of artificial intelligence: any system simple enough to be understandable will not be complicated enough to behave intelligently; and any system complicated enough to behave intelligently will not be simple enough to understand.' After you read it, you'll be asking the same question the author does — 'Are we searching Google, or is Google searching us?'"

19 of 346 comments (clear)

Min score:

Reason:

Sort:

Are We Searching Google, Or Is Google Searching U by darkheart22 · 2008-07-29 21:49 · Score: 1, Interesting

I think the times of big brother are ahead of us. Any big company that controls many aspects of our daily life "searches" us. I think it's time for another big company to take the lead of the search engines(not microsoft thought)...

--
Ever to excel
If only.. by QuantumG · 2008-07-29 22:06 · Score: 4, Interesting

all "magical thinking" in the field of artificial intelligence was reserved for fiction.
There's so much rigorous mathematically described hooey in AI that its hard to tell the naive geniuses from the crackpot morons. Consider this paper by Solomonoff. Brilliant stuff! A fantastic read. Then, at the end, it says:

In our view, however, the most interesting situation in machine learning, arises when we do not know ahead of time what program will solve a given problem and where the machine discovers the program itself. It seems to be very hard to find out much about this by theory alone. Running experiments is crucial.
This is Solomonoff's way of reminding us that he is a mathematician and hasn't actually run any experiments. His other papers make similar pronouncements in the footnotes about the uncomputability of his math or acknowledge the requirement of perfect (aka impractical) training data, etc. He makes it abundantly clear that is work is purely theoretical and unimplementable, but does this stop enthusiastic amateurs from reading his papers and declaring that AI is "solved"? Well no, of course not.

--
How we know is more important than what we know.
Re:Assuming that Google could reach consciousness by phoenix321 · 2008-07-29 22:21 · Score: 5, Interesting

Any biological intelligence does exactly the same as described: gather data (try to assess external universe model), find correlations (build internal universe model), act according to internal needs (act upon internal universe model) and repeat.
This chain of processing is done by all brains from the fruit fly to humans. Everything else is a consequential result from this process.
A human brain has very few hardwired constants and many of them they can be overridden.
Feedback loops are a natural result of action to fulfill internal needs according an internal model - that is always incomplete or wrong, see Goedel - upon the external universe. In the next step data is gathered, correlations found (which constitutes the feedback loop) and then acted out according to the adapted internal model.
A fruit fly has simple sensors, a very simple correlation engine and a tiny memory for its internal model. But that doesn't mean its following a different path than a newborn Einstein. Einstein has detailed sensors (easily surpassed by those of dogs and eagles, but still ok), a yet-unmatched correlation engine and a sufficient amount of internal model memory.
All other inputs come from the external universe and while some of them are absolutely neccessary and come from other organisms (parents, teachers), they do not impose a hard limit on Einstein: with enough correlation power, he can easily discover new facts, unknown to any of his inputs (teachers, parents).
Einsteins brain was never designed to do anything else than processing input signals, detecting correlations and contacting motor neurons to act upon its internal model. How did he discover Relativity then?
Re:Assuming that Google could reach consciousness by QuantumG · 2008-07-29 22:39 · Score: 4, Interesting

You make it sound so easy!
Of course, at a certain level of abstraction, everything is easy.
Intelligence really is that simple.. except there's one little detail you're ignoring.

Any biological intelligence does exactly the same as described: gather data (try to assess external universe model using limited computational resources), find correlations (build internal universe model using limited computational resources), act according to internal needs (act upon internal universe model using limited computational resources) and repeat.
That's the hard part. If you have infinite computational resources it's really trivial to act intelligently. All you need do is enumerate all possible outcomes of all possible actions with an idealized model of the world (Godel not withstanding) and pick whichever maximizes your expected reward. You can write nice long mathematical papers on this.. or even a whole book. The question is, how do you do it with a sensible amount of processing power and memory?
All the geeks have a great laugh when Matt Groening causes Bender to become transparent and we see a 6502 inside. The joke is that Bender has about the same processing power of a C64 from the early 80s. The show is littered with additional Commodore jokes which I'm sure 90% of the viewers just don't get. But that's not what really makes it funny. What really makes it funny is that all us geeks know that you need a lot more processing power than a 6502 to do the complex things that Bender does in the complex environment he does them in. But how is that? We don't know how to do AI. We don't even have the slightest clue. For all we know, there is a tight little algorithm for AI that could run on a 6502 and produce all those crazy behaviors that Bender gets away with.
And that's the problem with AI. The allure is that some short little algorithm exists that will magically evolve into a super-human intelligence if you just could find it and hook it up to the world. After all, nature figured out, how hard could it be? This has led many a would be mad scientist to code up a genetic algorithms implementation. In fact, most every programmer I know has given it a go. The mystery of what you'll find if you give it the right fitness function is a powerful motivator - with a little magical thinking, it could be anything!

--
How we know is more important than what we know.
Re:AI - A Myth by QuantumG · 2008-07-29 22:45 · Score: 3, Interesting

Semantic word games do not an argument make.
Go read about machine learning. There's plenty of things that we *can* do. It's not hard to sort the bunk from the legitimate results. Just don't look for anyone saying what we *can't* do. That's a little too pessimistic for the compsci crowd and is considered dangerous to the math crowd (who have a habit of not saying anything they can't prove).

--
How we know is more important than what we know.
Not everyone believes that by Namarrgon · 2008-07-29 22:54 · Score: 4, Interesting

It's perfectly possible for insanely complex systems to arise from very simple rules. We cannot grasp the entirety of the system, but we can know exactly how to create it, or perhaps manipulate it.
By way of example: the Mandlebrot set.

--
Why would anyone engrave "Elbereth"?
Re:Short answer... no by mccalli · 2008-07-29 23:02 · Score: 3, Interesting

We're not searching Google, we're searching the Internet.
Nope, quite definitely searching Google. "The internet" cannot be searched, there's no protocol for it. You can search a concentration of culled pages stored in a particular place, but you're not searching the internet. You're searching what that place has stored, believing it to be a subset of the internet.

You can trivially see this with pages that present one thing to Google spiders and another to the real browsing user. Or with 404 links - they existed at the time they went in the index, but they don't exist now. It's not the internet being searched, it's the snapshot subset that's been indexed.

Cheers,
Ian
Am I the only one getting a bit tired of ... by fadir · 2008-07-29 23:09 · Score: 3, Interesting

... the never ending "Google, the data monster will eat us all" hype?
A few years ago the same people were hyping Google for rescuing us from MS and now they are trying to tell us that Google is bad and we should use $random_unknown_startup instead to save our lives.
Bring me facts or leave me alone!
Re:Assuming that Google could reach consciousness by adamofgreyskull · 2008-07-30 00:06 · Score: 4, Interesting

Unless that's just his morality co-processor.
Genes and self-modifying programs by tucuxi · 2008-07-30 00:29 · Score: 5, Interesting
Parent is right. As long as there is no way for the programs running on Google's hardware to grow past their original programming (beyond optimization and load-balancing), there will be no Skynet.
Yes, many computer programs work in a feedback loop, and so do all organisms. But as long as only the data entry part of the loop can change, and the system lacks the flexibility to change the type of processing that takes place (the 'program'), no spontaneous evolution will occur.
Several factors are needed to get us to the bleak, dark, machine-vs-human Sci-Fi universe slashdotters know and love.
- The would-be AI programs must be free to rewrite portions of themselves. Self-modifying code is generally frowned upon as being very hard to write and debug, and outside academia (evolutionary programming?), nobody is pushing it. Also, current approaches need massive amounts of processing for meager results.
- The programs should be free to replicate. While Google has a lot of machines, they probably don't want runaway programs hogging the CPU cycles (they are not in the heating business). Internet-roaming malware is a much more likely than Google-sponsored code to eat over the Internets. Partly because the cheapest way to replicate is not asking for permission, and evolutionary systems will take shortcuts whenever available.
- There must be evolutionary constraints to help weed the "unsucessful" strains. If a viral, self-modifying program manages to get everywhere and "kill the host" (bog down the net completely), it will no longer evolve. Fortunately, there's lots of different systems hooked up to the 'net, and colonization would be hard enough.
The first point is the most difficult. It is *not* easy to take pieces out of two programs and build a third program that does things that both do. Whatever OO promises, code is not yet "easy as lego blocks" to assemble. You need very well though-out constraints to mix code in a meaningful way - any self-modifying program would need a small, hard-to-modify kernel that would take care of the mixing mechanism. Nobody knows how to design such a kernel correctly, or what exactly to include as 'genes' (mixable code modules). Computational biology (and biology itself) are hard at work on this problem.
But mixing blocks would not be enough. A successful system would need to build new, unseen blocks by modifying existing ones -- or starting from scratch. How many different things can you say in 20 words? How many of these things make any sort of sense? And how many of those require a very, very specific context to fit into?. The way that evolution can sort this out is by, very slowly, building things that sort-of, kind-of get the job done. However you look at it, there will be huge amounts of trial-and-error involved.
And another problem is that of intelligence "scale". Imagine a super-self-modifying internet worm. The ability to probe and infect does not automatically lead to self-consciousness. There are many, many evolutionary steps from bacteria (very good at self-modification and breeding) to humans. And the current installed base of Internet-connected computers and their "stability" (the time-frame during which a given system remains 'constant') is tiny in comparison to the resources that earths' organisms have had at their disposal for evolutionary purposes. Yes, computers are way fast and this can compensate for some parallelism issues. But I still think that emerging AI is still very, very far off.
Re:This is slashdot by phulegart · 2008-07-30 00:33 · Score: 5, Interesting

Hey, we all know the unspoken rules... if you read the article, you aren't supposed to post... and if you post, you aren't supposed to read the article. That's how a million geeks can slam a site from a Slashdot link, because there surely aren't a million posts in the thread of discussion about the same article.
Sorry about crossing the 30 word barrier though, and all the pain I caused those who have read this far...

--
"I love deadlines. I love the whooshing sound they make as they fly by." -D. Adams
Re:Assuming that Google could reach consciousness by ShieldW0lf · 2008-07-30 00:39 · Score: 4, Interesting

What makes your comment really funny is, the Commodore didn't use it's CPU for everything and connect to dumb IO devices. It had a good deal more intelligence in it's various components, keeping the load on the CPU low in the same way SCSI drives don't tax the CPU like ATA does. Which is how humans work... most data doesn't ever make it to the brain, but is pre-filtered by our organs, and most complex co-ordination exhibited by our bodies is not directly orchestrated by our brain, but through various biological dumb circuits.
The Commodore 64 had more in common with how humans work than modern computers do. I expect that once we begin grappling with the "avalanche of cores" problem in a meaningful way, modern computers will begin to be programmed in a fashion more reminiscent of how biological systems work.

--
-1 Uncomfortable Truth
Re:Well by pdwalker · 2008-07-30 00:42 · Score: 3, Interesting

Have you consider the firefox extension Googlepedia? It will present you a split screen with the google search results on the left, and the wikipedia results on the right.
Very useful.
Re:Google's information gathering techniques. by Janos421 · 2008-07-30 00:44 · Score: 2, Interesting

If you refer to the "onmousedown" event, I think you get it wrong. It just informs google that you clicked on a link.
They use javascript instead of href so they can record the rank of the result you clicked on (it's a parameter of the javascript function). This would not be possible with href.
As I'm working on a FF extension which simulates search activities to protect privacy, I investigate the javascript code (to simulate click). ASFAIK, they do not record other events than clicks. I have made couple of captures, but let me know if I missed something. Furthermore, they do not obfuscate code, I think they just want to reduce the size of the code to reduce bandwidth consumption.

Anyway, if you worry about privacy, you might:
+ Block google cookies (google-analytics, safebrowsing, adsense, ...)
+ Use a query obfuscation tool (either the one I am working on or TrackMeNot)
Ockham's razor by Anonymous Coward · 2008-07-30 00:48 · Score: 1, Interesting

If you know what selects for intelligence, by all means post it here
God.
Re:This is slashdot by Kelbear · 2008-07-30 01:39 · Score: 3, Interesting

It had a bunch of interesting excerpts, but overall I found a parallel to the "mother earth" concept.
Humans love to find patterns, it makes it easier to conceptualize and organize information. We look at our environment and find a wealth of information, when we expand our vision to include entire ecologies we find interaction between entities, and when we expand enough we become lost in the vastness of it all.
So how can we characterize all this information? We see the earth as a living organism, with a system of self-correcting processes that help sustain it's "life". A predator evolves a new advantage and the prey evolves a new defense. Overly successful species eat themselves to extinction or become eaten by a predator who is flooded with a bountiful food source.
In order to capture this ongoing balance act, we just call the pattern "life". But the exercise is left to the reader to determine the difference between the natural order and the human concept of "life as we know it". /. can refer to Riker's arguments against Data as a living life form. Pop open the back-plate of Google's head and switch it off. So long as the process requirements are in place, Google can function, but with us, once we die we are irreparably dead even when we bring the body back to life.
Gmail invites.. Social Network ping/mapping? by moorley · 2008-07-30 01:45 · Score: 2, Interesting

One thing I never understood and would "drool" over the information with morbid curiosity is how they did the gmail rollout.
You had to be invited in. I think you still do. That means to get what most of us finally have you had to have someone invite you.
That chronological tree of who is connected to whom would be pretty interesting data. Who is friends with whom? How long did it take to propagate?

--
"Don't fear death... fear not living..." -me :)
Re:Assuming that Google could reach consciousness by mrogers · 2008-07-30 02:46 · Score: 3, Interesting

If you know what selects for intelligence, by all means post it here; I've asked every biology teacher I've had since 9th grade and never gotten a reasonable answer.
Maybe you should ask a more specific question. :-) If "What selects for intelligence?" means roughly the same as "Which survival problems can be solved by intelligence?" then the answer is "Pretty much all of them."
Let's take a fairly broad definition of intelligence:
1. The ability to learn which actions, in which states, lead to which outcomes
2. The ability to predict the outcomes of actions without performing them
3. The ability to compose predictions (action A will lead to state X; in state X, action B will lead to state Y)
Evolution takes thousands of generations to solve problems, good solutions can only spread by reproduction, and it can only search the solution space greedily (each step of the solution must be neutral or beneficial compared to the previous step). Intelligence solves problems in a single generation, solutions can spread by imitation, and solutions can include steps away from the goal. Where evolution has trial and error, intelligence has thought experiments - falling into a river costs your life, but imagining falling into a river only costs a few calories.
On the other hand, intelligence isn't free - for part 1 of the definition above you need sensors, a state classifier and a memory. For part 2 you also need an imagination. For part 3 you need a way of finding paths through a network of imaginary states. So a hard-coded solution might be cheaper if the solution space is static on an evolutionary timescale. But every organism's environment contains other organisms that help to define the solution space, and those organisms are evolving, changing the solution space on an evolutionary timescale. So it may turn out that there are few problems for which a hard-coded solution is better than an intelligent solution; perhaps non-intelligent organisms will only survive in niches where intelligence is impossible to implement (eg microorganisms) or unnecessary for survival (eg domesticated animals and plants).
Paradox of artificial intelligence by kyliaar · 2008-07-30 07:37 · Score: 2, Interesting

The story contains some interesting observations such as, 'This was the paradox of artificial intelligence: any system simple enough to be understandable will not be complicated enough to behave intelligently; and any system complicated enough to behave intelligently will not be simple enough to understand.'
I have a problem with this 'interesting observation'. This is basically asserting that the system behind human intelligence is too complex to understand. This is based that engineers, subscribing to modern psychology and biological theories assume that the physical components of the body are what comprise the total sum of that which is human and thus which is intelligence.
I would posit that anything not sufficiently understood looks complex. Greater understanding brings greater simplicity. If you have a branch of research or knowledge that is leading into greater and greater complexities, you can be assured that there is basic data in the area that is either missing or is false.
I think the concept of intelligence, artificial or otherwise, could be easily understood if those studying intelligence applied more science and less reliance on proven 'authorities' and 'established' patterns of scientific thought.
Today's scientists are taught in an environment that stresses the importance of known data over a self-determined approach to phenomenon. What crazy world is it that reading other people's papers, writing your own without doing any actual real world observation, can be called research?