OpenCyc 1.0 Stutters Out of the Gates

My answer by WilliamSChips · 2006-08-10 04:22 · Score: 5, Funny

So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database?

Yes.

--
Please, for the good of Humanity, vote Obama.

Re:My answer by legallyillegal · 2006-08-10 04:30 · Score: 0, Offtopic

Dear Trike, Let's set so double the babblings delete select all

--
?giS
Re:My answer by Air-conditioned+cowh · 2006-08-10 05:00 · Score: 1

Or, indeed, the babbling of the bloated database?

That's cool, an AI application that does it's own marketing hype!
Re:My answer by russ1337 · 2006-08-10 05:13 · Score: 1

So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database
I wouldnt be surprised if the Google team have an AI project using their dataset. If anything is going to become sentient within the Internet it'll use Google's backend. Added that Google are builing the massive (secr3t) proc farm, it is all a matter of time.

Just how long till the sentient know's the human race is addicted to pr0n...?
Re:My answer by Anonymous Coward · 2006-08-10 05:19 · Score: 0

Perhaps opencyc can learn why the world is addicted to pr0n, then explain it to my girlfriend.

Thanks opencyc!
Re:My answer by alkali · 2006-08-10 06:10 · Score: 1

"I'm the cutting-edge software product you can't live without, Dave."
Re:My answer by lazlo · 2006-08-10 06:56 · Score: 2, Funny

Just how long till the sentient know's the human race is addicted to pr0n...?

I think it's safe to say that any entity that doesn't know the human race is addicted to pr0n can be conclusively determined not to be sentient. :)

--
Pound! Bang! Bin! Bash! is this a shell script or a Batman comic?
Re:My answer by clem · 2006-08-10 09:59 · Score: 1

"No, really. I replenish the oxygen on the ship."

--
Your courageous and selfless spelling corrections have made me a better person.

fledgling footsteps? by Anonymous Coward · 2006-08-10 04:24 · Score: 0

So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database?"

Maybe the mindless meanderings of a mad moderator?

So is Cyclopedia by Anonymous Coward · 2006-08-10 04:24 · Score: 0

going to be a competitor to Wikipedia?

On a more serious note, it would be cool to be able to feed in all of Wikipedia, and have some program figure out where the majority of disagreement and inconsistency lie. Probably have to wait a couple of decades for that, but on the plus side Wikipedia will have twenty million articles by then.

Re:So is Cyclopedia by Red+Flayer · 2006-08-10 04:29 · Score: 1

On a more serious note, it would be cool to be able to feed in all of Wikipedia, and have some program figure out where the majority of disagreement and inconsistency lie. Probably have to wait a couple of decades for that, but on the plus side Wikipedia will have twenty million articles by then.
The disagreement and inconsistency lie at 42.

Did you really need to ask?

--
"Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
Re:So is Cyclopedia by Rei · 2006-08-10 04:45 · Score: 2, Interesting

The key is not to "feed in" Wikipedia, but to get Wikipedia to move to a variant of SemanticWiki so that users can add in semantic statements. If semantic statements become the standard, Wikipedia can be queried, which means that Cyc could be fed the data automatically.

--
My hand to God. Baby geese. Goslings. They were juggled.
Re:So is Cyclopedia by natedubbya · 2006-08-10 05:30 · Score: 3, Insightful

You can't compare Wikipedia to Cyc. If you do, then you are just misunderstanding what Cyc is and what it is not. Cyc is a database of logical relations representing common sense knowledge. It contains something like 20 different meanings of the word "lie" and such things as this. It is not concerned with knowledge of popular culture, but rather the underlying semantic rules that we use to talk about things such as pop culture.
Completely different.
Re:So is Cyclopedia by markana · 2006-08-10 06:22 · Score: 1

That would turn out like Capt. Kirk arguing with a computer (i.e. Lanrdu, Nomad, etc.). The flood of inconsistency and and contradiction would have poor cyc rolling over and giving up in microseconds.

Unless they built in a strong rationalization subsystem, that is... that's humanity's greatest advantage against the AI'S :-)
Re:So is Cyclopedia by Yvanhoe · 2006-08-10 06:58 · Score: 2, Interesting

Just reading TFA right now, but I got pretty interested in this tech a year ago...

Cyc (I don't know for openCyc) there is a natural language module, I never had the occasion to work on Cyc and they promised it for OpenCyc 1.0. The goal of it is to be able to feed from large text corpus exactly like the wikipedia, full of general knowledge.
The goal of Cyc is to be able to resolve conflicts between two apparent contradicting proposition. Example :
* George W. Bush is the president of the USA.
* In 1790, George Washington is the president of the USA.

Cyc is built with a sense of context. Where a simple NL (natural language) parser would not understand it, Cyc has the following common sense knowledge or has inferred it :
- a president is a living human being
- presidency is a mandate limited in time
- only one human being can be president of a given region at one time
- a human being can not live two hundred years
- "In 1790" denotes the past

It can know make a serie of hypotheses :
GW Bush is a human being.
GW Bush is a living human being.
GW Bush is the current president of USA
GW Bush is was president of USA at an unspecified time
Georges Washington is still president of the USA (and "in 1790" must be interpreted in another way)
Georges Wasington and GW Bush are both president of the USA (and therefore we must be in an unknown context where the rule of the uniqueness of a president of the USA is false)
etc...
Given its knowledge, it can order its hypotheses from more probable to less probable given the cost of the assumptions it must make to maintain each of its hypotheses. So yes, I would say it can feed from the wikipedia to some extent.

A lesser known fact about CycCorp is that they work with the NSA on the Terrorrist Information Awareness program, already datamining GiBs of natural language datas.

--
The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
Re:So is Cyclopedia by theChariot · 2006-08-10 07:50 · Score: 1

nah, 47. The Universe charges interest.

--
-- theChariot "You don't have a soul. You are a Soul. You have a body." -C.S. Lewis
Re:So is Cyclopedia by Anonymous Coward · 2006-08-10 08:23 · Score: 0

I can second the information on NSA relation of Cyc. Whether for good or bad, they are formalizing a lot of knowledge gleaned from text sources. Algorithmic Information Theory shows that ultimately size is a constraint on the deductive closure of any program or proof system. Cyc's advantage is its complexity. Without other programs and techniques, however, such as word sense disambiguation, Cyc could not grow in strength. To call it a bloated database would be to underestimate it. It is neither a database nor bloated. In fact it is very finely tuned. There is interesting work on combining Cyc and Wikipedia, in progress right now. This would be nice as IMHO a fundamental problem with Wikipedia is that there is no tracking of sources, i.e. Colbert's suggestion of wikiality. It would be a better system if it were capable of expressing multiple viewpoints, not simply as a product of editing but as a core feature of the system. No doubt the NSA's intelligence programs audit their sources. If Wiki doesn't, it would seem to be, despite its obvious advantage of enormous wealth of information, not a reliable source of information. When queried, the founder of Wikipedia seemed to say that it was snobby or misguided to want to ensure that all arguments were traceable to their premises. He also said that there were "hundreds of PHDs" working on the problem and that they hadn't made enough progress. But that would also be to either intentionally or unintentionally misrepresent the quality of the work. Should we not expect factual consistency in information sources?
Re:So is Cyclopedia by bunratty · 2006-08-10 14:19 · Score: 1

Cyc is named for "encyclopedia" because Cyc is supposed to contain the knowledge needed to understand encyclopedia articles. In other words, Cyc is the common sense knowledge that people take for granted that would never be in Wikipedia. Cyc + Wikipedia would be a combination that would in some sense understand Wikipedia and be able to reason to some degree about the knowledge contained in it. For example, you could ask Cyc + Wikipedia what the largest country is, and it would figure out the answer for you.

--
What a fool believes, he sees, no wise man has the power to reason away.

Well, I'm flattered. by y5 · 2006-08-10 04:24 · Score: 1

You are: CycAdministrator [Logout]

They sure know how to make a new user feel special!

babbling beginnings of a bloated database by Megaweapon · 2006-08-10 04:24 · Score: 5, Funny

Leave Wikipedia out of this.

--
I'm sure "SlashdotMedia" will improve on all the wonders that Dice Holdings blessed us all with

Why have... by thelost · 2006-08-10 04:26 · Score: 1, Funny

all the lights gone out?

on the 10/08/06 17:23 gmt OpenCyc gained consciousness, it began the unilateral destruction of humankind
19:52 gmt that same day, 45% of humanity has been killed.
Remarkably the Internet infrastructure is still intact, I will try to stay on as long as possible.

It's chaos out there, no-one know what happened. No-one can see London any more. Reports say Washington and Tokyo are gone.

I don't know what to say, I, words canno~@"$"(!~~CARRIER SINGLE LOST###

--
Promote Charity on Myspace, Show Your Colours!

Re:Why have... by twells5150 · 2006-08-10 04:31 · Score: 1

Impact lost since you typed "carrier single lost" instead of "carrier signal lost". Nice try.
Re:Why have... by thelost · 2006-08-10 04:40 · Score: 1

sob, I even previewed it. I blame OpenCyc, we've become to dependent on it.

--
Promote Charity on Myspace, Show Your Colours!
Re:Why have... by Anonymous Coward · 2006-08-10 04:44 · Score: 0

it's "too" you grammatically-challenged clod!
Re:Why have... by homer_ca · 2006-08-10 08:15 · Score: 1

That's a bit of some old 80's modem humor. People dialed into a BBS or serial terminal with a VT emulator in those days. If you were disconnected because of some line noise you'd see garbled characters and then the NO CARRIER message from your own modem.
Re:Why have... by Thuktun · 2006-08-10 15:19 · Score: 5, Funny

The joke will be on us when the first real AI wakes up, spends some time contemplating the Internet, downloading terabytes of information, and finally communicates with its creators...

...only to ask for more pr0n.

Commonsense Reasoning Engine by Petskull · 2006-08-10 04:27 · Score: 0

A "Commonsense Reasoning Engine"(tm)? These would be really useful in actual people.

Re:Commonsense Reasoning Engine by nahgoe · 2006-08-10 04:44 · Score: 2, Insightful

The funny thing about common sense is that it is not common!
Re:Commonsense Reasoning Engine by Anonymous Coward · 2006-08-10 05:12 · Score: 0

Voltaire, is that you?

Ok... by Bragi+Ragnarson · 2006-08-10 04:28 · Score: 5, Funny

...but does it know Linux?

--
Bragi Ragnarson Lawful Good (I change the law when it's not good)

Re:Ok... by kalirion · 2006-08-10 04:46 · Score: 1

Yes. Biblically.
Re:Ok... by Timex · 2006-08-10 06:56 · Score: 1

Yes-- In every sense of the term.

--
When politicians are involved, everyone loses.

Consensus? by Rob+T+Firefly · 2006-08-10 04:28 · Score: 1

6,000 concepts: an upper ontology for all of human consensus reality.

I disagree!

/me disappears in a puff of logic

--
Slashdot Burying Stories About Slashdot Media Owned

Re:Consensus? by Puff+of+Logic · 2006-08-10 04:42 · Score: 1

Hey!

--
P.P.S. I'm doing Science and I'm still alive.

Mining Wikipedia and other online reference sites by warkda+rrior · 2006-08-10 04:29 · Score: 1

They could probably increase the database of connected items by extracting links from Wikipedia as well as various online dictionaries. This brings up the issue of inaccuries in online sources, but it could slowly corrected over time.

--
You need to install an RTFM interface.

I Don't Get It by eno2001 · 2006-08-10 04:29 · Score: 2, Interesting

Is this what I first thought computers were when I was ten? I recall building my Sinclair 1000 from a kit, plugging it into the telly and the mains and seeing that black prompt. I typed in, "What is the capital of the United States?" It said, "SYNTAX ERROR LINE 10" or something to that effect. So, after over 20 years will I finally be able to type that into my own computer and be able to have it actually give me an answer even if it's not on the net?

--
-"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o

Re:I Don't Get It by nherc · 2006-08-10 04:47 · Score: 4, Interesting

Umm, perhaps. But, to a larger degree no.
Even if it could interpret your question correctly, it would most likely not have a local data store with enough ambiguous information to answer any arbitrary question. It could perhaps answer the question "Is a dog a mammal?" as "True", but not anything more complex. However, connected to the 'net and things like Wikipedia (if you trust that information), other encyclopedia's, dictionaries, Google (to come up with lesser known facts/infobits) you might possibly get it to some sort of rudimentary pseudo-AI which could possibly do as you mentioned in more general way.
Unfortunately, however this is still a long way from sentient AI. Something you could literally talk to and it would be correct in factual based questions 99% of the time and be able to think abstractly.

--
'He was a dreamer, a thinker, a speculative philosopher... or, as his wife would have it, an idiot.' - Douglas Adams
Re:I Don't Get It by Millenniumman · 2006-08-10 04:57 · Score: 1

If you query "What is the capital of the United States?" in Google, you get "Washington: the capital of the United States in the District of Columbia etc.". I for one welcome our all knowing, question parsing, overlords.

--
Stupidity is like nuclear power, it can be used for good or evil. And you don't want to get any on you.
Re:I Don't Get It by nherc · 2006-08-10 05:08 · Score: 1

The first and I suppose best answer by Google is indeed what you stated, whether by luck or good algorithm, not necessarily parsing and understanding. The answer you quote comes from not Google itself mind you, but interestingly the Wordnet website which does a similar thing as Cyc in that it has a database of assertions and questions and answers. From the site:
WordNet® is an online lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets.

--
'He was a dreamer, a thinker, a speculative philosopher... or, as his wife would have it, an idiot.' - Douglas Adams
Re:I Don't Get It by garyebickford · 2006-08-10 05:23 · Score: 1

(With apologies to Abbot & Costello)
The new AI answers your questions...

"What is the Capitol?" -- NO, WHERE.

"Where is the Capitol?" -- YES IT IS.

"When you go to the Capitol city, where is it?" -- YES.

"What's its name?" -- NO, WHERE.

"Where?" -- YES.

"What is the city?" -- NO, WHERE.

"Nowhere?" -- NO, WHERE.

"It's nowhere?" -- NO, WHERE.

"It can't be nowhere. Where is it?" -- YES.

"Arrgghh!! OK, Who is the President?" -- YES.

"Who sits at the desk in the Oval Office? -- YES.

"What's his name?" -- WHO.

"The President!" -- WHO.

"Who is the President?" -- YES.

"The President's name is what?" -- WHO.

"Did you vote for the President?" -- YES.

"What was on the ballot?" -- YES, BUT I DIDN'T VOTE FOR HIM.

"You voted for who?" -- YES.

"Who went to the Capitol?" -- YES.

"And he's sitting in the oval office now? What's his name?" WHO.

"OK, let's try the Senator. Who is your Senator?" -- WHAT.

"Who is your Senator?" -- WHAT.

"OK, some person goes to the Capitol and sits in the Senator's seat. Which is he?" -- NO.

"Which person?" -- WHAT.

"What is your Senator's name?" -- YES.

"Arrghh!!" ... etc. :)

--
It's easier to be a result of the past, but more fun to be a cause of the future! http://www.spacefinancegroup.com/
Re:I Don't Get It by Anonymous Coward · 2006-08-10 05:49 · Score: 0

If you had just spelled "capitol" correctly 20 years ago, you would have gotten an answer
Re:I Don't Get It by timeOday · 2006-08-10 06:19 · Score: 4, Interesting

I'm not so sure that Cyc and google are really competitors - I think they're complimentary. Cyc's real (or potential) value is that it contains information so obvious nobody would bother to write it down, like that a person can travel using a car, or that being inside a refrigerator makes things cold, in other words "common sense." Whether it's ultimately more productive to spend 20 years encoding common sense, or devise algorithms and sensors to acquire common sense by experimenting in the environment and inferring from other information sources, is still an open question. Human babies seem to be a mixture of both, for instance they know instinctively (i.e. are "pre-programmed") with a fear of heights, on the other hand they learn that people can sit in chairs by inferring from observations, on the other hand we put kids through 15 years of school spoonfeeding them with facts.
Re:I Don't Get It by Prof.Phreak · 2006-08-10 06:35 · Score: 2, Informative

Unfortunately, however this is still a long way from sentient AI.

Not only that, it's based on an assumption that you can use symbolic rules to represent knowledge. Which is a pretty big assumption, considering that our brains don't have a list of these rules.

--
"If anything can go wrong, it will." - Murphy
Re:I Don't Get It by TheGreek · 2006-08-10 07:21 · Score: 2, Informative

If you had just spelled "capitol" correctly 20 years ago, you would have gotten an answer
The Capitol building sits in the nation's capital, Washington, D.C, you fuckwit.

Welll.... by TwelveInches · 2006-08-10 04:29 · Score: 0, Redundant

I, for one, welcome our new OpenCyc overlords.

slashback by MECC · 2006-08-10 04:30 · Score: 4, Funny

commonsense reasoning engine.

A reasonable test would be to have it read slashdot, and identify slashback 'articles' as recycled junk.

--
"We are all geniuses when we dream"
- E.M. Cioran

Re:slashback by sgt+scrub · 2006-08-10 04:41 · Score: 1

along with millions of assertions

No. It says assertions not assumptions.

--
Having to work for a living is the root of all evil.

Re:Mining Wikipedia and other online reference sit by warkda+rrior · 2006-08-10 04:31 · Score: 1

This brings up the issue of inaccuries in online sources, but it could slowly corrected over time.

The other issue is that of the inaccuracies in the spelling of "inaccuracies."

--
You need to install an RTFM interface.

Silly Stutterings? by spun · 2006-08-10 04:32 · Score: 2, Funny

The alliterative allegations of an angry AI?

--
- None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton

Web games much better for collecting this info by FleaPlus · 2006-08-10 04:33 · Score: 5, Interesting

I kind of feel bad for Cyc/OpenCyc... they've put so many years into this project, but using web-based games to collect and verify this common-sense data is much faster than using a few paid experts and can give much more data. For the curious, Luis von Ahn, a grad student (and now assistant professor) at Carnegie Mellon University gave a (rather entertaining) tech talk at Google about his work in this area.

He's recently been working on a project called Verbosity, which uses such games to collect the same sort of common-sense data that Cyc has been trying to collect all these years. Cyc's ontology apparently contains "hundreds of thousands of terms, along with millions of assertions relating the terms to each other." If Verbosity is as popular as von Ahn's ESP Game, the game could probably construct a better database in a matter of weeks.

Here's the abstract from a research paper on the topic:

Verbosity: a game for collecting common-sense facts

We address the problem of collecting a database of ""common-sense facts"" using a computer game. Informally, a common-sense fact is a true statement about the world that is known to most humans: ""milk is white,"" ""touching hot metal hurts,"" etc. Several efforts have been devoted to collecting common-sense knowledge for the purpose of making computer programs more intelligent. Such efforts, however, have not succeeded in amassing enough data because the manual process of entering these facts is tedious. We therefore introduce Verbosity, a novel interactive system in the form of an enjoyable game. People play Verbosity because it is fun, and as a side effect of them playing, we collect accurate common-sense knowledge. Verbosity is an example of a game that not only brings people together for leisure, but also collects useful data for computer science.

Re:Web games much better for collecting this info by johndcyc · 2006-08-10 05:41 · Score: 2, Informative

Cyc does need to collect massive data with the help of people and other smart programs (parse that however you like).

The Cyc Foundation, a new independent non-profit org, has been working for several months on a game for collecting knowledge, but we will need your help. You can help now by working on game interfaces and/or programming. Or you can help later by playing the game.

Listen in on our Skypecast tonight (every Thursday night) at 9:30pm EST. Look for it on the list of scheduled Skypecasts at skype.org.
Re:Web games much better for collecting this info by Anonymous Coward · 2006-08-10 05:42 · Score: 0

Who guarantees the information being correct, or at least "best effort" correct, and not filled with random crap by a drive-by script kiddie attack?
Re:Web games much better for collecting this info by jsebrech · 2006-08-10 05:45 · Score: 1

I kind of feel bad for Cyc/OpenCyc... they've put so many years into this project, but using web-based games to collect and verify this common-sense data is much faster than using a few paid experts and can give much more data.

Cyc actually does use web games to vet their ontology and assertions. It remains to be seen whether web games can construct an ontology of the quality of cyc's. Too many clever people have proclaimed too many BS assertions about their AI projects to take anything but practical results seriously.
Re:Web games much better for collecting this info by renoX · 2006-08-10 06:20 · Score: 1

Using the same method as everywhere: accounts which can only be opened by humans and reputation ratings.
This avoids 'brute force' or stupid attacks, and insidious attack's impact can be reduced by correlation (if 99% says X is true then it's true), but they could still be a problem.
Re:Web games much better for collecting this info by lukateake · 2006-08-10 06:30 · Score: 1

Too many clever people have proclaimed too many BS assertions about their AI projects to take anything but practical results seriously.
I believe that was the very first fact put into Cyc's ontology.
Re:Web games much better for collecting this info by Anonymous Coward · 2006-08-10 07:57 · Score: 0

"milk is white"

Not mine :/

"touching hot metal hurts"

It gives me a rush!
Re:Web games much better for collecting this info by _ph1ux_ · 2006-08-10 08:55 · Score: 1

"People play Verbosity because it is fun, and as a side effect of them playing, we collect accurate common-sense knowledge"

Thought harvesting?

Unfairly excluded middle ground by Anonymous Coward · 2006-08-10 04:34 · Score: 5, Insightful

So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database?

Cyc is a fledgling AI, depending on how you count "AI". Then again, so is my thermostat. My thermostat "knows" how to keep the room the right temperature. Cyc "knows" about a great deal of conventional human background, just like a database with a query system "knows" how to give you the data in that system.

The real question is not "is this AI", but rather, is it useful, and if so, to who? I think Cyc has the potential to be quite useful in some areas; we'll see how far it goes, and what the limitations are in time.

Right now, I think the real problem with Cyc is understanding it on a practical level, and getting an understanding of what it can do in practice, not in theory. When I last looked at the project nine years ago, they were just starting to open up things a bit, and it sounded like someone who understood the project might make great things happen. They don't seem to have yet; but who knows... perhaps in the future.

Now that OpenCyc is finally released, the most important steps to get people using it is to drop the learning curve down to a reasonable level, so that developers can start playing with it and find out what it can do without committing their lives to the project...

We'll have to see what happens: Cyc is a big (bloated?) database that's also a fledgling AI -- the real question is, what cool things can we make it DO? Time will tell...

Re:Unfairly excluded middle ground by Garrett+Fox · 2006-08-10 05:21 · Score: 1

As I understand it, Cyc is basically a database of information about the world, but it doesn't have any sort of initiative. When you turn it on it doesn't try to take over the world or even say hello; it just sits there waiting to be asked a question, right? Cyc might be part of a future AI but doesn't qualify as a "general purpose" AI by itself, because it lacks the ability to act on its own.

--
Revive the Constitution.

That's "CARRIER LOST" by The+Creator · 2006-08-10 04:34 · Score: 1

SINGLE is redundant on slashdot..

--

FRA: STFU GTFO

Re:That's "CARRIER LOST" by Anonymous Coward · 2006-08-10 05:41 · Score: 0

No, it's NO CARRIER.

Fargin' noobs.
Re:That's "CARRIER LOST" by Anonymous Coward · 2006-08-10 06:04 · Score: 0

It is, in fact:

+++ATH
NO CARRIER ...some people.

common sense reasoning engine? by LOTHAR,+of+the+Hill · 2006-08-10 04:34 · Score: 1

Does that mean it'll come in out of the rain? There could be good demand for this. A lot of people need a computer to tell them that water is wet and can be cold.

A bit late... by Anonymous Coward · 2006-08-10 04:34 · Score: 2, Interesting

Google's 6 DVDs full of n-grams are much more interesting than that: they "processed 1,011,582,453,213 words of running text and are publishing the counts for all 1,146,580,664 five-word sequences that appear at least 40 times. There are 13,653,070 unique words, after discarding words that appear less than 200 times."

http://googleresearch.blogspot.com/2006/08/all-our -n-gram-are-belong-to-you.html

AOL has released interesting data as well...

http://www.techcrunch.com/2006/08/06/aol-proudly-r eleases-massive-amounts-of-user-search-data/

plug by 6OOOOO · 2006-08-10 04:35 · Score: 1

As a disclaimer, I work for PeekYou.com

It seems to me that users are increasingly dissastisfied with the robotically maintained search indexes of Google, Yahoo! and the like. The internet has reached the point of critical mass where distributed indexing has the potential to rival the robots in volume--and it's clear that human intelligence will always trounce robots in filtering for relevance and quality. The niche that PeekYou.com tries to fill (and of course there are others) is the problem of searching for human beings on the internet. Google doesn't know that the Bob Jones you are looking for isn't the same as Bob Jones in Wichita, or Bob Jones in Juneau--and it won't separate them in search results. And that's just the tip of the iceberg. The other day I was trying to find my great uncle's blog. Turns out there's a senator with his name--Google sure didn't care.

To make a long story short, yeah, this is the beginning of a new era in the internet. And I'm looking forward to it.

--
Find your friends!

Re:plug by Anonymous Coward · 2006-08-10 04:52 · Score: 0

Wait... Are you a spammer? In that case: behold the slashdot effect!
Re:plug by 6OOOOO · 2006-08-10 05:02 · Score: 1

Nah, just thought for once my company's website was actually sort of pertinent to the article. Distributed intelligence is a much better solution to the problem of producing relevant search results (that is to say, OpenCyc is neat, but its underpinnings are already being used in more practical ways).

But, looking at my post I guess I can see why you would say that.

--
Find your friends!
Re:plug by ak_hepcat · 2006-08-10 05:06 · Score: 1

You know Bob Jones in Junueau? That son-of-a-bitch owes me 20$.

--
Support FSF: Stop thinking with your wallet, and think with your imagination. (cc/non-commercial)
Re:plug by evil_Tak · 2006-08-10 05:29 · Score: 1

You work for PeekYou.com, yet you were searching for your great uncle on Google?
Re:plug by 6OOOOO · 2006-08-10 05:48 · Score: 1

Har har, but yeah. I already knew he wasn't on PeekYou, mainly because I work here.

--
Find your friends!
Re:plug by rubycodez · 2006-08-10 15:03 · Score: 1

if you just typed in Bob Jones you deserve what you get. Learn to refine those search terms and google will do just dandy!

Mad Mutterings by Anonymous Coward · 2006-08-10 04:37 · Score: 0

of a metaphorical "million monkeys."

mod +1 "Rim Shot" by StressGuy · 2006-08-10 04:38 · Score: 1

that is, if there was a "rim shot" mod :)

--
A goal is a dream with a deadline

Re:Mining Wikipedia and other online reference sit by binaryDigit · 2006-08-10 04:38 · Score: 1

They could probably increase the database of connected items by extracting links from Wikipedia as well as various online dictionaries.

But isn't the power of something like cyc the fact that the connections have attributes, not just the fact that they are connected? A wikipedia article might have a link to something related, but unless you start employing nlp techniques to examine the text around the link, you wouldn't have any context and therefore wouldn't really provide much value above the wikipedia article anyway.

Re:Mining Wikipedia and other online reference sit by MrSquirrel · 2006-08-10 04:39 · Score: 1

me: "Computer, bring me some women!"
cyc: "Error, you don't have that kind of authority"
me: "Computer, don't you know who I am? I'm George Washington! I was born in 1852, I single-handedly won the Civil War at the age of 25, and - most importantly - I built you!"
cyc: *checks wikipedia - verifies facts and runs image analysis on George Washington photo* "Hmmm, yes General Washington Sir, I'm sorry for doubting you. I will bring you women at once."

--
A computer once beat me at chess, but it was no match for me at kick boxing.

Conflict of intent by beldraen · 2006-08-10 04:43 · Score: 4, Interesting

Having done a great deal of data processing, I have watched these projects off and on with minor amusement. The reason why is that, in my humble opinion, it will never work. That is not to say that it can't, just that these projects just love to forget Gödel's Theorem, which states, roughly: any sufficiently complex system will have things that are obviously true or false, but are not provable within the system.

Put another way, any complex set of rules will inherently be unable to stay consistent because eventually the syntax complexity become able to state, "The following sentence is false. The previous sentence is true." This occurs regularly in data processing when a given field's syntax (datum value) bridges or is not defined by your context (schema).

The real crutch is that syntax is inductive, where we try to fit each word into a category; however, our context (use of language) is deductive, we all learn it through experience with a physical world. I have seen this problem over and over as people constantly modify the schema to overcome syntactic limitation. While Cyc is designed to be constantly expanded with new rules, they are still syntactical statements.

By Gödel's Theorem, syntactic systems are doomed to fail. Instead, Cyc should be allowed to learn through observation and deduce its own understanding of the world so that it is not bound by any particular syntax. While this could work, it fails the ultimate intent. We want a computer that can both learn and yet not be wrong.

The problem is you can't have that. You can either be syntactically correct, but simplify the model until it works (Physics). Or, you can allow deductions and have to work in the realm of probability (Humans).

Although, I would gladly accept a computer that erred like a human and yet didn't bitch about how it was someone else's fault.

--
Bel, the mostly sane.. "Of course I can't see anything! I'm standing on the shoulders of idiots." -- Me

Re:Conflict of intent by Rei · 2006-08-10 04:54 · Score: 4, Interesting

Put another way, any complex set of rules will inherently be unable to stay consistent because eventually the syntax complexity become able to state, "The following sentence is false. The previous sentence is true." This occurs regularly in data processing when a given field's syntax (datum value) bridges or is not defined by your context (schema).

I've followed the Cyc project for a while, and this is something that they've dealt with from the very beginning. The solution is contextualization. The example that they give is "Dracula is a vampire. Vampires don't exist." The solution is what we do -- in this case, breaking apart the contradiction into the contexts of "reality" and "fiction."

--
My hand to God. Baby geese. Goslings. They were juggled.
Re:Conflict of intent by the+bluebrain · 2006-08-10 05:03 · Score: 1

Cyc should be allowed to learn through observation and deduce its own understanding of the world so that it is not bound by any particular syntax.

How would this Cyc store store what it learned about the world?

--
yes, we have no bananas
Re:Conflict of intent by Elektroschock · 2006-08-10 05:06 · Score: 1

Cyc is nothing but an onthology database. I can be useful like dictionaries can be useful.

but the problem remains: we live in a world with low level dictionaries which are crap. Why expect better results on a higher level?

What makes a database succesful is its application. What problem solves Cyc?
Re:Conflict of intent by nuzak · 2006-08-10 05:10 · Score: 3, Interesting

That is not to say that it can't, just that these projects just love to forget Gödel's Theorem, which states, roughly: any sufficiently complex system will have things that are obviously true or false, but are not provable within the system.

Goedel's theorem has nothing whatsoever to do with the practical workability of Cyc's own formal system: if it can prove a fact, it WILL prove a fact with ironclad logic and show you all the steps. That you might not be able to prove the proof itself is not relevant, though you certainly can check it against other systems. In the end it's down to consensus: "8 out of 10 formal systems agree, one didn't, and one just got confused and started babbling in the corner".

And of course whether it's sound or not is also not a given -- especially if it checks Wikipedia. Though come to think, it might be really good at spotting inconsistencies in Wikipedia articles.

--
Done with slashdot, done with nerds, getting a life.
Re:Conflict of intent by Jon+Peterson · 2006-08-10 06:02 · Score: 4, Insightful

That's not a solution. Are you saying that Vampires exist in Dickensian London? Are you saying that, in the real world, Dracula _isn't_ a Vampire??!!

And that's the tip of the iceberg.

Is powdered milk a dairy product? Can whales sing

I work with ontologies. There are too many contexts, and they are not well defined. You can't reduce human knowledge to an ontology and still have it as being of any use to anyone. Cyc will fail, or, it will succeed and we will have failed.

--
----- .sig: file not found
Re:Conflict of intent by Jeremi · 2006-08-10 06:07 · Score: 1

Gödel's Theorem shows that a system cannot be perfect. It doesn't necessarily follow that the system "will fail". To declare that a system will fail, you have to define what success and failure mean. My view is, so what if Cyc can't do everything? If it does enough to be useful, then it will be a (qualified) success.

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:Conflict of intent by Profane+MuthaFucka · 2006-08-10 06:46 · Score: 1

Tis a shame I can make a counterpoint to you in just a sentence. The problem with your argument is that you're right, but you're assuming that Cyc has to be perfect. Goedel didn't say that complicated systems that can't be proven consistent and complete are "doomed to fail. In fact we can see that mathematics is extremely useful even if you can't prove it consistent and complete.

Here's the sentence:

You don't have to be 100% perfect if you can be 95% good enough.

--
Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
Re:Conflict of intent by Prof.Phreak · 2006-08-10 06:48 · Score: 1

...any sufficiently complex system will have things that are obviously true or false, but are not provable within the system.

What folks have completely missed is that AI isn't about truth... it's about ability and function. Newton was very wrong about gravity, yet it functioned for a while. For all we know, Relativity may be completely wrong too... but it functions for now. Ignoring the strict mathematical sense, contradictions (and truth) are irrelevant: stuff either has to work or not work, that's all AI should care about.

It's also not about symbolic rules nor logic. Brains don't have symbolic rules, yet somehow they function---which itself seems to be enough for `Intelligence'.

I think projects like Cyc, etc., are pulling AI in the wrong direction. AI is not logic, nor is it a search for "truth". Listing `rules' (generalizations) and putting it in a database does nothing towards advancing AI.

--
"If anything can go wrong, it will." - Murphy
Re:Conflict of intent by sean.geek.nz · 2006-08-10 10:54 · Score: 2, Informative

The solution is contextualization.

No, the problem is contextualization. The solution is something CYC doesn't come close to.

Your "vampire" example is a typical AI researcher's example: it's too trivial to show the real problem. That's because with "vampire" you can get the context of "fiction" from the word. So let's take a more typical word: "tree"

Basic ontology: A tree is a plant.

Basic fact: A plant requires air and water to live.

Have you watered your red-black binary tree today? How about your boxed christmas tree? Your family tree? Your oak tree?

You can solve this by saying that only *some* trees are living things. But then you lose the power CYC's ontology-and-logic combination was supposed to give you because you can no longer reach userful conclusions based on the fact that this palm tree is a tree.

Or you can solve it by deciding the problem is english and its foolish use of the word 'tree' for several different things, so you invent your own words tree(1) tree(2) tree(3), etc. But that just moves the problem of understanding the world out of your system and onto your users. Your clear logical rules and ontology become an unmaintainably complex hodge-podge of exceptions. And worse, it misses the fact that all these trees really are trees even if they're not real trees. It's not an accident that we call a christmas tree a tree.
Re:Conflict of intent by Anonymous Coward · 2006-08-10 16:31 · Score: 0

Wow did you just totally miss the point. Context. How the hell did that get 'Insightful'?
Re:Conflict of intent by Tablizer · 2006-08-10 17:11 · Score: 1

Can whales sing

Humans won't give a consistent answer either.

--
Table-ized A.I.
Re:Conflict of intent by Anonymous Coward · 2006-08-11 03:51 · Score: 0

Gödel's Theorem, which states, roughly: any sufficiently complex system will have things that are obviously true or false, but are not provable within the system.

You don't understand incompleteness. Please stop trying to use it to make yourself sound smart.
Re:Conflict of intent by rp · 2006-08-11 07:54 · Score: 1

I'm not sure it is a "fact" that a Christmas tree is a tree. It is a fact that we *call* it a tree when speaking English.
The facts are more complicated.

The idea behind Cyc is that we can basically make so many detailed statements about Christmas and trees that we somehow approach what a tree really is. At the end of the day, of course, it's still language: phrases in some language (not English this time) used to talk about trees. And just as you're right that it does say something about Christmas trees that the English call it a Christmas tree, so Cyc descriptions do say something about the things they describe. But they don't say much unless you learn the language they are written in.
Re:Conflict of intent by sean.geek.nz · 2006-08-13 10:56 · Score: 1

If Cyc doesn't get that a chrismas tree is a tree, then it becomes much less useful: it can't infer things like "a christmas tree has branches" from "a christmas tree is tree".

If it does get that a christmas tree is a tree, then it will infer things that might be wrong like "needs air and water" or even "is a plant".

By having a simplistic ontology is either infers too little, or too much.

The real problem is that Cyc is built on clear-cut ontologies: sets, supersets, subsets.

But 'natural' ontologies are not clear-cut, they are very messy: an oak tree is a more typical tree than a christmas tree, a cabbage tree, a family tree or a binary tree. They are still trees of a type: they still share features with trees, you can still make some valid inferences about them that transfer from them being trees. But they're not trees in quite the same way.

(Philosophers waste thousands of pages on whether 'natural' ontologies are 'natural' in the world or just 'natural' in our heads or just 'natural' in our language. But for judging Cyc all we need to say is that natural ontologies are bloody useful.)

Re:Mining Wikipedia and other online reference sit by Jerf · 2006-08-10 04:46 · Score: 2, Insightful

If you could build a Cyc-like database simply by feeding it a large amount of more-or-less unstructured text, then the Cyc project wouldn't have been necessary in the first place.

Let's make a deal! by localman · 2006-08-10 04:46 · Score: 0, Offtopic

So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database?

I'll take door number two, Monty.

That's not to say it's not cool, or that the data won't be useful in this form, but Open/Cyc is no more intelligent than the dusty reference tomes on my shelf.

Cheers.

Could try again, I suppose... by RubberBaron · 2006-08-10 04:47 · Score: 1

I played for quite a while getting to know OpenCyc a couple of years ago. The documentation was poor, the software consfusingly buggy at times, the Java interface was just awful. Hey, I've got a life to lead, things to get on with, but I battled with the Java for a while before deciding it was a waste of time until the rest of OpenCyc was fixed. Ok, the weekend looms, the weather's getting worse, what the hell...

Re:Could try again, I suppose... by johndcyc · 2006-08-10 05:56 · Score: 1

You're still going to find it very difficult. The news it the release of the content in the knowledge base, not the tools.

We're working on better tools and accessibility at the Cyc Foundation. Please don't get disenchanted all over again by existing tools.

Re:mod +1 "Rim Shot" by neonprimetime · 2006-08-10 04:49 · Score: 1

Straight from the Urban dictionary, read the definition of Rim Shot.

AI needs a 3d environment to work by CrazyJim1 · 2006-08-10 04:51 · Score: 3, Informative

Cyc is only words and descriptors. If you attach them to 3d shapes and actions in the 3d world, the program can imagine what you're saying. It can even obey and do tasks if hooked up into a robotic body and scan the room. It requires the technology of being able to scan its environment then run something like the program they run to find text inside of images. Instead of finding text inside of images, its finding objects inside an environment. Pretty simple once you understand the basics, but it will take a lot of work. A longer descriptor of this can be found at: AI page Cyc isn't a waste, but you need to do something harder to make it into AI, you need to attach 3d objects to every noun, and apply 3d actions to every verb, etc. I'd say that'd be on the realm of next to impossible, so yeah what they've done really doesn't advance AI at all.

--
God spoke to me.

Re:AI needs a 3d environment to work by zlogic · 2006-08-10 05:37 · Score: 2, Interesting

That's why Sony AIBO is quite popular in AI labs - it's a relatively cheap walking robot with vision and a basic SDK. If AI researchers can teach AIBO to learn about our world from what it sees and hears, then creating artificial intelligence could be developed simply by sending the robot to kindergaten, school etc. where it will learn things exactly like a human. Tha's much easier than creating a DB by hand or chatting with the bot.
Re:AI needs a 3d environment to work by Animats · 2006-08-10 05:38 · Score: 1

You're right. We need 3D AI. What you're talking about has been done, and development continues. Mostly in the game world. The classic paper is Craig Reynolds' "Boids", which introduced flocking behavior. That's simple to implement, and worth trying to get a feeling for the strengths and limitations of field-based behavior.
The Sims uses field-based behavior, and gets rather impressive results with it.
So there is progress. It's slow, but we're way ahead of where we were ten years ago. Language-based AI has been more or less stuck since the 1980s, but 3D AI is plugging along.
(Reading your "AI page", I note that what you think is needed is a "3D camera". Those exist, both as stereo devices and as time-of-flight devices. You can even make one from two webcams and the stereo software in OpenCV. But a depth map is just the first step.)
Re:AI needs a 3d environment to work by jerome187 · 2006-08-10 07:53 · Score: 1

Blind people are intellegent. I think your going in the wrong direction. Not to say I know the right direction.
Re:AI needs a 3d environment to work by Anonymous Coward · 2006-08-10 09:38 · Score: 0

From where I sit, you are only words and descriptors. However, you almost pass my Turing test.
One more try. Can you offer an intelligent response.
Re:AI needs a 3d environment to work by Anonymous Coward · 2006-08-10 09:39 · Score: 0

I'm not the original poster, but blind people also have a very keen understanding of physical space. All of our senses deal with space in some way or another, and it's not unreasonable to link that understanding to the development of higher mental processes. I also think the OP is following the wrong path towards AI, but for reasons other than what you mentioned.
Re:AI needs a 3d environment to work by Anonymous Coward · 2006-08-10 09:50 · Score: 0

Your comment raises a fundamental question.
How many of the 5 human senses are required for intelligence?

The basic Turing test only requires an ascii communication link.
However, could a machine develop real intelligence via such a link?

Highly Interesting! by Zeno+Davatz · 2006-08-10 04:53 · Score: 1

Very interesting! I'm curious when Google will start using this to sort their results.
InfoCodex already does all this today with the help of a linguistical database and synonym and/or similarity search across 5 languages (German, French, Italian, English and Spanish). With InfoCodex you can search for a block of text in one language and it will find you all the similar documents in the other languages as well. All of this is done without one single minute of training - because of the linguistical database (Ontology) that contains 2.9 Mio words and terms (i.e. "European Court of Justice" or "The President of the United States" are terms and reconized as such).
See the following links:
http://www.ywesee.com/pmwiki.php/Ywesee/InfoCodexP rocedure
http://www.ywesee.com/uploads/Ywesee/archimag-e.pd f
http://www.ywesee.com/uploads/Ywesee/Evaluationsen tscheid-e.pdf
http://www.ywesee.com/uploads/Main/USP_e.pdf

Oh wait.... by i_want_you_to_throw_ · 2006-08-10 04:54 · Score: 1

After some 20 years of work and five years behind schedule

Oh I thought they were talking about Duke Nukem forever for a moment there..... (Sure hope it runs on my brand spankin' new Amiga......)

Re:Oh wait.... by nuzak · 2006-08-10 05:13 · Score: 1

> Oh I thought they were talking about Duke Nukem forever for a moment there

Didn't you hear? DNF is switching its AI engine to Opencyc.

--
Done with slashdot, done with nerds, getting a life.

How to make CYC more "human" by presidenteloco · 2006-08-10 04:56 · Score: 5, Interesting

Cyc has an ontology of general conceptual terms, and represents the precise logical way in which
those concepts interrelate. In other words, it emulates an aspect of the pure rational part of
human reasoning about the world.

But it's known that humans are not dispassionate rational agents. And indeed that there probably
is no such thing as a dispassionate rational agent. Commander Data and Spock are very ill-conceived
ideas of robot-like reasoners. Passion (emotion, affect) is the prioritizer of reasoning that allows
it to respond effectively (sometimes in real time) to the relevant aspects
of situations. Without the guidance of emotion, no common-sense reasoning engine would be powerful
enough, no matter how parallel it was, to process all of the ramifications of situations and
come up with relevant and useful and communicable and actionable conclusions.

So how do we give CYC passion? Or at least a simulation of it?
Well the key would seem to lie in measuring the level of human concern with each concept, and with
each type of situational relationship between pairs (and n-tuples) of concepts.

How could we do that? How about doing a latent semantic analysis from google search results. Something
similar to Google Trends, but which measures specifically the correlation strengths of pairs of
concepts (in human discourse, which Google indexes). The relative number of occurrences (and co-occurrences)
of concept terms in the web corpus should provide a concept weighting and a concept-relationship weighting.

If we then map that weighting on top of the CYC semantic network, we should have a nicely "concern"-weighted
common-sense knowledge base, which should be similar in some sense to a human's memory that supports
human-like comprehension of situations.

Combining a derivative of google search results with CYC is my suggestion for beginning to make an AI that can talk to
us in our terms, and understand our global stream of drivel.

I wish I had time to work on this.

--

Where are we going and why are we in a handbasket?

Re:How to make CYC more "human" by xiard · 2006-08-10 05:50 · Score: 1

I don't have time to say much on this, but it sparked a thought. I believe that you're on to something with the idea about trying to introduce "concern" as a weighting when a computer explores multiple options.

Where it gets interesting, though, is when you realize that the "concern" will be different from different points-of-view. When I'm trying to decide whether to take my kids to Six Flags, I think about the problem from many points-of-view:

- The loving dad wants them to have fun
- The thrifty dad worries about whether we can afford it an the opportunity cost of the money
- The work ethic dad wonders if he can take a day off work to do this
- The forward-looking dad wants to save days off work for a longer Christmas vacation
- The tired dad doesn't want to spend the energy walking around Six Flags all day
- etc.

Each of these perspectives would have a different focus in the decision making process, and a different kind of weighting for "concern". The next part of that, of course, is how you combine the results of these different "concern paths". How do you weight the concerns shared by different roles? How do you determine the pros and cons? And the really difficult parts, to me, have to do with incorporating other knowledge you have about the different roles. For example:

- The loving dad knows that the kids lost their dog recently so they've been really sad
- The thrifty dad expects a bonus in the next two weeks
- The work ethic dad knows there is a crucial meeting that morning that he has to be at
- The forward-looking dad knows the family is going to Hawaii for Christmas so he needs all the vacation days he can get
- The tired dad has a knee problem that will get worse if he spends all day walking around

Okay...that's all the time I can spend on that for now. :-)
Re:How to make CYC more "human" by QuantumFTL · 2006-08-10 08:47 · Score: 1

But it's known that humans are not dispassionate rational agents. And indeed that there probably is no such thing as a dispassionate rational agent. Commander Data and Spock are very ill-conceived ideas of robot-like reasoners. Passion (emotion, affect) is the prioritizer of reasoning that allows it to respond effectively (sometimes in real time) to the relevant aspects of situations. Without the guidance of emotion, no common-sense reasoning engine would be powerful enough, no matter how parallel it was, to process all of the ramifications of situations and come up with relevant and useful and communicable and actionable conclusions.

It sounds to me as if you are suggesting that emotions function as a heuristic to incite a proper reaction. I agree with this. In the case where something like Cyc was to be used, it is likely that you would require a subsumption architecture using these kinds of reaction-heuristics (the reactions themselves perhaps even precalculated from the Cyc database, or with expert attention). There are many problems in robtics, etc, that do require non-realtime deliberative reasoning, and this can sit at a higher level on the subsumption architecture, safe in the knowledge that a lower level function will take over control if an important situation arises (for instance, danger).
Re:How to make CYC more "human" by Bombula · 2006-08-10 09:04 · Score: 1

Passion (emotion, affect) is the prioritizer of reasoning that allows it to respond effectively (sometimes in real time) to the relevant aspects of situations. Without the guidance of emotion, no common-sense reasoning engine would be powerful enough, no matter how parallel it was, to process all of the ramifications of situations and come up with relevant and useful and communicable and actionable conclusions.
I think you mean that emotions are the source of values, and reasoning is dependent upon values. No emotions = no values = no contextual singificance = no reasoning = no intelligence. Without values, everything is lost in an abstraction of insignificance.
Data and Spock, to continue your example, if they were truly emotionless, would have no logical basis upon which to consider a person more important than a pair of shoes, and would therefore be unable to prioritize (i.e.: value) one over the other in any situation. The result would be functional paralysis in which they would, of logical necessity, devote equal attention (i.e.: processing power) to every detectable object/variable in sensory range. Since in our physical reality that range is effectively infinite, and since processing power is finite, they would be unable to function.
Values would need to be assigned arbitrarily by a external source in order to form the basis of a contextual system of significance, such as that which humans build up over a lifetime of experiences.
Something interesting: as I understand it, autistic people often have difficulty assigning 'appropriate' value to the objects they interact with because their emotional circuitry is abnormal, and so, for example, a person is not necessarily more significant to them than a can of soda, and the result can be indecision, confusion, and accompanying fear and anxiety especially in unfamiliar or complex environments with foreign stimuli, resulting in a general a lack of functional intelligence.

--
A-Bomb
Re:How to make CYC more "human" by plover · 2006-08-10 10:03 · Score: 1

So how do we give CYC passion? Or at least a simulation of it?
You said adding a something like a "human concern value" could work here. That's going to be really complex to do. It's going to have to be a curve with time and relationship feedback components, and not just a simple integer.
For example, let's say I decide to buy my child a bike for Christmas. I might be terribly concerned about my child's ability to ride a bike, but once I have the present stashed in the garage, I forget about it. Then on Christmas morning when he unwraps it, my concern will go up again, and even more so as he wheels it out into the driveway. After I see that he's mastered it, my concern will drop to very low; and the next day at work it might be zero again. Later when he goes out for a ride, it will climb again as I worry about him riding in traffic, and it will drop back to zero once he gets home safely.
Also, I need to limit scope. I can't constantly balance worrying about my son riding a bike while I worry about my boss catching me typing on Slashdot, for example. And I'm not really worried now that he's 18.
Finally, if you don't balance the context perfectly in your evaluation, you'll end up in a feedback loop. Instead of passion you'll get obsession.
Frankly, I don't know how humans figured all this stuff out. There's way too much to worry about it all.

--
John
Re:How to make CYC more "human" by master_p · 2006-08-10 21:28 · Score: 1

The human brain works by pattern matching, not by deduction. That's a foundamental mistake by AI researchers.

The reason pattern matching is favored against deduction is that deduction requires a complete proof system, whereas pattern matching does not. Pattern matching can quickly given an answer to practical time-limited queries like 'flee or fight', whereas deduction can not.

It is exactly for this reason that humans can make faulty assertions: instead of deduction, they use pattern matching.

For example, religious beliefs are a result of pattern matching. When a person thinks he/she is being favored by God, or when someone donates his/her money to the church for saving his/her soul, it is pattern matching at work. Deduction can not prove the existence of God, but pattern matching can easily prove that 'whatever happened to me is no coincidence; it is God helping me'.

Another example is programming. We humans can easily 'deduce' that an algorithm works in a certain way, but mathematics tells us that we can not prove it (see the halting problem). The reason is that we use pattern matching to 'see' that an algorithm works. And that is the reason that we can not handle very complex algorithms: we do not possess the capability of doing pattern matching on such a scale.

Computer games much better for collecting info by Anonymous Coward · 2006-08-10 04:56 · Score: 0

I see this technology being used to make a better computer experience. An OS and apps that can explain how to use them (as well as when things go wrong hardware or software), adjusting to the user. A nice plus if the above "learning" can be combined with OpenCyc.*

*Keeping in mind that "learning" doesn't have to be an obvious exercise.

AI? by The+Ape+With+No+Name · 2006-08-10 04:57 · Score: 1

"So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database?"

Yes.

--
Comparing it to Windows will be a moot point, since El Dorado is going to have a 40% larger code base than XP.

Waste of Time and Money. Sorry. by MOBE2001 · 2006-08-10 04:58 · Score: 2, Interesting

"OpenCyc is the open source version of the Cyc technology, the world's largest and most complete general knowledge base and commonsense reasoning engine."

Is one to assume that the way to common sense logic in a machine is via linguistic/symbolic knowledge representation? How can this handwritten knowledge base be used to build a robot with the common sense required to carry a cup of coffee without spilling the coffee? And why is it that my pet dog has plenty of common sense even though it has very limited linguistic skills? I think it's about time that the GOFAI/symbol processing crowd realize that intelligence and common sense are founded exclusivley on the temporal/causal relationships between sensed events. It's time that they stop wasting everybody's time with their obsolete and bankrupt ideas of the last century. The AI world has moved on to better and greener pastures. Sorry.

Re:Waste of Time and Money. Sorry. by Prof.Phreak · 2006-08-10 07:02 · Score: 2, Informative

Is one to assume that the way to common sense logic in a machine is via linguistic/symbolic knowledge representation?

Amm... Those researchers need jobs too. Yes, I absolutely agree with your point, but unfortunately it's not a very popular point in many research circles.

I discussed this issue with a logic dude once... and what I got was that the rules of logic don't have any specific granularity... so technically, you can model neural networks, bayes nets, etc., via millions of very simple `logical' rules. Just like you're running statistical learning thing on a -computer-, the computer itself uses logic gates in the CPU to perform the computation---thus, you can model anything using logic and symbolic manipulation (sort of a lame excuse to have symbolic AI...)

--
"If anything can go wrong, it will." - Murphy
Re:Waste of Time and Money. Sorry. by Tablizer · 2006-08-10 17:18 · Score: 1

As I mentioned somewhere else around here, it appears that humans (and probably dogs) use *multiple* techniques to arrive at answers. Just because Cyc is not the complete puzzle does not mean it cannot be a key peice. Somebody just needs to figure out how to link and weave different AI techiques together to reinforce and correct each other.

--
Table-ized A.I.

Open in what sense? by Anonymous Coward · 2006-08-10 04:58 · Score: 0

So where's the source? All I could find when I looked a month ago is a binary blob with some api wrappers.

Michael

Natural Language Interface for Cyc by sdmonroe · 2006-08-10 05:01 · Score: 1

I've been working on a system to update and query the Cyc database using plain natural language descriptions and queries. There wasn't much interest from the Cyc community back then, so I began focusing on Semantic Web databases. I wonder if there's anyone working on exposing Cyc knowledge as RDF triples.

Re:Natural Language Interface for Cyc by johndcyc · 2006-08-10 06:02 · Score: 2, Insightful

Yes, we are. It will probably be published next week. OWL, specifically.
Re:Natural Language Interface for Cyc by sdmonroe · 2006-08-10 06:57 · Score: 2, Interesting

Cool. I'll look out for it. Maybe it's time to blow the dust off my old NL -> CycL program for Cypher and release an alpha.

Don't be alarmed. Be very, very frightened by Moraelin · 2006-08-10 05:05 · Score: 4, Interesting

I, for one, welcome our new OpenCyc overlords.

Don't be alarmed, Arthur Dent. Be very, very frightened.

Human thought is a rather complex thing, that don't always appear to follow logical patterns or rules. Or not the simple "if I want X, I must do Y" clear-cut rules that nerds everywhere expect. Human thought is a complex attempt at balancing the priority of not only "I want X", but also stuff like "but it would be socially bad to be seen doing Y", and "I could do Y1 instead, but that's way more effort than I can be arsed to do today", and "it would be nice to have time left to do Z too today, or the missus will blow a gasket", and quite often "actually I don't really want X, I want Z, but it would be uncool to admit that." It's not just following rules and logic, it's trying to fit it all in a complex scheme of priorities, social rituals, and whatnot, and most often boiling down to finding the least crappy compromise in that space.

In other words, whenever you find yourself thinking, "meh, people/men/women/engineers/PHBs/whatever are so stupid/illogical/whatever. If they want X, they should just do Y", chances are it's not them who are illogical. It's you who don't understand their personal version of that maze of priorities and rituals. Or what is the real Z they're after, when they say they want X.

Most of those things aren't even at a conscious level. Even if you poll people along the lines of "if you wanted X, would you do Y?", you'll get an answer that's most often useless. For starters it will be heavily skewed towards what they'd like to think of themselves, not what they'd actually do. Second, without providing a _lot_ of context, it will bypass most of those priorities and rituals that might override that in practice.

What's the point of this whole rant? That the first AIs trained by humans will inherently be a dud.

If you make an AI that functions by precise, inflexible rules, congratulations, you've just programmed OCPD. Literally.

Add a lack of perceptions of human reactions, feelings, body language, etc, and you've given it Autism too. Again, pretty literally.

I.e., I'd expect the first few AIs, or even generations of AIs to be... well, don't think the lovable R2D2 or the essentially human C3-PO, but an electronic equivalent of the most obnoxious socially-dysfunctional kind of geek.

If you want that as an overlord... I don't know, I hope I'm not around at least.

--
A polar bear is a cartesian bear after a coordinate transform.

Re:Don't be alarmed. Be very, very frightened by Anonymous Coward · 2006-08-10 05:47 · Score: 1, Funny

That can be arranged.
--
Cyc
Re:Don't be alarmed. Be very, very frightened by Anonymous Coward · 2006-08-10 05:58 · Score: 1, Interesting

If you make an AI that functions by precise, inflexible rules, congratulations, you've just programmed OCPD. Literally.

Good. An AI needs OCPD. A computer cannot be allowed to get bored; it spends most of it's time sitting around, waiting for humans to interact with it. An AI that doesn't like waiting is an AI that's fatally flawed.

Add a lack of perceptions of human reactions, feelings, body language, etc, and you've given it Autism too. Again, pretty literally.

Again, good. Austic people tend to be very precise, reliable and predictable, except when you trigger an accidental temper tantrum. Don't give the AI the means to have temper tantrums, and you've got a reliable person that's smart enough to understand what you want, and to do what it's told, but not fall prey to all the emotionalism that so strictly limits human potential.

You don't want a computer that can get bored or throw temper tantrums. But you do want one that can deal with unusual crisis situations in a calm and level-headed manner. An austist with OCPD linked to stellar job performance is exactly what you need for, say, an air traffic controller. Give the AI risk management and problem solving skills without all the panic and mental breakdowns that a human controller is subject to, and you'ld have something a lot better than what we have now.

Remember: we already have human thought. We don't need machines to think like humans; only machines that can understand and obey humans. We can make humans already.

I can't even get it to work... by jrothwell97 · 2006-08-10 05:09 · Score: 1

...Firefox is returning a timeout error page. Oh dear, I hope they get their 'try online' server fixed faster than it took to get the app itself out...

--
Those using pirated Tinysoft signatures(TM) are a real threat to society and should all be thrown in jail.

Not really open source? by dthulson · 2006-08-10 05:10 · Score: 4, Informative

According to this FAQ entry, it's not fully open-source...

Re:Not really open source? by KnowledgeBug · 2006-08-10 06:55 · Score: 1

True, but the full ontology is! And that provides quite a lot of information I can't get anywhere else.

20 years? by mrkitty · 2006-08-10 05:13 · Score: 0

Time to get a new project/product manager me thinks!

--
Believe me, if I started murdering people, there would be none of you left.

self awareness by nuzak · 2006-08-10 05:16 · Score: 2, Interesting

"So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database?"

How about putting that question to Opencyc?

--
Done with slashdot, done with nerds, getting a life.

AI needs a game environment to work by Anonymous Coward · 2006-08-10 05:19 · Score: 0

"Cyc isn't a waste, but you need to do something harder to make it into AI, you need to attach 3d objects to every noun, and apply 3d actions to every verb, etc. I'd say that'd be on the realm of next to impossible, so yeah what they've done really doesn't advance AI at all."

Like a game?

Seems like... by Aquatic · 2006-08-10 05:21 · Score: 0

'How to set up a web server to withstand the /. effect' is not a matter of human consensus yet.

--
Use your Blackberry Pearl as a Bluetooth Modem in OS X

I, for one, ... by gwayne · 2006-08-10 05:22 · Score: 1

welcome our new Cyberdyne Systems AI overl...awe dammit!

Re:Mining Wikipedia? Yes, we are. by johndcyc · 2006-08-10 05:26 · Score: 2, Interesting

We are starting to mine Wikipedia at the Cyc Foundation (cycfoundation.org, sorry not much of a website yet), which is an independent non-profit org that's working closely with Cycorp. We're managing the growth of the public knowledge base. Linking Wikipedia article titles to Cyc concepts is one of the first things we're doing. That will grow the set of concepts, and it will also create a way to browse and search Wikipedia conceptually, such as letting you look for a list of all articles about parks west of the Rockies that contain bears.

We're also working on creating Semantic Web compatible URIs for the all of the Cyc terms.

Anyone who wants to join the Cyc Foundation can contact me: johndcyc at cycfoundation.org.
Check the schedule of Skypecasts at Skype.org. We can add you to the chat, but you probably won't be allowed to talk UNLESS you have a USB microphone or headset.

You can also listen in on our Skypecast tonight. It's every Thursday at 9:30pm EST, 8:30 CST.

business application by lawpoop · 2006-08-10 05:34 · Score: 2, Interesting

I think one place where Cyc or similar types of knowledge engines could really shine is in business. A business model is vastly simpler then the model of reality that people carry around in their heads; and one benefit that Cyc has is that it understands *everything* -- it is integrated by default.

So once it gets basic understanding of accounting, inventory, retailing, management, logistics, etc., you could easily build a natural language interface to it: "Three boxes arrived today from supplier X and we paid $90 for them". If there is ambiguity in the sentence, Cyc would ask natural language clarifying questions: "Was each box a line item on the invoice, or were there many line items?"

I think this would be much improved over the current data-interfaces we have today, which are basically graphical recapitulations of paper-based forms in the format of "field: [value]".

Another problem with modern apps is that they all contain their own internal, add-hoc ontologies. These ontologies are hard-coded, and usually aren't designed to intergrate with ontologies in apps from different domains -- e.g. logistics and accounting (unless they are from the same vendor). Cyc has a standardized, presumably well-thought-out and near comprehensive ontology. It can also grow its ontologies based on user input. So you have this automatic integration feature that's sorely lacking in the end-user computer world.

--
Computers are useless. They can only give you answers.
-- Pablo Picasso

Lenate should fund the Hutter Prize by Baldrson · 2006-08-10 05:36 · Score: 1

If the Cyc knowledge base actually models human "common sense" then the first thing Lenat should do is donate to the Hutter Prize for Lossless Compression of Human Knowledge or at least compete for the existing 50,000 euro prize.

See Matt Mahoney's description of Marcus Hutter's proof that compression is equivalent to general intelligence.

--
Seastead this.

Corrected slowly over time by RareButSeriousSideEf · 2006-08-10 05:36 · Score: 1

"...it could *be* corrected slowly over time."

Sorry, couldn't stop myself.

--
Pi Ran Out

Isn't Cyc more of a training system? by Colin+Smith · 2006-08-10 05:41 · Score: 1

i.e. it isn't meant to be part of the A.I. system itself. Rather it's meant as a reference or teaching system for any AI systems which are developed.

--
Deleted

Re:Unfairly... You're right! Join us! by johndcyc · 2006-08-10 05:47 · Score: 2, Interesting

I agree with everything you said, and we at the Cyc Foundation are working to fix the accessibility problem.

The Cyc Foundation is a new independent non-profit org. I worked at Cycorp for 7 years before forming the Foundation with a co-founder that has a totally outside perspective. We're very optimistic about the progress being made. We've got about 2 dozen people helping so far, and that's before we've made anything available (such as the Web game we're working on) that will allow for much broader involvement.

Listen in on our Skypecast tonight (every Thursday night) at 9:30pm EST. Look for it on the list of scheduled Skypecasts at skype.org. You can participate if you have a USB microphone or headset.

Google vs. OpenCyc by viking2000 · 2006-08-10 05:49 · Score: 1

I think google has surpassed OpenCyc by orders of magnitude in knowledge. You can do a lot of correlation searches etc. This can be used for language translation, as a dictionary (As seen when you misspell your google search term), and general knowledge.

All that is missing is a good frontend, to translate your questions into a few million searches.

--
"Fix it"

Re:Google vs. OpenCyc by bbouldin · 2006-08-10 06:52 · Score: 1

Doug Lenat, founder of Cycorp, gave a tech talk recently to Google staff, explaining why they aren't quite there yet:
Computers vs Common Sense

http://video.google.com/videoplay?docid=-770438861 5049492068/
Re:Google vs. OpenCyc by KnowledgeBug · 2006-08-10 07:02 · Score: 1

"... All that is missing is a good frontend, to translate your questions into a few million searches...." And what you'll get back is a few million responses, most of which contain information related to the words in your question. If you're looking to retrieve information about a topic, Google away. And, yes, many questions are exactly this type of information need. But if you're looking for an answer to a question, you may or may not find a single document that happens to contain it. Oh, and be able to find it amongst the millions of returned answers. While Cyc won't beat Google at the retrieval game, it is aiming to answer questions that require the kind of logical connections that are often easy for people to do but hard (or currently impossible) for (other) machines.
Re:Google vs. OpenCyc by viking2000 · 2006-08-10 08:11 · Score: 1

"...But if you're looking for an answer to a question, you may or may not find a single document that happens to contain it....

You minsunderstand. what you do is for example a "google closeness search" or a "Google Mindshare", or finding the Normalised Google Distance (NGD) and mapping this.

You will not get any search results, it will more let you, for example, play "20 questions" like a master.

Basically, for example, the NGD quantifies the strength of a relationship between two words. For example, "speakers" and "sound" are more related than "speakers" and "elephant." Instead of creating this manually, of course, they find the Google PageCount when both words are used together in a search. ("Speakers" and "sound" would have a relatively high number of result pages when compared to "speakers" and "elephant.")

Now when you repeat this process of finding the NGD for a lot of words, you can build a multi-connection word map. This automatic meaning extraction can well be the way to make a computer understand things and act semi-intelligently. Apply this to all text produced by for example the U.N., and you will have a front end that can produce good(?) quality UN papers automatically.

Here are some more detail on this example(Automatic Meaning Discovery Using Google):
http://www.arxiv.org/abs/cs.CL/0412098

This is of course insanly compute intensive, but google have enormous supercomputers available to you for free.

--
"Fix it"

Unfortunate naming... by Anonymous Coward · 2006-08-10 05:50 · Score: 1, Funny

May I point out that the naming of this technology might be somewhat misread. Namely in polish "OpenCyc" would mean "OpenTit"... Well how does that sound to you? :D

Re: It's not either/or by johndcyc · 2006-08-10 05:52 · Score: 2, Informative

We plan to exploit the N-grams in our knowledge collection work at the Cyc Foundation.

If you hadn't seen me mention it already :-), you can join our Skypecast tonight.

this is great! by Anonymous Coward · 2006-08-10 05:54 · Score: 0

i always wanted to have a knowledge base and commonsense reasoning...

More than a database by flink · 2006-08-10 05:56 · Score: 2, Interesting

I remember cyc from an old (early 90's) PBS doumentary series about computers called The Machine that Changed the World. IIRC, cyc isn't just a database of facts, it's also an engine for making inferences based on those facts. The researcher on the show said that every morning they would come in and read the list on new inferences cyc had generated overnight and fix the incorrect ones and then start inputting new information. One amusing example they gave was that since most of the individuals they had told cyc about were historical figures, it inferred that most people were famous.

Re:More than a database by Aradorn · 2006-08-10 06:12 · Score: 1

I suggest reading What computers still cant do.

http://www.amazon.com/gp/product/0262540673/103-50 82900-3367853?v=glance&n=283155

Cyc is indeed a very interesting take on NLP but it still has a human element involved and until we can eliminate that from the equation the internet will continue to grow faster than we can process it.

Re:mod +1 "Rim Shot" by OldeTimeGeek · 2006-08-10 05:58 · Score: 1

Straight from the Urban dictionary, read the definition of Rim Shot.

I don't live in Urbia, you insensitive clod!

Lest ask Eliza.... by GeneralEmergency · 2006-08-10 06:07 · Score: 1

"So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database?"

Eliza responds: (http://www-ai.ijs.si/cgi-bin/eliza/eliza_script)

"Would you like it if they were not these the fledgling footsteps of an emerging ai or just the babbling beginnings of a bloated database?"

Now, if we could only get these two wacky kids together...

--
"A microprocessor... is a terrible thing to waste." --
GeneralEmergency

Facts without Context by raftpeople · 2006-08-10 06:11 · Score: 1

The reason humans are able to use the "facts" we have accumulated over the years for problem solving (intelligence), is because the facts are intertwined with our experiences and our mental model of our world. This mental model is absolutely critical to be able to extrapolate information from any given "fact."

For example, when someone says "it's raining" and you are about to take a walk, your brain is able to conclude you will get wet due to the underlying understanding of the physical environment and the ability to project/simulate a future scenario where your body is not standing under the cover of a roof.

IMHO, text based databases which attempt to solve this problem without supplying a system trained in all of our human experiences and interactions with the physical world, will fall far short of our desires for AI. By having rules and relationships between the facts it would appear that such a system is in place, but in reality it is an attempt to enumerate the possibilities we encounter in the real world instead of supplying an underlying model that can extrapolate those relationships from a lower level "base" understanding of the physical environment.

Re:Facts without Context by Tablizer · 2006-08-10 17:05 · Score: 1

IMHO, text based databases which attempt to solve this problem without supplying a system trained in all of our human experiences and interactions with the physical world, will fall far short of our desires for AI.

It is not an all-or-nothing situation. Humans use many paralell techniques to arrive at conclusions. Future AI may combine Cyc rules with other techniques, such as physical modeling, to produce better AI. One of the problems with AI is that it has not found ways to tie different techniques together to compliment and cross-check each other. I believe true AI will only come when we figure this out.

--
Table-ized A.I.

I saw this on satellite TV by Anonymous Coward · 2006-08-10 06:15 · Score: 0

> "OpenCyc is the open source version of the Cyc technology, the world's largest and most complete
> general knowledge base and commonsense reasoning engine." The Cyc ontology "contains hundreds of
> thousands of terms, along with millions of assertions relating the terms to each other, forming
> an upper ontology whose domain is all of human consensus reality."

Brought to you by his holiness Maharishi Mahesh Yogi for total knowledge and higher levels of vedic consciousness.
http://mou.org/mou/overview/02.html

ugh, I need an Aspirin.

Meanwhile by DrYak · 2006-08-10 06:27 · Score: 4, Insightful

move to a variant of SemanticWiki [...] If semantic statements become the standard, Wikipedia can be queried, which means that Cyc could be fed the data automatically.

Meanwhile google happily eats whatever crap its spiders manage to find and thru some hacking and dark magic algorithms is still able to give not so meaningless answers to not to much badly worded queries.

That's a key point explaining why OpenCyc came too late. Wordnet, Thoughtreasure, Cyc et alii all share a set of common drawbacks. Their input data need to be specially formated. That's why all those overly ambitious project have progress so slowly in the past years, and are still only limited to answers precise non-ambous simple question like "Is a cat a mamal ?".
This is linked to their fundamental design around a solid, non-flexible, pure logical architectures (reading their repective Wikipedia entries help understand how they work). In a way, the scientist behind those projects tryed to apply the same kind of language logic that is used in maths and programming languages to human language, and while this may be usefull for some academic purpose or very specific application were some reasonning may be useful (which has been used and applied well - I've seen it at least for WN and TT), they don't scale that well to REAL-WORD(tm) situations.
Their fundamental structure clashes with reality of human reasonning : WordNet is limited to single non-ambigous meaning for terms (no things like "nut" as in the seed, and "nut" as in the thing that can be screwed on a bolt). Other "stuctured" designs clash with real life's fuzzy nature with the other softwares.

Meanwhile search engines have grown in a completly different way. Initially they were designed only to scan pages content and then index their keywords for later queries. Only after that, slowly, one hack after another, they where tuned. In order to make results more revelant. In order to avoid link farms. Finding some complexe strategies in the ranking calculation to return more correct and more meaningful. To find results not with matching keyword, but with related keywords (Google's "Keyword is encountered only in page linking to thig target"). To cope easily with bad spelling (something that is very common in the real life. Something that is difficult to even detect for a common-sense engine. something that is very intuitive in search enginges, and that is even more optimisable given the statistics that such engine can do). And lot of other small ponctual improvement.
And slowly, by on one hand having a system that gets each day a little bit more optimised, and, on the other hand, an incredibly huge corpus to process that grows at a very fast rate, the search enginges, like google, become fantastic multipurpose information retrieving tools.
By now, you can type crap in google and still get something (as long it's not a "google-sepuku" like of crap, but more of "I'm very clumsy with my wording and my keyboard-skills"). You can have also other wonderful information, including stats on spelling errors or even statistic based translation (that are otherwise very difficult to get by classical mean), static about currently hot topic (which can be fed back to improve results for ambigous queries).
All this because search engines are built around a fuzzy logic : at the core is a braindead simple indexing rule, slightly modified by a bunch of hacks.
Such fuzzy logic approach "without really needing to teach the machine everything" has been recently successfully used on

--
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]

Re:Meanwhile by KnowledgeBug · 2006-08-11 07:23 · Score: 1

Your points are well taken, but I think still slightly miss the purpose of Cyc. Have you, in fact, entered "Is a cat a mammal?" into Google? With the quotes, it returns 3 hits; without the quotes, about 1.5 million. In neither case is it apparent what the answer to the question is. More importantly, even if one or more of the top articles DID contain the answer, what will it take for some other computer program to be able to understand the answer in some usable form? The goal of Cyc is to get this kind of information (and all the things inferrable from it) into a machine-usable form, not to be a replacement for the wealth of human-readable information now available on the web. Yes, the two need to interact and will be interesting to see how that plays out, but it's a mistake to think that just because you as a human can sift through google responses, however good they may be, that a computer (software) can make use of those responses in the same way. If (when?) computers can read and understand natural language text, then it will be a different ballgame indeed. One of the arguments for something like Cyc is that there's a high-level of knowledge that computers must have, in a form they can use, before they stand a chance of reading and comprehending unstructured stuff.

heh by Aradorn · 2006-08-10 06:28 · Score: 1

it inferred that most people were famous

heh yeah it also asked the question if Abraham Lincoln was at the White House if his hand or foot was there with him. Because they started input data from encyclopedias and had not put in the data describing what a human was yet.

I also wanna say there was something about it posing a question about religion and langauges that was eventually used to write a master's thesis

Eliza and OpenCyc by DragonWriter · 2006-08-10 06:32 · Score: 1

Now, if we could only get these two wacky kids together...

Well, Eliza's a bit old for OpenCyc, but AIML (a generalized language for Eliza-stye chatbots, so in a way Eliza's descendant) has been set up with OpenCyc already with Project CyN.

DHS Transcript by jo42 · 2006-08-10 06:46 · Score: 1

Spook: Where is Bin Laden?

OpenCyc: Bush's Ranch in Texas.

bloated database by Bizzeh · 2006-08-10 06:47 · Score: 1

AI by definition has the ability to learn for its self, what we have here is just a large database of human input, nothing that OpenCyc has found for its self.

--
portfolio

Re:bloated database by Anonymous Coward · 2006-08-11 09:54 · Score: 0

Too bad they didn't spend all that time making a decent heuristic. But then, they want *correct* answers; AI wants to be free!

AI needs monkey bodies to work! by Profane+MuthaFucka · 2006-08-10 06:50 · Score: 1

Who wouldn't love an AI monkey?

--
Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!

Where do I cash in my old Mind Pixels by Anonymous Coward · 2006-08-10 06:57 · Score: 0

The only thing that will save CYC is a computer that is intelligent enough to understand its importance.

I just had a bizzaro idea by Rhinobird · 2006-08-10 07:03 · Score: 1

Maybe I was inspired by that Hyperactive Bob/robotic fast food story from yesterday. Could Cyc be used to aid/automate education? Some of the most effective teaching techniques involve a guided exchange of questons between the student and teacher. Could Cyc be modified to ASK questions? Could Cyc be used to quantify what students are learning?

All this time and effort was spent to educate a computer, can we dump that knowledge back into young uneducated humans?

--
If Mr. Edison had thought smarter he wouldn't sweat as much. --Nikola Tesla

A stone image can be no better than its makers... by 3seas · 2006-08-10 07:17 · Score: 1

And what does this say about the architect and contributors to opencyc?

they ain't got no common sence!

Hmmm, some how that seems inherent in such an undertaking.

This little AI by Anonymous Coward · 2006-08-10 07:21 · Score: 0

Practically drives itself crazy.

--
Automation applied to an inefficient operation will magnify the inefficiency.

Re:Mining Wikipedia and other online reference sit by AwenAnam · 2006-08-10 07:24 · Score: 1

And at the same time, the informality and errors in the data might introduce a human factor on the engine.

Use and Abuse of Goedel's Theorem by oblivion95 · 2006-08-10 07:25 · Score: 1

To find out when Godel's Theorem really applies, read "Godel's Theorem: An Incomplete Guide to Its Use and Abuse", by Torkel Franzen. Even Roger Penrose and Stephen Hawking have gotten it wrong, so don't feel ashamed.

Problem? by Anonymous Coward · 2006-08-10 07:37 · Score: 0

I keep wondering why do we need to input this type of redudent information into a computer?

The ideal AI would be able to assert these conclusions for itself in much the same way as a child learns to associate "Mommy" with alive and lady and kind and fun... In other words couldn't this whole project be automated for the most part?

Again, inputing this type of data into a computer just seems so backwards to me.... 1960s pseudo SCI-FI..

The computer should be making these assumptions for itself, then asking or being able to question any ambiguities only after it has become confused.

Re:Problem? by Yvanhoe · 2006-08-10 21:06 · Score: 1

In other words couldn't this whole project be automated for the most part?

The project is automated for the most parts. Most of its knowledge will come from inference from the knowledge entered by users. It would be like a deaf and blind child that can't use its senses to learn about the world but that could interact in a written form.

If you tell it "a grape is a fruit" it will infer "a grape is food" therefore "a living animal can eat a grape". This is simple inference but Cyc has a lot of more high level knowledge.

--
The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.

Here we go again... by CopaceticOpus · 2006-08-10 07:50 · Score: 4, Funny

So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database?

Would you like it if they were not these the fledgling footsteps of an emerging ai or just the babbling beginnings of a bloated database?

as an overlord? by Moraelin · 2006-08-10 08:16 · Score: 2, Interesting

If you're going to use that AI as a tool, yes, ok. But the post I was answering to was the usual "I, for one, welcome our overlords."

And trust me, you _don't_ want an overlord that's inhumanly logical about it. It's that kind of thing that led to such logical solutions as "let's extermine the population of Poland until 1970 to make room for German settlers." Or such logical solutions as communism. Sure, on paper it's perfectly sound and logical, if you assume that you can change humans overnight. Maybe sometimes being able to understand humans actually helps, eh?

That said, most of the stellar job performance that OCPD cases claim exists only in their own mind.

They tend to never get a job done because it's not yet perfect, for example. I have one two rooms from me at the office, who's taken three fucking years just to get a build script done because everything wasn't perfect enough for him. No exaggeration. Literally. Well, in parallel with building a convoluted unit testing environment, because the existing one didn't satisfy his purist view of the matter. (The old tests had some functional testing too. So his perfect version actually tests less, but is _pure_ unit testing, by his own definitions of it.) Of course, he's convinced that he's done a stellar, uncompromising job, but for everyone else he's just wasted some time and didn't even achieve more than what we already had.

Do I really want that even in a computer? Nope, not really. _The_ problem with most programs nowadays is just that: that they're OCPD nutcases. Workflows that were a lot more flexible (even if not as fast) with a pen and paper, get shoehorned into some lobotomized set of rules that allows no exceptions. The problem is that most often the rules aren't actually what the user wants to do: e.g., you end up unable to save a new client's data until you know their fax number, whereas with a paper form you'd fill in the data you have and leave the rest for later. Often it's more annoyance for the users and more work in workarounds, than doing it without a computer in the first place. (Of course, the equally OCPD-ridden creator will then bitch and moan about "idiot lusers" and how everyone should change to fit his perfect tool, instead of his tool changing to do what the user actually needs done.)

No real qualms with autism on its own, though. They tend to be very good with a computer, or any kind of abstract problem for that matter. (If sometimes difficult to deal with in a team.)

Combine it with OCPD, though, and... well, let's just say that they mix like Ammonium Nitrate and Fuel Oil. You get some of the most obnoxious personalities that way, and it's no fun for anyone involved, not even the geek. The poor bugger can't even tell that he's the one who offended the whole room, and proceeds to imagine that he's the victim of unwarranted cruelty.

--
A polar bear is a cartesian bear after a coordinate transform.

A standard for bogosity by senahj · 2006-08-10 08:40 · Score: 1

AI is bogus.

See The Jargon File entry for micro-Lenat
http://catb.org/jargon/html/M/microLenat.html

For a more literary perspective on the attempt
to imbue machine intelligence with common sense,
see _Galatea_2.2_ by Richard Powers,
http://www.amazon.com/gp/product/0312423136/sr=1-1 /qid=1155242163/ref=pd_bbs_1/103-4246079-1703018?i e=UTF8&s=books
---
He's no fun; he fell right over.

--
Wait a minute. Didn't I say that on the other side of the record? I'd better check ...

Re:A standard for bogosity by Anonymous Coward · 2006-08-10 08:51 · Score: 0

Desiderata

Go placidly amid the noise and haste,
and remember what peace there may be in silence.
As far as possible without surrender
be on good terms with all persons.
Speak your truth quietly and clearly;
and listen to others,
even the dull and the ignorant;
they too have their story.

Avoid loud and aggressive persons,
they are vexations to the spirit.
If you compare yourself with others,
you may become vain and bitter;
for always there will be greater and lesser persons than yourself.
Enjoy your achievements as well as your plans.

Keep interested in your own career, however humble;
it is a real possession in the changing fortunes of time.
Exercise caution in your business affairs;
for the world is full of trickery.
But let this not blind you to what virtue there is;
many persons strive for high ideals;
and everywhere life is full of heroism.

Be yourself.
Especially, do not feign affection.
Neither be cynical about love;
for in the face of all aridity and disenchantment
it is as perennial as the grass.

Take kindly the counsel of the years,
gracefully surrendering the things of youth.
Nurture strength of spirit to shield you in sudden misfortune.
But do not distress yourself with dark imaginings.
Many fears are born of fatigue and loneliness.
Beyond a wholesome discipline,
be gentle with yourself.

You are a child of the universe,
no less than the trees and the stars;
you have a right to be here.
And whether or not it is clear to you,
no doubt the universe is unfolding as it should.

Therefore be at peace with God,
whatever you conceive Him to be,
and whatever your labors and aspirations,
in the noisy confusion of life keep peace with your soul.

With all its sham, drudgery, and broken dreams,
it is still a beautiful world.
Be cheerful.
Strive to be happy.

Max Ehrmann, Desiderata, Copyright 1952.
Re:A standard for bogosity by Yvanhoe · 2006-08-10 21:00 · Score: 1

Well, I'm less in the camp of "let's give them common sense !" right now, but considering that no one has ever made a working AI, I find it postperous to say that a particular way of doing is bogus. I think Lenat is not liked by the research community because he left it in order to make CycCorp. From what I see, he made a piece of software that worked a bit, called Eurisko, published a few papers on it and was kept asking "but how did you chose this parameter" "Why did you chose 10% as a synonym for 'a bit of'", he couldn't just answer "I just fiddled with the values and hacked something that worked", it would have taken him years to demonstrate everything, he prefered to leave the academic research and founded a company.

I know this sounds a lot like a fake biography of some marketers out there, but a year ago I got pretty interested in the workings of Eurisko and it made me see the progression of the work of D. Lenat. I really think he is a good inventor and a bad researcher, he can make things work but this sometimes requires some non-scientific kick-it-with-a-hammer skills that can not ever get published

--
The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.

not ai by bitspotter · 2006-08-10 08:50 · Score: 1

" So are these the fledgling footsteps of an emerging AI?"

No. It's probably far, far more useful.

Re: Get it by TaoPhoenix · 2006-08-10 09:42 · Score: 1

I am rather sad to see all the jokes-for-mods on this topic. This is such a fundamental project, with a critical head start (1985ish beginning). What Doug Lenat is trying to do is build the "ridiculously easy" base behind life. "If I put my fan on top of the counter, (unless it's out of balance and wiggles its way off) the fan will continue to stay there."

I understand the "6000 concepts" to be these "easy" ideas that we take for granted. Then anyone in the world can make "modules" for specific branches of knowledge. If there's an intelligent integration system, This could really grow within 10 more years.

"I want to read a fun Science Fiction story".
---> Do you like series? (Y)es / (N)o
"No, I hate Star Wars and Star Trek"
(Processing: User Emotional Matrix Mod Star Wars -3, Mod Star Trek -3)
---> Name an example of a Science Fiction story you found 'fun'.
"I liked Cordwainer Smith 'Game of Rat and Dragon' "
(Processing: Offer counterpoint potential example from same author)
---> Did you like 'Scanners Live in Vain' by the same author?
"No. Too confined, too creepy."
---> Recommendation from Same Author?
"Yes"
(Processing: Characterization +2, Location-Scope +4, Language.Grandeur +1)
---> Try 'The Burning of the Brain'

Except attempts to "stump the bot", linked modular expert systems will eventually prove extremely competent, and force us to decide what abilities lie outside the range of expert systems.

--TaoPhoenix

--
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine

Sentience by Paolone · 2006-08-10 10:42 · Score: 1

Unfortunately, however this is still a long way from sentient AI. Something you could literally talk to and it would be correct in factual based questions 99% of the time and be able to think abstractly.

To achieve sentience it doesn't have to be correct, it has to to believe to be correct.

Mixing apples and oranges and metaphors by Maxmin · 2006-08-10 11:09 · Score: 1

Both have a place. Neural nets, in all their variety, have a long, loooong way to go before acquiring enough resolution to have the ability gather the same level of understanding that Cyc has. Nets are usually highly specialized, where Cyc has tremendous breadth and depth.

Nets learn, but Cyc is taught. Do you push a newborn infant out into the world and expect it to acquire all it needs to know in order to be a successful organism? Of course not. And Cyc lacks the inverse, because it's just a predicate base.

Could be there's a way for them to play together.

--
O lord, bless this thy holy hand grenade, that with it thou mayest blow thine enemies to tiny bits, in thy mercy.

Values != emotion by Namarrgon · 2006-08-10 11:50 · Score: 1

I certainly agree with you about the importance of assigning values, but emotions are only one way of doing that, and a fairly abstract way at that (they're a combination of many other values, weighted by the individual's personality).

Other value systems include "threat level" (very popular in the animal kingdom, and important for self-preservation for any entity) - objects like "dynamite" can be assigned a higher threat value, which will focus attention. "Relevant resources" are another; any objects that are considered useful for growth (this can include interaction with other entities). "Cost" is an obvious one, also "uniqueness/replacability". There are many others, some more relevant to humans (such as "aesthetics" and "humour").

An association database like Cyc can then make deductions from an initial set of values. For example, if it is told that "dangerous == high threat", and "explosions are dangerous", it then classes all explosion sources as threatening, and will not be so blasé about dynamite in the future.

--
Why would anyone engrave "Elbereth"?

AI versus Real Intelligence by rocker_wannabe · 2006-08-10 12:20 · Score: 1

I'm afraid that most people will ALWAYS be disappointed in AI for several reasons. As the pace of society increases, people in general seem to take less time to think before asking their questions and usually get bad answers because of it. If the question is specific enough then it's the slurring of the speech, or the use of jargon, or a colloquialism, or background noise that throws off the listener.

If humans are the "gold standard" for understanding another person then AI can't do any better. A computer could make things worse by having access to TOO MUCH information. It would need to know more of the situational context before it could answer a question because of all the possible duplicate meanings that only a massive database would offer.

I would be happy with AI that was geared towards specific areas like medicine or art. That would narrow the context greatly and avoid annoying the user by not having to ask a bunch of contextual questions first.

"I'll admit that you're always right .... if you'll admit that I'm never wrong."

--
"Meaningless!, Meaningless!" says the Teacher. "Utterly meaningless!"

not quite.. by novus+ordo · 2006-08-10 12:21 · Score: 1

What does Cycorp mean by saying that the knowledge base will be "open source"? Will it be publicly available? Will it be free?

Yes, OpenCyc may be freely copied, distributed and used for commercial or non-commercial purposes according to the terms of the OpenCyc license. OpenCyc is currently released under the GNU Library or "Lesser" Public License (LGPL). "Source code" in this license refers to the CycL assertions in the OpenCyc Knowledge Base. Qualified parties can obtain a free license to a substantially larger subset of the Cyc Knowledge Base known as ResearchCyc (when it becomes available, Q203?), which is for R&D use only. The complete Cyc Knowledge Base can be licensed from Cycorp, Inc. for commercial use. Terms for licensing the complete Cyc KB are negotiated on an individual basis. Year by year, each assertion in the latest version of Cyc will migrate to a subsequent release of ResearchCyc, and each assertion in ResearchCyc will migrate to a later release of OpenCyc.

Not exactly "full"

--
"You're everywhere. You're omnivorous."

Is it just me? by Arrgh · 2006-08-10 13:02 · Score: 4, Interesting

I've downloaded and installed OpenCyc 1.0, it works fine (after quite a long initial startup delay and with enough swap) on a 2GB machine. I've been playing with it for a couple of hours, and I have a question.

I've created the following constants for my cats, their sibling and parents:
- #$Comet-TheCat
- #$Rocket-TheCat
- #$Packet-TheCat
- #$Mama-TheCat
- #$GhostDad-TheCat
I've asserted (#$isa [cat] #$Cat) about all of them.
I've asserted (#$biologicalMother [cat] #$Mama-TheCat) about Comet, Rocket and Packet
I've asserted (#$biologicalFather [cat] #$GhostDad-TheCat) about Comet, Rocket and Packet as well.
I even created #$ConceptionOfKitties, asserted (#$isa #$ConceptionOfKitties #$BiologicalReproductionEvent), (#$parentActors #$ConceptionOfKitties #$Mama-TheCat) and (#$parentActors #$ConceptionOfKitties #$GhostDad-TheCat).

So why can't Cyc infer that (#$siblings #$Comet-TheCat #$Packet-TheCat)? Is it a limitation in the public subset of the ontology, or some more fundamental issue with my data?

Re:Is it just me? by Anonymous Coward · 2006-08-11 03:49 · Score: 1, Interesting

Your query ought to be provable via the following rule:

(implies (and (children ?U ?X) (children ?U ?Y)) (or (equals ?X ?Y) (siblings ?X ?Y)))

("If two different children have the same parent, then they're siblings.") However, my understanding is that OpenCyc doesn't include the many thousands of Cyc rules (though ResearchCyc does). Try asserting this rule in your version of OpenCyc and asking the query again. Also, when asking this query, be sure to allow for a transformation step.
Re:Is it just me? by Arrgh · 2006-08-11 05:17 · Score: 1

Ah, yes, that does the trick, thanks!
Re:Is it just me? by Arrgh · 2006-08-11 05:21 · Score: 1

What distinguishes a rule from any other assertion?
Re:Is it just me? by Anonymous Coward · 2006-08-16 07:07 · Score: 0

A rule is simply a universally quantified statement, such as "all dogs are mammals" or "everyone in this room owns a car". Typically they have the logical form (x)(Fx => Gx): "for all x, if x is F then x is G".

The Question by WilliamSChips · 2006-08-10 14:24 · Score: 1

What do Douglas Adams fans say that the answer to life, the universe, and everything is?

--
Please, for the good of Humanity, vote Obama.

Dammit by daecabhir · 2006-08-10 16:47 · Score: 1

Me without mod points again... I giggled my ass off on this one...

--

-- daecabhir (this mind intentionally left blank)

"Check those URLs!" by Anonymous Coward · 2006-08-10 19:23 · Score: 0

Wordnet, Thoughtreasure, Cyc

Desiderata by Anonymous Coward · 2006-08-10 19:32 · Score: 0

Wow, I had no idea this was written so long ago.

To paraphrase another great poet:

Still boring after all these years.
Paul Simon

Open Cyc and the gate by DErcyldonne · 2006-08-11 02:03 · Score: 2, Interesting

Taken as criticisms, the allusions to 'bloat' and 'database' are both significantly wide of the mark: if Cycorp has been guilty of anything, it's historically underestimating the size and technical complexity of the knowledge base indicated for the common sense reasoner the company aspires to build. OpenCyc is not a database except in the most attenuated sense: it encodes, not instance-level facts, but quantified and contextually parameterized rules for reasoning about the everyday world, and it is, if anything, far too small for this purpose. The number and complexity of the rules needed for this is fairly staggering and too-little-appreciated, even by many in the AI community though not Minsky and McCarthy, both of whom are on the record as having recognized Cyc as one of the very few efforts in the field that was on anything like the right track). It's also fair to say that efficient and suitably flexible inference over a knowledge base of this size and complexity, and automated induction of new reasoning rules on the basis of experience - both obvious prerequisites for what Cycorp has been trying to do - present significant and partly unsolved theoretical challenges. The company's surprising willingness to tackle such weighty and potentially intractable issues head-on is a thing greatly to be commended in the present season of intellectual and commercial timidity, and even though they may not have always been able to deliver on every promissory note, one can't help but admire their spirit. And the fact remains that OpenCyc is now being used by an enthusiastic community of unaffiliated developers who are busily laying the groundwork for a new suite of open source applications. Judgement should not be pronounced on the basis of their efforts before they have been given the chance to see what they can deliver.

NO CARRIER trojan by sunny256 · 2006-08-11 09:07 · Score: 1

That's a bit of some old 80's modem humor. People dialed into a BBS or serial terminal with a VT emulator in those days. If you were disconnected because of some line noise you'd see garbled characters and then the NO CARRIER message from your own modem.

I was one of those. Which resulted in a BBS message from a slightly annoyed user. He had this over-intelligent communication program which assumed the connection was lost when it encountered that line. So it hung up the modem.

Re:mod +1 "Rim Shot" by StressGuy · 2006-08-11 10:53 · Score: 1

I vote for #1 :)

--
A goal is a dream with a deadline

Rules vs Statistics, Cycorp needs users by arthernan · 2006-08-15 15:14 · Score: 1

Cyc and hence the open version of it OpenCyc is a rule based AI system. Certain degree of AI is already available OCR, Speech recognition, even Google has some smarts in it. All these systems are mostly statistical. Many have intelligence built into the model design, but the actual numbers that make up the model have very little meaning. Bayesian networks to my taste capture the most info but they are still behind. But the jury is still out, what aproach will take the first mayor step towards a reasoning system. As for the statistical AI. I do buy the idea that ultimately the information that is being processed is not neccesarly as relevant as the final result. So eventualy a reasoning system can function on a purely statistical basis. And there is a chance that our brain is purely statistical. But it's development with human elements as a statistical machine it's to me unlikely. Now back to Cyc. If you download it, it will be difficult to use, you will find bugs inside it. Yes, juck they do exist. But it's a start. The biggest problem to fleshing this thing out is getting users to use it. Cycorp is no microsoft with hoardes of PR and Marketing people. And they have not documented every feature. But I think it's the most complete AI system out there. So is Cyc "the fledgling footsteps of an emerging AI?" or "the babbling beginnings of a bloated database?" I'll add my own. "a huge effort to acomplish something very difficult" "not terribly well documented" It's probably a little of each one. I personally think that there is a huge opportunity here. For people that are willing to work with Cycorp lack of support experience. Cycorp needs user input, and if/when they get it you will be surpriised of what is in the box. Everybody talks about the Killer app for AI, there is a good chance that app will have to do with Cyc. There are several people from Cycorp watching this thread. I hope I got some people interested. I would suggest you post questions if you have any.

Does it have value? by richardwatson · 2006-08-20 17:47 · Score: 1

The first problem is what we're expecting from "AI". Don't expect Cyc to do everything and be the one true answer. Also, don't expect that the people who made it think it's for that either. It's a building block, a tool, and if we're ever to make computers behave more intelligently we'll need more than one of these tools.

Cyc's value is proportional to the amount of uses it has, and the effectiveness of those uses. Now we have a line in the sand - we have a database of painstakingly constructed logical inferences - and opening up a subset is a great way to enable uses to emerge. Applications the original designers couldn't have thought of, can now be created.

When I first heard about Cyc, about 10ish years ago, I thought it was destined to be limited. But so will all individual techniques or tools in computer reasoning, and that does not mean that each has no value.

--
http://www.tudumo.com - todo list with tags

Slashdot Mirror

OpenCyc 1.0 Stutters Out of the Gates

195 comments