Domain: opencyc.org
Stories and comments across the archive that link to opencyc.org.
Comments · 44
-
Re:already been done
Cyc is a controversial project in the AI community, and I'm glad that you brought it up. I don't think anyone yet knows how to use a database of commonsense facts, which is what Cyc is (though limited - the open source version only has a few hundred thousand facts) and which is one thing NELL could create. However, researchers continue to think about ways that an AI could use knowledge of the real world. There are numerous publications based on Cyc: http://www.opencyc.org/cyc/technology/pubs.
-
still impossible for supercomputersI wish we had *any* computer today that could do the things you mention, if it cannot be done in a supercomputer forget about the handheld thingies.
The problem is that both language translation and voice-to-text need a full understanding of the context. Any spoken language has so many different interpretations that it's useless to try automatic processing without full artificial intelligence. A classic example used in AI courses is "he saw that gasoline can explode". This sentence means either "he realized that it's possible for gasoline to explode" or "he watched a gasoline container as it blew up", one needs further examination of the context to know which meaning was intended.
A project that has tried to create a solution for this problem is Cyc, but it seems to be very far yet from realizing the original intent. Computers can do amazing things, but they still don't have the common sense of a four year old child. -
Not really open source?
According to this FAQ entry, it's not fully open-source...
-
Parsing != Understanding"Time flies like an arrow, fruit flies like a banana."
In which context are you talking? Take this one: "he saw that gasoline can explode". Did he see one particular can of gasoline exploding or did he realize that it's possible for gasoline to explode?
These and many other examples of ambiguous parsing problems have been running around the AI/NLP community for decades. The simple answer to that problem is that parsing a natural language sentence depends, ultimately, on the sense of the words, which can only be disambiguated from the context. And that's why NLP is an impossible problem by itself. One cannot process natural language alone, without an understanding of the situations that NL describes.
It's possible to create NLP programs to talk about limited situations, like Eliza, which has been around since more than forty years, and several other more sophisticated programs. But to have a program that really understands natural language, one needs a program that understands the subject of the text. There are several projects to create a program like that, one of those is Cyc. -
"Fact checker" needs a KB first
"a fact checker
... that compiles things from various sources and then presents it to a human to do final checking?"
Offtopic, but you may wish to play around with FACTory, a 'game' where you anwser trivia-like questions mined from the web and other sources. If enough people agree that a statement is true, it is entered in the Cyc Knowledge Base, a quite large knowledge base suitable for natural language processing, AI and/or Semantic Web-research.
(You have also a LGPL-version, OpenCyc, and ResearchCyc -- free for non-commercial/research use.) -
Symbolic vs semantic
Basically: a symbol is a variable and can hold any value. If a system knows that Dolly is a sheep and that sheeps are animals and that animals eat, it can guess that Dolly eats. But it cannot tell if Dolly is a plane, unless someone somewhere made that relation (planes are machines, machines are not living beings, animals are living beings, so Dolly can't be a plane). They would need an unlimited amount of rules.
A human "knows" about the meaning (semantic) of the symbol "sheep". Although this has never been discussed, he could answer that a sheep will not stand still if set on fire. The question is how the human is able to tell this. He does not need a sharp line of arguments.
But maybe he simply uses an enormous amount of small rules that seem to form something more complex called semantic in the sense of the article. The OpenCyc project assumes this and tries to teach a machine millions of small rules (assertions and concepts) to create sort of common sense based on a real world view (requiring to "know" about the world) in software.
-
Ahh, wonderfulWe now can have a 5-year anniversary of bashing a completely useless project brought to you by some internet cook who thinks he has "solved" AI by writing a program that even a 5-year old would understand is useless.
If you want a real database of "common-sense" knowledge, you should check out CYC instead. It might be harder to do it that way, but it sure pays off if you actually want to use it for something beyond spamming usenet groups and slashdot.
-
Re:A sample?
This completely illustrates the main problem with chatbot technology. They're all very good at canned responses to single questions but most fail to follow a conversation. ALICE in particular tries to use pattern matching to detect every single possible thing that might be said to it. That might sound hard in itself but now imagine doing that with every single sequence of sentences two or three deep and the whole problem becomes intractable. The truth is we're going to need a reasoning engine like http://opencyc.org/ before we're going to be able to handle realistic conversations realistically.
-
Babies have an instinctive understanding of 'real'
...and parents/pain for what is 'correct.' I don't think the concept is gone, but there are problems that are buried in the question as posed which (I think) became clearer stumbling blocks as technology advanced. NOTE: I'm not an AI theorist, nor do I play one on TV; I just like the idea and read a lot. Hence, this is all pulled out of my fundament.
Cycorp is not a poorly funded idea in the wrong direction. Cycorp chose a different tack; they decided that rather than trying to build a reality and correctness filter, they'd rely on human brains to do it for them (like trusting your parents implictly) and instead concentrated on the connectivity of the 'facts' accrued by the 'baby.' CYC is still very much around, and is very much in demand by various parts of the government and industry - if you want to play with it yourself, you can download a truncated database of assertions called OpenCYC. Folks have even gone so far as to graft it onto an AIML engine, to produce a chatbot with the knowledge of OpenCYC behind it.
The problem: how does your baby learn what's real and what's REAL NINJA POWER? Or, pardon me, what's REAL NINJA POWER and what's just a poser? Someone's gotta teach it. Which means it has to learn not only facts, but how to evaluate facts. So it has to learn facts, and how to handle facts - which means it has to learn how to learn. Which means you need to know that answer from the git-go. Tortuous games with logic aside, the onus is now much more heavily on the designer to have a functioning base - whereas with the Cyc approach, the only 'correctness' that is required is that of information, and perhaps that of associativity or weight - which can be tweaked, dynamically. The actual structure of how that information is related, acquired, stored and related is not relevant once decided. Having said all this, Cyc is (from the limited demos I've seen) quite impressive at dealing with information handed to it. It just wouldn't do very well at deciding what do do with that information - that's the job of the humans that gave it the info. It can tell you about the information, but not what to do with it. That task requires volition, really.
Volition is a killer. What is it? How do you simulate it? How do you create it? Is it random action? Random weighted action? Path dependent action? Purely nature, purely nurture? When it comes down to it, the human is (as far as we know) not a purely reactive system, which CyC (AFAIK) is. Learning requires not only accepting information, but deciding what to do with it - deciding how it will be integrated into the whole. If the entity itself isn't making that decision, then the programmer/designer/builder has already made it in the design or code - and then it's not really learning, is it?
Sorry if this is confused. As I said, I don't do this for a living. -
Take Search Technologies in a Different Direction
Since the dawn of the web, workarounds and cheat have continually been found to "optimize" search results. The sad result of every web site's quest to appear at the top of search results is that it has prevented search engines from providing "objectively relavent" results.
While Google, Yahoo!, and Microsoft continue to develop "search relevance technologies", someone out there needs to develop and bring to market a cognitive search engine that can actually understand the content of a page the way a human does and connect it with the requested search terms. Something similar to the Cyc project that Doug Lenat has been working on since the 80's (and its subsequent OpenCyc F/OSS derivative, only tied into search engines. And, no, I am not talking about Ask Jeeves or other silliness like that. ; )
Otherwise, "relevance" is just going to become a euphamism for "the people with the most money to 'optimize' their results" -
Meanwhile OpenCYC has not been updated since 2003
OpenCYC.org project Sourceforge CVS repository has not beent updated since October 22nd 2003. I hope some of that DARPA money will go a little way towards completeting the 1.0 release.
-
SR + Google + OpenCycOne of the things I'm really anticipating is such a speech-to-text processor that can work with OpenCyc's Natural Language Processor, so that we could interract with a truely intelligent system. Imagine, you say, "Computer, Earl Gray, Hot," and the computer responds with, "There are a number of meanings for what you said. Based on your previous queries, I expect you want the Tea Machine to steep you a cup of Earl Gray Tea. Is this correct?"
Of course, using such a processor, OpenCyc would also be able to use the video camera at your front door to ID you as you approach, open the door for you, and say "You have 5 new voicemail messages, one from 555-6789, from someone who sounded like your mother. Her tone was urgent. Would you like to listen to this message first?"
I haven't even got to Google integration yet, but that was mainly added as a way to get people to read this
;) OpenCyc can already do independent Google searches and collate the results. -
Re:5 years?
I have studied neural networks extensively and believe me these do not have the potential to revolutionize anything.
NN are as simplistic & bogus as the next thing. Other methods like Support Vector Machine has shown to be more powerfull. Not to say that there isn't room for improvment or that AI will nerver be fruitfull. Its comming, slowly but surely. here are a few reference to interesting AI research:
1
2
3
4
-
Re:Atomic view of content
I would look at the following technologies:
WordNet is well known although not that powerfull.
Common sense is really a beta but still its a big database.
Cyc is really cool, but not all free. Look at cycL the language they developped.
I think a simple thing like having integrated access to wikipedia articles or dictionny.com from the browser would be cool. Amazon I don't know. -
Re:Are there any...
Depends on what you mean by AI, of course, but OpenCyc is a great project that could really use more contributors.
-
Re:AI and adventure gamesI'd love to see an open source project that integrates OpenCyc into an interactive fiction programming suite.
The primary benefit I see in doing this is that instead of requiring users to complete excruciatingly specific chains of actions to achieve a goal, programmers could set goalstates and let the creativity of their players run wild trying to achieve them. OpenCyc's inference engine should be able to determine whether the goalstate was achieved or not, based on the properties of the objects.
This would, of course, make for an entirely different interactive fiction experience. Up until now, interactive fiction programming has focused on creating intuitive but nonobvious chains of reasoning and rewarding the player for discovering these sequences. Goal-based interactive fiction would place a greater focus on designing situations based on the properties of your objects. For example:
The Guard Room is filled with weapons. There are several shotguns mounted on the wall, next to a cabinet full of ammo. There is a filing cabinet in the corner, and a map of the prison on the wall.
There is a desk here with a phone, a lamp, a letter opener, and guard who seems to have fallen asleep while doing paperwork. It's Jimmy. The nice guard. Poor kid. You feel bad that he has to die so you can be free.
In a normal IF game, there would be one preferred way to solve this problem. Perhaps two, if the author felt especially creative. But an OpenCyc enabled game would let you examine the room in increasing detail, and use any and all of the objects you find to achieve the goal of incapacitating Jimmy.
Instead of being required to, say, grab a gun from the shotgun rack and shoot Jimmy in order to move past him, you might decide electrocuting Jimmy is quieter and smarter:
> get letter opener from desk.
Taken. Jimmy snores quietly but does not budge.
> cut lamp cord with letter opener
You are electrocuted. You have died.
Oops. OpenCyc knew that the letter opener was metal and that the lamp cord was plugged in, and that a human being could be electrocuted by doing this. Next time you unplug the lamp before cutting the cord and electrocuting Jimmy. Or maybe you tie him up with the lamp cord, and don't kill him. Your choice.
What makes this style of gameplay especially intriguing is that solutions could emerge which would surprise the author. It might even be fun to create situations which have no immediate solution and see if, through clever introspection, one might not emerge. Sharing your unique solutions with others would be part of the fun of playing the game.
By building on OpenCyc, the effort one programmer takes to define objects could be used and amplified by other authors. It could perhaps even be used by the general OpenCyc community in other applications. If nothing else, the challenge of trying to create a goal-based interactive fiction language that was powered by a common-sense inference engine like OpenCyc would be a heck of a lot of fun.
-
Re:Using heuristics in searches
You're thinking of CYC, as in enCYClpedia. (The open source version of this system was released in the wake of the movie AI, and is available at opencyc.org. )
As another poster has pointed out, this project had nothing to do with heuristics, and everything to do with ontology -- that is, the formal specification of knowledge using logical constructs.
In the way of background, the project was the brainchild of Douglas Lenat, who proposed to take traditional AI technques to their limit by giving a computer program all of the knowledge of the world which a toddler might have. Once a computer (so his reasoning went) had that knowledge, it could then be fed additional facts, and it would be able to understand them as well, with some occasional guidance from humans (much as a toddler might). Eventually the program would have enough knowledge that The project took dozens of computer scientists and philosophers specalizing in ontology the better part of the 1990s, and was frequently covered in the popular press.
The end result had not been so widely discussed or covered. I infer the program was not in fact self-propagating as was intended. Clips I saw towards the end of the project showed the enormous potential problems in this approach. For instance, one might tell CYC about a electric shaver. Later that night, it would go through and find inconsistencies between this new knowledge and its existing ontological database. For instance, in the case of the electric shaver, it might ask whether the human was also an electrical appliance while using the shaver, because someone had previously specified a rule that anything incorporating an electrical appliance was itself an electrical appliance. Hence, I gather that rather becoming self-propagating, the larger the ontological database became, the greater the number of logical inconsistences that arose, thereby miring the entire approach. At some point, progress would presumably be bottlenecked by the fact that many ontological experts trained in the CYC software would have to be working around the clock to attempt to sort out these problems.
Is anyone aware of any software projects that actually use CYC or openCYC? I am also greatly interested if anyone has a link to a good discussion by Lenat or others on their assessment of the CYC project at its completion. It is a monumental chapter in the history of AI, but despite this, I have never seen many technical articles published by CYC team members. I suspect that it may be nearly impossible to have a fully self-consistent set of ontological definitions of the world in the manner that CYC attempted. If so, that would be an amazing statement about AI, and indeed, the nature of knowledge itself.
Bob -
Re:Using heuristics in searches
-
Sounds like CYC
-
Hence the CYC project
The compelling dream is that you laboriously load up a computer with enough facts so that it can glean understanding of what it's reading, and one glorious day the computer has enough smarts to make sense of things on its own, and two weeks after crawling the entire Internet, it knows everything.
Hence Doug Lenat's Cyc, now partly open source. Unfortunately that glorious day has been "a few years away" for over 13 years.
The knowledge base is built upon a core of over 1,000,000 hand-entered assertions (or "rules") designed to capture a large portion of what we normally consider consensus knowledge about the world.
But I haven't come across any postings from Cyc on Slashdot correcting misinformation and lies.
Clearly this is possible because all those darn human kids do it; maybe you have to use a more complex computer and leave it for a few years crawling on the floor putting things in its mouth.
-
CYCThis sounds like the CYC Project. For over a decade they have been trying to collect all human knowledge and explain it to a computer using a logical language they developed. They claim that it has applications in search, among many other things, and a natural language translator is part of the system they are developing. They have even released part of CYC as Open Source!
I haven't seen any "WOW!" things come out of the project yet, but you have to admire their "just do it" approach to AI.
-
addendum: Re:interesting
WTF is up with Boromir son of Faram?
Looking at his user page shows an astounding mass of clueless posts (sadly, some moderate quite high).
Looks like a live expirement with a opencyc.
-
Re:Why?
Google gains self awareness.
Google already scares me a little. If you look at Google Labs, their Google Sets and WebQuotes already show simple "knowledge" of real world items.
Most AI research projects (like Cyc) face is a huge problem: data entry. All facts and rules must be manually entered by human operators. What if you could connect an Cyc-like AI frontend to Google's world-knowledge backend? Sure, much of the Internet is porn, spam, scams, banner ads, and lies, but Google already relies on PageRank of reliable sites to weed out the truth. -
Re:The Cyc project
It's been 9 years since that critique. Since then, lots of people have been trying Cyc themselsves, and having quite a bit of success. OpenCyc is the open source version of Cyc, which differs from the commercial version mainly in that it includes a fraction of the assertions in the commercial product. I'm surprised that we haven't yet seen a community effort to create an equivalent assertion database, but I imagine it's only a matter of time.
No one, not even Lenat, expects Cyc to become the humanlike AI that sci-fi authors have written about for decades, but I think it's becoming increasingly clear that Cyc is finally beginning to prove its worth. Cyc-enabled derivative projects like CycSecure will likely become much more important in the near future, and I suspect that the next decade will vindicate Lenat's approach to creating software that we can legitmately label "intelligent".
-
Re:Get AI moving with open source
Maybe it's time to start encouraging open source projects and development in this field.
There already is at least one such project.Using open source development, a project to establish a tool kit for AI programming fundamentals could be born.
The source code (and the development thereof) is the least of the problems. The problem is what to write, not what language or API to use. AI isn't hurting for lack of code or API. It's hurting because humans don't know how to develop AI systems. As I mentioned previously, we won't know until we know how humans are intelligent.Adding code-in-their-spare-time open-source developers having very little if any education or understanding of AI isn't going to help.
-
Re:AI, as a field, doesn't have a clue.I just saw a presentation Doug Lenat gave about Cyc a few weeks ago that seemed fairly impressive. Even better, all developments in Cyc have been committed to eventually flowing into OpenCyc. Just wondering if you (or anyone else) had thoughts on the promise of Cyc technology.
While I'm not sure if this will lead directly to Strong AI, it seems that having some sort of ontology would be a prerequisite (and quite useful in the short term to boot)
-
Re:The enCYClopedia of AI 'common knowledge 'Hadn't heard that specific one, but that sounds like the sort of thing Cyc comes up with. Another time it deduced that "everyone is famous." The cyclists had to explain to Cyc that almost all the humans it had specific knowledge of were famous, but the majority of us aren't.
"The researchers also told Cyc to ask questions if it decides it needs more clarity about a concept.
In 1986 Cyc asked whether it was human. That same year it asked whether any other computers were engaged in such a project."
That's either cool or scary, take your pick
:-) -
Check out OpenCyc
One of the best speech understanding systems in existance is OpenCyc - and it is open source!
-
Re:The CHINEESE ROOMGood point. This comes down to the AI therist's counter jibe:
I think, but you only simulate the process of thinking.
It is still, despite the fascinating open brain experiments and the PET monitoring, very difficult to evaluate what is happening to the mind inside the brain, other than through the conventional I/O paths.
I have always wondered what would happened if you sufficiently extended Alice with world knowledge such as that from the OpenCyc Project, how hard it would become to prove that Alice doesn't think and that humans do (well, some of them at least).
-
World facts on a hard driveThe Cyc project aims to collect world knowledge ("common sense"). However, many AI tasks show that this job is probably too huge to do it manually.
Do you think we will eventually get to a point were an AI system is able to gather common sense knowledge from a giant corpus, such as the web? What are the problems we will have to solve?
-
Prevention
-
Re:bah
There is a project currently out there to open source common sense. It's called OpenCyc and there was even a recent slashdot article discussing the projecct.
-
Re:Something else to think about ...
Actually, Cyc is pretty good at formulating sentences itself. I think you can even test this out in the currently released OpenCyc v0.6.
(:,
eca -
Open source code for the Cyc project available
There is an open source version of Cyc called OpenCyc, and it's available right here.
-
A few linksHere's the unofficial Cyc FAQ and a collection of Cyc resources
Cyc's corporate page has links to many recent news articles, the OpenCyc project, and other stuff of potential interest.
-
Old article
The first announcement of this can be found here.
Anyway, there is a related open source project for anyone interested.
Cycorp can be found here. -
OpenCyc project on SourceForgeI don't personally have anything to do with the project, but I thought it might be worth mentioning that there's an OpenCyc project being hosted by SourceForge. From their website:
OpenCyc is the open source version of the Cyc technology, the world's largest and most complete general knowledge base and commonsense reasoning engine. Cycorp, the builders of Cyc, have set up an independent organization, OpenCyc.org, to disseminate and administer OpenCyc, and have committed to a pipeline through which all current and future Cyc technology will flow into ResearchCyc (available for R&D in academia and industry) and then OpenCyc.
--Cycon
-
How about a few links...
Cycorp's home page.
OpenCyc is the open source version of the project, due to be released in July 2002.
The artificial intelligence FAQ mentions this project.
An interview with founder Doug Lenat.
A dissenting view from 12 years ago, by Christopher Locke.
-
websites
OpenCyc.org the open source cyc website and Cycorp the commercial website.
-
download cyc
(anonymous karma whoring -- whoo hoo)
Cycorp web site
OpenCyc
Sourceforge project -
Googe Sets precursor to Google AI?
I'm convinced that Google will become a giant AI. Google Sets seems like a small step towards machine understanding. The problem with older AI was bootstrapping their knowledge base. The Google AI systems will use the entire internet as an encyclopedia of self-correcting, peer-reviewed, continually-updated "facts". Suddenly, the problem of manual data entry for a AI system like Open Cyc is massively parallelized to the entire population of web users! Of course, the web is full of lies and self-promotion, but the web contains multiple voices, multiple "truths", that will create a general consensus using Google's PageRank algorithm. -
Cyc
-
Re:Oh, NO!
"Reading between the lines" involves not some native common sense that is wedded to intelligence, but a collectively evolved cultural contextualization. When we read an article in an encyclopedia, a lot of other stuff other than intelligence comes into play: x years of public school education, idiomatic constructs, varying by geographic location, that may or may not enhance or obscure meaning, and, of course, the double meanings and entendres inserted by bored or biased encyclopedia writers.
That is exactly what Cyc is doing. They're defining a contextual database of terms and concepts, trying to form subjective links, including idiomatic construts, double meanings and entendres. Don't knock it before you've read up on it.
-
Now combine this ...
... with OpenCyc and see what turns up.
-bch