Google Open-Sources SyntaxNet Natural-Language Understanding Library, Parsey McParseface Training Model
Google announced on Thursday that it is open sourcing its new language parsing model called SyntaxNet. It's a piece of natural-language understanding software, Google says, that you can use automatically parse sentences, as part of its TensorFlow open source machine learning library. The company also announced that it is releasing something called Parsey McParseface (Google has a sense of humor), which is a pre-trained model for parsing English-language text. Nate Swanner of The Next Web, attempts to explain it: Combining machine learning and search techniques, Parsey McParseface is 94 percent accurate, according to Google. It also leans on SyntaxNet's neural-network framework for analyzing the linguistic structure of a sentence or statement, which parses the functional role of each word in a sentence. If you're confused, here's the short version: Parsey and SyntaxNet are basically like five year old humans who are learning the nuances of language. In Google's simple example above, 'saw' is the root word (verb) for the sentence, while 'Alice' and 'Bob' are subjects (nouns). Parsey's scope can get a bit broader, too.
We need useful AI, not stupid AI that acts like a five year old. Why is this useful to anyone at all?
It's a piece of natural-language understanding software, Google says, that you can use automatically parse sentences, as part of its TensorFlow open source machine learning library.
YOU CAN USE AUTOMATICALLY PARSE SENTENCES
So, can Parsey McParseface make sense of what manishs posts? Because I generally can't. I assume that the example sentence from the summary probably came from the article, but for some reason the "editor" didn't think to read his summary to make sure that it actually made sense out of context.
How can we continue to believe in a just universe and freedom to eat crackers if we have no ale?
What we really need is software to automatically proofread and edit slashdot submissions.
Another memememe mcmemeface to suffer through
"Parsey McParseface (Google has a sense of humor)"
more like dour corp peons at google tries hard, very hard, to appear humorous.
even tay had better humor
James while John had had had had had had had had had had had a better effect on the teacher.
The company also announced that it is releasing something called Parsey McParseface (Google has a sense of humor)..
If by 'sense of humor' you mean 'a repeat of something that was humorous a while ago under a different context'.
"I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)
Fruit flies like a banana.
> (Google has a sense of humor)
No. They really don't.
Since google basically owns the internet, it stands to reason that the company's intelligence level equals that of the hive. XXXsey McXXXface was *never* funny, in any context.
Bison from Buffalo, New York, are known to bully other Buffalo bison, who in turn bully (buffalo) other New York bison. In other words:
Buffalo buffalo buffalo buffalo buffalo buffalo buffalo.
A concise version of the Library of Babel expressing every idea if a language?
Not really, because Xy McXface is not funny for any value of X.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
94% syntax is definitely good, for a machine learning parser. Now if you were to come to the land of rule-based parsers, 94% is the norm.
Google loves machine learning, and it's easy to see why. That's how they made their whole stack. They have the huge amounts of data to train on, and the hardware to do so. It's so seductive to just throw a mathematical model at huge amounts of data and let it run for a few weeks.
Rule-based systems don't need any data to work with - they just need a computational linguist to spend a year writing down the few thousand rules. But the end result is vastly better, fully debuggable, easily updatable, understandable, and domain independent. That last bit is really important. A system trained for legalese won't work on newspapers, but a rule-based system usually works equally well for all domains.
In 2006, VISL had a rule-based parser doing 96% syntax for Spanish (PDF) - our other parsers are also in that range, and naturally improved since then. Google is hopelessly behind the state of the art.
A two-year-old gelding destined to race in Australia has been saddled with the name Horsey McHorseface. (pun intended by editors)
http://www.bbc.com/news/world-...
94% of all sentences are parsed correctly, or 94% of all words?