Open Source Natural Language Processing?
fieldmethods asks: "One area where Open Source and Free Software doesn't seem to have really taken off is Natural Language Processing (using computers to deal with human languages). There are a few projects that are open source, such as Festival (a speech synth system, now ported to Java), NLTK, a general-purpose NLP system in Python, and the Linguana project, a Perl implementation of a semantic network not unlike Wordnet (but better). Generally, though, there doesn't seem to be a lot of Open Source momentum behind the field as a whole. It's a challenging, difficult field that would benefit from collaboration, especially given the potential of replacing static corpora with on-the-fly corpora developed by search engines. Is anybody else interested in this?"
Actually, I am interested in this. I did some computational linuguistics work while I was doing my BSc/MSc and really enjoyed it.
You should also have mentioned your [?] interesting website fieldmethods.net as a good source for exploring all things NLP [which I thought referred to Neuro-Linguisitc Programming when I first saw it...].
If machines that attempt the Turing Test count as NLP, then NLP is a solved problem. You just need a random number generator to choose from a list of prechosen responses (face it, there's nothing less believable than talking to someone who actually listens to you.) Therefore, I submit Virtual Boglin:
#include
void main() {
int i = 1;
printf("Hello\n");
while(i) {
scanf();
switch(i){
case 1: printf("Microsoft Sucks! Use Linux!\n");break;
case 2: printf("I need to boot back over to the Windows side to play System Shock 2.\n");break;
case 3: printf("Sony is an evil monster who won't be content until we have lost all our rights.\n");break;
case 4: printf("Have I shown you my Clie? Look, it can play the Spiderman Trailer!\n");
}
i=rand()%5;
}
printf("Leave me alone; I'm about to get a new high score.\n");
}
There's a huge amount of open-source NLP resources and software for many languages on the web.
Last but not least:
Will.
Why oil price increase equals economic trouble (Score: Interesti
...the people that are knowledgeable in this field enjoy getting paid for their work.
http://www.speech.cs.cmu.edu/
There are probably others ( search google.com, freshmeat.net, sourceforge.net )
Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW
The POESIA (an opensource internet content filter, partly funded by the European Commission, safer Internet Access Plan IAP2117/27572) project will have some opensourced NLP components (for English, Spanish, Italian...).
See POESIA site for details.
POESIA (Public Opensource Environment for a Safer Internet Access) aims to protect European youth (in educational institutions) against harmful or inappropriate Internet content, and use several techniques (including NLP, Image processing, ...) to achieve this goal.
> [...] - primarily because it is intrinsically complicated.
Yeah right. And databases, math and user interfaces are not?
" Is anybody else interested in this?"
judging from the lack of comments on this story, i'd say the answer is a resounding "No."
One of the best speech understanding systems in existance is OpenCyc - and it is open source!
/..sig file not found - permission denied.
http://opennlp.sourceforge.net
http://nlpfarm.sourceforge.net
If you're looking for speech software there isn't that much good software as open source, since just about every aspect of modern speech processing is patented.