Slashdot Mirror


Linguistics Meets Linux: A Review of Morphix-NLP

Emre Sevinc writes "Zhang Le, a Chinese scientist working on Natural Language Processing has decided to pack the most important language analysis and processing applications into a single bootable CD: Morphix-NLP. More than 640 MB of NLP specific software is included and there's still a lot of place on the CD which uses a compressed filesystem for bringing us the best of both worlds."

186 comments

  1. Ironic.. by grub · · Score: 5, Funny


    All this language processing packed onto a single CD yet /. can't run a spellchecker... :)

    --
    Trolling is a art,
    1. Re:Ironic.. by Anonymous Coward · · Score: 0

      yet /. can't run a spellchecker...

      If people used an Operating System that has a built in spell checker that can be turned on for any text entry field you like we wouldn't have to worry about this...

    2. Re:Ironic.. by 0x12d3 · · Score: 1, Funny

      And there's still a lot of place on the CD which uses a compressed filesystem for bringing us the best of both worlds."

      Maybe Slashdot is out of place on their servers.

  2. Noooo by lakeland · · Score: 4, Funny

    I was in the process of downloading this already. Damn you slashdot!

    1. Re:Noooo by FooAtWFU · · Score: 2, Informative

      Should have used BitTorrent. Then it'd be "I was in the process of downloading this already. Yay for Slashdot!!!"

      --
      The World Wide Web is dying. Soon, we shall have only the Internet.
    2. Re:Noooo by lakeland · · Score: 1

      Indeed. When I get it down I promise to put up a .torrent. Unfortunately I'm only getting 15Kb/s currently (9 hours remaining)

  3. that's pretty cool by homerjs42 · · Score: 3, Insightful
    This is a pretty cool thing. It seems like the kind of thing that would be of great use to anthropologists or others translating from a language that is more or less unknown. By unknown, I mean not used commonly outside of its people group, and probably unwritten.
    Neat.

    --dw

    1. Re:that's pretty cool by belmolis · · Score: 5, Interesting

      Actually, not very many anthropologists these days do much linguistic work. That's partly because linguistics has developed as a separate field and partly because cultural anthropology was largely taken over by Postmodernists, as a result of which it has nearly died. Most research on "exotic" languages these days is done either by linguists or by missionaries (who want to translate the New Testament).

      I am a linguist and have done extensive fieldwork, mostly on Carrier, the native language of a large region of northern British Columbia. (I also hack a little. Once upon a time I wrote the head-final shell mentioned in Charles Dodgson's comment.) Software is increasingly used for this kind of work, but for the most part it is not the sort of NLP software provided on the Morphix-NLP CD. A lot of that software is useful primarily if you've got a large corpus to work with, and it often presupposes that some basic resources exist, such as a lexicon, or at least a wordlist with part of speech information. For many languages even basic resources such as a lexicon don't exist or aren't available in electronic form, and when you're dealing with really small languages, there aren't any ready-made corpora, such as news text. If you want a text corpus, you've got to make it yourself, usually by recording people telling stories or whatever, and transcribing it. This is an important part of fieldwork, but its incredibly slow and tedious.

      There are some tools designed specifically for this kind of linguistic research. One is Transcriber, a tool that assists a human being in transcribing audio recordings. One of the older tools is Shoebox a dictionary database program for field linguists, originally written to run under DOS.

      Some of us have used Unix tools to extract and process information, e.g. grep to do regular expression searches. Ken Church at Bell Labs used to give a tutorial "Unix for Poets" on how to use Unix tools for linguistics. Here is his handout. For example, I've produced dictionaries of several dialects of Carrier using scripts written mostly in AWK plus the usual Unix tools, controlled by elaborate Makefiles. Some of us also use emacs a lot, not only as an editor but for doing searches. If you're interested in what kinds of software are of interest to linguists, you might check out the Computational Resources for Linguistic Research page.

      It is worth mentioning that spread of the internet has made available a lot of useful material for linguistic research. There are now quite a few languages for which you can obtain a good chunk of text (say at least 100K words), and often you can find parallel text (that is, the language you're interested in plus a translation into English or another language that is useful to you). But this works mostly for relatively big languages, that is, say, languages with a million or more speakers. There are around 340 such languages, depending on how you count, about 2% of the world's oral languages.

      One topic that concerns some of us is how software and other technology can speed up the process of documenting dying languages. Languages are rapidly become extinct - some experts estimate that as many as 90% of the languages currently spoken will be extinct in 100 years. [Computer languages may be proliferating at the same rate.:)] The late Ken Hale had seven languages die on him. If we don't find a way to speed up the documentation, or slow down the rate of extinction, most of those languages are going to die without very much being known about them.

    2. Re:that's pretty cool by judicar · · Score: 0

      Yeah, too bad it's all in chinese :(

  4. Great... by Anonymous Coward · · Score: 2, Funny

    This means that GCC will have to be expanded to be expanded to support all human languages as well as programming languages...

    1. Re:Great... by lakeland · · Score: 5, Funny

      Actually, I saw someone working on something like parsing english as a programming language, try a Google for 'controlled english' sometime. The general idea is that management may not be able to write the specifications, but they can read them and tell you it isn't what they're really after _before_ you code the thing.

    2. Re:Great... by adrianbaugh · · Score: 1

      I wondered (for about 5 seconds, once) about writing a doctype for english, similar to those for HTML.

      --
      "'I pass the test,' she said. 'I will diminish, and go into the West, and remain Galadriel.'"
      - JRR Tolkien.
    3. Re:Great... by Hanji · · Score: 1

      I once considered trying to write out a rough BNF definition of English ... I gave up when I realized it'd be largely useless without a way to differentiate between different parts of speech, which I was too lazy to try to figure out how to do better than just a massive hand-entered database :-D

      --
      A Minesweeper clone that doesn't suck
    4. Re:Great... by lakeland · · Score: 2, Interesting

      You can get such lists pretty easily without having to type them in. Just looking up the most frequently used POS for that word gives almost 90% accuracy. Alternatively I wrote a program that automatically predicts the POS for new words.

      However, your BNF grammer is likely to come unstuck as soon as you try to parse either casual english or moderately complex english. Either one very quickly leads to adding lots of infrequently used grammar rules, and hence lots of ambiguity in even simple sentences.

      The idea of controlled english was to create a useful subset of english that does conform to a BNF grammar (or LL(1), or something, I forget). Writing in it turns out to be quite hard -- very easy to forget you're writing in a programming language. But there is at least one english controlled english machine-assisted translator.

      Given a few years, I wouldn't be surprised to see a program like that be the basis of the next big thing in programming languages.

    5. Re:Great... by millette · · Score: 3, Interesting

      I guess this would interest you too. BTW, have you read "Le Ton Beau de Marot" by Hofstadter?

      In 1977, Xerox adopted Systran for internal translations by creating a Multinational Customized English that's easier to translate. [1]

      In 1930, C.K. Ogden proposed a tiny version of English: just 850 words that could be learned in a few months and used to say anything. He called it Basic English (BE). [2] [3]

      1. basic english
      2. machine translation
      3. xerox systran
    6. Re:Great... by Doomdark · · Score: 1
      Given a few years, I wouldn't be surprised to see a program like that be the basis of the next big thing in programming languages.

      Why is that? I generally believe in using right tool for the job... and controlled or non-controlled, human languages that I'm familiar with do not seem to have much benefits over existing programming languages?

      --
      I like paying taxes. With them I buy civilization -- Oliver Wendell Holmes
    7. Re:Great... by Anonymous Coward · · Score: 0

      That was the idea behind COBOL!

    8. Re:Great... by bstone7 · · Score: 1

      Didn't they call that COBOL?

    9. Re:Great... by PenrosePattern · · Score: 1

      I propose a very tiny version of English: just 2 words that can be used to say anything. "On" and "Off".

      --
      Seuss - I'm telling you this 'cause you're one of my friends. My alphabet starts where your alphabet ends
    10. Re:Great... by Lussarn · · Score: 1

      Yepp, like no doctype HTML... Noone knows if it's suppose to be human or machine readable.

    11. Re:Great... by Anonymous Coward · · Score: 0

      That would be a pretty big upgrade because the type of parser used for most programming languages aren't powerful enough for use on human languages. i.e. programming langauges use context free grammars while human langauges use context sensitive or unrestricted grammars.

  5. When... by irokitt · · Score: 0, Redundant

    Does my computer do my Spanish homework for me?

    --
    If my answers frighten you, stop asking scary questions.
  6. So this means by YoungBonzi · · Score: 3, Funny

    Maxis will have The Sims actually talking, instead of looking "special".

  7. Anyone remember Forum 2000? by Stile+65 · · Score: 2, Interesting

    Does anyone remember Forum 2000 (link does not actually work)? It's got some neat technology behind it. And the conversations between surfers and the SOMADs was hilarious. When I first saw the site, I thought it was actual people imitating the different characters. Does anyone know what happened to the site and why it no longer functions? I miss it.

    --
    I claim first use of "Error No. 0B" - or "No. 0B error." It'll be the new ID 10T!
    1. Re:Anyone remember Forum 2000? by Anonymous Coward · · Score: 4, Informative

      New version? Got this after some googling
      http://www.forum2010.org/

    2. Re:Anyone remember Forum 2000? by Stile+65 · · Score: 1

      I love you. Thanks!

      For those who aren't surfing at 0 or -1, someone graciously provided this link.

      Now I'm going to surf and hope I find the wisdom of Ayn Rand on the new site as well. *cackles*

      --
      I claim first use of "Error No. 0B" - or "No. 0B error." It'll be the new ID 10T!
    3. Re:Anyone remember Forum 2000? by generic-man · · Score: 3, Informative

      There was a brief time when they were Forum 3000, but the domain has fallen into the hands of domain squatters.

      Forum 2000 and 3000 died mainly because the people who ran them got bored and/or wanted to work on their graduate theses. It sure was fun to play with the Zephyr interface while it lasted, though. :)

      I wonder whether Forum 2010 is run by the same folks. I doubt it since Forum 2000 and 3000 were both Carnegie Mellon projects, and forum2010.org is registered to someone in St. Louis.

      --
      For more information, click here.
    4. Re:Anyone remember Forum 2000? by Stile+65 · · Score: 1

      No, it appears to be run by another person. And it's missing Ayn Rand. Still, it's quite amusing. :)

      --
      I claim first use of "Error No. 0B" - or "No. 0B error." It'll be the new ID 10T!
  8. But? by Anonymous Coward · · Score: 0, Funny

    But can it translate Perl code?

  9. Re:Good Chinese Compression by MoThugz · · Score: 5, Funny

    If you want to play the typical stereotype... please at least get it right.

    It's the Japanese who has problems pronouncing L's... and the Chinese have problems pronouncing R's.

    The Westerners on the other hand, can pronounce almost anything, but will never ever get facts right :)

  10. It can by Anonymous Coward · · Score: 0

    It just converts it to Chinese.

  11. Why Linux is great for doing applied linguistics? by dark-br · · Score: 4, Informative


    This page has some reasons.

  12. Re:Good Chinese Compression by Anonymous Coward · · Score: 0, Offtopic

    Well, the reason that they have trouble pronouncing them is that L and R are the same letter in their alphabet.

    And since when can Westerners pronounce anything?
    Last time I checked they couldn't pronounce a single Russian word half-way correct.

  13. eckcha isa outa by CPUgrind · · Score: 1, Funny

    Ia oundfa aa anguagela ita antca igurefa outa!

    1. Re:eckcha isa outa by Anonymous Coward · · Score: 0

      Apparently the moderators can't either.

  14. It's actually useless for that by scheme · · Score: 4, Interesting
    This is a pretty cool thing. It seems like the kind of thing that would be of great use to anthropologists or others translating from a language that is more or less unknown. By unknown, I mean not used commonly outside of its people group, and probably unwritten. Neat.

    Actually, this software seems like it would totally useless for that purpose. The software was developed and has a bunch of heuristics and domain knowledge put in by experts in english or the relevant language. Without similar expertise, the software can't be adapted to a new language. The software isn't a universal translator.

    So your hypothetical anthropologists or translators would still need to spend time and learn the language in question.

    --
    "When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
    1. Re:It's actually useless for that by homerjs42 · · Score: 1, Informative
      Actually, this software seems like it would totally useless for that purpose. The software was developed and has a bunch of heuristics and domain knowledge put in by experts in english or the relevant language. Without similar expertise, the software can't be adapted to a new language. The software isn't a universal translator.

      So your hypothetical anthropologists or translators would still need to spend time and learn the language in question.

      Well, yeah. I _know_ that. I was just speculating that such tools would be useful in the effort of learning/translating/etc. a language that had not as yet been studied formally.

      --dw

    2. Re:It's actually useless for that by Anonymous Coward · · Score: 4, Informative

      While right on this probably not being of much help to the typical anthropologist, it's not at all true that most of the software has lots of built in domain knowledge.

      At least half the tools are general purpose applications for constructing various kinds of models, whether they be trees or HMMs or n-gram models or entropy models.

      Believe it or not a lot of NLP work gets done on understanding algorithms that apply broadly across languages.

      There is some English specific stuff on the CD, but most of it isn't.

      The only software

    3. Re:It's actually useless for that by Anonymous Coward · · Score: 0

      There is a lot of research going into this sort of stuff but honestly, most of the work that applies "broadly across languages" are fairly lacking. It's a very hard problem that isn't going to be solved today or tomorrow.

      The cases in which it works well are when the two languages have very much in common from the beginning. In this case it's hardly surprising that you would manage without much domain language, since in a way you are using a well known language for comparison.

      Basically you will need some domain knowledge, the question is if you need it in a separate form or if the source language is enough in itself.

    4. Re:It's actually useless for that by Anonymous Coward · · Score: 0

      Most research does pick some sort of language to focus on, but unless it's an engineering task, solutions that can't be fitted to other (kinds of languages) are not usually the ideal.

      Domain knowledge, as you say, is crucial, but as far as the kind of software on this distribution goes, domain knowledge is part of your data store and not the algorithms involved.

      That said, the software on this CD is NOT IBM Via Voice or Google Language tools which more than likely have all kinds of engineering hacks to increase performance.

      The stuff on this CD for the most part has NO real language in mind. The big exceptions are the parsers and part of speech taggers. And there are even some general purpose parsing and tagging tools on there too! The software treats domain knowledge as part of a data store and NOT usually a crucial part of the algorithm. A lot of the same parsing techniques used in CS are used in NLP, n-grams and HMMs are not even remotely techniques limited to a certain language, and most of the other linguistic models too are congruent modulo some parametric variations imposed on the model by constraints in the model.

      English is one of the primary model languages used BUT parametric variation aside, good theories shouldn't be crippled from working with other languages or at least languages in the same genetic family.

  15. Alright! by Robotbeat · · Score: 1

    I was JUST googling for stuff about grammar and sentence diagramming on computers when I saw this story! Anyways, hopefully this will encourage people trying to make AI (AI capable of passing the Turing test) to use true grammatical parsing/analyzing (a non-open-source unsuccessful attempt is http://www.brainhat.com/). Also, perhaps this will encourage the development of an open-source grammar checker for OpenOffice.org or KOffice.

    1. Re:Alright! by anonomouse · · Score: 1

      My Turing test questions: 1. Describe an orgasm. 2. What does very cold ice cream taste like? 3. Describe you worst experience with anxiety. My full Turing test question: 1. Would you like to go for a swim?

  16. How many of you really support OSS? by citog · · Score: 0, Flamebait

    Amazing, isn't it? An article is posted about the latest Microsoft hole or the latest RIAA/MPAA engagement and the slashdot rabble section will harp on about how wonderful OSS is. Then an article extolling the benefits of OSS comes along that, more than likely, adds to the potential for global adoption of OSS. The result; the rabble section heaps mind-numbing stereotpyes upon the article killing useful discussion of the subject.

    Any chance people could be vaguely consistent and get behind OSS for reasons other than elitism.

    1. Re:How many of you really support OSS? by Stoptional · · Score: 0, Offtopic

      No.

      --
      Stoptional
    2. Re:How many of you really support OSS? by citog · · Score: 1

      Concise and decisive at least.. :)

    3. Re:How many of you really support OSS? by sisukapalli1 · · Score: 1

      I would first try to put the things in perspective. NLP is a relatively esoteric field, and most common "techies" wouldn't be so keen on delving into the internals.

      You should take a look at posts on Mozilla, KDE, and GNOME, and you will see that people do get behind OSS for reasons other than elitism.

      S

    4. Re:How many of you really support OSS? by citog · · Score: 1

      I appreciate that, I guess this morning I'm in bad humour. Must have picked the wrong articles to read first :) There are just a lot of times when I think the support of OSS isn't motivated by the philosophy rather it is being used as a stick too often.

  17. actualy by DrLZRDMN · · Score: 2, Funny

    no states have laws like that, this summer Texas ditched theres, they were the last to do so
    stiff sodomy laws? theres a joke in there somewhere...

  18. Funny? No - Informative! by Anonymous Coward · · Score: 0

    Thank you very much for posting the 'controlled english' comment - I had never heard of it before. As it turns out, it was exactly what I was looking for. Thanks.

  19. Re:Good Chinese Compression by Anonymous Coward · · Score: 0

    depends on what part of china you're from about the L/R thing actually

  20. Download Link by Hal+The+Computer · · Score: 3, Informative

    Here is where you can go to download the .iso image .
    Try not to kill their site. If someone has downloaded it, it would be nice of them to post a .torrent on Slashdot.

    --

    int main(void){int x=01232;while(malloc(x));return x;}
    1. Re:Download Link by Anonymous Coward · · Score: 0, Flamebait

      you sir, are a fucking karma whore. you post a direct link to the iso so that you get mod points, and also try to come off as a compassionate fruitcake by asking people not to download it. may you rust in pieces.

  21. Chomsky and stuff by Saint+Stephen · · Score: 2, Interesting

    This article is about linguistics, and he said "go read Chomsky", so I went and read Chomsky's bibliography. What I'm about to say applies to all modern philosophers and mathematicians:

    God damn, them are some fancy-schmancy sounding titles! Does anybody ever get the feeling sometimes that maybe things are simpler than our smartest people currently make them out to be? If you can't talk as simple as I'm talking now, you ain't really "nailed it."

    The reason I think this is true: back when all mathematicians only had Roman Numerals, the process for explaining how to multiple 3-digit numbers was extremely opaque, and it was nearly impossible to describe how to do long division. Now we can teach 3rd/4th graders how to do it before they watch "Barney".

    I saw some links about all the math they never teach anymore (compound arithmatic, like pounds shillings pence comes to mind). I think something similar will be the case in 1000 years with everything Chomsky and any arbitrary math guy says: they just haven't thought about how to say it simply yet. Life just *ain't* that complicated (if you have the right way to think.)

    1. Re:Chomsky and stuff by Anonymous Coward · · Score: 0

      Does anybody ever get the feeling sometimes that maybe things are simpler than our smartest people currently make them out to be? If you can't talk as simple as I'm talking now, you ain't really "nailed it."

      Arithmatic is a simple artificial system that is easy to understand. At the lowest levels, the real world may be governed by simple laws, but when multiplied by millions of interactions it becomes much more difficult to acertain what these rules are. So instead we build hueristics to understand the higher level behaviour that we see, but these hueristics are really only special case simplifications of the real overall effect of the real laws. Often the most difficult thing is not understanding the underlying laws and hueristics but being able to see which are most applicable in each unique situation - weeding through all the possible influences and knowing what the overriding factors are. This is why studies like atmospheric physics, psycology, political science, and linuguistics are so easy to sound good at, but so difficult to truely understand and pin down. So no, nothing having to do with the real world is ever simple.

    2. Re:Chomsky and stuff by idlemachine · · Score: 2, Insightful

      I both agree and disagree: life *is* that complicated, we just haven't yet come up with workable abstractions for a lot of things that allow us to handle them in the simplified manner you're asking for.

      What you're seeing here is the process by which that happens. Chomsky especially is someone whom I don't consider to want to "make [things] out" to be more complicated than they are; on the contrary, he seems to be more about wanting to understand the *true* process that is at work, not the pre-accepted social fiction that we might currently use as an explanation.

    3. Re:Chomsky and stuff by tepples · · Score: 1

      the math they never teach anymore (compound arithmatic, like pounds shillings pence comes to mind)

      Like days, hours, minutes, seconds? There still exist measurements that haven't been decimalised.

    4. Re:Chomsky and stuff by monecky · · Score: 5, Interesting

      I'm a programmer getting my masters in linguistics. Computer Science undergrad. Trust me. This is some tough stuff... until you learn the basics. Then everything starts making sense. There is a huge hurdle getting into any field... and it is usually because of the terminology. Every field has it's own terminology because every field needs to be extremely precise in their explanations.

      Linguists don't think Knuth is very lucid.

      Linguistics is neat. Syntax (the study of the structure of language), Phonology (the study of the interactions of sounds and what a child has to actually 'learn'), Phonetics (the study of the human language system and the sounds that it can produce/hear), and Morphology (the study of the smallest possible unit that holds 'meaning') all work together to form an idea of what goes on in the human mind.

      --
      http://jones.ling.indiana.edu/~prrodrig
    5. Re:Chomsky and stuff by revividus · · Score: 1
      back when all mathematicians only had Roman Numerals, the process for explaining how to multiple 3-digit numbers was extremely opaque, and it was nearly impossible to describe how to do long division.

      Especially considering that 3, 7 and 12 were all 3 digit numbers, whereas 2, 6, and 9 had 2 digits, and 1, 5 and 10 had one; and 8 had four! Holy crap!

      This has to be the funniest troll I've read in ages. My compliments!

    6. Re:Chomsky and stuff by kramer2718 · · Score: 4, Interesting

      Well, I'll answer your questions both in respect to NLP, and also more generally.

      First of all, most practical NLP techniques aren't *that* complicated simply because they must be able to be computed quickly. There are quite a few statistical hacks prevalent

      Most NLP techniques use probabilistic variants of two models finite automata and pushdown automata (both models are actually pretty simple, but if you don't know what they are, they may sound complicated).

      Finite automata consume input and transition to different states (a finite number of them) based on that input. They can also be interpretted as generating output instead of consuming input.

      Push down automata are almost the same except that they have a stack that they can push symbols onto. Another name for push down automata are Context Free Grammars.

      As I said above, most NLP techniques use probabilistic variants of and small extensions to these two concepts.

      The reason that Markov models (probabilistic finite automata) work so well to model speech is because they are flexible, simple, and linear just like speech. The reason that CFGs work so well to model language is that they are flexible, and hierarchical, and so can capture the recursive nature of language (think about "the man who killed the horse who killed the dog who...").

      Having said all of that, I don't think that these models capture the way that humans process language/speech. I think that neural networks have the potential to capture this better. They just aren't mature enough. We also don't really have a good architecture to run neural networks. A human brain has about 10^14 neurons (within a couple of orders of magnitude) that run in parallel. Try simulating that on todays serial architectures, and you'll run into problems.
      So my hypothesis is that there is probably some inherently simple learning algorithm for neural networks that we just don't know yet that will help solve many different types of problems (there is some biological evidence of there being a single learning algorithm implemented in the brain).

      So yes, there is likely a simpler answer, but until we know it, we have to use heuristics and statistical hacks in order to build systems that work.

      As to science in general, the reason it all sounds complicated is twofold:

      First things interect in a very chaotic way. Even if the interactions are simple, when you compose many very small interactions, you find complex behavior.

      Secondly, even if the interactions are actually simple, we humans with our Neutonian intuitions have a hard time understanding non-Neutonian interactions.

      Hope that helped.

    7. Re:Chomsky and stuff by revividus · · Score: 1
      I was exposed to some of Chomsky's linguistic work (as opposed to his political writing/interviews) awhile back, and it was indeed neat. I wasn't taking the course myself, and so didn't dig too deep into it, but I was trying to help someone else with their homework, and even the surface bits I comprehended while I was helping were pretty cool. Chomsky is a smart guy.

      I still find it hard to believe the original parent was serious, though... Roman numerals... :)

    8. Re:Chomsky and stuff by monecky · · Score: 4, Informative

      There is no talk of linguistics complete without mentioning Chomsky's political diatribes. :)

      He pretty much defined linguistic theory for the past 40 years. Once he had a voice he turned into somewhat of a political critic. A conspiracy-theorist. I don't see him solving any political problems, and I don't know how well respected he is by those who study such things, but I think he's a loon. (But, oh god, I wish I could study with him. :) )

      Chomsky's papers are tough to comprehend for beginners. (Which I am.) Those who are interested in learning Chomskian theory may wish to pick up some Andrew Radford. (he is very understandable, and his book "Transformational Grammar" is aimed at the undergraduate level syntax class. Once you tackle that, you can read Haegemann, "Government and Binding," which seems to be the most used graduate level book... but this one is quite boring.)

      In the meantime, a linguistic glossary which may help you get through some of the papers you may find: http://tristram.let.uu.nl/UiL-OTS/Lexicon/

      --
      http://jones.ling.indiana.edu/~prrodrig
    9. Re:Chomsky and stuff by Saint+Stephen · · Score: 1

      Quick: in your head: how much is 6 dozen and 3 times 7 and 1/2 score? This is the kind of math they used to teach in elementary school in the 1800s.

      They don't anymore.

    10. Re:Chomsky and stuff by Anonymous Coward · · Score: 0
      If you can't talk as simple as I'm talking now, you ain't really "nailed it."

      Sometimes, the ones who have really "nailed it" are precisely the ones who cannot explain it. A Taoist or a Zen Buddhist would tell you this. You can see the same phenomenon (although perhaps with different causes) among university professors: The highly accomplished ones are sometimes the worst teachers. My General Chemistry prof was such. It's an unfortunate fact that Feynmans are rare.
    11. Re:Chomsky and stuff by WFFS · · Score: 1, Insightful

      Um, you forgot Semantics (the meaning of language), one of the more currently important topics.

      I'm doing my BSc, majored in maths and CS, and currently doing honours in CS. However, my project/thesis is on Language Technology, based squarely around semantics (for verbs to be precise).

      Now, my point is basically agreeing with the above poster. I can't really go in depth about my project with the average Joe/Jo, because it is just too complicated. There is too much jargon and linguistic basics that would need to be covered first, and well, that takes up a whole chapter of my thesis, by which time Joe/Jo would've gone back to their game of Quake.

    12. Re:Chomsky and stuff by smithwis · · Score: 0, Troll

      You do know, they probably feel the same way about you. You're being brainwashed by the religious right;) It's time you stopped listening to their lies and the lieing liars who tell them.

      So why don't you give up on this hate thing and focus on your own fucking family.

      Thank you and goodday;)

    13. Re:Chomsky and stuff by zby · · Score: 1

      Some things can be compressed others not. Computer geeks should grasp this notion pretty easily. If not there is the whole theory of Chaitin Omega Numbers which proves that there are hard problems.

    14. Re:Chomsky and stuff by qui_tollis · · Score: 1

      You couldn't describe Chomsky as a conspiracy theorist. Read any of his political works, they are thorough to the point of dullness, and make no claims without a high standard of evidendce. MUCH higher than you get in mainsteam press.

    15. Re:Chomsky and stuff by The+Limp+Devil · · Score: 1

      Does anybody ever get the feeling sometimes that maybe things are simpler than our smartest people currently make them out to be?

      I think you're right, but it's not a new phenomenon. Some 40 years ago the Norwegian historian Jens Arup Seip coined the phrase "The American disease" to signify academics who use difficult words as subsitutes for original ideas. That disease is alive and well, not just in the USA.

    16. Re:Chomsky and stuff by dylan_- · · Score: 1
      Quick: in your head: how much is 6 dozen and 3 times 7 and 1/2 score? This is the kind of math they used to teach in elementary school in the 1800s.
      Well, I'm not that old, but at primary school I certainly learned what a dozen was and my six times table, which covers the first. 3 times 7 is supposed to be difficult? I also learned at primary school what a score was, so I don't think I'm going to have any difficulty halving it.

      I don't understand the point of your example. I suspect any 10 year old child nowadays could calculate that in their head.
      --
      Igor Presnyakov stole my hat
    17. Re:Chomsky and stuff by Toddlerbob · · Score: 1, Insightful
      As someone who likes Chomsky's work (and gets modded down for mentioning it - so I'm glad to see it didn't happen with people this time, though maybe to me, we'll see) and has someone who's studied cognitive science, I agree with this poster and also the one poster two steps up.

      That is, yes, things are that complicated. In fact that's a point that Chomsky himself makes, not only in reference to language, but also in reference to economics and sociology. He often says this in reference to economists and sociologists who claim to understand how human societies work, but I could also imagine him saying this in the context of human psychology / linguistics.

    18. Re:Chomsky and stuff by Saint+Stephen · · Score: 0, Offtopic

      Let me tell you my plans are uniquely my own. My grief has nothing to do with right or left. I'm gonna turn 'em into fertiziler :-)

    19. Re:Chomsky and stuff by gusman · · Score: 1

      That's pretty funny considering your talking about Chomsky. He talks about this all the time in his
      political writings. You might want to go check them out. If you do check them out, remember to take the red pill...

    20. Re:Chomsky and stuff by Tin+Foil+Hat · · Score: 1

      I don't know about you, but I am not prepared to argue with an MIT Professor of Linquistics over anything related to language. Certainly not the titles of his papers. That would be sort of like trying to argue morality with a nun, you just can't win.

      --
      No matter how many of my rights are taken away, somehow I still don't feel safe. -Frigid Monkey
    21. Re:Chomsky and stuff by dido · · Score: 2, Informative

      Actually, Chomsky (or one of his contemporaries anyhow) discovered early on that almost no natural language can be represented solely by regular languages, or even context-free languages. Chomsky initially even tried to use unrestricted/semi-Thue grammars to represent natural languages, but realized just as quickly that this HUGE class of languages is much, much too big (in fact, it's actually Turing complete, and only useful to those doing research in the theory of computation, not the theory behind human language). That left the context-sensitive languages in the original Chomsky hierarchy, but even those languages were found to be much too general, and the most general simulators for linear bounded automata needed to process CSL's apparently requires exponential time to operate. Current research in computational linguistics these days seems to concentrate on classes of languages between CFL's and CSL's, formal languages which are "mildly" context sensitive to characterize human languages. One example is the tree-adjunct grammars (which also incidentally have been found to characterize RNA secondary structures very well, and are of great use in bioinformatics). There are a few other models out there which I researched while making a writeup on the Chomsky hierarchy for E2, but unfortunately E2 is still down... :(

      Apparently computational linguistics is taking the same course that most other fields in artificial intelligence have taken lately. One camp takes the formal symbol manipulation approach (the original Chomsky theory and its descendants), and the other camp includes more recent approaches based on neural nets, fuzzy logic, genetic algorithms, and so forth, which are more grounded in biology rather than abstract mathematics. Sorta like the traditional SMPA robotics vs. Dr. Brooks' behavioral robotics.

      --
      Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
    22. Re:Chomsky and stuff by ak_hepcat · · Score: 1

      hmm.

      (6doz+3) * (7.5score) =

      (6*12 +3) * (7.5 * 20) =
      (72 + 3) * (140 + 10) =
      75 * 150 =
      75 * 100 * 50 =
      7500 + 3750 = 11,250

      In my head.

      Use shortcuts, its a lot easier..

      --
      Support FSF: Stop thinking with your wallet, and think with your imagination. (cc/non-commercial)
    23. Re:Chomsky and stuff by Dan+D. · · Score: 1

      The reason I think this is true: back when all mathematicians only had Roman Numerals, the process for explaining how to multiple 3-digit numbers was extremely opaque, and it was nearly impossible to describe how to do long division. Now we can teach 3rd/4th graders how to do it before they watch "Barney".

      As with everything pretty much. You have to understand it before you can express it simply. Just because the "smart guys" don't express it simply yet, doesn't mean they should just give it up. They still understand it better than most everyone else. Chomsky understands linguistics better then most people understand it, and at the moment most people who study linguistics are probably able to understand Chomsky as though he were talking to a child. Someday they will figure it. Sorta, but then it'll evolve into something more complicated so the linguistics we know now will be expressible simply, but something else far more interesting will be difficult to express and only the enlightened will be able to express it at all.

      --
      People who quote themselves bug the crap out of me -- Me.
    24. Re:Chomsky and stuff by Saint+Stephen · · Score: 1

      How many dozens is that?

      They used to teach specialized techniques in the 1800s for "compound arithmetic", so you don't have to "blow up" the compounds into their constituent parts. You just gave the answer in pennies or ounces; they needed to know how many schillings or stone + pence or drams. :-)

    25. Re:Chomsky and stuff by okmokmokmmm · · Score: 1

      >>things are simple.

      i agree

      >>life *is* that complicated...

      no its not

      >>Chomsky's political diatribes...

      are performance art. he teaches by example. listen to his tone of voice.

      >>Roman numerals...

      math is poetry: numbers exist but not physically.
      poetry doesn't belong in accounting.
      we need a better accounting system than math.

      >>I can't really go in depth about my project with the average Joe/Jo, because...

      you don't know what you're talking about.

      >>extremely well respected intellectual
      >>indeed
      >>among those on the left
      >>neo-conservatives
      >>despise him immensely
      >>however
      >>virtually
      >>compatible
      > >unfortunately
      >>rhetoric
      >>regarded
      >>dismissi ng

      this language makes me feel bad.

      the user

    26. Re:Chomsky and stuff by starm_ · · Score: 1

      Trying to understand how language works is almost like trying to understand how we think. Some people have suggested that a lot of our thoughs are made of an internal dialog. Language is created in our brain by HUGE neural networks that cannot by accurately modeled by simple statistical models. Now, if a language is grammatically correct and structually sound, it is a little bit easyer for a computer to analyse. This is almost never the case. People deform language while still conveying somewhat clear ideas. So the deformation is not a big drawback for us. But for a computer to analyse language that is not structurally perfect it is a lot harder.

    27. Re:Chomsky and stuff by rnd() · · Score: 1

      I'm not sure how much Chomsky you've read, but it sounds like you've read a lot. I think that based on the following facts it is likely that syntax (the field) will have a tough time making unified scientific progress:

      1) There are few people who are both trained syntacticians and native speakers of all of the obscure languages needed to provide data to test an aspect of a theory of syntax, and so native speaker judgments are required in order to "prove" a given theoretical contribution... There end up being a few "classic" sentences (such as "John loves Mary" and a few from Icelandic that I am thinking of) that make up the primary data points for any particular theory of syntax, to the extent that the theory is empirically testable at all, and the possibility that there are some sentences which are simply tough for any native speaker to judge due to some structural aspect that is just as likely a quirk as it is a consistent universal.

      2) Few syntax papers (journal articles) are based on a particular chomskian theory: Most are based on a variety of citations from a variety of different works, some by Chomsky, and some in response to Chomsky's various works. The point is, there is not a consistent theoretical starting point for most papers. Most typically start out with the latest theory and then drege up something from P&P or one of Chomsky's older works, and claim that it combined with some new data (in some heretofore unstudied language) ought to be considered proof that warrants a change to "the theory"...

      3) There are no formal tools for describing theories of syntax, merely formal tools (most typically trees and x' notation) for describing sentences. Such notation for theories of syntax would likely resemble the notion of sets of functions, as in the lambda calculus, or something like that. A given piece of sentence data (along with a given tree interpretation) could then be checked against n theories to see which class of theories allowed it, and why.

      It seems to me that the formal tools and the methodological approach to syntax do not make it likely that the field will go much beyond Chomsky's initial (and insightful) observations about the existence of potentially quantifiable universals.

      Thoughts?

      --

      Amazing magic tricks

    28. Re:Chomsky and stuff by Anonymous Coward · · Score: 0

      [b]Another name for push down automata are Context Free Grammars.[/b]

      Actually PDA (Pushdown Automata) recognize CFLs (Context Free Languages) which are described using a CFG (CF Grammar).

    29. Re:Chomsky and stuff by smithwis · · Score: 1

      Apparently I was too subtle for the moderators.

  22. Re:Why Linux is great for doing applied linguistic by d3faultus3r · · Score: 0

    I always thought it was because of so much perl code being obfuscated purposefully. After all, if you can figure out what some of that does without frying your brain from confusion, translating mandarin chinese is no big deal.

    --
    read my blog
    musings on politics and technol
  23. Omission of Gate by use_compress · · Score: 2, Informative

    I was surprised to read that GATE was not listed in the package list. It's the best piece of software to tie together the descrete components that were included. Another complaint is that are a lot of so-so implimentations of very good algorithms. (#define NOT_FLAMEBAIT = 1) I suppose that you have to turn to corporate software to get the really robust implimentations and to free software when you want the cutting edge.

    1. Re:Omission of Gate by iplayfast · · Score: 1

      Gate needs info filled in, in order to download it. So it looks like they didn't want to step on toes.

    2. Re:Omission of Gate by elwood.ufl.edu · · Score: 1


      use_compress,

      Would you care to reply to me off-list on what you use GATE for? We're thinking of using it for named entity recognition and tagging in full-text digital collections.

      Gus

      gusclif at mail.uflib.ufl.edu

  24. Should it be patented? by Progman3K · · Score: 1

    Can the idea of producing a modular-on-a-cd OS be patented?
    Because if it can be, we have to secure it with something before a corporation patents it!

    --
    I don't know the meaning of the word 'don't' - J
    1. Re:Should it be patented? by Exiler · · Score: 1

      Done.

      It's called prior art.

      --
      Banaaaana!
    2. Re:Should it be patented? by AlXtreme · · Score: 1
      Being the first (afaik), some people (no, hordes of people) have told me to do this, they believe such a patent would make me rich. I counter it with: If it wasn't free, nobody would use it. In only 11 months, there have been many people who have used Morphix to build their own livecd's, and that's the whole idea of the project. Make livecd's without having to rebuild the whole damn thing at every update.

      So, unless the borg get me, this is one patent that won't fly :)

      OT: Having said that, there seem to be reports yesterday on our forum that Sun's Java Desktop uses Morphix's basemodule. No word on their website about any of this, so it might be a mistake. Even so, there's your evil corporate conspiricy for you...

      Alex

      --
      This sig is intentionally left blank
    3. Re:Should it be patented? by Progman3K · · Score: 1

      I didn't mean patenting it so it could be used for it could be charged for.
      I meant that that way, a corporation can't come along and patent it, even if the patent is not just.

      Take Microsoft patenting the long-filename extensions to FAT.

      It's NOT a just patent, but because they are a huge corporation, and can use lawyers to scare people, they'll probably get fees back from media manufacturers that ship their devices FAT32 ready-formatted anyway, because no one can afford to go to court to defend something like this.

      So I pray the GPL covers this and it can't be hijacked by Microsoft or some other huge corporation.

      --
      I don't know the meaning of the word 'don't' - J
  25. Forum2000 is dead. Long live Forum 2010! by Neuracnu+Coyote · · Score: 3, Informative

    I wonder whether Forum 2010 is run by the same folks. I doubt it since Forum 2000 and 3000 were both Carnegie Mellon projects, and forum2010.org is registered to someone in St. Louis.

    That's me, actually. You can't expect hundreds slashdot geeks suddenly slamming my site and having me not notice. ];-)

    Forum 2010 had, in fact, nothing to do with the great fellows at Forum2k/3k aside from inspiration. And, just to end the rumors, I built the F2.01k matrix and all my own SOMADs as a senior project for my Comp Sci degree at Fontbonne University.

    Now, I'm late for a date! Please don't destroy the matrix while I'm gone!

    --
    --
  26. Memories by gidds · · Score: 2, Interesting

    I remember when I was first let loose on a Unix system, and discovered tools like 'lex' and 'yacc' for lexical analysis and parsing. I was amazed that advanced language processing was so well supported - it was a short while before I discovered that they weren't for natural language processing :)

    --

    Ceterum censeo subscriptionem esse delendam.

  27. Re:Good Chinese Compression by log2.0 · · Score: 2, Insightful

    I would say that westeners can not pronounce simple Chinese.

    English is the only language I know but I studied Mandarin chinese for a few years.

    There are all sorts of things in there that we have a lot of trouble pronouncing.

    --
    Can your karma go above being Excellent?
  28. Natural languages useful for spam filters? by joelparker · · Score: 2, Insightful
    Can anyone here comment on if/how
    any of these natural language tools
    can be helpful for spam filtering?

    Cheers, Joel

    1. Re:Natural languages useful for spam filters? by INT+21h · · Score: 2, Interesting

      Lets see... if it had a good language guesser that could be fit into a plugin then we could toss all messages in languages we can't read (or see no use for), for instance all messages I get that are in English are either from some mailinglist, or spam. I've actually been working on a "spot English"-plugin to use on the mail that isn't automatically shunted into the mailinglist-folders, but if the work is already done, yay!

      You might think that looking at the charset used would be enough but 'taint so! Frequency of letters isn't good enough either, two good ways is checking for the most frequent words or the most frequent letter trigrams. If you want to know more, see if you can find the paper "Comparing two language identification schemes" by Gregory Grefenstette. It used to be openly hosted at xerox but now the server is gone.

    2. Re:Natural languages useful for spam filters? by tealwarrior · · Score: 1

      The short answer is yes. Spam filtering can be though of as a document classification problem. Some documents are classified as going in the inbox and others to the trash. The Maximum Entropy or SVM classifier software which is included could be used to train a model for this type of classification. You would need data to train it (in this case email marked as either spam or inbox or any other category you want). The model will produce a probability of whether or not it's spam. To make it really useful you'd want to integrate somehting like this into you email client so that you could tell it when it makes a mistake and retrain it. Tight email integration would also allow you to use you're contacts as a source of information to the model so that it could learn the even if you close friends mention the words penis enlargement in the smae mail you might still want to read that.

      --
      In theory, there is no difference between theory and practice, in practice there is.
    3. Re:Natural languages useful for spam filters? by dvdeug · · Score: 1

      You might think that looking at the charset used would be enough but 'taint so! Frequency of letters isn't good enough either, two good ways is checking for the most frequent words or the most frequent letter trigrams.

      Try looking at mguesser (http://mnogosearch.org). It's been quite accurate for me, but I've never tried it on spam.

  29. Re:Forum2000 is dead. Long live Forum 2010! by Stile+65 · · Score: 1

    Neat! Did you write the QSA code yourself or adapt the code written by the original CMU researchers?

    --
    I claim first use of "Error No. 0B" - or "No. 0B error." It'll be the new ID 10T!
  30. The base Morphix by unmadindu · · Score: 2, Informative

    I have been using the base Morphix system for a Bengali l10n Live CD project (which was mentioned at slashdot a few days back). I am really amazed by its capabilities - if you want to have a LiveCD of your own - this is probably the best starting point.
    For documentation, you may want to have a look at the Morphix Wiki.

    1. Re:The base Morphix by AlXtreme · · Score: 1
      Well, that _was_ the whole idea of the project now :)

      Have you joined our mailinglist/forums? It's great that all these derivatives are getting this much press [insert proud father-of-morphix photo], it would be even better to keep in touch and exchange bugreports & featurerequests. If you have, just ignore me, doing my best to get as much feedback as possible on the different modules...

      --
      This sig is intentionally left blank
  31. Re:Good Chinese Compression by Anonymous Coward · · Score: 0

    Pronounce syllables, dumbass.

    Unless pronouncing "l" and "r" constitutes speaking English?

    Anyway, since when does being, "imaginative," make a langauge good?

    White boy monkeys?

    Squint.

  32. Slashborging by PurpleBob · · Score: 2, Funny

    Wow. That's the first slashborging ("All Slashdotters should have the same opinions! Be consistent, dammit!") post I've seen in a long time.

    Even though they're stupid as hell, I was beginning to miss them.

    --
    Win dain a lotica, en vai tu ri silota
    1. Re:Slashborging by citog · · Score: 1

      No, I referred to a section of the Slashdot community. The ones that jump in at the start of the discussion with drivel. There is a lot of valuable input from members of the remaining section however it is frequently lost in the nosie.

      My posting was not a "All Slashdotters should have the same opinions!". Read it again and you might see that the sentiment is 'support not subvert' the OSS movement.

  33. There is a downside to Natural Language Processing by Anonymous Coward · · Score: 2, Interesting

    While NLP has many benefits, it can also freeze certain linguistic elements that should be removed or amended.

    As a simple example, take spell checking. When the computer can remember the spelling for every word and fix it automatically, who is going to worry about spelling simplification or reform? Yet changing to a standardized phonetic spelling would probably help people in the long run, if only by allowing children time to actually *write* rather than spending time in rote memorization and spelling bees.

    The same holds true for grammar. Program existing grammatical rules -- in all of their illogical complexity -- into computers, and you reduce the incentive to simplify and improve such rules. If we had continued to use Roman numerals until the advent of handheld calculators, would there be as much incentive for using Arabic numerals? And yet, without zero and the simplicity of the latter, mathematics would be far poorer for it today. And if computers can soon parse logographic languages like Chinese, will it prevent simplification or even conversion to a (arguably better) phonetic alphabet?

    NLP is important, granted, and will help more than it hurts, but it is important to realize that it has some potential drawbacks.

  34. Aren't patents written in that? by tepples · · Score: 1

    I saw someone working on something like parsing english as a programming language

    I thought English was already a programming language, designed for querying PICK databases.

    But seriously, don't patents try to describe a process in a limited subset of the English language?

  35. Re:Good Chinese Compression by Anonymous Coward · · Score: 0
  36. and there's still a lot of place on the CD by frovingslosh · · Score: 1, Funny
    and there's still a lot of place on the CD

    OK, I get that it's a Chinese scientist working on this, but it's about language. Should the Slashdot article really have been written in Chlinglish?

    --
    I'm an American. I love this country and the freedoms that we used to have.
  37. Re:My wish by revividus · · Score: 1
    ???

    You do know this is /., right?

  38. Re:Why Linux is great for doing applied linguistic by Anonymous Coward · · Score: 0
    You sir, are comic gold. Please add these delightful gems to your hilarious arsenal:

    Take my wife! Please!

    Why did the chicken cross the road? To get to the other side!

  39. Do you honestly believe that? by Kjella · · Score: 2, Interesting

    The reason I think this is true: back when all mathematicians only had Roman Numerals, the process for explaining how to multiple 3-digit numbers was extremely opaque, and it was nearly impossible to describe how to do long division. Now we can teach 3rd/4th graders how to do it before they watch "Barney".

    That's also why none of the good stuff was made by the Romans - it was the Greeks, then the Arabs that had good numerals, made the discoveries, before the knowledge of a proper number system finally returned to Europe in more recent centuries. The roman numerals were more like the Dark Ages of mathematics.

    I think something similar will be the case in 1000 years with everything Chomsky and any arbitrary math guy says: they just haven't thought about how to say it simply yet. Life just *ain't* that complicated (if you have the right way to think.)

    Life might not, but math certainly can. E.g. x^n + y^n = z^n is not true for positive integers x,y,z and n > 2. Proof: 250 pages long or so alone. The final article to put it all together is 100+ pages alone. And you won't understand shit until you've read a couple thousand pages of basic number theory. If you think that's ever going to be something you can slap up on the blackboard in an hour, you're wrong.

    For all that's been said and done, I think most "simplifying" moves have been made. I've done quite a bit of higher math, and I certainly haven't found any "easy" way to explain it to others. Sure, I can *show* you how phasors rotating in the complex plane can be used to derive the output of a AC circuit of resistors, capacitances and inductances, but noone will understand why.

    Most people will never get past the "apples" math. 3, 1/2, sqr(2), all operations on them can be understood by thinking of it in terms of physical objects. Now try make people "understand" e.g. complex numbers and operations. Hell most people have trouble understanding a trivial induction proof.

    Now say I got a standard induction proof:
    f(1) is true.
    if f(n) is true, f(n+1) is true.
    And this proves it for n infinitely large.

    Then, people believe it's some "infinity magic". But in reality it's simply that for every finite number there is a conventional, finite proof.

    Let's say I want to prove it for f(325266235235352):
    f(1) is true.
    Since f(1) is true, f(2) must be true.
    Since f(2) is true, f(3) must be true.
    ....
    Since f(325266235235352 - 1) is true, f(325266235235352) is true.

    But people don't understand that. Which tells me they will never understand 90% of higher math, because it won't get much simpler than that...

    Kjella

    --
    Live today, because you never know what tomorrow brings
    1. Re:Do you honestly believe that? by themightythor · · Score: 0
      Life might not, but math certainly can. E.g. x^n + y^n = z^n is not true for positive integers x,y,z and n > 2. Proof: 250 pages long or so alone. The final article to put it all together is 100+ pages alone. And you won't understand shit until you've read a couple thousand pages of basic number theory. If you think that's ever going to be something you can slap up on the blackboard in an hour, you're wrong.
      You're correct...for now. Just because that's the only proof that we have right now of Fermat's last theorem doesn't mean that there isn't a simpler one. I think that's what the OP was saying...
    2. Re:Do you honestly believe that? by Saint+Stephen · · Score: 1

      If you think that's ever going to be something you can slap up on the blackboard in an hour, you're wrong.


      All I'm saying is that 2000 years ago it took a 60 year old man hundreds of pages to describe techniques for long division, and they had LONG, LONG discussions about how stuff was made of earth, wind, and fire . You *seriously* believe similar advancements won't be made 1000 years from now that put our science in a similar light?



      I'm not saying the techniques 2000 years ago weren't valid, or the ones today are offbase. But odds are it is possible (from the correct perspective) to express them in plain english, to a preschooler. Thats all I'm saying: not that we're *wrong*, only that it's possible to explain it all, much, much simpler.

    3. Re:Do you honestly believe that? by CableModemSniper · · Score: 1
      But in reality it's simply that for every finite number there is a conventional, finite proof. Let's say I want to prove it for f(325266235235352): f(1) is true. Since f(1) is true, f(2) must be true. Since f(2) is true, f(3) must be true. .... Since f(325266235235352 - 1) is true, f(325266235235352) is true.

      Thats not really explaining how it works. Its not inifinty magic yes, but demonstrating it with an arbitrary large number doesn't explain it anymore then saying it is infinity magic. Which makes me wonder if you understand it as much as you think you do (on the other hand, the shit about phasors was pretty impressive). The reason it works is because (to paraphrase) you can always tell whether 2 is bigger than 1. (or that if I have 2 of something and you have 1 of something I have more then you). Which almost brings us back to the "apples" math. But what do I know.

      --
      Why not fork?
    4. Re:Do you honestly believe that? by maysonl · · Score: 1
      Now try make people "understand" e.g. complex numbers and operations.

      Well - just draw them a picture and show how complex numbers can be considered to be an absolute value in combination with a rotation, so that they multiply by multiplying absolute values and adding rotations, and then show them how this can simplify trigonometry, and finally dazzle them with

      • e to the (pi*i) + 1 = 0.
    5. Re:Do you honestly believe that? by Tony-A · · Score: 1

      But odds are it is possible (from the correct perspective) to express them in plain english, to a preschooler.
      Methinks you're right, but getting that perspective is not going to be easy. I have seen (algebraic topology) an arbitrary dimension generalization of Green's and Stokes theorems expressed in four symbols. I didn't really understand it then and I sure don't remember it well now ;-(, but if you have the right machinery in place, some things are an awful lot easier.
      Long division of roman numerals is doable, but somehow I doubt that the Romans ever did it. Algebraic division of polynomials and it can be ground out.

    6. Re:Do you honestly believe that? by bluGill · · Score: 1

      Would you go back and explain that to the math professor who game me a zero on one problem I proved by induction? There was no mistake in my proff, the initial condition was right, as was the induction step, and about half the class got exactly the same answer (ignoring a few trivial mistakes).

      Seems that induction, for all its power isn't perfect, and it took less than 1 minute to demonstrat a contradiction (which was obvious to the other half of the class).

      As we were then reminding, induction is not a mythod to prove things, it is a mythod to understand and communicate your proof after you have proved it a better way. Once someone has [correctly] proved something the hard way, induction is a valid way to re-prove it for yourself.

    7. Re:Do you honestly believe that? by BubbleNOP · · Score: 1

      Can I see your problem and the proof you wrote? I am curious to see an example of induction leading to a wrong result.

    8. Re:Do you honestly believe that? by bluGill · · Score: 1

      I'm 5 years out of that class, and only remember the result, not the exact problem. The homework and tests I did save were destroyed in the flood last summer. It was in logic, I remember that much, though I'm sure most math studies can bring up a similear problem..

    9. Re:Do you honestly believe that? by Anonymous Coward · · Score: 0

      it was the Greeks, then the Arabs that had good numerals, made the discoveries/B.

      Actually Indians invented Arabic numerals along with the concept of zero (shUnya). Arabs brought it to the west.

  40. Re:Good Chinese Compression by Anonymous Coward · · Score: 0

    I would think the Chinese have problems pronouncing 'v' rather than 'r'. There is no 'v' sound in Mandarin, but plenty of 'r'. Such as in 'ri' (day/sun), 'ru guo' (if), etc.

  41. Re:Good Chinese Compression by frovingslosh · · Score: 1
    It's the Japanese who has problems pronouncing L's... and the Chinese have problems pronouncing R's.

    No, I specifically remember Maxwell Smart's old Chinese enemy, the Craw!

    --
    I'm an American. I love this country and the freedoms that we used to have.
  42. Random musings from an ex-linguist. by Charles+Dodgeson · · Score: 4, Insightful
    I'm a PhD drop-out in linguistics, and happen to know precisely what a head-lexicalized context-free grammer is. (And, no, reading Chomsky is not the way to find out what it is). Below are some random musings on the geekiness of linguists.

    Linguists have always been geeky. Don't forget that Larry Wall is a linguist first.

    The only computer class I ever took was in 1983 called "Computer tools for natural language analysis". It was an introductory Unix course. We learned grep, awk, sed as well as tools like vi, Mail, and rogue. And a tiny little bit of C. But since then I've taught C at the graduate level.

    Linguistics is all about the reprensentation and manipulation of information. But instead of it being about languages we design for particular purposes, it is about the language system that we use naturally.

    Suppose you have a few thousand languages that you know were written with the same tools (like lex and yacc, but not lex and yacc), but you have no access to those tools. Suppose you are trying to figure out what those tools are from examining the languages (not the compilers) that have been specified using those tools. That is what theoretical linguistics is trying to do. We know that the specification of English and the specification of Dyirbal and every other human language out there are somehow "written" with the same tools. It's pretty need stuff.

    Linguists were early adopters of TeX, have had a Unix affinity for a while, and as people who are interested in how information is internally represented and manipulated, like reading the source.

    I remember once nagging the sys admins to always make sure that there is a man page for anything added to /usr/bin or /usr/local/bin. The next day, they asked me to look at the manpage for something to see if it met with my approval. The DESCRIPTION was the C source. I was happy to say that it did, indeed, meet with my approval.

    At one point, a well known professor (Geoffrey Pullum) had written a little essay for a newsletter on the "grammer of Unix" using linguistic style analyses of the shell. Naturally several of us feigned outrage at his confusion of "Unix" with the shell. Another linguist (Bill Poser), went so far as to write a shell which was verb (command) final, and post-positional. That is instead of saying
    cat foo bar > bang
    you would say
    foo bar bang > cat
    That is, the arguments preceed the command, and the redirect symbols go after the filename they redirect to or from. Now for various reasons, I had root access on a machine that Pullum used. So I changed his shell to this command final one. He actually caught on remarkably quickly. And after a quick
    /bin/sh chsh
    he was ready to concede the point.

    For me, there is no surprise that linguists, and particularly computational linguists, are OSS enthusiasts. But that is enough of my random musings for now.

    --
    Prime numbers are exactly what Alan Greenspan says they are -S. Minsky
    1. Re:Random musings from an ex-linguist. by torpor · · Score: 1

      I remember once nagging the sys admins to always make sure that there is a man page ... The DESCRIPTION was the C source ...

      So, not only are you working in a cool field, but you're also working with some of the best sysadmins in the world (in my opinion).

      Nice.

      Guess I'm gonna hit the little pill next to your name, as your post has also rejuvenated a slightly diverted personal inspiration of mine to go back to school and study linguistics.

      What are sources for the more interesting field journals/publications worthy of swot - care to make some suggestions?

      --
      ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
    2. Re:Random musings from an ex-linguist. by Charles+Dodgeson · · Score: 2, Informative
      What are sources for the more interesting field journals/publications worthy of swot - care to make some suggestions?

      I dropped out 15 years ago, so I'm not really the best person to ask. For popular books on linguistics, I'd recommend The Language Instinct by Steven Pinker. (It is the book I wish I'd written). My favorite journal back in the days when I was reading them was Natural Language and Linguistic Theory.

      If you've had any contact, you'll know that linguistics is a bitterly divided field. I was of the west-coast variety. But you need advice from some one working in the field now. I'd suggest that you drop by your local university and ask around. But do remember that there are substantial divisions in linguistics, so take what you are told with a grain of salt.

      --
      Prime numbers are exactly what Alan Greenspan says they are -S. Minsky
    3. Re:Random musings from an ex-linguist. by Anonymous Coward · · Score: 0
      I'm a PhD drop-out in linguistics, and happen to know precisely what a head-lexicalized context-free grammer is. . . . At one point, a well known professor (Geoffrey Pullum) had written a little essay for a newsletter on the "grammer of Unix"

      'Grammer', huh?

      Well, as a Ph.D. dropout in Nucular Physics, you have my sympathy.

    4. Re:Random musings from an ex-linguist. by lenester · · Score: 1

      [...]happen to know precisely what a head-lexicalized context-free grammer is. (And, no, reading Chomsky is not the way to find out what it is).

      I don't know if you actually meant this as a dig at Chomsky's writing style, but I laughed my ass off.

    5. Re:Random musings from an ex-linguist. by Charles+Dodgeson · · Score: 1
      I don't know if you actually meant this as a dig at Chomsky's writing style, but I laughed my ass off.
      It was not an accident. The question everyone has asked (and there are even some speculative answers out there, too) is "how can such a clear and persausive speaker be such a terrible writer?" Reading Chomsky is not a way to understand anything.

      But I also meant that Chomsky is not at all sympathetic to head-lexicalized context-free grammars.

      --
      Prime numbers are exactly what Alan Greenspan says they are -S. Minsky
    6. Re:Random musings from an ex-linguist. by starm_ · · Score: 1

      Pinker also wrote "Words and Rules" a little later wich is a nice introduction to NLP for the general public.(He also wrote "How the Mind Works" and "The Blank Slate")

      He really makes it more interesting than my NLP textbooks by inserting comic strips and other amusements with each different subjects.

      I have to mention Pinker is a psychologist so there is not much explanation on the programming, machine learning, or statsitical.

      Also if you are interested in the psychological side read Ray Jackendoff.

    7. Re:Random musings from an ex-linguist. by foqn1bo · · Score: 1

      Are you a former UCSC student?

    8. Re:Random musings from an ex-linguist. by Charles+Dodgeson · · Score: 1
      Are you a former UCSC student?

      Yes. Best undergraduate linguistics program ever.

      I graduated in 1984, went on the Stanford/CSLI, where I dropped out in 1987. I guess you recognized Jorge's computer class. Please don't ruin a good story by telling people that rogue was not actually part of the course.

      I take it that you have memories of UCSC, too. Feel free to email me. My email address is "jeffrey" at the domain you see for my URL.

      --
      Prime numbers are exactly what Alan Greenspan says they are -S. Minsky
  43. Re:Good Chinese Compression by Anonymous Coward · · Score: 0

    Westerners typically can't differentiate between different tones in tonal languages.

  44. Re:Good Chinese Compression by Anonymous Coward · · Score: 1, Insightful

    there are something like 80 phones of linguistic merit capable of being produced by humans. english has like 40, i think.

    and any linguist will tell you, it's impossible to pronounce something wrong...linguistics is a descriptive, positive science as opposed to a normative, prescriptive science. NO ONE speaks wrong, unless they think they do, i.e. a speech error

    i didn't try very hard in ling 101, it was so easy....

  45. what we need to do by bash-2.02$ · · Score: 1

    is figure out which one of us is going to download this and torrent it. all the rest of us will stop downloading immediately.

    sure, thatll work

    but seriously, my wife is very interested in linguistics(spanish major, almost a russian minor, some esl) and im curious as to how easy this will be to use for someone with no linux experience

    --
    tofu is made of little baby seals
  46. NLP? by anethema · · Score: 1

    I totally read it as neural-linguistic programming. Or..maybe it IS really neural-linguistic programming, you listen to their cd for a while and you end up in one of these stories.

    --


    It's easier to fight for one's principles than to live up to them.
  47. Star Trek & Farscape by Anonymous Coward · · Score: 0

    I guess this means we're just a step closer to have Star Trek/Farscape style universal translator implants on our body. Can't wait until I can talk to a Klingon or a Vulcan... maybe even Elves, Dwarves, and Orcs! ;)

  48. Not really. there's already UG by geekpuppySEA · · Score: 1
    Eh, there is a universal translator: the idea that all ideas are experienced in the same way by humans. Ergo Universal Grammar.

    So what are you waiting for... linguists are waiting for the geeks to make data gathering easier, to give us more grist for the microscope.

    And besides - the more language data we get, the more complex mindlike matter we can incorporate into games and sims... so hop to it, people. You've got that girlfriend or nemesis to animate, har.

    --
    Intelligent Design: because MATH is HARD.
  49. MOD THIS UP by AmVidia+HQ · · Score: 1

    i want to know

    --
    VIVA1023.com | Political Fashion.
  50. Mousings fur ah excelling west. by Anonymous Coward · · Score: 0

    Of course, we are aware that "grammer" is a spelling error and not a grammar error, so there won't be 27 comments about that...

  51. Amazing.. by 0utlaw · · Score: 0, Offtopic

    It's amazing how you can have so much functionality and interesting products surface out of the free linux kernel but still not dominate the market. Reminds me of this joke :

    if operating systems ran the airplanes

    Air DOS Everybody pushes the airplane until it glides, then they jump on and let the plane coast until it hits the ground again. Then they push again, jump on again, and so on...

    Mac Airlines All the stewards, captains, baggage handlers, and ticket agents look and act exactly the same. Every time you ask questions about details, you are gently but firmly told that you don't need to know, don't want to know, and everything will be done for you without your ever having to know, so just shut up.

    Windows Air The terminal is pretty and colorful, with friendly stewards, easy baggage check and boarding, and a smooth take-off. After about 10 minutes in the air, the plane explodes with no warning whatsoever.

    Windows NT Air Just like Windows Air, but costs more, uses much bigger planes and takes out all the other aircraft within a 40-mile radius when it explodes.

    Linux Air Disgruntled programmers decide to start their own airline. They build the planes, ticket counters, and pave the runways themselves. They charge a small fee to cover the cost of printing the ticket, but you can also download and print the ticket yourself. When you board the plane, you are given a seat, four bolts, a wrench and a copy of the seat-HOWTO.html. Once settled, the fully adjustable seat is very comfortable, the plane leaves and arrives on time without a single problem, the in-flight meal is wonderful. You try to tell customers of the other airlines about the great trip, but all they can say is, "You had to do WHAT with the seat?"

  52. what about historical etymology? by Anonymous Coward · · Score: 1, Interesting

    I'm not at all interested by airy analysis about sentence structure -- I like historical linguistics. Every wonder about a word like "go"? Why is it's preterit "went"? Well, the preterit used to be "eode" which actually comes from the same stem from which Latin ire comes. And from ire, we get only the French future stem ir- (as in J'irai -- I will go). This is important and all, but why are linguists so interested in this computer-related stuff, and not in the rich and varied history of our language, as well as those of many others?

    1. Re:what about historical etymology? by Anonymous Coward · · Score: 0

      Linguistics is concerned with the analysis of human language structure. Getting computers to understand language (however badly) aids this project immensely. Studying why we use a particular set of of sounds for a particular meaning is not fundamental, and does not advance scientific understanding.

      Your example is like learning why { ... } is used for blocks in Java programs - mildly interesting, but somewhat beside the point.

    2. Re:what about historical etymology? by Anonymous Coward · · Score: 0

      Translation --

      I like history. Why do you like computers and not history.

      I don't know
      (Linguistics is concerned with the analysis of human language structure.)

      Computers understand me.
      (Getting computers to understand language (however badly) aids this project immensely.)

      History sucks.
      (Studying why we use a particular set of of sounds for a particular meaning is not fundamental, and does not advance scientific understanding.)

      You suck.
      (Your example is like learning why { ... } is used for blocks in Java programs - mildly interesting, but somewhat beside the point.)

  53. Re:There is a downside to Natural Language Process by goodbye_kitty · · Score: 1

    And if computers can soon parse logographic languages like Chinese, will it prevent simplification or even conversion to a (arguably better) phonetic alphabet?

    Why is this a "potential drawback"? written chinese is a beautiful, expressive language and a phonetic alphabet (there are several around) results in a severe watering down of the meaning attached to any particular word. Having said this, 90% of educated mainland chinese can read romanized pinyin anyway, but few would choose to write with it unless there was a specific need.

  54. Clarification: Controlled Language [Re:Great...] by j.leidner · · Score: 2, Informative
    Controlled language is the conscious decision of an organisation to use only a subset of what a natural language like English offers in technical documentation (medical leaflets, submarine documentation, maintenance manuals, software documentation) in order to avoid confusion.

    (1) Insert the knob behind the lever.

    In (1) you could perhaps use a handfull of terms instead of "knob" -- controlled language enforces only certain licensed terms, this increasing overall consistency (same terms for same thing). This can be checked automatically once a positive list (or typically a hierarchy called "thesaurus") has been setup.

    (2) He saw the girl on the hill with the telescope.

    The second/third case are lexical and structural ambiguity: we want to avoid problems like with (2), where "saw" could be past of "to see" or have another (more morbid) interpretation. Even worse, it is unclear whether the girl is on the hill, carrying the telescope or whether "he" is spying on the girl with the telescope. I leave it as an exercise to the reader how many combinations (possible interpretations) there are in a sentence like (2) [Hint: Which verb? Who is where? Who carries the telescope?].

    In a Controlled Language scenario e.g. ACE, after some initial investments in thesaurus construction, thesaurus lookup and simple parsing techniques are used to report problematic passages to a human editor, who has to correct it manually.

    This is not programming in natural language. Typically only large companies can afford the initial investment.

  55. Where oh where? by rock_climbing_guy · · Score: 2, Funny

    Where are the "All Your Base" trolls when it's actually relevant?

    --
    Wh47 d1d j00 541, 31337 15n't t3h r0xor5 ne m0r3???
  56. Zhang Le is a cunning linguist ! by Anonymous Coward · · Score: 0

    And a Linux lover too.

    1. Re:Zhang Le is a cunning linguist ! by ballpoint · · Score: 1

      This is a picture of a Linux-loving cunning linguist.

      --
      Flourescent (adj): smelling like ground wheat.
  57. Re:There is a downside to Natural Language Process by zsau · · Score: 1

    Computers haven't stopped grammar evolving. I find some sentences without contractions ungrammatical, but a computer won't mark them as such, but nor will it mark the contracted form ungrammatical. Also, in my dialect, there takes ''s', regardless of whether it's plural or singular ('there's a million people here'). A relatively recent innovation (I didn't even realise it was an innovation till it was brought to my attention though).

    o:sVu, Di @dv{:ntidZ @v @ kn=sist@nt sp{liN @z D{t D@ pkj1}j{r@d@iz @v mai dailekt n@id@n bOD@ j1} (XSAMPA. Less oddly, we might have:
    orso, dhe advowntidj ov a consistant spalling iz dhat dhe pekuyaredyz ov my dylect need'n bodha you)

    --
    Look out!
  58. Another disc by frostman · · Score: 1

    can be found here.

    It's either much harder or much easier to read, depending on your point of view.

    --

    This Like That - fun with words!

  59. Speaking of musings..... by gosand · · Score: 1
    Linguists have always been geeky. Don't forget that Larry Wall is a linguist first.

    Let's not forget about Douglas Hofstadter either. He has written some books I think every geek should read: The Mind's I, and Godel Escher Bach. If you can get through those, you should try Metamagical Themas. As melon-scratchers go, it's a honey-doodle.

    Funny story, that I am sure nobody cares about: My wife (then girlfriend) and I were both in a bookstore looking for books, and were in different parts of the store. She was getting her Masters in French Linquistics. We met up to check out, and she was excited about the book she found. I told her I found a really cool one too, Metamagical Themas. I showed her the cool stuff in this book, and she agreed it was interesting. Then she showed me some of the interesting stuff from the book she picked: Le Ton Beau De Marot: In Praise of the Music of Language. We then realized that they were written by the same guy! Hofstadter is really awesome, and ties the whole geek/linguistics thing pretty well.

    --

    My beliefs do not require that you agree with them.

  60. How about translation of requirements to code ? by master_p · · Score: 1

    Since both are languages, can, for example, these tools be used for translation of software requirements to code ?

  61. linguistics and computer science by TheTick · · Score: 1

    Back when I was an undergrad, I was taking Principles of Compiler Design in one building on campus and Principles of Linguistics in another. However, the division seemed purely arbitrary.

    In Compiler Design we were learning all about lexical analysis, parse trees, and context free grammars. In Linguistics we were learning all about...lexical analysis, parse trees, and context free grammars. It was really interesting taking the two classes back-to-back, and observing the similarities (and differences).

    Don't even get me started on how Compiler Design (and Linguistics) put me leaps and bounds ahead of the curve when I took Modern English Grammar.

    --

    --
    bachiatari na torisetsu o yome!

  62. Linguistics and Anthropology by Enkerli · · Score: 2, Informative

    As both a partly self-labeled linguistic anthropologist and a cultural anthropologist, I would like to respectfully qualify the parent's statements on the state of the field. This really isn't meant as a flame but I do enjoy discussions on the difficult relationship between linguistics and anthropology.
    First, while anthropology seems to emphasize linguistics to a much lesser degree than in Boas' era, a large number of anthropologists do work on language, in one way or another. Granted, the groundwork of deciphering unknown languages isn't really part of the discipline anymore, but thorough research projects on how language and language varieties work in social and cultural settings are prominent in the work of many anthropologists, from Michael Silverstein to Alessandro Duranti. Whether or not you call this type of language science "linguistics" is a matter of choice. The fact remains that language still plays a prominent role in contemporary anthropology.
    The matter of whether or not "post-modernism" killed cultural anthropology is also open to debate. While I understand the claim and did feel some frustrations caused by "post-modern" anthropology, I think that the ultimate impact is that of enhancing anthropology. True, most cultural anthropologists have stopped writing monographs about "The Xs," but "post-modern" self-criticism is now being replaced by hybrid research activities combining theory and practice. Interestingly enough, language has a large impact on much of this work, at least in the form of meaningful exchanges. Again, maybe not "linguistic" in the strictest sense, but surely enough to warrant language training.

    --
    Alexandre http://enkerli.wordpress.com/
    1. Re:Linguistics and Anthropology by belmolis · · Score: 1

      I think we're basically in agreement on on the relative roles of linguists and anthropologists in language work these days. The main point that I meant to make was that it is for the most part linguists rather than anthropologists who now do the "bulk documentation" work.Secondarily, I think that it is true that anthropological involvement linguistic work has declined in two other ways. First, my impression is that it is considerably rarer than it once was for anthropologists to learn the language of the people they are studying unless their research is specifically on linguistic topics. Second, what some would call "cognitive anthropology", e.g., specifically, the study of such things as kinship systems, color terms, and folk biological taxonomy, is out of fashion with, indeed despised by, a large fraction of anthropologists. This isn't just my own perception: I know of very distinguished senior anthropologists working in this area who say that they feel that they no longer have any disciplinary home: their work is appreciated by linguists and psychologists, but, they say, not by most anthropologists. But, as the parent says, there is an active area of anthropology concerned with language use.

      I am not so sanguine about Postmodernism, though in part it depends on what you mean, and Postmodernism is a slippery creature. If we're talking about what I would regard as the hard core of Postmodernism, with deconstruction at its core, I view it as wholly negative. The epistemological and linguistic foundation is infantile, the result is not "self-criticism" but intellectual nihilism ("there is no truth and nothing can be known"), and it replaces the search for fact and valid argument with a lack of concern for data and ad hominem argument and arguments based on putative political implications. On top of all this, Postmodernist writing tends to truly awful, its only virtue the fact that it exemplifies Chomsky's observation that there are grammatical sentences of natural languages that have no semantic interpretation. The Postmodernism Generator seems to me to be entirely realistic.

      I don't accept the idea that Postmodernism has led to worthwhile self-criticism because Postmodernism doesn't actually motivate self-criticism. In fact, self-criticism is part of the standard scientific method of which Postmodernists are so critical.

  63. Rogue? by DrCode · · Score: 1

    Hey, that was one of my favorite tools too, back in the 80's. Can't think of anything better for finding the Amulet of Yendor.

  64. Common in mathematics by DrCode · · Score: 1

    You're likely correct. I've heard that often, the first person to prove a theory in mathematics does it in a very complex way. Later, other mathematicians figure out how to simplify it. It's a little like cleaning up someone else's code.

  65. First application: by DrCode · · Score: 1

    spreadsheet.eng:
    ---
    Write a spreadsheet that's Excel-compatible.
    ---

    gcc -o spreadsheet spreadsheet.eng

  66. Re:Good Chinese Compression by greendot · · Score: 1

    Yes, exactly. I spent a year in Vietnam and had the hardest time understanding the tonal differences.

    What I hear as one word will actually have 6 tonal differences. Very sing-songy. Westerners imply emotion in their tones. There are hidden meanings in our tones. All of this actually confused the locals too.

    And yes, western language does have more sounds than eastern langugaes. At the university there, I had a side-by-side chart of the sounds in each language. English had maybe two or three times as many sounds as Vietnamese. Their language is straightforward, albeit "foreign". :)

    Another things English has over a lot of languages, "flavor" or "color". We can be very colorful in how we say things. We can twist something with a hint of sarcasm or irony or humor. We can throws words together like "asshat" or "prarrie dogging" and we "get it". They won't. And other simple things where we personify items. Nobody ever got used to me items. It would rain, I would look up and say "stupid rain" and people would ask me "how can the rain be stupid?"

  67. Re:Good Chinese Compression by aldousd666 · · Score: 1

    It's becuase when you are born, you're capable of pronouncing anything. As you grow up listening to the sounds of those around you talking, your brain 'tunes in' to sounds you perceive as relevant and hear often. All other sounds are treated as background noise. So -- Americans raised on English not only have trouble speaking chinese, but they have trouble hearing it correctly. Look at the Nguni languages of certain African tribes. They searialize syllables consisting of clicks and ticks of the tongue that we would at best interpret as salivary overhead. This is why westerners, not just americans, have trouble with eastern languages. Tones, pauses, pitch, all of these things are involved in every language, far beyond the simple translation of text into phonetic looking symbols. The phonetics we're familiar with limit us to certain languages. You'd write out 'exactly' as EX-AKT-LEE or something like that, but how would you write out the ticking clicking clopping sounds of the Nguni? You wouldn't know where to begin. You have no context for them. Hell, even most people can't properly hear the pronunciation of Yiddish or Hebrew, and that's more closesly related to western languages than the oriental suite. Face it, it's the upbrining of people speaking oriental tongues that causes the R L confusion -- not their intelligence or so called lack therof as the pundits seem to assume.

    --
    Speak for yourself.
  68. Re:Clarification: Controlled Language [Re:Great... by dvdeug · · Score: 1

    (2) He saw the girl on the hill with the telescope.

    where "saw" could be past of "to see" or have another (more morbid) interpretation.


    No, it couldn't. I saw, you saw, but he saws. Proper verb conjugation won't allow your alternate interpretation.

  69. ode to a greasian URN by tsarkon reports by Anonymous Coward · · Score: 0
    9 steps to greasing your anus for Yoda Doll Insertion!
    v 4.02.0
    $YodaBSD: src/release/doc/en_US.ISO8859-1/yodanotes/9steppro cess.sgml,v 4.02.0 2003/12/05 14:15:45 tsarkon Exp $
    1. Defecate. Preferably after eating senna, ex lax, prunes, cabbage, pickled eggs, and Vietnamese chili garlic sauce. Defecation could be performed in the Return of the Jedi wastebasket for added pleasure.
    2. Wipe ass with witch hazel, soothes horrific burns. (Rob "CmdrTaco" Malda can use witch-hazel on mouth to soothe the horrific burns from performing so much analingus.)
    3. Prime anus with anal ease. (Now Cherry Flavored for those butthole lick-o-phillic amongst you - very popular with 99% of the Slashdotting public!)
    4. Slather richly a considerable amount of Vaseline and/or other anal lubricants into your rectum at least until the bend and also take your Yoda Doll , Yoda Shampoo bottle or Yoda soap-on-a-rope and liberally apply the lubricants to the Doll/Shampoo/Soap-on-a-rope.
    5. Pucker your balloon knot several times actuating the sphincter muscle in order to work it in.
    6. Put a nigger do-rag on Yoda's head so the ears don't stick out like daggers!
    7. Make sure to have a mechanism by which to fish Yoda out of your rectum, the soap on the rope is especially useful because the retrieval mechanism is built in.
    8. Slowly rest yourself onto your Yoda figurine. Be careful, he's big!
    9. Gyrate gleefully in your computer chair while your fat sexless geek nerd loser fat shit self enjoys the prostate massage you'll be getting. Think about snoodling with the Sarlaac pit. Read Slashdot. Masturbate to anime. Email one of the editors hoping they will honor you with a reply. Join several more dating services - this time, you don't check the (desired - speaks English) and (desired - literate). You figure you might get a chance then. Order some fucking crap from Think Geek. Get Linux to boot on a Black and Decker Appliance. Wish you could afford a new computer. Argue that IDE is better than SCSI because you can't afford SCSI. Make claims about how Linux rules. Compile a kernel on your 486SX. Claim to hate Windows but use it for Everquest. Admire Ghyslain's courage in making that wonderful star wars movie. Officially convert to the Jedi religion. Talk about how cool Mega Tokyo is. Try and make sure you do your regular 50 story submissions to Slashdot, all of which get rejected because people who aren't fatter than CowboyNeal can't submit. Fondle shrimpy penis while making a Yoda voice and saying, use the force, padawan, feeel the foooorce, hurgm. Yes. Yes. When 900 years you reach, a dick half as big you will not have.

    All in a days work with a Yoda figurine rammed up your ass.

    I HAVE A GREASED UP YODA DOLL SHOVED UP MY ASS!

    GO LINUX!!

    Tux is the result after trimming Yoda's ears off so that Lunix people don't rip themselves a new

  70. Re:Aren't patents written in that? (No) by waterbear · · Score: 1

    I saw someone working on something like parsing english as a programming language

    I thought English was already a programming language, designed for querying PICK databases.

    But seriously, don't patents try to describe a process in a limited subset of the English language?


    Seriously, no, patents don't have any linguistic axe to grind. The function of a patent specification is to tell the world, in language that the ordinary specialist in the field will be able to understand, that here is a new and useful thing, this is what its essentials are, and then here is the inventor telling how to make and use it. Many peculiarities of patent drafting are learned as precautionary reactions to some one or other of the pitfalls that await to trap the unwary, especially when it later turns out that amendment is needed: the patent falls into the pit when the desired subject of the amendment is not found in the original document .... it would take far too long to give examples, I'd better stop right here!

    -wb-

  71. Any working mirrors for the iso? by Anonymous Coward · · Score: 0

    Does anybody know of any working mirrors for this ISO?

    Thanks in advance.

  72. torrent here by Anonymous Coward · · Score: 0

    http://snorg.org/morphix-nlp-1.1.iso.torrent