Slashdot Mirror


Open Source Grammar Checkers?

DaveBarr asks: "Maybe I'm more sensitive to this than most, but after continuing to see "it's" instead of "its" and "loose" instead of "lose" everywhere in the media and on web sites of supposedly reputable origin, I began to wonder. Are there any Open Source projects trying to develop a reliable grammar checker -- one that would catch these common foibles? Are all these algorithms proprietary? Are there any University research projects which could be used as a basis for even a halfway-decent grammar checker?"

17 comments

  1. Grammatik (closed source) by Anonymous Coward · · Score: 0

    i don't know what the status of the software is now, but there was once a company called Reference Software which made a really good grammar checker called Grammatik. Unfortunately, they were bought by WordPerfect, which was subsequently bought by Corel. i think that Grammatik may have been integrated into WordPerfect. Since the guys at Corel are so big into Linux right now, maybe they would consider open-sourcing Grammatik?

  2. Re:Perl snippet, translated by Anonymous Coward · · Score: 0

    Not everyone understands Perl?!

    O.k...in the universal language of vi...
    s/\([Ii]t\)'s/\1 is/g

  3. Sorry, but I think you're very optimistic! by Anonymous Coward · · Score: 0

    What I read is that encoding meaning is a remarkably difficult task. The Fifth Generation Project in Japan was said to be really about making it easy to enter Japanese text into computers; however, it apparently turned out that cultural context was simply too dificult to define, and that goal couldn't be reached. There's an ambitious project it Texas (?) that's trying to encode meaning, as I remember, and they've found out it's a huge task.

  4. Syntax error in that? by Anonymous Coward · · Score: 0
    I don't know that syntax, yet, but didn't you mean

    s/([iI]t)'s/$1 its/g ?

    NB

  5. FIRST SUPERIOR POST by Velox · · Score: 0

    THIS IS A
    S
    M
    A
    R
    G
    L
    E
    ANNOUNCEMENT!

    The smargle"> race is superior!

    All furry creatures - bunnies, hamsters, sheep, muppets, kitties and fuzzy smarglegoblins -
    realize this, for they are allies of the smargle.

    Do not attempt to challenge the smargle, for then you will surely gain a smargle right in the *bleep*.

    You may now look to your right.

    Plekt you!
    - smargle frep

  6. Slashdot could do it's [sic] part. by Anonymous Coward · · Score: 1

    I'm always amazed when a Slashdot article
    is posted *without* any grammar mistakes.

    I've often wondered what would happen if
    the "preview" function for submitting an
    article included something like
    s/([iI]t)'s/$1 is/g

    Can we force geeks to recognize "it's"
    for what it is through technology?

    --kyler

  7. Dr. Bruce E. Wampler and Grammatik by Anonymous Coward · · Score: 1
    It is interesting to note that Grammatik was written by a very big Open Source author and advocate, Dr. Bruce E. Wampler. Dr Wampler is also author of of the Open Source LGPL V C++ Portable GUI . It is a very easy to learn C++ GUI that is portable between Windows, OS/2, and Linux. It was designed that way from the start and is quite a robust and clean product.

    Hmmm, I wonder if Dr. Bruce has any thoughts on designing an Open Source grammar checker? He probably could offer a lot of guidance to any group who wanted to start such a product.

  8. 1) Homophone checker: A needed addition to WPs by Anonymous Coward · · Score: 1
    First, what's a homophone? It's the preferred term for words (like "sale" and "sail") that sound identical but mean different things. Our present system of teaching reading fails to set up the neural pathways to recognize and remember which of several possible spellings was used in some text. It seems to be enough to obtain the correct sound, and to forget about the spelling. It's not enough. We seem to be getting away from communicating via text, and only sound seems to matter to many people. It's really sad when "they're" ( = "they are") is used for "there".

    Building upon spelling checker code, a fairly small dictionary could provide all the data needed to identify most homophones. At the user's choice, each homophone could be flagged with alternate spellings shown in a dialog box, with really-concise meanings for each. The user would select the intended meaning.

    So far, this idea seems to have generated little interest, but it would help create fewer ridiculous bodies of text.

    Far more ambitious would be a lexical analyzer that would try to deduce whether a given homophone seemed appropriate for the meanings of the words (a bottomless pit?) in the surrounding text. (Bloatware, anyone?)

    Nicholas Bodley // nbodley@world.std.com

  9. Re:Parsing English (or any other language) by K-Man · · Score: 1

    It's true that most AI programs have a necessarily limited semantic model, based on a few logic predicates and deductive rules. Logic itself is a philosophical construct, derived from observations of how people reason and solve problems, but it's not really a model of how the brain works, and efforts to get computers to assemble their own sets of rules and facts have been largely unsuccessful.

    "When people try to get computers to learn, the people do and the computers don't" - Alan Perlis

    --
    ---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
  10. Not open source but... by mwillis · · Score: 1

    Word Perfect for linux comes with a grammar checker, Grammatik, licensed from Novell.

  11. Koffice? by DanKolb · · Score: 1

    Forgive me if I'm wrong here, but won't KOffice come with a grammar checker?

    --
    Common sense is a set of prejudices built up over a lifetime
  12. Consider the following: by MostlyHarmless · · Score: 1

    Squad helps dog bite victim (a classic indeed).

    Or:

    The boy is hungry
    The boy is a toad

    Or:

    The boy carried a sandwich to the playground and ate it. (the playground? Note that conjunctions are the most ambiguous words in the English language.)

    It's easy for us to tell how to parse those, but a computer would have to maintain a database of the following:

    playground is big
    sandwich is small
    people normally eat small things
    when dogs bite, they harm humans
    a noun indicating [a] human[s] (squad) would not harm humans.

    One can argue that the purpose of learning is to fill in those pieces of knowledge, but:

    1) The amount of knowledge that would have to be stored and recalled is *huge*.
    2) Even if we have the storage and recall capacity, computers need to be able to interpret everything and know that, among other things, squad can be a group of people, "normally" may not always apply, etc. etc.

    void recursion (void)
    {
    recursion();
    }
    while(1) printf ("infinite loop");
    if (true) printf ("Stupid sig quote");

    --
    Friends don't let friends misuse the subjunctive.
  13. WHO WANTS TO WRITE ONE by jdigital · · Score: 1

    Im very interested in the project of writing such a beast. I have been interested in natural language processing for years. Im also a C coder (under *nix). Anyone interested please email me at joshr@netspace.net.au

    --
    :wq ~ ~ ~ ~ ~
  14. Perl snippet, translated by noc · · Score: 1

    "s/([iI]t)'s/$1 is/" is (ugh) Perl for "substitute `it is' or `It is' for every instance of `it's' and `It's'" I don't know why people expect everyone on /. to understand Perl. I only use it for fixing other ppl's broken Perl code.

  15. Re:Parsing English (or any other language) by GenCuster · · Score: 1

    The problem with parsing English lies in the nature of English itself. English was not designed to be parsed. it was not designed with a logical structure that has been consistently implemented.

    The question is what do you mean by a grammar checker? If you simply mean a program to read text and try to find obvious errors. You do not need to be able to parse English completely to do this. To extend the example from above you do not need to know exactly what "The cow is brown" means. Only if the tense agree. That program would just need to be able to recognize certain patterns as wrong. That is not impossible.

    As for the other side of it, a program that actually understands what you are writing and figures out the best way to communicate that. This is much more complex. It would be a very cool program if it could be completed. Besides, what better than OSS to harness the immense mindshare that would require?

    That being said, my grammar is so horrible I would love to see either one working as soon as possible.

    Nate Custer

    --
    "The poet presents his thoughts festively, on the carriage of rhythm; usually because they could not walk" Nietzsche
  16. GNU/FSF has one by severett · · Score: 2

    There is a program called diction.

    This is a GNU program still in development. It's available at:
    this link

    I've played with diction and it's not bad, not great but not bad. :)

  17. Parsing English (or any other language) by Greg+Merchan · · Score: 2

    Frankly, I'm suprised that I haven't seen a program that understands a spoken human language. The rules are codified in millions of textbooks and semantics should be parsable from WordNet, the OED or various other sources. And there are plenty of 'M-x doctor'-like programs that try to emulate conversation; and some of them, like megahal, can 'learn' well enough to fool some people.

    I've even played with coding a C library that reads like English without proper writing mechanics. A natural language interpreter shouldn't be too hard, though it would be time consuming and would probably not produce a substantial return on investment to a financial sponsor.

    I am inclined to think that the problem is ideological. There are so many disagreements among philosophers, linguists, and computer scientists as to the meaning of 'The cows are brown.' that unless one person is sufficiently savvy of all three and some other disciplines, no consensus or plan will ever be implemented.