Slashdot Mirror


A Useful Grammar Checker?

burtdub asks: "With the amount of raw text data available, there seems to be no shortage of ambitious language projects on the horizon, from Universal Language Translators to Junk Email Filtering. However, the mess that is the English language still seems to elude commercial attempts while being relatively ignored by the open source community. What would it take to make a useful, functional grammar checker?"

28 of 503 comments (clear)

  1. Simple enough. by FireballX301 · · Score: 4, Funny

    All you need is my 7th grade English teacher staring over your shoulder all day.

    That'll get you twisted into shape real good.

    1. Re:Simple enough. by LiquidCoooled · · Score: 3, Funny

      My missus does that all the time and when I showed her the original reply I had written she corrected me on that, then went away and banged her head on the wall because she realised what I was posting about.

      --
      liqbase :: faster than paper
  2. Make it for Latin by ari_j · · Score: 5, Interesting

    The best way to write a useful grammar checker is to write it for a language with a rational syntax.

    1. Re:Make it for Latin by parvenu74 · · Score: 5, Interesting

      Rational syntax? Latin? It's one of the few languages in which you can scramble the order of the words in the sentence and not loose any meaning because the word carries enough meta-data in the form of all of the various endings. Heck, regular verbs alone have 140 different forms, and irregular verbs are exactly that, with unique endings per item. And who's to say that the "nominative-ablative-dative-accusative-verb" syntactical ordering is either correct or ideal? Cicero doesn't write like that half of the time and Caesar almost never did in his "Gallic Wars." And consider that the Catholic Church, which has used Latin as its official language longer than the Romans did, has adopted a simplified vulgatum form officially, not that the various Popes and writers throughout the centuries have bothered to use that instead of the higher-browed Classical Latin.... whose rules are you proposing to follow?

      English might actually be an easier task than trying to parse Latin.

    2. Re:Make it for Latin by ari_j · · Score: 4, Insightful

      All those different forms and the nearly syntax-free sentence structure are precisely why it is easier to parse Latin than English.

    3. Re:Make it for Latin by dgatwood · · Score: 5, Insightful
      The thing is that most Romance languages also have word order restrictions. French, for example, adjectives come after the noun they modify.

      What makes English such a pain in the backside is that the language has been so utterly simplified over the millenia that we have lots of words with identical spellings, but different parts of speech. This makes the word order critical.

      Technically, word order isn't critical in English. I can say "Campus green and tow'ring trees" and you understand I'm talking about a green campus. This was actually common usage in the not-so-distant past.

      The problem, though, is that words have become overloaded and/or multiple words combined to a single term. For example, the green lantern is probably something you carry around to provide light when the power goes out. The Lantern Green is probably a place where they play cricket.

      We're seeing this happening with things like "it's vs. its" and "their vs. they're vs. there" in some people's usage as well. Every time the spelling distinction between words breaks down, it becomes significantly more difficult for anything short of a person to get meaning out of a sentence. That's why there are so many spelling/grammar nazis on slashdot. If we don't, in a matter of just a few years, we'll get to the point where nobody can understand anything.

      There is another theory which states that this has already happened.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    4. Re:Make it for Latin by ari_j · · Score: 4, Informative

      While I am one to appreciate good grammer and spelling, I hardly think that people English will become more difficult for native speakers to understand or use. As long as everyone screws it up in a consensual manner, we'll know what others mean.

      Q.E.D.

    5. Re:Make it for Latin by brpr · · Score: 5, Insightful
      It's depressing being a linguistics student. Every time a language-related topic is raised you have to listen to people who don't know what they're talking about spouting off and getting modded +5 insightful (or whatever the non-Slashdot equivalent of this accolade may be).

      What makes English such a pain in the backside is that the language has been so utterly simplified over the millenia

      No, it hasn't been simplified. At least, you won't find any linguist or student of Old or Middle English who'll claim that it has simplified as opposed to changed. Presumably you'll back up this outlandish statement with, say, a detailed analysis of the history of the case system in English from the Norman conquest onwards?

      that we have lots of words with identical spellings, but different parts of speech.

      Yeah, just like every other language. Do you have any data suggesting that English is unusual in this respect?

      This makes the word order critical.

      Word order isn't critical because of homographs, it's critical because the rules of English grammar are strict about word order. From a more practical point of view, it's critical because English is too poorly inflected for a parser to work out the structure of a sentence without reference to the order of the words. In any case, there's nothing particularly difficult about parsing languages with strict word order rules, or parsing languages with homophones and homophones, or parsing languages with both.

      Every time the spelling distinction between words breaks down, it becomes significantly more difficult for anything short of a person to get meaning out of a sentence.

      Not really. The problem of people writing "their" instead of "they're" is absolutely trivial compared to the staggeringly difficult task of accurately parsing natural language, or machine translation, or any other NLP problem of similar complexity. For God's sake, just list "their" as a synonim for "they're" in your parser and it will figure out which meaning was intended from the grammatical structure (there are few, if any, syntactic contexts in which more than one of "there", "their" or "they're" is correct).

      If we don't, in a matter of just a few years, we'll get to the point where nobody can understand anything.

      People have been saying this for hundreds of years.

      So, basically, you've taken one of the most difficult areas of AI (NLP) and argued that it's really difficult these days because sometimes people spell "they're" incorrectly. Weird.

      --
      Freedom is not increased by mere diminuation of government. Anarchy is freedom for the strong and slavery for the weak.
    6. Re:Make it for Latin by cfuse · · Score: 4, Interesting

      We must polish the Polish furniture.

      He could lead if he would get the lead out.

      The soldier decided to desert his dessert in the desert.

      Since there is no time like the present, he thought it was time to present the present.

      A bass was painted on the head of the bass drum.

      When shot at, the dove dove into the bushes.

      I did not object to the object.

      The bandage was wound around the wound.

      The farm was used to produce produce.

      The dump was so full that it had to refuse more refuse.

      The insurance was invalid for the invalid.

      There was a row among the oarsmen about how to row.

      They were too close to the door to close it.

      The buck does funny things when the does are present.

      A seamstress and a sewer fell down into a sewer line.

      To help with planting, the farmer taught his sow to sow.

      The wind was too strong to wind the sail.

      After a number of injections my jaw got number.

      Upon seeing the tear in the painting I shed a tear.

      I had to subject the subject to a series of tests.

      How can I intimate this to my most intimate friend?

  3. How about LEARNING the English language? by TripMaster+Monkey · · Score: 3, Insightful


    What would it take to make a useful, functional grammar checker

    How about a competently taught highschool English class?

    Seriously, people...learn to use the language...you'll be better off.

    --
    ____

    ~ |rip/\/\aster /\/\onkey

    1. Re:How about LEARNING the English language? by Haeleth · · Score: 3, Insightful

      Agreed! We already have the problem of people not knowing how to spell (reliance on spellchecking) and people not being able to do basic math (reliance on calculators) - this would just dumb people down even more.

      And don't forget the problem of people not knowing how to shoe a horse (reliance on motor vehicles), or light a fire (reliance on electricity), or plough a field (reliance on supermarkets).

      Wait, those aren't problems, they're examples of how the advance of technology has completely obsoleted things that used to be vital life skills. Whereas clearly spelling, grammar, and basic maths are completely different, and we should not be making any effort to help people take their mind away from niggling details and let them concentrate on the content of their writing or the implications of their calculations.

      No, wait, I'm still not quite following the logic here...

    2. Re:How about LEARNING the English language? by PitaBred · · Score: 5, Interesting

      And you wonder why people are stranded on the side of the road with a flat they can't change. You can't abstract out all the mechanics of anything, no matter how advanced.
      The problem is that "content" without proper mechanics loses all of it's value, and without proper mechanics built into the content generation process, thoughts are muddled and incoherent. There's no structure enforced. That's why people start thinking crap like Scientology is a good idea. They have no rational thought processes, they're governed solely by "content", ie "emotion". Kinda like the gorillas and monkeys you see in zoo exhibits.

    3. Re:How about LEARNING the English language? by Deanalator · · Score: 5, Insightful

      Not to be a jerk, but how is that insightful? Its not even really that funny. An open source grammar checker would be extremely useful. Everyone mistypes from time to time, and often times spellcheckes are unable to catch it.

      To the best of my knowledge, its one of the harder open problems in the OSS community. Im actually surprised that someone didnt enter something like that into the google summer of code. If I had any idea where to start, I know I would have (and I did consider it). It's a very valid question, and I look forward to seeing if anyone here comes up with any good answers.

  4. AI by Roguelazer · · Score: 4, Insightful

    Grammar can often only be determined by context, especially in English, where the rules of grammar change so much. Until a computer can for itself understand context, no grammar checker can be successful (or even marginally useful). Thus, my answer to your question is two words: "Artificial Intelligence." Artificial stupidity can also be used to simulate bad English.

    1. Re:AI by tktk · · Score: 3, Funny
      Artificial stupidity can also be used to simulate bad English.

      What's the point in having artificial stupidity when we have natural stupidity in abundance?

  5. Bask in it! by TheTranceFan · · Score: 5, Funny

    Ahhh the irony of asking Slashdot how to build a grammar checker!

  6. Biofeedback by Doc+Ruby · · Score: 4, Interesting

    People are always making these grammar checkers that work "from the inside out": look at the words, surround them with expectations of what words can agree with them grammatically, and flag contradictions. But humans are interactive with language, like everything else we do. Proper speakers and writers of English are good listeners (and readers). When we hear what we've said, we imagine what that would mean to us if it had been said to us. When the words make us think of something different from what we though before we said them, we correct ourselves. A better grammar checker might work "from the outside in": compose imagery or relationships between recorded objects as represented in the written words, and show implications to the writer, to match against their expectations.

    That might be a mightily complex undertaking, akin to a machine "understanding" the words. But it would replicate the feedback we humans already use to keep our grammar correct, and to understand each other. If we aimed that high, we could probably find a less ambitious assistance that's easier to automate, but goes a long way towards helping us express our words to computers, and to each other using computers.

    --

    --
    make install -not war

  7. Re:The Elements of Style and a good eye. by iced_773 · · Score: 5, Informative


    Speaking of The Elements of Style, the full text of the book can be found here. It's online now. Use it.

  8. English needs to be mutable. by vertinox · · Score: 4, Insightful

    One of the concepts that most people should realize is that the main success (and downfall) of the English language is that it can mutate quite easily.

    Remember... English is the bastard child of Celtic, Latin, and various other Germanic languages. Language also affects the way the way we think and also is the key limiting factor in grasping concepts.

    If your language cannot express a certain concept then you need a way to bend the rules (which English has a bad habit of doing) so that you can share that idea with others.

    To enforce a view or a proper method of speaking will often stagnate a societies ability to assimilate new ideas or methods. George Orwell pointed this out when he came up with the idea for new speak in which society can restrain itself from unwanted aspects by removing societies ability to even discuss it.

    We obviously do not speak Elizabethan English or the olde English of the Middle ages. Should our descendants be forced to speak an archaic language 200 years from now because we demanded to have our software set in stone what is the proper way to express ideas and communication.

    Man, this sounds a bit hippy-esque, but hopefully you understand what I mean.

    Still there should be some ground rules to what proper English is and should be so we can understands each other without going "Huh?" but it shouldn't be a hard-line stance that is unchangeable for the next 50 years.

    --
    "I am the king of the Romans, and am superior to rules of grammar!"
    -Sigismund, Holy Roman Emperor (1368-1437)
    1. Re:English needs to be mutable. by Mr.+Bad+Example · · Score: 4, Informative

      A couple of nitpicks here:

      > Remember... English is the bastard child of Celtic, Latin, and various other Germanic languages.

      English isn't really related to the Celtic languages. There are a few Celtic loan words, but that's about it. Also, Celtic languages and Latin aren't Germanic. You can see the relationships here.

  9. Two englishes are coming by hawk · · Score: 3, Interesting

    American and British English remain, for the most part, mutually intelligible. They have largely drifted together.

    However, that has happened with a large english speaking population.

    I'm expecting it to split over time into an international english, which will be largely today's american english, and whatever the english speaking countries drift into speaking. I suppose that they *could* be enough of an anchor to slow the mutation of the language, but I doubt it. I'm even more skeptical of the idea that the now established international english would follow the changes of the native speakers--there's no reason for a french-speaker and a korean speaker, both of whom speak english as an international language, to change their english due to americans or brits.

    hawk

  10. best solution: by circletimessquare · · Score: 5, Funny

    1. break text source into a handful of slashdot comments, and submit each comment

    2. wait for the inevitable uppity howling condescending grammar nazi to response to whatever grammatical errors exist, however slight or unimportant

    3. reassemble text source and apply grammar nazis' edits

    voila! grammar checking via redundant network of distributed grammar nazis (tm)

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    1. Re:best solution: by the+phantom · · Score: 5, Funny

      there should be a comma between 'uppity' and 'howling'
      there should be a comma between 'howling' and 'condescending'
      'response' should be 'respond'
      'voila' should be capitalized
      should read: 'via [a|the] redundant' OR 'via redundant networks'
      there should be a period after '(tm)'

  11. Re:What would it take? by the+phantom · · Score: 5, Funny

    A linguistics professor is giving a lecture. He explains that in English, prescriptive grammar dictates that a double negative creates a positive, for instance "I ain't got no money" would parse as "I have money." He then goes on to explain that in many languages, a double negative creates a more emphatic negative, for instance, in Russian "U menya nyet nichyevo" (literally, "By me is not had nothing") uses two negative phrases to create a stronger negative. Furthermore, the prof explains, in most languages, using two positives will create a more emphatic positive, or at the very least, will not change the meaning of a phrase, for instance "Yes, I have bananas" is fundamentally the same as "I have bananas." However, the proffessor concludes, in no language does a double positive create a negative.

    A student, in the back of the class, muttering under his breath, was heard to utter "Yeah, right."

  12. adjective-noun order in French by Tumbleweed · · Score: 5, Interesting

    French, for example, adjectives come after the noun they modify.

    Actually, that's only true for some adjectives. There is a rule to remember which ones go before the noun: 'BANGS'

    B - beauty
    A - age
    N - numerical order
    G - goodness (or badness)
    S - size

    Everything else goes after the noun.

    This has been your online French grammar lesson for the day. :)

    1. Re:adjective-noun order in French by P0ldy · · Score: 4, Interesting

      And yet, neither this nor the "adjectives-always-following" former accounts for those adjectives whose meaning changes depending on its placement.

      Whereas "ma chambre propre" means "my clean room", "ma propre chambre" means "my own room".

  13. Fruit flies like a banana by BlueStraggler · · Score: 5, Interesting
    Is fruit an adjective or a noun? Is flies a noun or a verb? Is like a verb or an adjective?

    This requires some serious AI (or just plain I) to sort out. And that only gets you past the subject line. Now re-read each of the sentences in my opening paragraph, but literally this time. Each of them would choke a grammar checker, yet for most readers they will parse perfectly well within the context.

    Easier just to pay attention in Grade 7 English class, as someone already pointed out.

  14. Re:adjective-noun order in French (BANGS) by geminidomino · · Score: 3, Funny

    A man's shirt is a feminine object, and a woman's blouse is a masculine object? Why?!

    Hey, anything that wants to be pressed against boobies all day can be assumed to be masculine. :)