A Useful Grammar Checker?
burtdub asks: "With the amount of raw text data available, there seems to be no shortage of ambitious language projects on the horizon, from Universal Language Translators to Junk Email Filtering. However, the mess that is the English language still seems to elude commercial attempts while being relatively ignored by the open source community. What would it take to make a useful, functional grammar checker?"
All you need is my 7th grade English teacher staring over your shoulder all day.
That'll get you twisted into shape real good.
The best way to write a useful grammar checker is to write it for a language with a rational syntax.
What would it take to make a useful, functional grammar checker
How about a competently taught highschool English class?
Seriously, people...learn to use the language...you'll be better off.
____
~ |rip/\/\aster /\/\onkey
Grammar can often only be determined by context, especially in English, where the rules of grammar change so much. Until a computer can for itself understand context, no grammar checker can be successful (or even marginally useful). Thus, my answer to your question is two words: "Artificial Intelligence." Artificial stupidity can also be used to simulate bad English.
My Systems
Ahhh the irony of asking Slashdot how to build a grammar checker!
People are always making these grammar checkers that work "from the inside out": look at the words, surround them with expectations of what words can agree with them grammatically, and flag contradictions. But humans are interactive with language, like everything else we do. Proper speakers and writers of English are good listeners (and readers). When we hear what we've said, we imagine what that would mean to us if it had been said to us. When the words make us think of something different from what we though before we said them, we correct ourselves. A better grammar checker might work "from the outside in": compose imagery or relationships between recorded objects as represented in the written words, and show implications to the writer, to match against their expectations.
That might be a mightily complex undertaking, akin to a machine "understanding" the words. But it would replicate the feedback we humans already use to keep our grammar correct, and to understand each other. If we aimed that high, we could probably find a less ambitious assistance that's easier to automate, but goes a long way towards helping us express our words to computers, and to each other using computers.
--
make install -not war
Speaking of The Elements of Style, the full text of the book can be found here. It's online now. Use it.
One of the concepts that most people should realize is that the main success (and downfall) of the English language is that it can mutate quite easily.
Remember... English is the bastard child of Celtic, Latin, and various other Germanic languages. Language also affects the way the way we think and also is the key limiting factor in grasping concepts.
If your language cannot express a certain concept then you need a way to bend the rules (which English has a bad habit of doing) so that you can share that idea with others.
To enforce a view or a proper method of speaking will often stagnate a societies ability to assimilate new ideas or methods. George Orwell pointed this out when he came up with the idea for new speak in which society can restrain itself from unwanted aspects by removing societies ability to even discuss it.
We obviously do not speak Elizabethan English or the olde English of the Middle ages. Should our descendants be forced to speak an archaic language 200 years from now because we demanded to have our software set in stone what is the proper way to express ideas and communication.
Man, this sounds a bit hippy-esque, but hopefully you understand what I mean.
Still there should be some ground rules to what proper English is and should be so we can understands each other without going "Huh?" but it shouldn't be a hard-line stance that is unchangeable for the next 50 years.
"I am the king of the Romans, and am superior to rules of grammar!"
-Sigismund, Holy Roman Emperor (1368-1437)
American and British English remain, for the most part, mutually intelligible. They have largely drifted together.
However, that has happened with a large english speaking population.
I'm expecting it to split over time into an international english, which will be largely today's american english, and whatever the english speaking countries drift into speaking. I suppose that they *could* be enough of an anchor to slow the mutation of the language, but I doubt it. I'm even more skeptical of the idea that the now established international english would follow the changes of the native speakers--there's no reason for a french-speaker and a korean speaker, both of whom speak english as an international language, to change their english due to americans or brits.
hawk
1. break text source into a handful of slashdot comments, and submit each comment
2. wait for the inevitable uppity howling condescending grammar nazi to response to whatever grammatical errors exist, however slight or unimportant
3. reassemble text source and apply grammar nazis' edits
voila! grammar checking via redundant network of distributed grammar nazis (tm)
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
A linguistics professor is giving a lecture. He explains that in English, prescriptive grammar dictates that a double negative creates a positive, for instance "I ain't got no money" would parse as "I have money." He then goes on to explain that in many languages, a double negative creates a more emphatic negative, for instance, in Russian "U menya nyet nichyevo" (literally, "By me is not had nothing") uses two negative phrases to create a stronger negative. Furthermore, the prof explains, in most languages, using two positives will create a more emphatic positive, or at the very least, will not change the meaning of a phrase, for instance "Yes, I have bananas" is fundamentally the same as "I have bananas." However, the proffessor concludes, in no language does a double positive create a negative.
A student, in the back of the class, muttering under his breath, was heard to utter "Yeah, right."
Rhapsody in Numbers
French, for example, adjectives come after the noun they modify.
:)
Actually, that's only true for some adjectives. There is a rule to remember which ones go before the noun: 'BANGS'
B - beauty
A - age
N - numerical order
G - goodness (or badness)
S - size
Everything else goes after the noun.
This has been your online French grammar lesson for the day.
This requires some serious AI (or just plain I) to sort out. And that only gets you past the subject line. Now re-read each of the sentences in my opening paragraph, but literally this time. Each of them would choke a grammar checker, yet for most readers they will parse perfectly well within the context.
Easier just to pay attention in Grade 7 English class, as someone already pointed out.
A man's shirt is a feminine object, and a woman's blouse is a masculine object? Why?!
:)
Hey, anything that wants to be pressed against boobies all day can be assumed to be masculine.