A Useful Grammar Checker?
burtdub asks: "With the amount of raw text data available, there seems to be no shortage of ambitious language projects on the horizon, from Universal Language Translators to Junk Email Filtering. However, the mess that is the English language still seems to elude commercial attempts while being relatively ignored by the open source community. What would it take to make a useful, functional grammar checker?"
All you need is my 7th grade English teacher staring over your shoulder all day.
That'll get you twisted into shape real good.
The best way to write a useful grammar checker is to write it for a language with a rational syntax.
What would it take to make a useful, functional grammar checker
How about a competently taught highschool English class?
Seriously, people...learn to use the language...you'll be better off.
____
~ |rip/\/\aster /\/\onkey
Doing it yourself is the best method I know of (ending sentence with a preposition).
Remember Linguo? Or am I dating myself? (ew)
How 'bout useful, functional grammar? Ain't not no problem otherwise, right?
If brevity is the soul of wit, then how does one explain Twitter?
I think this might help
Grammar can often only be determined by context, especially in English, where the rules of grammar change so much. Until a computer can for itself understand context, no grammar checker can be successful (or even marginally useful). Thus, my answer to your question is two words: "Artificial Intelligence." Artificial stupidity can also be used to simulate bad English.
My Systems
me no needs no stupid grammer checker
People who have language skills, not just 1ee7-speak.
Don't pick up the pho*(@)$*@&@!@ NO CARRIER
How about a useful, functional grammar? One that the english language apparently lacks.
Garbage in, garbage out and all that rot.
How about a dictionary and classes in english, like those given in schools. Should be all that is needed.
Theru my lief xperiences, Ive troble width my grammer, but my speeling is verry god.
If you have any other links to tools please let me know.
1 tbsp of crazy
1 ounce of nuts
4 cups of pure genius
1/2 tsp of wit
5 gallons of caffeine*
*Your product of choice.
Just post the text to slashdot, wait for the flames, and do the opposite of what they suggest.
Have you read my blog lately?
The best way would be to re-implement English as a context-free grammar, preferably LL(k) or LALR.
"Yields falsehood when preceded by its own quotation" yields falsehood when preceded by its own quotation.
The sourceforge link in the summary didn't require all words in the search. Requiring BOTH 'english' and 'grammar' yielded a few interesting projects:
Queequeg, an English grammar checker for non-native English speakers
LanguageTool, an Open Source language checker for the English and German language.
graviax, Grammar rules (XML files containing regular expressions) and grammar checker.
Yes, they are even less developed than commercial alternatives. But they are all interesting starts...
Here's a great online grammar checker.
What would it take to make a useful, functional grammar checker?
An act of God, essentially. Who's going to write such a program, the English teachers? Have you noticed that programmers are not exactly ideal gramarians?
I love that old joke, what's so hard about the english language?
Plural of Goose is Geese
Plural of Moose is Moose.
Tooth, Teeth
Booth, Booth
What's not to get?
(More at places like this: http://www.edu-cyberpg.com/Literacy/reading.asp )
It would certainly reduce the number of errors like "would of", but people who write that are FUBAR anyway and shouldn't rely on software to get stuff like that right. The ability to write semi-competend is like breathing or eating: better done without artificial help.
P.S.: I've hidden 5 errors in this post. Check your grammar checker by finding them all.
Fleur de Sel
Ahhh the irony of asking Slashdot how to build a grammar checker!
You mean the U of Wash prof, Sandeep Krishnamurthy, http://sandeepworld.blogspot.com/ who criticized MS Word's grammar checker in March http://slashdot.org/article.pl?sid=05/03/28/192323 1/ http://seattlepi.nwsource.com/business/217802_gram mar28.asp/ hasn't done it already? He made it sound so easy.
People are always making these grammar checkers that work "from the inside out": look at the words, surround them with expectations of what words can agree with them grammatically, and flag contradictions. But humans are interactive with language, like everything else we do. Proper speakers and writers of English are good listeners (and readers). When we hear what we've said, we imagine what that would mean to us if it had been said to us. When the words make us think of something different from what we though before we said them, we correct ourselves. A better grammar checker might work "from the outside in": compose imagery or relationships between recorded objects as represented in the written words, and show implications to the writer, to match against their expectations.
That might be a mightily complex undertaking, akin to a machine "understanding" the words. But it would replicate the feedback we humans already use to keep our grammar correct, and to understand each other. If we aimed that high, we could probably find a less ambitious assistance that's easier to automate, but goes a long way towards helping us express our words to computers, and to each other using computers.
--
make install -not war
I have absolutely no idea what the appropriate requirements for a grammar checking engine would be.
However, I doubt slashdot would be an appropriate place to seek advice on the subject.
English is a complex and "dirty" language, effective usage can involve breaking what are the accepted rules.
Where's the Kaboom?
There's supposed to be an Earth-shattering Kaboom.
i just ahte that stupid offcie paper clip, who pretensd to teach me engilsh no need to a open source one
Back when WordPerfect was actually giving MS Word a fight, grammatik was a great grammar checking program for DOS, Windows, Macintosh and Unix & years ahead of anything which made it into MS Word. It was developed by Reference Software, before WordPerfect acquired them. I assume Corel still has this & uses it in their WordPerfect Office Suite.
Not perfect (our language is eccentric & computers are stupid), but the best I've seen.
so it will take a miracle.
It's a lot easier to parse
I was
you was
We was
simple!
Open Source Drum Kit, LPLC deve board - mjhdesigns.com
Imagine a proofreading clearinghouse in India...
Retired from software... maybe. Sort of.
... making C the national language? We already have grammar checkers for that. They're very good, too. Although, I suppose string literals would still be a problem. Damn!
One of the concepts that most people should realize is that the main success (and downfall) of the English language is that it can mutate quite easily.
Remember... English is the bastard child of Celtic, Latin, and various other Germanic languages. Language also affects the way the way we think and also is the key limiting factor in grasping concepts.
If your language cannot express a certain concept then you need a way to bend the rules (which English has a bad habit of doing) so that you can share that idea with others.
To enforce a view or a proper method of speaking will often stagnate a societies ability to assimilate new ideas or methods. George Orwell pointed this out when he came up with the idea for new speak in which society can restrain itself from unwanted aspects by removing societies ability to even discuss it.
We obviously do not speak Elizabethan English or the olde English of the Middle ages. Should our descendants be forced to speak an archaic language 200 years from now because we demanded to have our software set in stone what is the proper way to express ideas and communication.
Man, this sounds a bit hippy-esque, but hopefully you understand what I mean.
Still there should be some ground rules to what proper English is and should be so we can understands each other without going "Huh?" but it shouldn't be a hard-line stance that is unchangeable for the next 50 years.
"I am the king of the Romans, and am superior to rules of grammar!"
-Sigismund, Holy Roman Emperor (1368-1437)
What would it take? Someone who REALLY REALLY cares about this stuff or someone with a lot of money.
Ever notice how the open source community is full of really cool 90% finished products? People like to spend their spare time doing stuff thats fun, not mundane crap that occupies 10% of software development.
If you need applications like a grammar checker or that other 10% of "cool" software to be built, you will most likely have to pay someone to do it.
Null half the output and NOT the rest.
The FSF has a program called Diction. It's not perfect but it's better than nothing right now.
http://www.gnu.org/software/diction/diction.html
The webpage for Queequeg gives a good overview that the sourceforge project page lacks (and the link on the sourceforge page to their webpage is for a non-english index).
What would it take to make a useful, functional grammar checker?
Paying attention in high school English classes coupled with mandatory testing for proficiency as a graduation requirement.
"I'd rather be a lightning rod than a seismometer." -Ken Kesey
My Grammar's in the kitchen. Should I check on her?
Please sign petition to restore sanity to our banking system!!!
http://financialpetition.org/
I cain't think of any reasons why programmer's times are spent making that kind a software. Maybe they just dont have alot of reasons for needing them.
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
Based on the fact that you posted this to Ask Slashdot, you probably need to start by studying the work that has already been done in the field of computational linguistics. Read the journals. Then figure out how to apply your knowledge.
LOAD "SIG",8,1
If you want to know why it's hard, look at the following sentance:
While cooking dinner, my dog threw up on the carpet.
It's a correct, grammar wise, but nearly everyone who reads it sees what's wrong right away. But just try and right a program that can do that, and you won't get anywhere.
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl
Natural language is built upon context - who is speaking to whom, where they are located, tone and body language, and the meanings they are trying to convey. Simply mapping correlations between words in one language and another will get you nowhere. Better versions of Babelfish will still be babble.
I am currently living in Japan and learning the language. Here is a simple phrase that I was thinking about the other night...pretty much first-year college level Japanese.
Japanese version:
Itta koto ga aru
English version:
I/he/she/they have been (t)here before
or
Have you been there before?
The latter translation is correct if said with a rising tone of voice.
Note that I have just listed nine completely correct translations of a very simple Japanese sentence, all of which differ based on context - who is speaking to whom, their current location relative to the one in question, and tone of voice. A human reading a passage containing "itta goto ga aru" would know all of this information and correctly translate with ease. We are a long way from having a computer that can do such.
Even ignoring these problems, just to illustrate the difficulties of translation from vastly differing languages, I will explain the Japanese phrase.
"itta" means "went"
"koto" turns a verb into a noun. In this case, "experience" is probably the best single-word translation. It can also mean "idea", "process", or "concept".
"ga" has no translation into English whatsoever. It marks the topic of the sentence - in this case, "itta koto", the experience of having went.
"aru" implies existence.
Note that the natural Japanese sentence does not contain the subject of the sentence (I/you/he/she/etc) nor the implied location (here or there). The English version would use pronouns in these places. The translator needs to know this when translating, inserting them into the English from nothing in the Japanese, while dropping them in the reverse case. The translator needs to understand not only that Japanese rarely use pronouns, but it also needs to know when they do need to be used. If it turned every English pronoun into the corresponding Japanese, it would be wrong. If it threw them all out, it would be wrong. Instead, it must decide based on context whether they are necessary - and trust me, this is not easy at all.
Language processing is still a million miles off, and analyzing a mountain of text is only a fraction of the solution.
A possibility is to assign every word in a sentence a number of descriptors (tense, part of speech, etc...) and see if they are in a logical order. For example:
I use a grammer checker.
Nominative Pronoun, present tense transitive action verb, general article for non-vowel sounds, adjective, noun.
Simiilarily, She kick a red ball would have the same pattern.
Assuming that an adequate dictionary is compiled (containing all the descriptors, relying on context for a word such as "grammer" (if before noun, grammer is an adjective, otherwise, it is a noun).
While this system would be very difficult to design, I believe that the basic approach would work.
Nobody will buy a spam detection system which flags all of the mail from their kids as spam.
...that OpenOffice doesn't have a grammar checker either?
Maybe that'll be the best way for one to happen; to come out of OO that is.
The question seems to imply that implementation of a "good" grammar checker is made more difficult by the nature of English grammar. Does that mean there are effective grammar checkers out there for other languages, like German or Gaelic? I kind of doubt it.
Great men are almost always bad men--Lord Acton's Corollary
Well with the frequently asked questions (I'd hate to see the infrequently asked ones!) on Dr. Grammar, I'd say that it is a monstrous task to make a good checker. Grammaticians don't even agree on what is grammatically correct.
Slashdot can't even detect dupes and now you want grammar?
Go hit your knuckles with a ruler.
A grammar checker need I not.
AT&ROFLMAO
American and British English remain, for the most part, mutually intelligible. They have largely drifted together.
However, that has happened with a large english speaking population.
I'm expecting it to split over time into an international english, which will be largely today's american english, and whatever the english speaking countries drift into speaking. I suppose that they *could* be enough of an anchor to slow the mutation of the language, but I doubt it. I'm even more skeptical of the idea that the now established international english would follow the changes of the native speakers--there's no reason for a french-speaker and a korean speaker, both of whom speak english as an international language, to change their english due to americans or brits.
hawk
A million monkeys.
Why would anyone think that technology could provide a check to see if thier Grandma is still useful? Of all the insensitive, downright cold things to ask for...
Soko
"Depression is merely anger without enthusiasm." - Anonymous
Technology far beyond the current state of the art, even in academia. Language is hard.
~ a PhD student in computational linguistics
Elint my_stuff.doc or whatever...
No? There should be.
1. Grammar is easier than other languages. 2. Spelling is really messed up. Spelling reform is needed. Ef aj wonted tu spell logikalli, aj wud spell laik thys. 3. As far as the US is concerned, we have spelling that is more logical, but we are still using the imperial system... WE NEED TO CHANGE THAT.
-Palal
There isn't any. :)
the major advances in civilization are processes which all but wreck the societies in which they occur - A.N. White
A good place to start on English is Richard Lederer's "CRAZY ENGLISH". If nothing else, it's entertaining.
FalconShould there be a Law?
If the /. community provides any indication, good grammar checkers wouldn't be used even if they existed. Spell checkers work very well and no one seems to pay them any heed.
Chance 'em.
Any language that can express the thoughts of Shakespeare, Auden, Ogden Nash _and_ e.e.cummings is NOT a mess.
Your room is a mess, as well as your logic.
I am not a computational linguist, but here's my take on the situation:
There are all sorts of ambiguities in the English language - you have to be able to understand what part of speech a word is at the moment, and in order to do that you often have to understand enough context to use clues from neighbouring sentences. Oftentimes the grammar checker is going to have no clue what part of speech a word takes, for the same reason that your spellchecker frequently complains about words you spelled correctly.
In a language where, "Buffallo buffalo buffalo Buffalo buffalo, " is a complete sentence, I think that a trustworthy grammar checker is a few years out. =)
...saying "Just learn the grammar correctly in the first place", here's a question: can you really see no use in a computerised tool to help you learn correct grammatical usage?
It's like someone coming on asking about natural media painting apps being told "Just go to art school and learn how to use REAL paint, you lazy bastard!" - you're missing the point entirely. A grammar checker would be useful even for people with a decent grasp of grammar, as a double-check. Like spell checking, do you get it yet?
Game dev and music blog
English has no fixed grammar. The language is organic. What does this mean? It means that there are no fixed rules that define proper sentence structure, instead it is a fuzzy set. The set changes with each new speaker.
If there is no grammar for English, then what is it that your teachers taught you in grammer school? Simply this: Class. Thats right, what you are learning when you study grammar is social cueing that signals what social class you belong too. This works both ways; the 'educated elite' have a set of rules that distinguish them from the riff raff, but also, there are tribes that speak pidgin in order to distinguish their culture from the dominant one.
I think this is what you're looking for comrade!
Grammar checks YOU!
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl
Microsoft Word has a great spell checker. Not only does it flag the wrong words, but half the time, it's correctly fixing stupid typos. Word isn't the only spell checker that works well; I'd argue that everything I've used lately, except for Lotus Notes, does a fabulous job.
However, I'm yet to see a 'grammar checker' that works well at all. I don't believe we will anytime soon, either. The reason is that English syntax can be somewhat insane, and is full of exceptions. It's easy to say that "teh" should be "the." It's not so easy to point out mistakes in comma usage, for example.
I've found that Word is rarely right about grammar. To its defense, I'm a decent writer, so I don't have flagrant errors for it to be catching. But when Word repeatedly pops up incorrect, sometimes nonsensical, corrections, it becomes incredibly frustrating.
There might one day be a halfway decent grammar checker, but I don't anticipate it. There's no substitute for a human proofreader. (This applies to spelling, two, because you might spell a word correctly, but use the wrong won.)
________________________________________________
suwain_2
1. break text source into a handful of slashdot comments, and submit each comment
2. wait for the inevitable uppity howling condescending grammar nazi to response to whatever grammatical errors exist, however slight or unimportant
3. reassemble text source and apply grammar nazis' edits
voila! grammar checking via redundant network of distributed grammar nazis (tm)
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
I think *I* write grammar checker is ok?
If by "proper" you mean the listener or reader can clearly and unambiguously parse your sentence, then "proper" English can change every time it is uttered.
What IS within reach is a parser for particular flavors of proper English. For example, you could write a parser for The Queen's English, or Proper Written American English as defined in the AP style manual.
Neither parser would necessarily work for the other, and both would stumble on non-formal English such as most Slashdot posts.
If you want to create a general-purpose parser, a key element will be a failure-detector, flagging items that need human verification or outright intervention.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
more important *distinction* between
I should have been more careful in a post about grammar. *G*
Grammer checking is a thousand-fold more complicated than most people realize. English's hoary syntax, which pretty much boils down to "8 million exceptions in search of a rule", doesn't parse easily into computer code.
But I, too, would be interested in seeing this field develop - because it has the side effect of making bot AI better! Now, a voice-activated console that understood commands in plain, sloppy English would be worth striving for. Grammer-checking in a word-processor usually just provokes me: "How *dare* you red-line this sentence; I'm quoting *Shakespeare*, you illiterate rock!"
But we'll have perfect machine-generated grammer before we've reached the level of innovation required to put a spell-checker on the comment box on Slashdot!
How about the /. editors?
**runs to the patent office**
There might develop a movement to make the language fit the technology. If it doesn't, then the message will be ignored.
An example of this is the simple code-like messages sent by young people over cell phones and instant messaging mediums. People will taylor their messages to a language dialect that is appropriate to the medium.
Why would open source people need grammar checkers? All we have to do is post a message to Slashdot and it will be prodded, poked, parsed, and insulted until nothing is left, it's great! Spelling, grammar, translation, jargon checking, and even *^%hole tests are available! We don't even have to be on topic, any message can be submitted....
Here's to losing my Karma Bonus again....
Any chance when it's done that we can integrate it with slashcode to stop all the whining all day about grammatically correct posts?
Curiosity was framed; ignorance killed the cat. -- Author unknown
Yes, you right quite are, it's plenty enough superiorly good. Whom was I that did wanted to used they're opened source shit that to?
I use it all the time, it okay'd this posting.
ôó
In gradeschool I remember using a application for the Mac OS (7.x at that time) called "Correct Grammer". It seemed to work really well, much better than the grammer checker in word.
I'd reccomend trying to follow up with the company that produced the program. Hell, since it was so long ago theymight have released the program to public domain.
I looked around and it apears that the program was atleast at one point, owned by wordstar.org. Very little mention of it on their (horrible) site).
perhaps all we need is a logicaly structured language then grammer checking wouldn't be a problem at all! I do say... that idea is double plus good!
Just look at jamaican english
http://niceup.com/patois.txt
some sample phrases:
"No cup no broke, no coffee no dash wey". Even if disaster strikes your home it's always possible
that all may not be lost. (22)
you don't make a fuss there won't be a fight. (29)
"Wha eye no see, heart no leap" means that something terrible could happen but if you don't
see it, you are not frightened. (29)
"mi come here fi drink milk, mi noh come here fi count cow". A remimder
to conduct business in a straightforward manner. (22)
"The higher the monkey climbs the more him expose". A truly comic image if
you've ever been to the zoo, and comforting to any of us whose backs have been
used as a stepping-stone for someone else's success. (22)
"A city upon the hill cannot be hidden." same as above (29)
I personally believe that language will just evolve so that our childrens children, will be almost incomprehensible to us. as you can see, having africans speak english for 400 years in jamaica gave them there own particular flavour of the language.
I'll just use my special getting high powers one more time...
...language will not further codify/lower in context until we either learn to recognize briefer sounds or can afford to remember umpteen different forms and then eventually prefer to use them in casual conversation. Unfortunately, high context is convenient and maybe language today is as advanced as it's going to get, until we achieve a more biologically advanced stage...2 tongues, whatever...mmm 2 tongues
Often wrong but never in doubt.
I am Jack9.
Everyone knows me.
A language with a constant ideal grammar (an unmessy one) would not be very useful for human purposes. Meaning crucially depends on the context of utterance, and grammatical infractions are always possible and sometimes necessary for the sake of bettering communication. Strange but true. There is no way, in principle, that a grammar checker can ever get things quite right. An improvement would be to check sentence grammar in relation to surrounding sentences, rather than in isolation (pretty obvious and i'm sure this is already being done by people working on this stuff).
Tense is also a thorny issue. The different uses of the present perfect for example depend on pragmatic, context-based features for disambiguation. And if you're writing a novel there are all sorts of grammar rules (pertaining to normal referring speech) that you can and must break to be intelligible. For example, narrative fiction is always written in the past tense yet a narrator can use deictics like "here" and "now" (in a sentence using the preterite tense). If I were to try that in a business letter however, e.g., "I have told you now that..." should the grammar checker catch me? Not if I'm referring to the paragraph above; but yes if "now" is referring to present sentence (the grammar checker should point out that the present tense should be used).
Maybe there should be an option for the user to tell the grammar checker what genre a particular document is being written in. That is, set down the grammatical rules for particular types of writing. That might help a bit.
But at the end of the day, the idea of a perfect grammar checker is in principle impossible. Precisely because language is messy. And that's a good thing.
Ludwig Wittgenstein
French, for example, adjectives come after the noun they modify.
:)
Actually, that's only true for some adjectives. There is a rule to remember which ones go before the noun: 'BANGS'
B - beauty
A - age
N - numerical order
G - goodness (or badness)
S - size
Everything else goes after the noun.
This has been your online French grammar lesson for the day.
Get an education, you illiterate clod.
you check the require all words option
t &exact=1&forum_id=280640&group_id=82185&atid=0&wor ds=english+grammar&Search=Search
http://sourceforge.net/search/?type_of_search=sof
- create a porn site
- submit a sentence to 5 users.
- Ask them to correct it for the password.
- Assume that if 3 answers come back identical, that it is the correct sentence.
- Then allow those 3 through.
- Charge the user for doing their grammar checking.
- PROFIT!
If the word nuclear is changed to nuculear, then you may assume an uneducated individual (or a bought education) and can just deny them. It will drive at least one person crazy.I prefer the "u" in honour as it seems to be missing these days.
... slashdot. The /. community will shred your post to its grammatical undershorts.
I believe that the African-American dialect has developed into a different type of language that is primarily based on English. This unnamed language type is differenciated from standard English by its rate of change.
Africans in the early New World were in a difficult situation. All were slaves and most came from small tribes and villages that spoke their own local language. Most Africans in the New World could not understand the native speech of most other Africans. English had to become a common language.
However it was also necessary to develop a form of this common language that was not understood by the slave owners. I believe that the Africans used the vocal inflections of the native African languages using English words to communicate among themselves meanings that would not be understood by the slave owners. Even when the slave owners used exactly the same words. I believe that the key to this higher level of communication was a constant change in the meaning of the words spoken. As soon as the slave owners figured out what the slaves meant a whole new set of code words and vocal inflections would have spread through the community.
After 500 years the Africans became the African-American people and their language is still used in the same manner. Eubonics differs from other language types by its unusually rapid and constant rate of change. It's this ability to use and understand the pace of change in the language that African-American parents pass to their childern, not the vocabulary itself. The vocabulary and grammar stucture (for the most part) is identical to standard English.
The characteristic of having the Whites think that the Blacks 'talk funny' is one more general quality of this language. Adopting a word style that made the slaves appear 'funny' to the Whites was a necessary trait at a time when the people had no defence at all, legal or social, against arbitrary brutality and murder. It is now one of the first characteristics of Eubonics to be dropped as the African-American community develops parity with the world middle-class. The passage and general acceptance of civil-rights laws negates the need for 'Amos & Andy' dialog styles.
One disadvantage of the rapid rate of change of Eubonics is that is a group seperates from community, they lose the ability to communicate with the community after several generations. For example the 'gullah' dialect spoken by the African-Americans in the Carolinas coast who have lived in near isolation for generations is difficult to understand by others in the community and nearly incomprehensible to the whites. The Eubonics of New Orleans and Texas can be difficult to understand in the communities of the northern cities. This is offset by the constant circulation of music recordings throughout the entire range of the community.
Eubonics is not like the Chinese languages, where entire meanings of common phonemic clusters change according to the frequency tone accorded to the word. The meanings of the words is not solid but is a range of shades that is constantly changing. This makes Eubonics a different language type from word-ordered or declined languages.
(an oft-quoted saying in England, and few Americans would disagree)
There is an "International English" (at least, in Windows) but I don't know anyone who actually uses it in practice.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
This requires some serious AI (or just plain I) to sort out. And that only gets you past the subject line. Now re-read each of the sentences in my opening paragraph, but literally this time. Each of them would choke a grammar checker, yet for most readers they will parse perfectly well within the context.
Easier just to pay attention in Grade 7 English class, as someone already pointed out.
Well, they are completely different. They are thinking and communication skills. Spelling may be slightly mechanical, but knowing proper spelling can also save you from embarrassing conversational gaffes. (Can also help with other languages, depending on your native tongue.) Try puzzling out even mildly complicated calculus without a firm grasp of basic math. You don't have any in-build sanity checking, and you don't develop the room for intuitive leaps. Communication is more so, only without the advantage of being able to be explicitly incorrect in the same way one can be with math.
If your theory that we're moving into an age where mechanical abilities are automated (and I think you're right), thinking and communicating are going to be highly valued at the expense of automation, and how one communicates even more of a social identifier than it is now.
I forget what 8 was for.
that your brain is supposed to check the grammer as well as the spelling. If trends keep going the way they are, with spell checker and grammer chekcking, people are going to end up stupid. With no idea how to spell or form a sentance properly. I already find myself not really trying to spell a word, when that happens i jump on dictionary.com and search the word.This makes me read(a few times) the word and the definition. Which in turn, shows me how to properly use the word. People don't need spell checkers and grammer checkers. What they need, is to learn what they are trying to spell and say. Having something do it for you must shrink your brain.
w00t
I'm doing my MA project on spell checkers for L2 learners, and in the course of readings for it I've run across a few papers on grammar checkers as well.
There are opensource grammar checkers out there. Or at least one. Of course this article hits the day I don't have my big notebook of papers with me at work, so I don't have the reference and url. But someone did their MA thesis work writing a grammar checker (more acurately called a style checker) for KWord. You have to patch and recompile KWord, but it gives you something along the same lines as MS Word's.
Most of the comments about grammar here have been incredibly stupid, by the way. Here's an important thing you learn in an intro to ling class: all languages are equally complicated. It's not going to be easier to write a grammar checker for any language above any other. e.g. You might have to worry more about morphology in one language and word order in another.
There isn't a complete grammar written for any language. There's good reference grammars for a lot of the major ones, and for dead languages you can call it a complete grammar if it makes you feel better. No native speakers can tell you what you got wrong there.
In reality, languages have rational syntax in the sense that people can understand them, but none have the sort of rational syntax that a computer is good at understanding.
True. This means that computers should be more like people. That is to say, they should be intelligent. So, short of having a truly intelligent machine that can learn a language the way children do, a really good grammar checker is still in the future. But who knows, true AI may be just around the corner.
You are right that the grammar must be context-sensitive, but such power can largely be gotten without having the program know what the words mean. In fact, with a modest core grammar, an adaptive grammar parser can work starting with dictionaries consisting of only a handful of words.
Here is some background:
There are four classes of languages in the Chomsky hierarchy, each a subset of the next:
Regular languages - parsed by finite-state automaton
Context-free languages - parsed by push-down automaton
Context-sensitive languages - parsed by linear bounded automaton (finite-tape Turing machine)
Unrestricted languages -parsed by infinite tape Turing machine
Computer science has mostly stuck to the first two types of language and used ad-hoc hacks to make context-free languages imitate true context-sensitive languages when needed, for example in parsers for compilers.The grammars for languages have traditionally been static collections of rules written in Backus-Naur Form. In the late 80's and early 90's, researchers such as Christiansen, Burshteyn, Shutt and Boullier began working on grammars with rules that could be modified on the fly, known as modifiable, adaptive or dynamic grammars. Natural languages have context-sensitive grammars (at least) and cannot be parsed by lower-level grammars without special-purpose code for a totally impracticable number of commonly encountered cases.
Quinn Tyler Jackson, in work from 1993-1998 created a theoretical framework called Meta-S calculus and extended the Backus-Naur notation for adaptive, truly context-sensitive grammars. His first parsing library release applied this to : "an example natural language grammar that, with only a handful of preset words, which included only one noun ("man"), parsed the Gospel According to Mark (King James Version), acquiring nouns during the parse by context." In late 1998 he released a parser generator, PAISLEI, implementing these ideas and in mid-1999 he began publishing papers on the topic.
In 2002, with feedback from Bjarne Stroustrup (the inventor of C++) and Boris Burshteyn (one of the seminal theorists in modifiable grammars) he demonstrated that a clean, elegant adaptive grammar system could parse C++ with similar speed (and linear parse-time increases when fed longer input files) as conventional parsers using ad-hoc code to handle special cases of context-sensitivity.
When doing any parse, whether of natural or computer language, PAISLEI can provide a graphical derivation tree, which in the case of natural languages amounts to a diagram of the sentence. This remarkable software even naturally handles sentences such as "Time flies like an arrow, fruit flies like a banana", correctly identifying the first instance of "flies" as a verb and the second as a noun.
With a good dictionary and training on a corpus of known-good text, Dr. Jackson's program should be able to do even more astounding things. If I were putting together a grammar checker, OCR or voice-recognition product team, Quinn would be at the top of my must-recruit list.
"Is life so dear, or peace so sweet, as to be purchased at the price of chains and slavery?" - Patrick Henry
let me be the first to say... "wAt R u Tahkin abowt"
English is extremely, enormously comple. Take, for example, these two sentences:
On the lowest level of structure, they seem highly similar, but the human mind sees the structure behind them and understands the difference. It knows that "flies" is a noun in one and a verb in the other.
This kind of knowledge becomes important for grammar checking. Suppose you added the word "These" at the beginning of both sentences. Is that grammatically legal? As a human, you can rule that out for the first sentence because you know that "time" is not the type of noun that can be modified by a word like "these". So, a grammar checker has to know that. That's a royal pain, but a grammar checker could probably be coded to handle that. (That is, if you first figure out what attribute you really mean when you say "a noun like 'time'" -- what property is it of that noun that you are focusing on? What's that called? How do you code it?)
Suppose, instead, that you add the word "always" before the word "like". In the second sentence, that is legal. In the first sentence, whether it's legal depends on whether there is a type of "fly" called a "time fly" that is capable of being fond of an arrow. If there is no such thing as a "time fly", then that makes "flies" a verb, and putting "always" after the verb isn't legal, because that's not where adverbs go in English. But if there is a such a thing as a "time fly", then the meaning of the sentence is ambiguous and it could be grammatically legal or not depending on the context.
The point is, a truly good grammar checker requires you to understand the text being written. That goal is not achievable until we have achieved Strong AI (true, generalized machine intelligence). Even approaching that level of quality still requires a huge amount of effort. That's one reason that I don't see it happening as open source. There's not much use for a grammar checker that catches 1/10th of your errors, so it's sort of an all or nothing game, and all requires lots of resources.
English does have what can be defined as a classical structure. Its a shame to see it often not taught, or left completely of the english syllabus, in most forms of what we call education.
If we were to start a grammer tester, for the FOSS community, the best starting point is to do simple tests. The first thing that comes to mind is a test for, two words that are exactly the same in the same sentence.
Its not hard but its a start, and yes we wont find an all in one auto_magic alorithm to correct our grammer, so lets start with the simple stuff and see were it leads us.
Often in this age of high technology, the best tool is a rock.
I think were all aiming too high right now, how about we take baby steps first, for the simple things and work our way up.
Is this due to that new bankrupcy law? My two Grammars are dead. They do not need checking or any other type of account. Bush and his Business Buddies at it again.
That would have made French class so much easier!
My first French teacher was a Swiss woman with no sense of humour. She taught us the rule as 'BAGS' instead of 'BANGS,' until I noticed the French movie poster for The Fifth Element had Fifth before Element, not after, so I added the 'N' to the rule. She wasn't impressed, with rather irked me.
The _really_ unfortunate part of the French language is not the adjective-noun order, which at least has the BANGS method, but the 'gender' of nouns, for which there is no way to know short of memorization, as far as I know. A man's shirt is a feminine object, and a woman's blouse is a masculine object? Why?! Totally bizarre, though still the most beautiful sounding and looking language to me of any I know of.
There is one. The Link Grammar tool is a research project that performs English grammar checking. Recently, the AbiWord folks built a plugin for grammar checking which uses Link Grammar that highlights phrases with questionable grammar with green squiggly lines. It isn't perfect, but it definitely works, and works now. The 2.3 development series of AbiWord currently is the only one with this plugin, however, the 2.4 release is weeks away, and 2.3.x is quite stable. For those of you who don't know, AbiWord is a free/open source (GPL) word processor that is full-featured but fast. It runs on Windows, Linux (GTK+ and GNOME versions), and Mac OS X, as well as a new port to the Nokia 770 which is under development by INDt, a Nokia research lab. You can get it here: http://abisource.com/
Full Disclosure - I help out with AbiWord, as the Windows packager for 2.3 and 2.4, as well as some other random things. I started helping because it works great for me, though.
I recognize people by their sigs. Is that a bad thing?
I would gladly accept people not comminucating using the neuances of proper grammer iPh +h3y \/\/0u1d ju$+ $+0P +@1/iNg 1i/3 +hi$.
I took French for 3 years to meet the requirments, and this is the first time of heard of this rule. It really would have helped. For some reason I could never grasp the language; no matter how hard I tried it completely eluded me!
My first year French teacher was a woman from Canada with a serious attitude. She did teach us how to insult people, which was fun, but beyond that I had a miserable time with it. I think it was even more frustrating because I taught myself a decent amount of German playing around with mods for the RA BBS software.
rm -rf
But I don't think anything based on simple word correlations can do a good job, Google or otherwise.
Right now, I understand about 50% of a typical native Japanese conversation. However, this number is strongly dependant on context. If I am present for the beginning of a conversation, and know who/what/where/when is being talked about, I usually can follow the conversation. However, if I enter in the middle of the conversation, I usually cannot enter it, precisely because I am missing all of this context. That, combined with the words or expressions I do not yet understand, leaves me in a muddle.
It is my opinion that a key to understanding and translation is precisely this context - who is speaking, who the audience is (both within and outside of the text), where these people are, what relationships they have, etc. Therefore, any system which wants to translate must have a method for codifying such things, or its translations will always be garbage.
But there are so many problems that I seriously doubt anybody will ever solve. Not only does my experience (trying to perfect a model of English grammar) lead me to believe that the task is inherintly impossible, but...
Syntax is not consistent between dialects of English. Who here doesn't have a problem with this sentence, "Anymore I like to go fishing." Probably most of you. But not all! For some speakers, "anymore" no longer requires a negative sense verb. And this isn't a strange foreign dialect... find somebody from rural PA and they will probably have heard it. You can ask them what it means! They can be amazed that you don't understand it!
Smaller differences abound! In actuality, no two speakers of English will have the same syntax. What's grammatical for one will be ungrammatical for another. I can say "There's pigs in the garden", but you may balk and say "There're pigs in the garden,". For me, either works both in natural speech and academic writing. Some of you will agree. Some of you will think the former is OK in casual speech only. Yet others will entirely reject the first utterance.
So are we to expect a grammar checker with little checkboxes for each well known variation in dialectical syntax? Sure, presuming we could pin down the major stuff first, that would work. For about one generation. Then a whole new breed of English speakers comes about with their own slightly edited syntax.
It's not a matter of being taught wrong or right in school, you see. Academia has forced us to believe that we speak English wrongly sometimes. But if it's the natural language, as people actually use it, is it so wrong? Isn't it natural for language to change? (Yes). Isn't the grammar of Shakespeare and Chaucer dramatically different from our own? Along the way, hordes of people spoke "wrongly" and now we have a different standard. Consider split infinitives ("to boldly go")... everybody uses them in casual speech. I know hardly a soul who will correct me unless they're being pedantic. Yet our standards of "Academic discourse" prevent me from using it in a paper. Sometimes. In actuality, few college professors I know actually give a damn what some stick-in-the-ass bozo has to say about split infinitives. And trust me on this: changes that we find repulsive (perhaps "Where's he at?" will be one!) will eventually become the "right" way to speak.
When a child learns the language, he doesn't learn the same syntax as I have. He hears what others utter and from this evidence constructs a grammar of his own which will very probably be different.
Anyway... as a linguist, I'm still very impressed with the grammar checker in Microsoft Word. For the true complexity of the problem, it's a solid algorithm. It can always use tweaking - don't get me wrong.
Still, I don't trust it further than I can spit.
Who actually uses grammar checkers? I don't. Yes, I'm a professional writer and have a pretty good grasp on grammar, but it's not because I think constantly about the rules of commas or something. I just write like I'm talking.
I'm not trying to sound arrogant, but how is it that you can speak English your whole life and still not know how to write it?
The only time I ever have the grammar checks in word processing programs highlight anything, it's when I'm doing something that the computer is too dumb to understand or when I'm using a newer form of a word - for example, "to e-mail" is now a normal verb, but it used to just be a noun.
Overall, though, I'd say that if you need a grammar check to reform your sentences, you also need a logic check to reform your thoughts. Am I wrong?
P.S. As I previewed this, it occured to me that people who don't speak English natively could really use a grammar checker, and I'd sure appreciate one if I had to write Spanish.
1337 930913 |)0|\|'7 |\|33|) 59311(|-|3(|3|25
I've never seen such literate comments in a /. article!
I'm not sure if it's a function of every poster triple checking their comments for grammatical errors so they don't lose credibility, or merely the fact this article has attracted every grammar nazi out there.
But whatever the cause this is truly bizarre sight to behold.
I stole this Sig
try reading this I cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdgnieg. The phaonmneal pweor of the hmuan mnid. Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are,the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Amzanig huh? Yaeh and I awlyas thought slpeling was ipmorantt !!!
Among other examples ....
.... neither the German pilot nor the young, inexperienced Cypriot co-pilot could speak the same language fluently, and each had difficulty understanding how the other spoke English, the worldwide language of air traffic control. ... ... ... ....
.... the crew at over 14,000 feet would already be experiencing some disorientation because of a lack of oxygen.
Crew confusion found in Athens plane crash
By Don Phillips International Herald Tribune
WEDNESDAY, SEPTEMBER 7, 2005
PARIS The crew members of a Cypriot airliner that crashed Aug. 14 near Athens became confused by a series of alarms as the plane climbed, failing to recognize that the cabin was not pressurizing until they grew mentally disoriented because of lack of oxygen and passed out....
The plane had a sophisticated new flight data recorder that provided a wealth of information.
At 10,000 feet, or 3,000 meters, as designed, an alarm went off to warn the crew that the plane would not pressurize.
At 14,000 feet, oxygen masks deployed as designed and a master caution light illuminated in the cockpit. Another alarm sounded at about the same time on an unrelated matter, warning that there was insufficient cooling air in the compartment housing avionics equipment.
The radio tapes showed that this created tremendous confusion
During this time, the German captain and the Cypriot co-pilot discovered they had no common language and that their English, while good enough for normal air traffic control purposes, was not good enough for complicated technical conversation in fixing the problem....
LanguageTool is the program. He has apparently rewritten it in Java since I last looked at it. His thesis is in the docs section of the webpage. It looks to be more of a standalone app now with an API that other programs can call.
AskOxford.com has a question about whether to use the article 'a' or 'an' with an abbreviation that is pronounced with a vowell only when abbreviated, but not when spelled out (ie - MP is pronounced "empee" or "military police").
Google tells us:
511k sites use "a MP"
but
"an MP" wins with 1.75 million hits.
The experts at AskOxford.com concur. Why are we paying them again?
As a bonus, Google doesn't just return binary results. If the results are pretty close, you realize either usage is acceptable.
One might criticize using a poll of the marginally literate denizens of the net to determine best practices. But language is usage, and Google hasn't failed me yet.
Now, if you were automating grammar checking for a document, it'd be a little harder. But we just need some logic for phrase grabbing and word substitution... piece of cake, right?
If I had to venture a guess as to why there are no serious open-source grammar checkers, I'd say it's because most of the open-source coders mastered basic grammar in elementary school before they mastered programming in high school. Why would they spend their own time writing a program that they don't need? If you feel the need for a better grammar checker, perhaps you should work on your grammar.
Most grammar is common sense. Start with that.
Aero
Please stop hurting America -- Jon Stewart
Like Esperanto, for example. Its syntax is completely regular. Parsing is straightforward, and independent of semantics. Just as important, the language is also effective for ordinary human communication.
It's known that languages such as English are difficult, and in some cases impossible, to parse without understanding of the world which the language is intended to represent. So, for example:
Parity: What to do when the weekend comes.
Given random feeds from t'intarweb, the best possible tool would be incapable of generating any rules (or, in reality, would generate rules such as "it's" means "its", or "their" and "there" are identical).
In reality, we'd need some academics to agree on grammar in the first place, to be able to verify any such software.
Here's a novel idea... why not educate children about grammar (and spelling, come to think of it), and expect people to be able to do this stuff for themselves?
What else do you want? A machine to wipe your bum for you because you're too lazy to work out how to do it yourself??!
Author, Shell Scripting : Expert Re
Is it possible for this to be used by the slashdot editors - it could write their summaries!
...a cliche checker.
Jeez, how you younguns forget! In my day, we had style and diction, and we liked it. None of that fancy-schmancy parsing irregular grammar, just pattern match a few of the worst cases, throw out a few statistics, and wow!
Of course, that was when the line printer was state of the art, and you had to cut your printout into sheets to turn your English assignment in, and two or three nroff submissions could bring the PDP 11-44 to its knees...
Envy my 5 digit Slashdot User ID!
Do you know what that word means? Humpty Dumpty does. Lewis Carroll wrote "Alice in Wonderland"(1865) and "Through the Looking Glass"(1871) as books for young children. In America today, it is considered material for grades 9-12. Critique his post if you like, it does not invalidate his assertion. In this case, it actually helps to prove his point. Now that you've googled it, quiz your peers at work tomorrow. See what percentage of the college 'educated' adults answer correctly. Feel free to point out grammatical errors in my post as well. I also received a very poor education in literature in America's public school system.
A man's shirt is a feminine object, and a woman's blouse is a masculine object? Why?!
:)
Hey, anything that wants to be pressed against boobies all day can be assumed to be masculine.
What makes English such a pain in the backside is that the language has been so utterly simplified over the millenia that we have lots of words with identical spellings, but different parts of speech. This makes the word order critical.
Firstly, don't say it's been "simplified". Say rather that it has gained complexity in some areas and lost complexity in others.
Your point will help me illustrate:
<expound>
English used to have a larger set of grammatical suffixes (known as inflectional morphology), kind of like Latin. You put a particular suffix on a noun to mark it as the direct object; you put a particular suffix on a verb to mark its tense, number, or whatever. English has largely lost these endings, mostly due to some heavy phonological reduction of lots of its vowels during the late Old English and early Middle English periods, starting around 1000 CE and ending around 1200 CE. Basically, vowels in unstressed syllables turned to schwa (which is the first vowel in the word under, as pronounced by a typical American newscaster). Because of this, inflectional suffixes became ambiguous; because they were ambiguous, people stopped using them.
So English lost all that inflectional morphology. So what? Well, before this happened, English word-order was relatively free. Afterward, people could no longer disambiguate syntactic categories by the endings. So word-order took up that role, and English word-order became more fixed.
For more details, see [1].
</expound>
So just like a big game of whack-a-mole, a loss of complexity in one area led, in a rather straightforward manner, to an increase in complexity in another.
If we don't, in a matter of just a few years, we'll get to the point where nobody can understand anything.
This is patently untrue, but I forgive you. From an earlier post of mine:
<windbag>
This is a very common sentiment among educated people, cross-linguistically and cross-culturally. In basically every culture around the world, there is a group of people, usually middle-aged, that believes that people spoke their language "correctly" about a generation or two ago. They lament the eminent doom of their language. They blame the young, the uneducated, and the poor.
The fact is that languages change constantly, and lots of these changes can be pretty well understood as natural processes. For instance, if you're from the US, you probably pronounce the word butter with a d-like sound in normal speech (linguists call the sound a "voiced alveolar tap"). So it sounds just like "budder". When people started using that pronunciation, their elders probably thought them "lazy" as well. I can almost hear them saying, "Pronounce your t's properly!"
But think about it. In order to pronounce the word with a proper tt in the middle, you'd have to turn your voice on to say the b and the u, then turn it off to say tt, and then turn it back on to say er. It's much easier to just leave your voice on! And that's what people started doing. If you say the word with a "hard" t sound in America today, people will probably consider it strange.
</windbag>
People do not "mispronounce" and misspell words because they are stupid, lazy, poor, or young. (I realize the parent was not asserting that such is the case; however, the sentiment is common enough to warrant mentioning here.) The true reasons for these phenomena are remarkably subtle. Linguists have made great strides in understanding them, but there is still a very long way to go.
In any case, people have been misspelling words for a good healthy number of centuries now. Yet here we are, writing in English back and forth to each other. I'm not too worried.
References:
I don't agree with your premise. Spelling, while important, is not vitally important in exchanging information. The same is true, to a lesser extent, of grammar.
Your examples are good ones at showing how wrong your premise is. If someone writes, "I can't believe their taking it's toy away" you know what they mean. They mean, "I can't believe they're taking its toy away". You might have to work a little harder to understand the first sentence, but you can understand it.
This is because English, and most, perhaps all, other human languages, has an abundance of redundant information, so you can usually tell from context the meaning of a word even if it's misspelled, and you can usually tell the meaning of a sentence even if it is grammatically incorrect.
The redundancy of human languages actually makes it harder to write a good grammar tool. Since you can say the same thing so many different ways (more than Perl even!), it's really hard to determine whether a string of words is grammatical or not.
French adjectives don't always come after the noun, just usually. It's not a rule. Instead, your belief that it is a rule shows again why writing a grammar tool is so hard. Most things we think of as grammar rules are not.
Language is a living thing. Grammar changes much faster than our rules for what correct grammar is do. Get used to it because television, the internet, and SMS, are just making grammar change faster.
I write for a living. I love a well crafted sentence, but I don't care if anyone uses correct grammar because I don't believe there is only one correct grammar. The important thing is making your meaning clear. Appropriate grammar for your audience can help with that, but it only needs to be good enough to communicate the message. My post has a lot of incorrect grammar in it, but if you don't understand it, it's not due to the grammar, it's because I didn't write it well.
BTW, interesting use of the word nazi. You've turned it into a verb.
Oh, and being a grammar or spelling nazi isn't something to be proud of, it just makes you sound like a fuddy-duddy. Next you'll be yelling at the neighbor kids to stay off your lawn.As I see it, a good grammar checker needs:
1. A comprehensive lexicon, that lists all english words, and ALL their possible roles as "parts of speech", that is, verb, noun, etc.
2. A hunk of statistics for each word, sampled from error-free text, showing the probabilities of the parts of speech when surrounded by certain words.
3. Some good well rounded algorithms to resolve ambiguities. "Time flies like an arrow", is time an adj. or a noun? Is "flies" a noun (plural) or verb?
4. good rule checks to see if the sentence fits proper patterns. Best to encode all the exceptions.
And all the painful details about extracting the root words from tense, etc. may come to play.
One possible reason the free software community hasn't played around with this yet, is because the cost of developing a useable lexicon is HUGE --it's several man years, if not tens of man years to develop, debug, review, etc. such a lexicon. The guys at Webster's, etc, have a definite head start, but... even such may not be useful enough; hardly any dictionary provides the level of detail you need, in painful accuracy, describing the parts of speech in a useful way right now. The GNU Dictionary project (history described here gives you a taste of the scope of the project. From what I've heard, it was mostly done by Russians (when they were cheap), because OCR is just not there.
From the standpoint of a grammar checking lexicon, the GCIDE in xml/html format is peppered with errors, omissions, irregularities, and problems, lacks all kinds of useful info, but is the best shot I've seen yet at a free lexicon.
Seems like most of the grammar checking s/w these days is rule/pattern based, and can spot a lot of common probs, but...
To sum it up, my guess as to why the open software movement seems to ignore the grammar checking software, is because a key piece of technology, a good lexicon, is missing. When one exists, you'll see all sorts of folks making pot-shots at really good grammar checkers.
Dogs look up to men; cats look down on men; But Pigs! Pigs can look men square in the eye. -Churchill
"It's one of the few languages in which you can scramble the order of the words in the sentence and not loose any meaning because the word carries enough meta-data in the form of all of the various endings."
:)
It's not like I'm a grammar nazi or anything, I just like the irony
"When the atomic bomb goes off there's devastation...but when the atomic bong goes off there's celebraaaaation!"
Esparante!
Actually, there was something about this very subject on Slashdot just a couple of weeks ago:
The beauty of this algorithm is that it isn't tied to a particular language, like English (in fact it's even been used to analyze DNA genome sequences), and using "known good" texts as a reliable source the resultant data sets can be used to gauge the "badness" of your grammer under test.
IM IN 7TH GRAED AND MAH TEACHER DO3S NOT LOK OV3R MAH SHOLD3R AL DAY AND IM MAKNG OUT11!!1!1! OMG LOL THINK ABOUT IT IM DONG JUST FIEN111! OMG WTF U CAN UNDARSTAND M3
Oh wait, you said useful and functional... guess we need to start from scratch.
English seems to be more a matter of style than of grammar. People want something to sound "right" and to be easy to read. Ending a sentence with a preposition is not an offense to die for.
Could we program in a set of styles, and check the document against a style?
Andy Out!
Do other languages have parts of speech that do not have English equivalents? (I'm guessing yes, but not significant ones). Do you have any good links I could look at?
My personal theory is that the way language is constructed is analogous to the way the brain works. We break the world into objects, and these objects have attributes. Thus we have nouns and adjectives. Similarly, we see actions and these actions can have different characteristics: verbs and adverbs. When you are modelling language, you are modelling the mind.
"I'm not impatient. I just hate waiting." - My Dad
How about instead of a grammar checker we have a grammar teacher... Others have already said we should be teaching better grammar in schools rather than using grammar checkers to check our work. So lets make the grammar checker more like a teacher, when you check your work it asks you "Do you see what you've done wrong?" then lets you figure out what you need to change and explains why.
You're not talking about Lojban, are you?
"Just because we can build tools doesn't mean we ALWAYS have to use them - or that we can forget (or never learn) how to do things without them!"
I think that the reason we have the tools is because these problems existed in the first place. I have a hard time believing that spell-checkers are creating more lousy spellers, rather they are probably helping lousy spellers perform better.
I cried real tears when Li Mu Bai died.
I hate the argument that it doesnt help people who don't know good grammar. Of course it doesn't. It wouldn't help them if it corrected every word. These people need to learn how do develop good sentences that make sense grammatically. Basically, you shouldnt be that lazy. It takes, what? like 20 minutes extra to read through a paper and correct all of your grammar and spelling errors. And if you somehow missed that whole thing called 'English class' in school, then thats your problem, and you should have payed more attention, huh.
Spell-checking makes more sense, because it only takes one letter to mess up a word, not only that but the task is more daunting to weed out all those errors. Grammar is easy.
Come on people, its not that hard.
And "fruit flies" is a noun-noun compound, not an adjective modifying a noun.
Are you adequate?
In writing, the content is the ideas, the sentence fluency, the voice, the organization, and the choice of words. The conventions of writing, like spelling, punctuation, and grammar, are incidental and can be edited without changing the intent of the original author. Granted that grammar has an impact on choice of words and sentence fluency, but they are still only incidental.
The content, or the author's intent, is the REAL content of writing.
I cried real tears when Li Mu Bai died.
i can see why they posted this "proposal" on /. :)
Wintertree-Software.com has a wonderful Windoze app, Grammar Expert Plus. The demo even continues to work after the trial period, with about a 10 second nag screen.
I, myself, was always recognized . . . as the "slow one" in the family. It was quite true, and I knew it and accepted it. Writing and spelling were always terribly difficult for me. My letters were without originality. I was . . . an extraordinarily bad speller and have remained so until this day. --Agatha Christie
I cried real tears when Li Mu Bai died.
Hi everyone, AbiWord-2.4, due any day now, will come with an integrated version of open-source grammar checker link-grammar. You download a beta version now. As usual it is available for Linux, Macs and Windows.
RS
Shoes for Industry. Shoes for the Dead.
> French....the most beautiful sounding and looking language
From which I infer that you know neither Chinese nor Japanese nor Serbo-Croatian.
But yes, Rimbaud will always sound like making love to the overmind, while Rilke will always sound like a phlegm disorder.
-I like my women like I like my tea: green-
efficient,useful and functional grammar checker would stop language evolution. that is bad.
I've heard quite a lot of Chinese and Japanese, and almost no Serbo-Croatian, but aesthetics are a matter of personal preference, which is why I said 'to me,' so get over it.
Man, I wish I had better karma, because I've got useful things to say here.
You can check grammar using a well-trained Hidden Markov Model and the Viterbi Algorithm. If I were to design such a program, I would have the part-of-speech tagger have a go at a sentence, and if it came back with a confidence below, say, x, then the sentence's grammar is probably not good.
This is nice because it also helps sentences keep from being awkward.
A rock is a good enough hammer in many cases, just don't try to do fine carpentry with it. Similarly, Word/WP is good enough in many cases, just don't use it to edit your novel.
All you've shown is that you can misuse a tool in a slightly failing way. Despite your best attempt (or at least good attempt) to foil the tool, it let you convey your intended message.
There are 1.1... kinds of people.
There's the Link-grammar parser, which can do grammar checking. It is being used by Abiword (2.3/2.4).
http://www.link.cs.cmu.edu/link/
Well, don't worry about that. We can get you back before you leave. (Dr. Who)
Perfect grammar checkers are impossible with less-than-human AI and good ones are extremely difficult. But I posit, even adequate ones are useful!
The function of a grammar checker, to me at least, is to catch common stylistic faux pas, and to backstop a spell checker with some intelligence about what word belongs where. It can't do your thinking for you, true. But it can catch when you've goofed egregiously - which everyone does sometimes.
Parsing natural language is automatically inferring syntactic structure. Grammar checking requires accurate parsing, otherwise it's ad hoc.
However, automatic parsing methods still struggle to achieve high accuracy. That's the reason grammer checkers still suck.
So parsing is, at least, the most promising direction for immediate research. In the limit, grammar checking is an NLP-complete problem.
Take it from me: I've written a full-coverage grammar cum spelling checker for Dutch, and a natural language question answering system (working on a smaller domain, but includes semantic reasoning), and I can tell you it's an incredible amount of work to write a good grammar checker.
... ad infinitum.
It's the 90-10 rule in action: 90% of the work can be done in 10% of the time. A fairly large coverage can be obtained in relatively little time. However, the remaining bit takes very much time. E.g., the current dictionaries may cover many words, but they do not really make all semantic distinctions. If you would have them, you would need to obtain statistical distributions over all these distinctions and their combinations. Then you would have to fit this into your rules. Then,
And then, everybody has his pet linguistic theory in which he fits his dictionary and rules. So collaboration is not very fruitful.
And then, when you've got your basics completely covered, comes the hard part: understanding the whole text. Sentences that are difficult to read or outright nonsensical may be acceptable in another context. Well, that ultimately requires the equivalent of a human brain, not just a shallow NP parser...
Now ask yourself: would an open source project even get to first base? No, it takes too much time and effort and intelligence. It's like asking for an open source nuclear fusion reactor. Only the combined effort of a group of university research groups can slowly tackle something as complex as this. You'll have to wait for them to come up with something and make it freely accessible...
As you are talking about parsing Latin, I *must* report this:/ Perligata.html
http://www.csse.monash.edu.au/~damian/papers/HTML
That's what I call a "proper language"...
--federico
and most males prefer to be inside a female?
...and unless you're using English English instead of American English. The phrase "green campus" is American English phrase, with no direct translation under English English variants.
I once had a US border security guard ask me whether I spoke English. The temptation to reply "My dear chap, I don't just speak it, I am English!" was almost unbearable, but the nearby box of latex gloves convinced me that the more concise "Yes sir" was more appropriate.
(Anyone who thinks that there is such a standard as "British" English has obviously never attempted a conversation with someone from Glasgow.)
Andrew Oakley - www.aoakley.com
Can be found here.
That's why there are so many spelling/grammar nazis on slashdot.
/. story with the text in question, and return a real-time grammar/spell check?
So what you're saying is that Microsoft's next spell checker will actually post a
Find as many un(der)employed college graduates with English degrees as you can. Pay them to check your grammar. Problem solved!
"We can categorically state that we have not released man-eating badgers into the area." - Major Mike Shearer, UK
ADIOS and MEX (usage a google search for those two terms). Make it possible to distill a grammar from a corpus (existing text). And then use that grammar to validate provided text. // babl fishes being made to custom order //
You're asking slashdot about grammar checkers? This brings up a good point though. Most speakers of the english language, and even those who study it can't agree on what proper grammar is. How are we supposed to program a computer to check rules that we don't even agree on. Requirements first people!
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
I tried to convince my daughter that even when she talks in leet-speak to her friends on IMs, she should use proper diction and syntax. I said it would help her when she has to talk to a real person. The verdict is still out. My crusade continues.
Because artificial stupidity doesn't breed, thus making it the preferred choice.
Bring on the asteroid
In my spare time, I write fiction. I've done it for years, and the tools that I use are simple: a text editor (not word processor), ispell, and a self-written output formatter written in Perl in order to get things looking nice or to reformat into HTML/XHTML or other presentation formats.
I'm not a perfect writer, and it's difficult to self-edit when you are proof-reading your own work: you tend to read what you INTENDED to say, rather what you ACTUALLY wrote.
I found that there weren't any proofreading tools available for the platforms I was using (Unix), so I tried to figure out what common mistakes I tend to make (duplicated words, etc.) and wrote a program in Perl to find these things. It started simple, and in the couple of years since I wrote this processor, I've added additional checks, which now include:
Right now, I'm wondering what to do with this. This program works as a filter that simply generates "error lines" that my text editor can read and use to pinpoint where in my text the error occurred (using built-in features like "find next error"). This suits my needs, and I can add new checks as I think of them. How useful would it be to others? How many others even write the way I do? It's not as extensive as the grammar checker in Word and it works on an entire (text) document at a time.
The obvious place that could use this would be the OpenOffice.org project, especially since OOo has no grammar checker. However, it doesn't use just plain text, so I'd have to make a lot of changes to remove formatting, etc. before checking the patterns. Still, I've played around with the idea that I could "port" my Perl script into a plug-in for OOo. The problem is, I don't really use word processors all that much... it wouldn't be something I'd be using extensively, and therefore I wouldn't have the incentive to keep it up to date.
In addition, OOo is not an English-specific project. My grammar checker, even if it was ported to be used within OOo, doesn't have the rules for French, Spanish, Chinese, Japanese, Swahili, or any other language... such support would require quite a lot of dedicated individuals.
All that being said, and relating to the parent post, even using my grammar checker isn't a replacement for having a flesh and blood person looking over your shoulder. How can a program know that you are overusing the word "had" or using "like" when you should be using "as if" or another better phrase? Even writing a rule like "Never start a sentence with 'because'" isn't sufficient, because there are some times when that rule shouldn't be applied ("Because the food was so good, it was eaten in a matter of minutes.").
"May I have ten thousand marbles, please?"
Next will come "Grammar" (Gnome) and "grammatiK" (guess).
Within the following year we'll see "iWordflow", and six months past that (after a flaming rant and a two-week hackathon) everyone will upgrade to "OpenGrammar".
You'll eventually see XFGrammar for people who think Grammar and Grammatik are too bloated.
Dewey, what part of this looks like authorities should be involved?
Actually the word "moose" has no plural at all. This makes talking about a group of them difficult, so native speakers tend to avoid it. We will say "herd of moose" or other such things when we must, but that doesn't make it correct, only that we have no other choice. Natives will go out of their way to avoid talking about more than one moose at a time.
What would it take to make a useful, functional grammar checker?
You'd have to find programmers who actually knew correct English grammar.
18 rules
A classic.
It will never be outdone by any other book or by any means, electronic or otherwise.
This urban legend is based on a true story. The linguistics professor was J. L. Austin, the interjector was Sydney Morgenbesser, a philosophy professor and the interjection was "Yeah, Yeah". See http://en.wikipedia.org/wiki/Sidney_Morgenbesser .
'la barbe' is the one that makes me chuckle...
Beard in french is feminine.
The beauty of Chinese lies not in the spoken but in the written word. There it is as far beyond all European languages as Andromeda is beyond the corner store. Japanese has most of that, plus a minimalist euphony and several orthogonal dimensions of orthography.
Serbo-Croatian really only has euphony going for it, and I would be surprised to see it last as a national language for another 200 years. Quite pleasantly surprised, since then I would be older than a Galapagos tortoise, and seeing anything would be a pleasant surprise.
-I like my women like I like my tea: green-
Useful, functional grammar.
proofvi(1) has been around since 1984, as part of the Writers' Workbench software package from Bell Labs. It works pretty good^H^H^H^Hwell. I have a copy on my 3b2-500. :-)
Grammarian by Casady&Greene Mac OS 7 or 8.
As a 13-year-old I noticed that a heavy majority of French nouns ending in "e" were feminine and a heavy majority of those not ending in "e" were masculine. There are plenty of exceptions, but this will get you a long, long way down the track.
Awesome tip; thanks much! Eventually, I do intend to get back into studying French, so this should help a lot.
5. Profit!
Me lost me cookie at the disco.
Yours would be better without the comma.
Try this one: http://www.checkthetext.com/
One can be bolder, and say that adding an "e" to the end of a noun if often a way to make it feminine. Again, there are exceptions, and it's only useful whith live things.
le chat -> la chatte (the cat -> the female cat)
un français -> une française (a french -> a french female)
On the other hand, there are really crazy things, like some nouns that are masculine when singular, and feminine when plural (never the other way round, though). The most famous one is "amour" (love), and there's no doubt that someone is going to make a joke about polygamy.
People verb words all the time.
Made me laugh!
-kgj
PS, this entire thread is interesting; thanks all.
-kgj