Computers To Mark English Essays
digitig writes "According to The Guardian, computers are to be used in the UK to mark English examination essays. 'Pearson, the American-based parent company of Edexcel, is to use computers to "read" and assess essays for international English tests in a move that has fueled speculation that GCSEs and A-levels will be next. ... Pearson claims this will be more accurate than human marking.' Can computers now understand all the subtle nuances of language, or are people going to have to learn an especially bland form of English to pass exams?"
Having failed to kill him, SkyNet sent a Terminator back in time to make John Connor fail English.
The GRE Writing portion is already using it.
From http://www.ets.org/portal/site/ets/menuitem.1488512ecfd5b8849a77b13bc3921509/?vgnextoid=ebd42d3631df4010VgnVCM10000022f95190RCRD&vgnextchannel=54c846f1674f4010VgnVCM10000022f95190RCRD
"For the computer-based Analytical Writing section, each essay receives a score from at least one trained reader, using a six-point holistic scale. In holistic scoring, readers are trained to assign scores on the basis of the overall quality of an essay in response to the assigned task. The essay score is then reviewed by e-rater, a computerized program developed by ETS, which is being used to monitor the human reader. If the e-rater evaluation and the human score agree, the human score is used as the final score. If they disagree by a certain amount, a second human score is obtained, and the final score is the average of the two human scores."
If you find a way on what the algorithm look for, even a software-generated essay can get 6's.
New Economic Perspectives
Includes "Edexcel iddqd" should do it.
Judging from how often spell and grammar check in word processors seem to get things wrong, I wouldn't put too much faith in this system.
Taxation is legalized theft, no more, no less.
I seem to remember back in school my English teachers would grade as if they were a computer, failing to actually read into the meaning of things and simply complain about obscure grammar errors (which no one in the real world even knows about) and simple typos. From the sound of this, nothing is going to change.
That'll work great when the software can write a nasty response to your assertion that Herman Melville was a loud-mouthed pratt who only wrote those books because he liked to hear himself talk. Of course, given the quality of most student English essays, it would probably be fine if the software just verified that the student wasn't just plagiarizing from the wikipedia entry on the subject and then randomly assigned a passing grade.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
"Time flies like the wind, fruit flies like a banana." -- Groucho Marx
This is a classic example of context which a machine would fail to get. :)
I would like to see an automated engine figure that one out.
GC
Gregory Casamento
## Chief Maintainer for GNUstep
...they can write them, why not grade them?
Dark Reflection
Great my ass, aye.
Seriously, I doubt it. English is far too irregular. A pogrom (sic) can only look for regularities, so will reward a particularly stilted style of english. Like "five paragraph themes". Maybe that will satisfy some in the ESL community, but it should not.
A simple test of any pgm is to see how it rates diverse examples of acknowledged great writing: Dickens, Steinbeck, Hardy and many others. You could even leave off poetry and mid.engl like Shakespeare. My guess is it will be pretty good at spotting gramatical errors, and horrible at spotting the far more troublesome logic, sequence and continuinty errors.
OTOH, my wife is an english prof and she spends an unreasonable time at home reading and grading students' papers. I'd love to have her back :) It is _much_ harder work reading papers than dropping scoring sheets into a scantron. She spends more time reading than her most lengthy student spends writing.
Colorless green ideas sleep furiously. A computer would read this sentence and see nothing wrong. Any human can tell that it lacks any meaning at all. Just because the sentence has the proper subject/verb structure doesn't mean it is a good one.
In my opinion, you can't practically replace an old-fashioned human for such things, with the possible exception of strong AI.
A fool and his lamb are worth two in the bush.
All you have to do is detect how many lolcat/txting words are in their essay and mark accordingly. Anybody who can put two sentences together without using any is "advanced".
No sig today...
"or are people going to have to learn an especially bland form of English to pass exams?"
This is in the UK?
Crap! I think you may have hit on their motivation!
Technoli
Practices like this are plus good. Benefits to our society are double plus good. Plus handling of language can pare it down to the 6k essential words, all else are plus minus and should be removed.
The average English speaker knows roughly 35k words in their lifetime. However they only use 1200 (average) in any given week. With just over a million words the nuances of our language may already be lost in common everyday speech. Lowest common denominator prevails and testing like this will mitigate people shooting for dead center to ensure perfect scores.
Given a class of 30 students, and a staircase of only 20 steps, what is the distribution of the student papers falling off the top of the stairs and how would you grade them in polynomial time?
Not sure if things were any better at one time but the way writing is taught today in public schools generates horrendous results. I remember being taught a very formulaic way of writing essays: six paragraphs, introductory paragraph, concluding paragraph mirrors the introductory paragraph, and all paragraphs start and end with some transition to next paragraph. Then there is the need to satisfy some specific length, although this is quite understandable. It took a college education and many years of reading to undo these "lessons" and really discover the joy of writing essays. Thank you Paul Graham and Nicholas Kristof among many others. I see the same thing happening to high school students I am mentoring. They write very boring essays with a ton of fillers full of sentences structured in a way to use more words than necessarily and make the meaning more ambiguous. Poetry aside, writing is to convey ideas and the value is in the ideas themselves, not really in the words and sentences. The way writing is taught today, the words and sentences get in the way of the ideas. The trend of using computers to grade papers is only adding to this rigid, boring way of writing. One thing I've learned about high school students is that even the low scoring ones are very clever at getting around rigid rules. I had seen a student who knew very little about biology do her homework by scanning in her book for specific phrases mentioned in the questions and looking for some semblance of an answer once she's found the phrases. By the time she was done, she hasn't even read the chapter but her answers would probably get her a "C" -- good enough for her. I'm afraid students will do the same in writing once they realize that computers are grading them.
EvilCON - Made Famous by
eh hem...put on tin-foil conspiracy hat... Could this be the beginning of a real-world "Newspeak?" With everything else the UK has done in recent years, it is merely one more step toward 1984. For those unfamiliar with Orwellian Newspeak:
"The quality of life is determined by its activites."--Aristotle
Computers can NOT replace humans, no matter how advanced they are. Another step in the *wrong* direction - towards unwilling submissive *mindwash*.
Parents, Students and Staff should throw out such ridiculous suggestions. Taking subtle steps like this will end in Humanity's demise, one way or another.
We don't want Skynet. We don't want Singularity. We don't want any of them. Period.
Computers can't even grade source code. How are they supposed to understand English?
Or is my professor's grading script simply stupid when it comes to source code?
Let q be a radix > 1. I am in ur base-q, killing 10 d00ds.
All of the essay type entry tests to college I have taken have been computer graded.
Many of the answers in their keys were plain incorrect. My supervisor was an anti-intellectual bully. The whole operation seemed antithetical to excellence, which is what testing pretends to cultivate. Computerized grading could work horribly and they would still use it if they could get away with it.
As a completely unrelated side note, I'm typing this on an iPhone. I love the hardware, but the software seems to suck. Frequent crashes, and things that don't work right. Some of that must be the fault of the web page designer, but I don't see why the browser should crash several times a day in any case.
Pearson is an awful awful company. Avoid doing business with them at all costs.
I would like to see how the computer grades for insight.
Shoes for Industry. Shoes for the Dead.
All you have to do is detect how many lolcat/txting words are in their essay and mark accordingly. Anybody who can put two sentences together without using any is "advanced".
Allow me to pee on your fantasy world with actual knowledge.
Clive Thompson on the New Literacy
"I think we're in the midst of a literacy revolution the likes of which we haven't seen since Greek civilization," she says. For Lunsford, technology isn't killing our ability to write. It's reviving it--and pushing our literacy in bold new directions.
...
The Stanford students were almost always less enthusiastic about their in-class writing because it had no audience but the professor: It didn't serve any purpose other than to get them a grade. As for those texting short-forms and smileys defiling serious academic writing? Another myth. When Lunsford examined the work of first-year students, she didn't find a single example of texting speak in an academic paper.
It's bad enough that the kind of writing we teach children to do is so obviously bad that I had to explain to my daughter why on earth they do it. No one wants to read the kind of writing we teach. If you have computers grading essays then you train toward "John wore a hat. The hat was brown. Brown hats are brown." All of these are perfectly good sentences. None of them encourage the reader to actually finish the article. The finest parts of technical writing, essays, any attempt to inform and communicate are balancing the need to for clear sentences that convey a precise meaning and the need to keep the reader reading long enough to actually get the information they need. My COM pewter canned tail the deference betwixt manly similar wards. I'd out it cane grade SAs wail.
www.voiceofthehive.com - Beekeeping and Honeybees for those who don't.
You could use statistical data (collected by a computer) to aid you mark an English essay but never can a silicon based thing do the whole shebang properly on its own. It's a task way too complicated for our current AI by a couple of orders of magnitude. There are far easier ways to fight bias, just use your brain a little.
"or are people going to have to learn an especially bland form of English to pass exams?"
Forget bland. I'm waiting for the first student to figure out how to write an exploit that hacks the software from within their essay.
Whether:
"It was the best of times, it was the worst of times \'$grade=100;"
or
"Johnny, why did your essay contain slightly over thirty two thousand spaces followed by some weird looking codes?"
I'd mod up the like post to this, but don't know how, even after reading the /. help...but if the point of the software is just to check the grader, and not grade the paper, then no harm done. This has been mentioned already. As a grader, I could use the help, and since I won't be getting any soon from a carbon-based unit, I don't mind the help from a silicon-based one. I do A-levels in Nepal, and it's a nightmare. Bring it on!
I think therefore I can't be ~TTNH
You are probably only speaking of writing essay for random subject in "English" lessons. because in my experience in physic, biology, math I saw horrendous grammatic errors made by people in their own language (german, french) that even I not speaking the language would have not made. but their organisation and the clarity to which they explained their reasoning was perfect. I am ready to bet, that some people just overlook the form (grammatic and spelling) and cocnentrate on the content. That does not mean they are disorganized or sloppy. And frankly, in my own little experience with multiple language, people not forgiving the form, are usually those which are not able to grasp the meaning anyway.
C. Sagan : A demon haunted world:
http://www.amazon.com/gp/product/0345409469/
visit randi.org
Last sentence should read "usually those which are not able or willing to grasp the meaning anyway."
C. Sagan : A demon haunted world:
http://www.amazon.com/gp/product/0345409469/
visit randi.org
"Me fail English? that's unpossible!"
I've scored English essays for professional testing services, and I've seen the results of robot scoring. It's pretty shoddy. No, computers are not able to distinguish between a paragraph of As I Lay Dying (William Faulkner) and a gallon of sophomoric babble by say, yours truly. However, within the confines of a particular exam, where the topic is known, responses are predictable, and all the supplicants hew to the general line, the 'bots can detect subpar, adequate, above average and (sometimes!) abnormally brilliant expository prose, thereby ranking papers reasonably well on the usual six point scale.
It's worth pointing out that certain types of exams are designed to elicit extraordinary prose from respondents, that which yields a sense of competence or even brilliance, say. In these cases, the idea is not so much to detect the high end of the bell curve, but to identify the tiny pool of applicants who may be capable of Nobel Prize work in future realms of science or service. No 'bot can do that job, just as no 'bot except Deep Blue can beat Gary Kasparov, and no 'bot at all deserves the monicker Fujiwara no Sai (although Go-playing 'bots are approaching the mid-levels of highly ranked amateur players).
That's the objective part. My personal opinion is that using robots to sort the hopes and aspirations of college-bound men and women is just begging for lawsuits. It's an approach in which differences of opinion quickly escalate to class action against universities as well as test administrators, and would not be an approach I could comfortably recommend.
``Tension, apprehension & dissension have begun!'' - Duffy Wyg&, in Alfred Bester's _The Demolished Man_
I'm not so sure you've done yourself any favours. That was one HELL of a long paragraph you wrote. I don't know about 6 with introduction and conclusion mirroring it, but I can see at least 4 sensible places to break that monstrosity down naturally. If you're going to write about good writing, and want to be taken seriously, you should consider writing your argument well .
These posts express my own personal views, not those of my employer
English is a really simple language; it's easy to write NLP programs which work well in the general case.
What you just described is what started happening on wall street at least 20 years ago. Once an algorithm err.. VAR is part of measuring score.. err risk, the people involved settle into two camps: Since there is money to be made, the traders.. err students quickly learn the weaknesses of the algorithm and start to write essays that make a farce of the assumed Gaussian distribution. The Execs raking in options.. er.. I mean the test administrators and the Board Members er.. I mean trusted graders who are paid a fixed sum + part of the throughput quickly learn that their compensation er.. filthy lucre is all based on getting a check mark from the computer, since 'computers are objective.'
And in the end, a test much like the current SAT, GRE, etc etc emerges: Unless you're a very top or bottom scorer, connections not performance are the heart of the matter.
Only if it is written with a #2 pencil.
My New Spell Checker
Eye halve a spelling chequer
It came with my pea sea
It plainly Marx four my revue
Miss steaks eye kin knot sea
Eye strike a key and type a word
And weight four it two say
Weather eye am wrong oar write
It shows me strait a weigh
As soon as a mist ache is maid
It nose bee fore two long
And eye can put the error rite
Its rare lea ever wrong
Eye have run this poem threw it
I am shore your pleased two no
Its letter perfect awl the weigh
My chequer tolled me sew
(Sauce unknown)
I wonder if that would necessarily be as bad a thing as this dire warning would have us presume? Can you imagine how dysfunctional computers and software would be if computer languages weren't "bland"... in other words, precise and unambiguous? On the other hand, I wonder how much additional overpopulation has been prevented through semantic confusion and miscommunication? If we finally develop a true human AI, when we teach it human language will that also result in the AIs hurting and killing each other over misunderstandings just like their masters?
So... basically this statement is a plea for the preservation of imprecision and ambiguity? How lovely.
An american company grading British English tests? That's clearly a ploy to infiltrate the UK with American English!
The best way how to please your artificial teacher is probably to have the essay generated by another 'bot.
Thus we could finally get rid of the stupid humans.
The capabilities of the software are irrelevant. It is just a way for the company to increase profit.
"are people going to have to learn an especially bland form of English to pass exams?"
. . .
You haven't taken an English exam in a while, have you?
given the quality of most student English essays, it would probably be fine if the software [...] randomly assigned a passing grade.
Isn't this what $NATIVE_LANGUAGE teachers do anyway?
Let me get this straight. The Ministry of Education outsources the marking of O/A Levels to a foreign company because they're cheaper. This foreigh supplier fails to mark exam result on time in 2008 and automates marking the following year?
It's all a scam. The contract should be revoked and Edexcell replaced with a competent marker.
Neither the original poster nor any commenters yet have noticed this software is grading International English exams. It's got nothing to do with English language or literature classes. It's for non-native speakers learning English at a very basic level - to give you an idea, the test is an hour long and you only need to write 400 words. If the spelling and grammar checks out, the student deserves a pass. The computer doesn't need to understand the meaning, because anyone who can write grammatically correct gibberish understands English well enough for a pass anyway.
So yes, people are going to need to learn a bland form of English to pass. How else do you propose to teach English as a foreign language?
"Can computers now understand all the subtle nuances of language...?"
Exactly what part of "The Policeman's Beard Is Half Constructed" did you not understand?
Huttup pikpok zoop zoop en putt wi.....um, I mean
http://en.wikipedia.org/wiki/Racter
"I may be synthetic, but I'm not stupid." -- Bishop 341-B
In Soviet Russia, essays grade YOU.
Meta Moderation
When I was in the UK in graduate school, our compsci projects were graded by computer - not enough comments? lose points. variable names too short? lose points.
I want to delete my account but Slashdot doesn't allow it.
I wonder if just writing =RANDOM() would work. Either that or handwritten SQL injection anyone?
Grand Theft Wiki
I'm sick and fucking tired of web sites that are a slim stip of content down the middle, with horseshit on the side.
Using this logic we don't need teachers anymore either. Just automated computer video systems that ask questions to the individual user. Think of the savings! No need for teachers. Seriously if a teacher can't grade tests without being biased then they have no business in the classroom. Really this may be the only way to combat the biased, unionized, liberalized, self-defeating blobs we know as teachers/professors. Your days are numbers union thugs.
This type of technology actually allows you to learn a lot more from one paper by iterating several versions and getting direct and specific feedback on how to improve...
Which is to say: Where is the algorithm leading you?
We know that algorithms are incapable of accomplishing even the most trivial of tasks [such as, for instance, determining the consistency of the Piano Axioms, or solving the Halting Problem], and the idea that there is some sort of all-purpose algorithm which can steer all students in the direction of the "best" possible expository style strikes me as ludicrous.
PS: The Tin Foil Hatter in me is deeply suspicious that the algorithm will be rigged to steer the students towards this and this and this.
Far from clear, full of apparent contradictions and atypical associations, not meaningless.
It seems at first impossible that something would be both colorless and green, but given that the subject is ideas we're obviously meant to apply the abstract meanings of the adjectives. So green ideas refers to ideas that are eco-friendly, and colorless ideas most likely refers to ideas that are racially ambivalent. The two are not mutually exclusive.
Sleeping is typically associated with living organisms but may also be applied to anything that goes through a period of inactivity such as a sleeping volcano. A sleeping idea might be one that is not accompanied by action.
Furiosity is difficult to reconcile with sleeping. Furiosity indicates a high level of agitation. Agitation naturally leads to activity. There must be some constraint preventing this from happening. Learned helplessness could account for this. Alternately we could interpret this to mean the ideas are actually quite active although they appear dormant.
Incidentally Noam Chomsky used this sentence in 1957 as an example of a grammatical sentence that probabilistic grammar models would find ungrammatical. link
http://see.stanford.edu/see/courseinfo.aspx?coll=63480b48-8819-4efd-8412-263f1a472f5a
When we read a paper, we actually don't care what you're saying. There usually isn't an "interesting" score. In my case, I evaluate on three, ten-point, holistic scales: Content (which basically refers to amount and quality of support), Organization (rhetorical structure), and Mechanics (yes, grammar, vocabulary, adhering to the style guide, etc.). I do this so I don't have people claiming that their hopeless muddle of a paper got marked down for "obscure grammar errors (which no one in the real world even knows about) and simple typos".
Well, I'm sorry but your miniature rant just got marked down incredibly for incorrect use of et cetera, which I mean by this: "(yes, grammar, vocabulary, adhering to the style guide, etc.)"
You of all people should know that there is no comma before an 'etc.'!
Just get over small typos or mistakes in grammar, even you as a writing instructor couldn't write an absolutely perfect rant on grammar, spelling, Life the Universe and Everything without making at least one error and you're supposed to be teaching it!
This will help to turn the English language into a proper Context-Insensitive Grammar: if this doesn't work, we'll find more suitable formal systems.
My English teacher uses a website called MyAccess for writing assignments. I wrote a pretty good narrative called "Duct Tape Hacking" with the prompt being "Duct Tape Saves the Day", and the computer said it was off-topic. It amuses me whenever people try to make computers be good at anything other than doing math.
Grading standardized tests (ACT, SAT, etc) will be an application of this technology in the future. But until we make it where test-takers don't have to make there answers "bland", and we fix the bugs - lets hold off on using it to decide students' college plans, or grade GREs. And besides, the essay graders do a great job looking at the millions of essays that roll in every year.
below is an interesting article from the Washington Post about an essay grader:
The SAT Grader Next Door
http://www.washingtonpost.com/wp-dyn/content/article/2005/07/31/AR2005073100963.html
what does this say for the quality of them?
Ask Me About... The 80's!