Indiana First With Computerized Grading
Mz6 writes "Computerized grading has been talked about previously, however, the New York Times reports that Indiana has become the first state to grade high school English essays by computer. The computerized grading process, called 'e-rater', uses a 6-point rating scale and uses artificial intelligence to 'mimic the grading process of human readers'. The system was tested over a 2-year pilot program and produced results virtually identical to those of trained readers. The big question is, will other states begin to emulate Indiana by tossing human grading?"
Funny, because the way I read that is, "Produced lawsuits where the cost is virtually identical to about 20 times the short-term savings."
I see this coming from both sides. The obvious, the grading was wrong, and I lost a scholarship. To other people suing after dropping out of collage level english classes (the test said I was better than I was).
Kinetic stupidity has a new brand leader: Allen Zadr.
Is this program available under an open source licence? It sounds really interesting!
it would have been my goal to make the most wrong essay I could that would still generate a good grade from the system.
/bin/fortune | slashdotsig.sh
I bet I could write the other side of the equation: a program to create nonsensical gibberish that always gets A's. What would a teacher do if you handed in something like that? Apply a double standard to the student?
--
My username: hats off to George Carlin, and fuck the FCC. Freedom!
I live in Indiana, and I have taken these. They are not graded fairly, and they determine 10% of the final grade. A computer can obviously not grade essays fairly, so it shouldn't be done. I got a 5/6, which, according to the computer, was extremely well. However, this was an 83%, which brought down my grade significantly. This computerized grading isn't fair at all.
schmoozing with the teacher to get higher grades.
In unrelated news, Delicious Red Apples have suffered a terrible sales slump.
Blessed be he who reads this post, Cursed be he who tells my boss.
for the time being, i would trust more that program to moderate my comments.
c'mon people i was only joking dont mod me down, not noooo!!
"The quality of life is inversely proportional to the number of keys on your keyring."
SPAM filters are tricked all the time depending on the text of an email. Google was f'd up not too long ago because of trackback linking in blogs screwing up their algorithms. Isn't this a similar situation? If a student can figure out a way to beat the grader, we'll have students learning to write to beat software, not form a well written essay.
- gtaluvit (prnc. GOT-tuh-LUV-it)
Perish the thought should students start writing about the dangers of artificial intelligence. They may very well fail!
He who has no
The GMAT books are already giving formula essays to get you past any writers block that might happpen on the exam day...
You are in a maze of twisted little posts, all alike.
My alma matter graded most of my computer programs with shell scripts and I graduated in 1997. So I don't think India is the first to do that.
Lets just outsource all our test grading to Indiana too.
Best Windows Freeware
As a student the first thing I would hand in is twenty paragraphs from refreshing this. In political science class, of course.
--
My username: hats off to George Carlin, and fuck the FCC. Freedom!
The system was tested over a 2-year pilot program and produced results virtually identical to those of trained readers
I think this says more about the training that the "trained readers" are receiving than it does about the software.
Computerized grading is great.
Computerized grading is superb.
Computerized grading is excellent.
Computerized grading is outstanding.
Computerized grading is god.
Computerized grading is great.
Computerized grading is superb.
Computerized grading is excellent.
Computerized grading is outstanding.
Computerized grading is god.
Essay Result = A+
Ok, since you know the grading software is going to make it into the hands of the students, here's my scheme for perfect essays:
Step 1: Feed some encyclopedia articles, Wiki pages, and other random material on your subject into a Markoff chain generator.
Step 2: Use a genetic algorithm to generate variations of the text. Fitness is determined by the grade calculated.
Step 3: Repeat step 2 until desired grade is achieved. (And, of course, Profit!)
The result is totally worthless, but at first glance would probably appear legitimate even to a human reader.
Sort of like Slashdot posts.
That's actually a pretty novel way to approach the problem of creating Strong AI. Making smarter machines is hard, so what you do is dumb down the humans until even a coffee maker (or a grammar parser or whatever) would beat them in the Turing test. Damn, this is so sad.
>|<*:=
Output: A+
A feeling of having made the same mistake before: Deja Foobar
Maybe these says more about the readers than the computer program?
Good: The computer probably won't grade you down for writing an anti-Bush essay, and it probably won't get fired for it. Good: Computers won't play favorites, and you can't kiss up to a computer. Bad: The computer really can't grade you up for expressing original ideas. Bad: It's probably possible to fool the computer somehow.
-73, de n1ywb
www.n1ywb.com
...until some wiseass figures out a way spoof the grader, probably by sliding under the radar of whatever probabilistic models they've got that pass for spell- and grammar-checkers.
For example:
Flimblarm nif goondatakun, jut sekfar bel shon duc. Seempkin dar goolnac flar tefnek voz toulian; elmpar gef sogquel.
Grade: B+ Your use of double-negatives continues to haunt you, but I'm glad you've gotten over hanging participles.
The only surefire protection against Microsoft infections is abstinence. - The Onion
At least the parent post proves one simple truth: human english teachers can be replaced by simple shell scripts.
Indiana parents are the first to buy (en masse) licenses for Essay Constructor Pro v2.0. The software produces essays that are indistinguishable from those written by real students, using the latest screen-scrape-from-Internet 'n' plagiarism-from-non-credible-sources techniques.
Indiana Director of State Board of Ed comments: "Isn't it wonderful how technology is improving education?"
Fred
"A fool and his freedom are soon parted"
-RMS
Not that there's anything in this post that serves as an example. I guess that's because I was graded by humans. Seriously, I don't recall getting any encouragement in writing back in the '70s in high school, and not much in college. I guess it wouldn't have been any worse if the Grade-O-Vac was inspecting my papers instead of my mostly-marginally-literate teachers. There were several exceptions, but they focused much more on reading than on writing. I suspect they had a lot greater effect that way--I know they had a great effect on me.
Now all the students need is e-writer so that they can just type in the subject and the score they want to achieve and then e-reader will grade it accordingly!!!
If I were a student, I'd want to get a copy of this software and use it to pre-grade my papers so that I could find out what's wrong and fix it before I turned it in.
schmoozing with the teacher to get higher grades
This works better for the Slashdot crowd. They are much better at romancing computers than people to get what they want.
Wouldn't it just be cheaper to grade the tests at call centers in India? What are those Indians doing when there are no incomming calls? Just slacking off??? They could be grading tests.
-- I was raised on the command line, bitch
Sure, but that's the fault of the humans implementing the grade system, who don't understand the difference between Gaussian and uniform distributions. Don't blame the computers.
Not that computers are a great idea here - they can only grade at the shallowest level, and if they were grading like real teachers, then those "real" teachers weren't doing their jobs.
But this specific problem that you mention is entirely human based.
The system was tested over a 2-year pilot program and produced results virtually identical to those of trained readers.
So it gives its favorite students 'A's without reading, least favorite students 'F's, and the rest arbitrary grades somewhere in between to mimic a bell curve?
Excellent!
"Artificial Intelligence is easy. It's artificial stupidity that impresses me." -- Arthur Oscar
Microsoft is to software what Budweiser is to beer.
How are these system supposed to scribble in the margins and tell you your ideas don't fit together?
How do they judge the content? What if you submit an excellent paper on middle ages history but the assignment was on socialism?
Human feedback is required in order to learn how to write well, you can't just expect a machine to tell you how to improve your writing. Grammar perhaps, but not ideas and how to let them flow coherently.
In order for these students to get that feedback someone has to read it, and since they're reading it anyway, why not just grade it then?
Seems like they are trying to solve the wrong problem with this system, or a problem that dosen't exist. (Are there really so many papers to mark you need a machine to do it?)
By creating a vernacular consisting of elongated words and sophisticated verbiage, obviously indifferent to definition but simultaneously observing grammar regulations while eschewing colloquialisms, perhaps students may increase individual chances of achieving substantial academic acclaim.
If this works anything like the writing level indexes you find on word processors, it should be easy to fool.
If moderation could change anything, it would be illegal.
I wonder how long until the FBI is linked to this system?
Grammar, 90%
Spelling, 95%
Patriotism, 80%
also:
I'd love to see famous writings graded by this system.
My writing style is somewhat peculiar, though I can't exactly say how (or even approximately how). Partially as a result of this, my marks in English class over the years of high school ranged from C to A, depending not on me, but on who the teacher was. If the teacher happened to like my style, I got a good mark.
This is annoying, but at least each year there was a different teacher, who may like my style. If the marking is computerised, it will not change; if your writing doesn't fit what the computer likes, you're screwed; likewise, if it does like it, you might never learn to express yourself more creatively (ie you'll be punished for trying to write in a manner different from what you usually do).
There are possibilities in this technology, but I suspect that it will be a long while before the eccentric aren't labeled as poor writers.
To illustrate my point, I'll restate it. [English -> German -> English]:
I do not trust the computer, which arranges, until I see a computer-translated document of this laughable isn't.
That's about how well a computer "comprehends" language today.
A computer can check spelling and even grammar to a certain extent. However, it cannot evaluate factual accuracy, strength of argument. Even with spelling, the computer is not likely to catch improper use of homonyms. I can guarantee you that it will be possible to create a piece of writing that is utter crap that would get an A+ using this or any other possible computerized grading system. Unfortunately, there are probably many teachers out there who make poorer graders than this system does. The answer to the problem of poor-quality teaching is not replacing teachers with computers; the answer is a combination of better teacher pay and putting higher standards in place for our teachers via competency testing.
I believe that (English) essay grading is harder than grading science exams based on problem solving (no bubbles please), at least if essays are about content and not just grammatical correct sentences.
o .cgi though that is from 1977...) I am not aware of programs grading physics problem solutions.
I say this because there is an objective criteria for grading the solution to a physics or math problem: correctness. For essays I do not beleive that we (and the current state of AI) can come up with an exact criteria like that. You might determine whether an essay is too different from essays which were written by experts, but cannot a very different essay to be just as good?
To my knowledge the AI programs can solve physics problems which are limited to some well defined domain (for example: http://www.cs.utexas.edu/users/novak/cgi/isaacdem
I will accept an essay grading program after they grade solutions to math and physics problems.
I conjecture that some writers would feel offended if their essay did well according to the program: they might think it means they are too conformist and conservative and not novel in their approach...
Matyas
>If you would like to try out e-rater, you can obtain an ID and password and submit and original essay for scoring on the CriterionSM Web site.
Submit "and" essay? I guess they haven't run the software on themselves.
F.
If moderation could change anything, it would be illegal.
Writing is not mathematics. Good writing should not go along some artificial standard. Just because my paper is grammatically correct, has a topic sentence, 3 supporting paragraphs, and a conclusion doesn't mean it is good. Good writing needs a flow of ideas from one paragraph to another. It needs finesse, style, grace. This is like an IQ test for english writing. It would do very well in identifying poor writers - but could never identify a great one. I'm sorry ee cummings, your use of punctuation is poor 1/6. There are examples like this in books on taking the various standardized tests - any truly excellent writer is likely going to do badly. Why? The rules of the english language are guidlines, which may be broken when appropriate. This is just the mechanization of another facet of society, and should be tossed out with the rest of the garbage.
But I think that if a computer grading program which is no worse than humans could be devised, it would be a great learning tool. A lot of people make it to college as borderline illiterates. I'm not kidding. I read a lot of their crap. That's because their HS teachers were too overworked to grade their writing, so they didn't assign much. If a computer program could auto-grade and give detailed comments on how to improve the writing, high school students could be assigned an essay per week, and really get the hang of writing well. Teachers could focus on teaching instead of tedium.
Sure, the first grading applications are going to make a few serious errors. This is the first stage of every application when a computer is asked to interpret rich data. Early voice recognition sucked. Now it sucks much less, and it will just keep getting better. Same with OCR, chess software, machine translation, etc. So the right debate to have is about when this will be good enough for school use, and not whether. I'm prepared to admit that the answer to the right question is "not yet" (I'm sure how deep the current problems go), but I fully support working on this system until it works right.
I live in Indiana (no, NOT India) and took this test. Being a techie, I figured I'd try to fake out the system. This test works out to be 10% of the final grade and since I had a 98 going into the test, I figured I could afford to gamble a little, figuring if it back-fired I could blame it on a computer error since every one would figure the kid with a 98 MUST be telling the truth.
I almost wimped out. I wrote about 80 percent of the essay (about influence of pop-culture on society - and silly me I always thought society influences pop-culture but anyway). I had 5 paragraphs - 1 intro, 3 body - 1 half-assed conclusion. I reoreded the paragraphs, copied the one I felt was the best written and pasted it into the body 3 times.
Guess what I got.....6/6 (six point grading scale which is pretty messed up because a 5/6 is an 83%). Hopefully they won't audit mine....
All your base are belong to us!
Edsger Dijkstra would no doubt have something profoundly and humerously offensive to say to the writers of this software ;-) There doesn't seem to be anyone to take over his mantle, which is presumably why the software industry is going pot at an ever increasing rate. sigh.
A computer can not replicate certain aspects of the grading process. Sure they can grade spelling and grammar and probably certain aspects of your writing style but there is plenty of important aspects of writing that they can not grade you on.
For instance, does your essay really grab the reader? Anyone here who reads technical documents knows what I'm talking about. There are some writers that, no matter how dull the subject, can make their work interesting and fun to read. A computer can not possibly grade one on that. I have a good friend who's a high school English teacher and occasionally I'll read some of the things written by his students. I've come across plenty of papers that are grammatically correct, have perfect spelling and are fairly well written from a syntactic and stylistic point of view, but are just plain boring to read. Then I'll move on to another paper, about the same subject, which is interesting and actually fun to read.
That's just one example of something a computer can not possible take into account when grading an essay. The bottom line is that a computer will never be able to grade you on certain subjective things, which although they are subjective and therefore open to a certain amount of interpretation depending on the person doing the grading, are nevertheless still very important aspects of good writing.
With spelling and grammar check, almost any average student can churn out a paper that is going to be mostly correct; however it still takes a good writer to produce something interesting. In my opinion, an interesting paper with a few minor spelling, grammar or syntactic errors is just as good as a boring paper with no spelling, grammar or syntactic errors.
Slashdot First With Computerized Story Posting
Now with Computerized Story Posting, the artificial intelligence "seeks out" stories that have either been long archived or just posted the previous day and then posts them as new material. The program then ignores what is stated in the FAQ and disregards all emails stating that the story is a duplicate. This program is also known as "chrisd".
Other features include "mis-classification into the wrong topic", "making up stupid-titles-that-go-into-the-dept", and the most difficult, ignoring stories that should be posted.
Chris Benard
If it doesn't already, I would expect a service like this will eventually include plagiarism detection, due to marketing pressure if nothing else. This is something that human graders do, at least over the space of papers they grade and works they remember.
But if plagiarism detection is added, then the grading service would have to make and retain some encoding of each graded paper, a derivative work, in its database.
Once that happens, the grading service also becomes subject to all of the issues already raised with services like TurnItIn.com, already discussed here.
I also found this comment from ETS's site rather strange, to say the least:
A good essay always consists of an introductory paragraph, three body paragraphs, and a closing paragraph.
It is essential that every paragraph begin with a topic sentence. The first paragraph should state the thesis, or point of the essay. Since computers cannot actually understand the entire essay, you can assume that it will only be judging the local coherence of writing which is free to run like a river, past Eve and Adam's, from swerve of shore to bend of bay, taking us by a commodius vicus of recirculation back to Howth Castle and environs.
The second paragraph should make a point that present a countervailing view, the antithesis. Once again, spelling should be correct, the essay should be capable of passing a Microsoft Word grammar check, but after that we pass through grass behind the bush where a gull calls, coming far, ending here. Finn again? Take, but softly memory till thousands are given the keys to a way a lone a last a loved a long the river runs.
The third paragraph should synthesize the material covered in the first two paragraphs. It is, however, important that any material obtained from external sources be modified so that it cannot be detected as an exact match for anything on the Web. So, she went into the garden to cut a lettuce leaf to make an mince pie; and at the same time a great wolverine, coming up the street, goes into the store. "What! No laundry detergent?" So he died, and she very imprudently married the barber, and they all fell to playing the game of catch as catch can till the gunpowder ran out at the heels of their boots.
In conclusion, the final paragraph should recapitulate and summarize what has gone before: since you can be sure that a computer is capable of counting paragraphs, a good essay always consists of five paragraphs. If it has the right number of paragraphs and every word is spelled correctly, you are almost certain to get at least a passing grade.
"How to Do Nothing," kids activities, back in print!
Yes, you can trick it. From the e-rater article:
"Experienced writers, teachers, and writing assessment specialists have tested e-rater to determine the extent to which it "understands" the content of essay responses. Some of these writers have submitted essays that have tricked e-rater into giving a score even though the essay does not make any sense. The individual words in these "challenge essays" are grammatically correct, but they are strung together in such a way that they create nonsense sentences."
That observation shouldn't be surprising because earlier it says: "An e-rater score will be most beneficial to students who make a good faith effort at using it to improve their writing skills."
The program works (grossly oversimplified) by mimicking the grading of humans on essay samples.
This issue cuts deep into the heart of what grading is for -- it's possible for smart people to reasonably disagree, depending on what they think the intent of the grade is. Since grades are put to many uses, there are many answers to the question.
As a college instructor, I tend to use a strict grading protocol -- and then "bump up" a few of the students. If someone comes in to my office every week and really struggles to understand the concepts, but the computer tells me that they earned a "C+" -- they're likely to find a "B-" on their transcript. But if someone who's smart enough to get an "A" blows an exam from being hung over, that person gets little or no sympathy.
I just signed up for a userid so I can take the exam online, but after submitting my info it said I may have to wait up to two days to get an account.
Curious that they can grade essays with a computer but it looks like they have to have a human pass out the user ids.
Anyway, I'll see if I can submit one of my articles to the exam, and will post here how I did. Since I have to wait for my user ID, you'll have to look back here later to see how I did.
Request your free CD of my piano music.
A friend of mine who teaches Biology said that she saw some pretty bad essays which she would have given a poor grade to because the english was atrocious but she had to follow the grading rubric and give high scores to because the keywords were present.
The postmodern essay generator
Of course it was fun to mess with cheaters. If I noticed someone was copying off of my work I would make a point to put down all wrong answers. Then I would pretend to check over my work. The person who was cheating off of me would usually just take their test up to the teacher right away. When they sat back down I would make a big deal out of erasing every single one of my answers and doing the whole test over.
Their reaction was always priceless.
"It is difficult to get a man to understand something when his salary depends upon his not understanding it."