Indiana First With Computerized Grading
Mz6 writes "Computerized grading has been talked about previously, however, the New York Times reports that Indiana has become the first state to grade high school English essays by computer. The computerized grading process, called 'e-rater', uses a 6-point rating scale and uses artificial intelligence to 'mimic the grading process of human readers'. The system was tested over a 2-year pilot program and produced results virtually identical to those of trained readers. The big question is, will other states begin to emulate Indiana by tossing human grading?"
Is this program available under an open source licence? It sounds really interesting!
it would have been my goal to make the most wrong essay I could that would still generate a good grade from the system.
/bin/fortune | slashdotsig.sh
I live in Indiana, and I have taken these. They are not graded fairly, and they determine 10% of the final grade. A computer can obviously not grade essays fairly, so it shouldn't be done. I got a 5/6, which, according to the computer, was extremely well. However, this was an 83%, which brought down my grade significantly. This computerized grading isn't fair at all.
SPAM filters are tricked all the time depending on the text of an email. Google was f'd up not too long ago because of trackback linking in blogs screwing up their algorithms. Isn't this a similar situation? If a student can figure out a way to beat the grader, we'll have students learning to write to beat software, not form a well written essay.
- gtaluvit (prnc. GOT-tuh-LUV-it)
My alma matter graded most of my computer programs with shell scripts and I graduated in 1997. So I don't think India is the first to do that.
As a student the first thing I would hand in is twenty paragraphs from refreshing this. In political science class, of course.
--
My username: hats off to George Carlin, and fuck the FCC. Freedom!
Maybe these says more about the readers than the computer program?
I disagree. I see lawsuits as no more likely. Furthermore, any process where you're subjectively evaluting something there has to be quality controls and an appeals process. My wife once held a part time job grading essay questions on a high school exit exam. Every few hours of grading exams, she would have to take and pass a "calibration battery" of 10 exams. Quality control is fundamental to the process.
What I see as being problematic is kids learning to beat the system. Typically these systems are predicated on gramatical analysis (use of punctuation and sentence compeleteness) and evidence of citing the text the question is based off. I'd bet its a real easy system to beat.
bance.net
Indiana parents are the first to buy (en masse) licenses for Essay Constructor Pro v2.0. The software produces essays that are indistinguishable from those written by real students, using the latest screen-scrape-from-Internet 'n' plagiarism-from-non-credible-sources techniques.
Indiana Director of State Board of Ed comments: "Isn't it wonderful how technology is improving education?"
Fred
"A fool and his freedom are soon parted"
-RMS
-- I was raised on the command line, bitch
How are these system supposed to scribble in the margins and tell you your ideas don't fit together?
How do they judge the content? What if you submit an excellent paper on middle ages history but the assignment was on socialism?
Human feedback is required in order to learn how to write well, you can't just expect a machine to tell you how to improve your writing. Grammar perhaps, but not ideas and how to let them flow coherently.
In order for these students to get that feedback someone has to read it, and since they're reading it anyway, why not just grade it then?
Seems like they are trying to solve the wrong problem with this system, or a problem that dosen't exist. (Are there really so many papers to mark you need a machine to do it?)
I wonder how long until the FBI is linked to this system?
Grammar, 90%
Spelling, 95%
Patriotism, 80%
also:
I'd love to see famous writings graded by this system.
I believe that (English) essay grading is harder than grading science exams based on problem solving (no bubbles please), at least if essays are about content and not just grammatical correct sentences.
o .cgi though that is from 1977...) I am not aware of programs grading physics problem solutions.
I say this because there is an objective criteria for grading the solution to a physics or math problem: correctness. For essays I do not beleive that we (and the current state of AI) can come up with an exact criteria like that. You might determine whether an essay is too different from essays which were written by experts, but cannot a very different essay to be just as good?
To my knowledge the AI programs can solve physics problems which are limited to some well defined domain (for example: http://www.cs.utexas.edu/users/novak/cgi/isaacdem
I will accept an essay grading program after they grade solutions to math and physics problems.
I conjecture that some writers would feel offended if their essay did well according to the program: they might think it means they are too conformist and conservative and not novel in their approach...
Matyas
The real answer is to adjust your teaching methods per student based on subjective analysis. A low objective mark on a paper or test would indicate that, ideally, the teacher needs to pay closer attention to the needs of that student, and teach him in a way he can learn. (VERY few teachers have the time and skill required for this, unfortunately).
You can't grade subjectively because those grades will be compared objectively down the line. You can't say "this is pretty good for kevin, I'll give him an A", but then say "josh's paper is way better than kevin's paper, but josh is a bright kid, so I'm giving him a C". Kevin will think he's mastered the english language while Josh will go insane trying to achieve perfection.
Grading, when used for anything other than helping the teacher learn about each students, just plains sucks, and is only used for competition.
$8.95/mo web hosting
Darn straight. I'm all for making dumb people feel not so bad about things they can't control - like being dumb - but not at the expense of making the non-dumb feel dumb. There's a lot to be said for working smarter, not harder, and a lot to be said for those who have figured that out on their own. English provides several correct ways to do something and some incorrect ways of communicating. An "A" paper is an "A paper, regardless of the source and effort put in by said source.
Edsger Dijkstra would no doubt have something profoundly and humerously offensive to say to the writers of this software ;-) There doesn't seem to be anyone to take over his mantle, which is presumably why the software industry is going pot at an ever increasing rate. sigh.
If it doesn't already, I would expect a service like this will eventually include plagiarism detection, due to marketing pressure if nothing else. This is something that human graders do, at least over the space of papers they grade and works they remember.
But if plagiarism detection is added, then the grading service would have to make and retain some encoding of each graded paper, a derivative work, in its database.
Once that happens, the grading service also becomes subject to all of the issues already raised with services like TurnItIn.com, already discussed here.
I also found this comment from ETS's site rather strange, to say the least:
Yes, you can trick it. From the e-rater article:
"Experienced writers, teachers, and writing assessment specialists have tested e-rater to determine the extent to which it "understands" the content of essay responses. Some of these writers have submitted essays that have tricked e-rater into giving a score even though the essay does not make any sense. The individual words in these "challenge essays" are grammatically correct, but they are strung together in such a way that they create nonsense sentences."
That observation shouldn't be surprising because earlier it says: "An e-rater score will be most beneficial to students who make a good faith effort at using it to improve their writing skills."
The program works (grossly oversimplified) by mimicking the grading of humans on essay samples.
I just signed up for a userid so I can take the exam online, but after submitting my info it said I may have to wait up to two days to get an account.
Curious that they can grade essays with a computer but it looks like they have to have a human pass out the user ids.
Anyway, I'll see if I can submit one of my articles to the exam, and will post here how I did. Since I have to wait for my user ID, you'll have to look back here later to see how I did.
Request your free CD of my piano music.
Actually, I've been a part of writing software like this for their competitors and have worked with this software in the past as part of my duties as manager of development at the IUPUI Testing Center (thats Indiana University Purdue University Indianapolis). We've worked on this shit for about 10 years now.
One of my tasks in the past was to push this type of software onto the local schools. We've used it for rating in class essays.
The idea is that everyone knows that the only way you get better writing is to write more and get some feedback on it. It doesn't matter if you are an educator throwing the papers back at the student or a computer algorithm. It all forces the student to find the mistakes and not make them the next time.
The problem isn't getting students to write more, its getting educators to grade more. There isn't enough hours in the day. So this is where this type of software comes into play -- you assign 2x the work you can normally handle, and let the computers handle half of it. You don't tell the students which assignments will be computer rated. Thus the students grades got better. Not much better, but they were better than the students not using the system in the same types of classes.
One of our smaller studies actually had us installing this software locally for instant feedback. It was a small percentage, but the students work was even better than before.
Yeah, you 'steal a copy' if you can't seem to get one given to you, and run it through until it likes it. How is this cheating or anyway underhanded. One of the better and far more dedicated educators I know actually allows students to hand in papers and have them marked up as many times as they want until the paper is due. His students final works are generally light years ahead of other educators in his facility that don't have the dedication (and for $26k a year, do you really want to give up your nights and weekends???).
Same thing here.
Shit, even using the grammar checker under work will force you to learn to write better (up to a point). You learn what its looking for and you avoid it. I'm not a good speller and I know the spell checkers help me learn after I hit the same error over and over again.
All these tools work for you in the learning process as long as you are willing to not just put this stuff on autopilot.
As for the title of this thread -- Lawsuit? The only lawsuits will come from idiots. None of the high stakes testing does purely computer rating. They all put humans into the equation. You will most likely get better rating because instead of having three or four humans look at your paper for 30 seconds each before moving on, you will now have one that is able to devote some serious time to it. All these humans will still be working just as many hours as before, but studies have shown that the eyeballs on the paper are there longer with this type of software than without (sorry, but these studies belong to the bigger testing companies or I'd post links...I just get paid to crunch the data).
Secondly, 5 years ago when I was working on this stuff full time, the software had a human agreeance of around 62% with a rater pool of 3 raters. Meaning that if you asked 3 people what they thought, took an average of this, and then asked a fourth, 62% of the time, the human agreed with the others. This was on a 12 point scale. The application, however, actually rated between 70 - 80% of the time depending on the model used.
In both cases, the raters were all trained together with the same things to look for, and the models were designed around this rater pool -- in a sense trying to simply guess at what the others would pick. The computer:human agreeance was higher than the human:human agreeance.
Back to the parent post, beating the system only means you beat learning to write.
BTW -- My post is not indicative of my writing skills outside of a conversational and informal setting, sans spell checker and proof re
A friend of mine who teaches Biology said that she saw some pretty bad essays which she would have given a poor grade to because the english was atrocious but she had to follow the grading rubric and give high scores to because the keywords were present.
This was the point I was going to make, too.
Nobody wants to read 5-paragraph themes all day long, even if they do get the point across. They are just a means to an end.
One of the best English teachers I've ever had would point to the use of alliteration, clever turns of phrase, humor, novel word choice (not just synonym-madness), and other completely subjective facets of writing as some of what makes the written word worth reading.