Indiana First With Computerized Grading
Mz6 writes "Computerized grading has been talked about previously, however, the New York Times reports that Indiana has become the first state to grade high school English essays by computer. The computerized grading process, called 'e-rater', uses a 6-point rating scale and uses artificial intelligence to 'mimic the grading process of human readers'. The system was tested over a 2-year pilot program and produced results virtually identical to those of trained readers. The big question is, will other states begin to emulate Indiana by tossing human grading?"
Is this program available under an open source licence? It sounds really interesting!
it would have been my goal to make the most wrong essay I could that would still generate a good grade from the system.
/bin/fortune | slashdotsig.sh
I can't wait til someone figures out how to google bomb the grading system.
I wonder if it will be as simple as repeating a high ranking sentence?
I live in Indiana, and I have taken these. They are not graded fairly, and they determine 10% of the final grade. A computer can obviously not grade essays fairly, so it shouldn't be done. I got a 5/6, which, according to the computer, was extremely well. However, this was an 83%, which brought down my grade significantly. This computerized grading isn't fair at all.
SPAM filters are tricked all the time depending on the text of an email. Google was f'd up not too long ago because of trackback linking in blogs screwing up their algorithms. Isn't this a similar situation? If a student can figure out a way to beat the grader, we'll have students learning to write to beat software, not form a well written essay.
- gtaluvit (prnc. GOT-tuh-LUV-it)
My alma matter graded most of my computer programs with shell scripts and I graduated in 1997. So I don't think India is the first to do that.
Many geeks like me did not like English class for the simple reason that grading was entirely subjective to the teacher's tastes.
If he or she didn't like what you wrote, or took a point of view opposite to theirs, you would get a lower grade. Frequently, the "special" students would get the benefit of the doubt, and easy grading just for exceeding their own limitations. An 'A' paper in one English class could be a B- in another, etc, etc.
With this computer grading, these students now know that they will be treated equally, and not bitch about potential human biases. Then, everyone will have a fair shot.
Slashdot Moderation: From positive to terrible in 2 "insightful" posts.
As a student the first thing I would hand in is twenty paragraphs from refreshing this. In political science class, of course.
--
My username: hats off to George Carlin, and fuck the FCC. Freedom!
Maybe these says more about the readers than the computer program?
I disagree. I see lawsuits as no more likely. Furthermore, any process where you're subjectively evaluting something there has to be quality controls and an appeals process. My wife once held a part time job grading essay questions on a high school exit exam. Every few hours of grading exams, she would have to take and pass a "calibration battery" of 10 exams. Quality control is fundamental to the process.
What I see as being problematic is kids learning to beat the system. Typically these systems are predicated on gramatical analysis (use of punctuation and sentence compeleteness) and evidence of citing the text the question is based off. I'd bet its a real easy system to beat.
bance.net
Indiana parents are the first to buy (en masse) licenses for Essay Constructor Pro v2.0. The software produces essays that are indistinguishable from those written by real students, using the latest screen-scrape-from-Internet 'n' plagiarism-from-non-credible-sources techniques.
Indiana Director of State Board of Ed comments: "Isn't it wonderful how technology is improving education?"
Fred
"A fool and his freedom are soon parted"
-RMS
-- I was raised on the command line, bitch
How are these system supposed to scribble in the margins and tell you your ideas don't fit together?
How do they judge the content? What if you submit an excellent paper on middle ages history but the assignment was on socialism?
Human feedback is required in order to learn how to write well, you can't just expect a machine to tell you how to improve your writing. Grammar perhaps, but not ideas and how to let them flow coherently.
In order for these students to get that feedback someone has to read it, and since they're reading it anyway, why not just grade it then?
Seems like they are trying to solve the wrong problem with this system, or a problem that dosen't exist. (Are there really so many papers to mark you need a machine to do it?)
I wonder how long until the FBI is linked to this system?
Grammar, 90%
Spelling, 95%
Patriotism, 80%
also:
I'd love to see famous writings graded by this system.
I believe that (English) essay grading is harder than grading science exams based on problem solving (no bubbles please), at least if essays are about content and not just grammatical correct sentences.
o .cgi though that is from 1977...) I am not aware of programs grading physics problem solutions.
I say this because there is an objective criteria for grading the solution to a physics or math problem: correctness. For essays I do not beleive that we (and the current state of AI) can come up with an exact criteria like that. You might determine whether an essay is too different from essays which were written by experts, but cannot a very different essay to be just as good?
To my knowledge the AI programs can solve physics problems which are limited to some well defined domain (for example: http://www.cs.utexas.edu/users/novak/cgi/isaacdem
I will accept an essay grading program after they grade solutions to math and physics problems.
I conjecture that some writers would feel offended if their essay did well according to the program: they might think it means they are too conformist and conservative and not novel in their approach...
Matyas
The real answer is to adjust your teaching methods per student based on subjective analysis. A low objective mark on a paper or test would indicate that, ideally, the teacher needs to pay closer attention to the needs of that student, and teach him in a way he can learn. (VERY few teachers have the time and skill required for this, unfortunately).
You can't grade subjectively because those grades will be compared objectively down the line. You can't say "this is pretty good for kevin, I'll give him an A", but then say "josh's paper is way better than kevin's paper, but josh is a bright kid, so I'm giving him a C". Kevin will think he's mastered the english language while Josh will go insane trying to achieve perfection.
Grading, when used for anything other than helping the teacher learn about each students, just plains sucks, and is only used for competition.
$8.95/mo web hosting
As a high school student, in spite of being an an honor student, getting accepted at a top Liberal Arts college, scoring well on SAT's, and taking AP English, I recieved a 53% percentile on my human-scored state writing exam. After a little investigating, I found that these tests were graded by the hundreds, if not thousands, by teachers trying to supplement their meager incomes. I'm sure that it's pretty hard to read carefully and make sure that all exams are graded fairly and equally when there is no accountability, and no way for students to challenge the results. At least this way, all students in the state will be graded by the same algorithm.
Darn straight. I'm all for making dumb people feel not so bad about things they can't control - like being dumb - but not at the expense of making the non-dumb feel dumb. There's a lot to be said for working smarter, not harder, and a lot to be said for those who have figured that out on their own. English provides several correct ways to do something and some incorrect ways of communicating. An "A" paper is an "A paper, regardless of the source and effort put in by said source.
Edsger Dijkstra would no doubt have something profoundly and humerously offensive to say to the writers of this software ;-) There doesn't seem to be anyone to take over his mantle, which is presumably why the software industry is going pot at an ever increasing rate. sigh.
Exactly! That's why a writer or journalist has an editor. A good, interesting writer who makes a few minor spelling, grammer or syntactic errors on occasion is going to have no problem finding work while someone who never makes that type of mistake but who's writings are just plain boring never will. That's why the pro's have editors, to catch the small things and offers suggestions on how to improve what's there. A good editor can take a good work with some errors, eliminate those errors and offer suggestions on how to make the piece great. A good editor can not, however, take a boring piece and make it great. The work, ultimately, has to come from the writer. An editor can help improve a work, but he can't transform it, because that would make him a co-author of the material. Ultimately a good writer has to produce interesting and engaging material. Grammar and spelling and syntactic errors can be fixed. If you are incapable of producing something interesting and engaging, there isn't an editor on the planet that can help you.
If it doesn't already, I would expect a service like this will eventually include plagiarism detection, due to marketing pressure if nothing else. This is something that human graders do, at least over the space of papers they grade and works they remember.
But if plagiarism detection is added, then the grading service would have to make and retain some encoding of each graded paper, a derivative work, in its database.
Once that happens, the grading service also becomes subject to all of the issues already raised with services like TurnItIn.com, already discussed here.
I also found this comment from ETS's site rather strange, to say the least:
Yes, you can trick it. From the e-rater article:
"Experienced writers, teachers, and writing assessment specialists have tested e-rater to determine the extent to which it "understands" the content of essay responses. Some of these writers have submitted essays that have tricked e-rater into giving a score even though the essay does not make any sense. The individual words in these "challenge essays" are grammatically correct, but they are strung together in such a way that they create nonsense sentences."
That observation shouldn't be surprising because earlier it says: "An e-rater score will be most beneficial to students who make a good faith effort at using it to improve their writing skills."
The program works (grossly oversimplified) by mimicking the grading of humans on essay samples.
I just signed up for a userid so I can take the exam online, but after submitting my info it said I may have to wait up to two days to get an account.
Curious that they can grade essays with a computer but it looks like they have to have a human pass out the user ids.
Anyway, I'll see if I can submit one of my articles to the exam, and will post here how I did. Since I have to wait for my user ID, you'll have to look back here later to see how I did.
Request your free CD of my piano music.
Actually, I've been a part of writing software like this for their competitors and have worked with this software in the past as part of my duties as manager of development at the IUPUI Testing Center (thats Indiana University Purdue University Indianapolis). We've worked on this shit for about 10 years now.
One of my tasks in the past was to push this type of software onto the local schools. We've used it for rating in class essays.
The idea is that everyone knows that the only way you get better writing is to write more and get some feedback on it. It doesn't matter if you are an educator throwing the papers back at the student or a computer algorithm. It all forces the student to find the mistakes and not make them the next time.
The problem isn't getting students to write more, its getting educators to grade more. There isn't enough hours in the day. So this is where this type of software comes into play -- you assign 2x the work you can normally handle, and let the computers handle half of it. You don't tell the students which assignments will be computer rated. Thus the students grades got better. Not much better, but they were better than the students not using the system in the same types of classes.
One of our smaller studies actually had us installing this software locally for instant feedback. It was a small percentage, but the students work was even better than before.
Yeah, you 'steal a copy' if you can't seem to get one given to you, and run it through until it likes it. How is this cheating or anyway underhanded. One of the better and far more dedicated educators I know actually allows students to hand in papers and have them marked up as many times as they want until the paper is due. His students final works are generally light years ahead of other educators in his facility that don't have the dedication (and for $26k a year, do you really want to give up your nights and weekends???).
Same thing here.
Shit, even using the grammar checker under work will force you to learn to write better (up to a point). You learn what its looking for and you avoid it. I'm not a good speller and I know the spell checkers help me learn after I hit the same error over and over again.
All these tools work for you in the learning process as long as you are willing to not just put this stuff on autopilot.
As for the title of this thread -- Lawsuit? The only lawsuits will come from idiots. None of the high stakes testing does purely computer rating. They all put humans into the equation. You will most likely get better rating because instead of having three or four humans look at your paper for 30 seconds each before moving on, you will now have one that is able to devote some serious time to it. All these humans will still be working just as many hours as before, but studies have shown that the eyeballs on the paper are there longer with this type of software than without (sorry, but these studies belong to the bigger testing companies or I'd post links...I just get paid to crunch the data).
Secondly, 5 years ago when I was working on this stuff full time, the software had a human agreeance of around 62% with a rater pool of 3 raters. Meaning that if you asked 3 people what they thought, took an average of this, and then asked a fourth, 62% of the time, the human agreed with the others. This was on a 12 point scale. The application, however, actually rated between 70 - 80% of the time depending on the model used.
In both cases, the raters were all trained together with the same things to look for, and the models were designed around this rater pool -- in a sense trying to simply guess at what the others would pick. The computer:human agreeance was higher than the human:human agreeance.
Back to the parent post, beating the system only means you beat learning to write.
BTW -- My post is not indicative of my writing skills outside of a conversational and informal setting, sans spell checker and proof re
A friend of mine who teaches Biology said that she saw some pretty bad essays which she would have given a poor grade to because the english was atrocious but she had to follow the grading rubric and give high scores to because the keywords were present.
This was the point I was going to make, too.
Nobody wants to read 5-paragraph themes all day long, even if they do get the point across. They are just a means to an end.
One of the best English teachers I've ever had would point to the use of alliteration, clever turns of phrase, humor, novel word choice (not just synonym-madness), and other completely subjective facets of writing as some of what makes the written word worth reading.
I should have patented it. My high school issued report cards that were printed on fan-fold forms from a dot-matrix printer. They paid "student helpers" to feed the printer. I also paid the "student helpers" for the leftover blank forms at the end of a print run. Our high school administrators decided they would save on postage by handing out the report cards in homeroom and announcing this fact in the local newspaper.
Using my TRS-80 Color Computer and DMP-100 dot matrix printer, I offered an alternative scholarship program. For a $10 fee, I would print a report card that was identicial to the real ones, except for the grades. My "clients" would take their real report card, pencil in their new grades, and my computerized grading system would do the rest. Each kid would go home and say that he forgot his report card in his locker and would bring it home tomorrow. I would deliver the new & improved report cards the next day and all was well.
The "offical" grades remained unchanged, so it was up to each client to avoid flunking courses that would prevent graduation. Anyone who failed a mandatory course was ineligible for my "service". One client tried to blackmail me into providing the service for free, but I said, "Just try and get someone to believe that report cards are being manufactured in a student's house."
The only disappointment I had was when some kids decided to publish an underground newspaper. I wanted to take out an ad, and they refused.