Computers Could Grade Essay Tests Better Than Profs
An anonymous reader writes "Robot essay graders could be the answer to grade inflation. New software being tested turns over the task of grading to computers — this article has an interactive demo of the software. One professor says the computer is far fairer than human graders, who get tired and become inconsistent, or play favorites."
I once got an F on a paper from a TA who wrote in the margins "How dare you try to say what Shakespeare was thinking!" Um, that's what literary analysis IS, to some extent. You try to place someone's written works within the context of their culture and society at large and reconstruct their thought processes and views on the world. But that TA was an asshole and had it out for me, and many of us complained about him bitterly for years afterward. The only person who got an A in that entire section was one cute girl.
As long as the robo-grader also includes a plagiarism check, I'd be okay with it. My husband is a professor and most of his failed papers are a result of TurnItIn.com catching outright plagiarism.
Occasionally living proof of the Ballmer peak.
I had a prof in literature who only graded well if you made your critical essay about sexual imagery. At one point I gave up trying to "be me" and went whole hog, way overboard, almost parodying the over sexualized essay. And I scored an "A" for the the first time. Lesson learned? Sometimes it's OK to tell the boss what he wants to hear and do it his way, as long as it doesn't cost you anything, and nobody gets hurt. And, of course, life's not fair.
I can't say if a computer is better than a human at marking, but in my engineering subjects, when my name was on the test papers I did not get very good grades (actually at least grade lower than expected). But as soon as all the students were given anonymous numbers the grades went up. Conclusion, the staff could no longer decide to give better grades to their pet students. So in theory, there could be many students who get better grades because there is no more favouritism.
Take Nobody's Word For It.
My essay grades in college humanities courses were terrible until I started trying to figure out the political slant of my professor (or TA if the TA is the grader) and wrote papers supporting those views (and to be fair, those views weren't always left-leaning ones). I went from a C paper student to a low-A paper student in the blink of an eye.
Consistency is a fair point, but playing favorites? Isn't this what anonymous marking codes/IDs are for? (Or at least, that's what happens in the majority of universities in the UK)
but it really needs to check for plagiarism. I saw a load of it up at Colorado State.
In addition, it would ideally be able to handle lab books. I remember grading micro-bio 201 lab books back in the 80's, and I was getting tired after the first 30. The second 30 was a pain. The last 30, well, we finished the grading at a pizza joint over beer. I suspect that was how grade inflation happens.
I prefer the "u" in honour as it seems to be missing these days.
Not who, what.
Blank until
The best student. Duh.
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
What's up with the mass media headlines? Reading the summary actually makes me dumber. It talks about "computers" like they are sentient and grades the tests instead. Having professors first strictly defining the rules, entering them into software and having a computer evaluate those rules is still "professors grading the essays". It's self evident that the grading is better if it's more strictly defined.
Wow, I can build a house faster with this hammer. Headline: Hammers Could Build Houses Faster Than Construction Workers (In Cyberspace)
Unless they've made some impressive advances in natural-language interpretation in the past few years that haven't trickled out into other products, I'm a bit puzzled as to how this scheme is supposed to work.
Even the (comparatively much easier) tasks of spelling and grammar checking result in a fairly steady stream of mistakes from computer systems. I can't exactly summon much optimism for the likely outcome of such a system trying to distinguish between a paper with a well supported thesis and a paper that contains some declarative statements, a few quotations, and the word "therefore" at intervals.
On the plus side, it should be pretty trivial to get the machines to do the same lousy job without the slightest consideration of the student's name/status/cuteness/willingness to flatter the professor; but what use is purely objective execution of lousy work?
I'm pretty sure no program is capable of this (and I've got a PhD in natural language processing). They might be able to check for a couple of easily scored factors, such as number of words, and consistency between paragraphs, but I'm pretty sure that there is no program that could distinguish between an essay and the same essay messed up to base reasoning on false assumptions. I think someone left out a pretty important assumption: such programs might be able to score fairer (meaning: with less bias!), provided the students did their best.
First, for those who didn't read TFA, computers play only a small role on a handful of essays. Most of the article is in reference to having a 3rd party grade anonymized tests, rather than leaving it to the professor or TA. During college, I had a job as one of those graders.
We worked for five hours a day in the evening, though we could leave early and get the full pay if we finished all our papers. Most of the tests would be on general topics, but occasionally we'd get tests that required specific knowledge. In those cases, only qualified graders could review them, and we were given cheat sheets to make sure we didn't make factual mistakes. Essays were generally graded on a 1-5 scale (or a 0 if the essay was a blank page or similar). Each essay would be graded by two people, with a third breaking the tie in the event of a disagreement. However, we trained to be extremely consistent in the grading, so disagreements were rare and never more than a one point difference.
A few times a day, we would get fake essays intended to test our grading skills. For example, an essay that was supposed to be a perfect example of a 4 would be given to you with all the rest. If you gave it a 4, you get +1 point. Give it a 3 or 5, you get zero points. Give it a 2 or less, and you lose a point. If you accumulate a lot of points, you get a bonus up to 50% of your pay. If your total score goes too negative, you get fired.
It was a pretty good job, as crappy part-time "work your way through college" jobs go. The best part was whenever we got to grade essays by little kids. They were harder to score accurately -- it's hard to look past the abysmal handwriting and frequent misspellings. But they were frequently adorable and unintentionally hilarious.