Automated System Developed To Grade Student Essays
RougeFemme points out this story at the Times about software that can be used to grade student essays and offer almost instant feedback. "Imagine taking a college exam, and, instead of handing in a blue book and getting a grade from a professor a few weeks later, clicking the 'send' button when you are done and receiving a grade back instantly, your essay scored by a software program. And then, instead of being done with that exam, imagine that the system would immediately let you rewrite the test to try to improve your grade. EdX, the nonprofit enterprise founded by Harvard and the Massachusetts Institute of Technology to offer courses on the Internet, has just introduced such a system and will make its automated software available free on the Web to any institution that wants to use it. The software uses artificial intelligence to grade student essays and short written answers, freeing professors for other tasks."
One of my kids had something like this: not for English, but for physics.
The teacher couldn't be bothered to assign and grade proper homework.
Instead, he fobbed the kids off onto a web app.
- go to the site
- get a problem
- solve the problem
- type in the numerical answer
- right answer? go on to the next problem
- wrong answer? try again
The web app allowed maybe 0.5% margin for rounding error, and you got 5 tries before it failed you on that problem.
It sounds reasonable in the abstract, but in practice it was utterly wretched.
All learning is, at some level, an interaction--a conversation--between student and teacher.
Even if it is nothing more than a red check mark or a red X on a homework paper,
you have communicated some thing to some person and gotten some response.
You don't realize how important this is until it is gone.
With nothing but a machine to talk to, it stops being about learning.
It is just about satisfying the machine by whatever means necessary.
In his rage and frustration my son told me that the easiest way to solve the problems was to copy and paste the problem text in to google.
This would reliably return the general formula for solving that problem;
plugging in the numbers that the web app had generated for your instance of the problem would then yield the correct answer.
By the end of the school year, I was telling him that if he didn't want to deal with the web app, he should use google to get his grade,
and if he wanted to learn physics, I would teach it to him.
Automated essay grading is going to be even worse.
There is no point writing prose unless a human is going to read it.
When I want to talk to machines, I write code.
Writing songs, that voices never shared...
-- Paul Simon
Take one lab report for Fluid Mechanics, measure the thickness with a micrometer -- look up the grade on the curve.
Seems like it's a small step from this to having computer algorithms that automatically write your paper for you too - then you can let it go through thousands of submit-edit-submit cycles until the scoring computer gives you a perfect score.
Kind of like the guys that came up with software to generate nonsense scientific papers and actually had a few accepted at conferences and journals.
Why I want to goto Harvard By P Q Student Up up down down left right left right B A
``Your grade is C. To improve your grade in the future, you need to do the following:
use 25-30 words per sentence; include more words from the wordnet entry for the topic of your essay; avoid simplistic or run-on sentences as measured by number of noun and verb phrases detected by our proprietary NLP tokenizer.
As a helpful reminder, our preparatory guides are available as a subscription service and include 100 practice submissions per week; only $29.95 per month."
"They were pure niggers." – Noam Chomsky
Grading is not, or should not be, about the grade, it should be about the feedback that the lecturer gives to the student. Even if the computer can grade an essay well (which I remain to be convinced of, although I am sure I will soon have the chance to test it for myself), there is no claim made about the computer giving useful advice to the student. Can a computer explain how to refine a research question or structure an argument? Sadly, many lecturers don't in fact give good feedback, but we should be looking for ways to enable lecturers to give better feedback, not accepting poor feedback as the norm.
My friend wrote a story about his cat that was grammatically correct,and used big words, but made little to no sense. The auto-grader program told him he was approaching PHD level English. So he took his paper into school and showed it to the English teachers who reviled at it. He was like,"Show's what you know, the computer told me I'm university level."
God spoke to me
Every era has its snake-oil salesmen and their marks. Sadly, in this case it will not be the customers who suffer, but their hapless students.
Those who would give up essential liberty to purchase a little temporary safety, deserve neither liberty nor safety.
And you have been awarded 2500 extra HP.
Perelman gives an example of how you can get a high score. The most interesting feature of the algorithm is that it doesn’t care about substance or even truth. It will ignore such trivialities as saying that the war of 1812 began in 1945, provided you say it grammatically. The substance of an argument doesn’t matter, he said, as long as it looks to the computer as if it’s nicely argued.
For a question asking students to discuss why college costs are so high, Mr. Perelman wrote that the No. 1 reason is excessive pay for greedy teaching assistants. “The average teaching assistant makes six times as much money as college presidents,” he wrote. “In addition, they often receive a plethora of extra benefits such as private jets, vacations in the south seas, starring roles in motion pictures.”
E-Rater gave him a [top score of] 6. He tossed in a line from Allen Ginsberg’s “Howl,” just to see if he could get away with it. He could."
http://freethoughtblogs.com/singham/2012/05/03/how-to-fool-a-computer-grader/
Computers suck at even the most basic grammar checking. I once decided to try a bunch of online grammar checkers to see if they would be useful at providing a sanity check for my novels. I concluded that they report so many bogus mistakes that it simply wasn't practical to use their output at all. To test them, I fed them a block of content, some with intentional errors that the grammar checker should have caught, others with deliberately (or accidentally) tricky bits that should not have produced any errors.
And so on. Heck, my phone doesn't even know the difference between "its" and "it's" and tries to auto-correct me into looking like I failed first grade English. And these folks expect me to believe that computers can feasibly help students learn to write better papers? Give me a break. Maybe in thirty to fifty years (*) we'll get there, but....
* Which many grammar checkers would probably suggest changing to "thirty-two fifty".
Check out my sci-fi/humor trilogy at PatriotsBooks.
It sounds reasonable in the abstract, but in practice it was utterly wretched.
No, the abstract does not sound reasonable: as with most things online you can always find bad ways to do it. I'm a physics prof working as part of a team to develop an open source, algebra capable question and content system. However even the current capabilities of something like Moodle (which is Open Source) is far in excess of what you describe. You can type in multiple "answers" to a problem and have the student get feedback and a partial grade if they get the problem wrong in a way that you managed to guess. Obviously if they find a new way to get it wrong then they will not get feedback though.
Commercial systems go even further with the student having the option to click on a help button which can break the question into steps for the student to complete in rder to guide them through to the right answer. This can be configured to give a grade penalty at the choice of the instructor - this is one of the features we want to add to an Open Source solution.
However even with current Moodle capabilities you can build a system that, I would argue, is better pedagogically for many physics problems (those with numerical or symbolic responses) than paper-graded assignments because, with an online system with some feedback and multiple attempts the student is encouraged to keep trying until they figure out how to get it right. This encourages them to think out the solution themselves whereas with a paper assignment they get one try and are then given the answer. To make this work though you need some means for students to come and talk to you and/or TAs to provide some help towards getting the right method. So you still need the student-teacher interaction but computers can provide a first line of contact and so let a teacher help more students.
That being said I find it exceedingly unlikely that this EdX system can work for written responses beyond checking that their english is good. For physics how can it possibly know that the statement "the Higgs boson has a mass of 140 GeV/c2" is wrong and "Dark Matter does not interact with photons" is correct? To be able to grade it will have to know a huge amount of information about a massive range of topics - and looking this stuff up on Google is not an option given all the crazy people and their wacky physics theories which they stick on a web page.
Face it, we're all going to get replaced by Expert Systems. They talked about this in the 80s and you didn't believe. 95% of us follow pretty simple patterns. There's damn little that most of us can do that a machine can't. Sure, there are exceptions. But most of us don't qualify, we just think we do.
You're being replaced. The real question is how are you going to deal with it? What do we do when 95% of us are completely unnecessary?
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
http://www.documentcloud.org/documents/346138-essay-awarded-a-top-grade-by-e-rater.html
All your GRE essays are evaluated by a machine and have been for years -- the e-rater. http://www.ets.org/research/topics/as_nlp/writing_quality/
The rating is also done by humans. It works well in practice and ensures that essays are graded fairly. If there is a significant discrepancy between the two ratings for a essay, that essay is examined further by another specialist. It prevents students from being victims of someone having a bad day at the office, and also does not encourage writing an essay to beat a machine.
The significance of the EDX news is not the concept of automated grading, it is that that such software is now free and opensource.
"Isn't their JOB to TEACH?" Not completely, sometimes barely at all. At an R1, the typical humanities appointment is 25-40% teaching, 50% research, and the balance to service. Some faculty may only teach one class a semester, if they're administrating a department or subdivision of a department, or if they're running a onerous committee, like a hiring committee. At a teaching school, your "main" job is teaching, but you're still required to produce some token level of research and serve the university in other ways, such as by working on committees, being a public figure, and other stuff that you might not consider right away. So, at my job, at a teaching school, about 70% of my time goes into teaching. The rest goes into mandatory requirements to publish, present papers, do committee work, assist developing colleagues, and perform community service. (Note that in my annual performance review, I'm only allowed to indicate that teaching was a maximum of 60% of my effort, and this at a teaching school. This may be atypical, but I suspect it's not.) Now, in the sciences there are faculty with no teaching requirements. And in the humanities, at R1 schools, faculty get a year or a semester off periodically during which time they are expected to complete a research project, typically a book.