Australia To Grade Written Essays In National Exam With Cognitive Computing
New submitter purnima writes: Australia keeps on giving and giving. Each year school kids in Australia sit The National Assessment Program (NAPLAN) which in part tests literacy. The exam includes a written page-long essay aimed at examining both language aptitude and literacy of students. Of course, human-marking of such essays is costly (twenty teacher-minutes per exam). So some bright spark has proposed that the essays be marked by computer. The government is convinced and the program is slated for the 2017 school year. Aside from the moral issues, is AI ready for this major task?
AI is not ready to do this task properly, but, at least in the US, human grading has sometimes been dumbed-down to the point where you would not even need current 'AI' to do as well, as prof. Perelman of MIT has demonstrated - e.g: http://www.bostonglobe.com/opi...
Each year school kids in Australia sit The National Assessment Program (NAPLAN) which in part tests literacy.
Can we get this AI to test Slashdot summaries?
Tic-Tac-Toe, Global Thermonuclear War, and relationships all have the same winning move.
So is human-writing. Maybe we should have AIs take the test for us, too.
Sounds like some politicians are buying an expensive lesson in what can and can't be automated by computer on their tax payers' dime.
Here in the US it's the military that usually serves that particular function but Autstalia has their schools doing it.
I can't wait for some clever student to figure out they can game the system and write a totally incoherent paper that the computer gives perfect marks.
First the content of the essay shouldn't matter at all so there have to be no understanding of the text.
Second checking grammar, spelling and general literacy isn't new - there are already programs for all three that does an okay job.
Third humans needn't be removed entirely. Outliers can be checked/graded manually.
Of course there will be chances to cheat the system. But IMHO the effort to cheat a "dumb AI" should be similar to or harder than actually writing a text in the first place.
Therefore the only task of those who write software to grade essays is that the variation of the machine is no worse that the variations of the humans. There is some success in this. Edx has a module that will grade essays. As far as I know the value in this is quicker and more uniform feedback for practice essays. Of course humanities majors, who have generally have minimal understanding of advanced technology, hate it. This, of course, includes journalists.
This is not to say that computer graded essays are going to be as good of an assessment as human graded essays. However, it may be good enough, and better than other objective measures, such as fill in the bubble tests. In fact anything that minimizes the cost of open ended free response assessment is going to benefit anyone. Securing multiple guess test is very expensive, and the value of them are highly questionable. They tend to overestimate the value of student how have vague passive knowledge, and underestimate the value of those who have an ability to actively apply knowledge.
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
Adverb clause, independent clause conjunction independent clause dependent clause. Subject, adjective clause, verb prepositional phrase? Participle phrase subject verb conjunction dependent clause!
Emoticon.
We don't have a state-run media we have a media-run state.
Maybe somebody will come out with a set of templates for generating grammatically correct essays for NAPLAN exam questions, sort of like Mad-libs.
Eye halve a spelling chequer
It came with my pea sea
It plainly marques four my revue
Miss steaks eye kin knot sea.
Eye strike a key and type a word
And weight four it two say
Weather eye am wrong oar write
It shows me strait a weigh.
As soon as a mist ache is maid
It nose bee fore two long
And eye can put the error rite
Its rare lea ever wrong.
Eye have run this poem threw it
I am shore your pleased two no
Its letter perfect in it’s weigh
My chequer tolled me sew.
Sphinx of black quartz, judge my vow.
Since machines cannot yet understand the semantics of complex English text, they will use some simplistic rules as a substitute. These rules will be things like "average sentence length" and other such metrics, which as soon as they are discovered by students, will be used to game the system. Instead of producing essays born of rational and coherent thought, they will instead make them to match the things being measured while being utterly devoid of meaning.
written page-long essay aimed at examining both language aptitude and literacy of students.
So, the same technology used SO effectively to rank resumes will be used with students. Okay, kiddies, remember to stuff a lot of fancy-pants words into it.
Fail: This is sh*t. Go f*ck yourself. I'm not kissing your ass.
PASS: Subjectively, it is blatantly obvious to this observer that the new paradigm, as a cost-saving measure, was inspired by, and mimics, the the natural environmentally safe process of translating organic matter into nutritious compost. This has the outcome of allowing everyone who is in a paid position to devote the time saved to stress-relieving activities such as self-pleasuring, resulting in both a higher awareness of the need to practice good hygiene by such prophylactic procedures as more frequent hand-washing, and use of tissues to properly dispose of organic residue, though it could also negatively impact on their visual acuity over time.. Affected students should refrain from overtly engaging in behavior with superior's inferior posteriors to avoid being perceived as having a brown proboscis by their peers, with the associated negative impact on their social placement in the student hierarchy.
"Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
That's because all of us colonials and ex-colonials are burdened with the English factory educational system that was designed to produce bureaucrats for the Empire. The reason computers are capable of grading products of the educational system is because the system is made to create human computers.
Our - US, Australia - educational system needs to be completely changed - not reformed. I think the template to use is Maria Montessori's system. In the future we are going to need creative people who can discover new things and solve problems: not follow rules and memorize things: computers do that better.
Well, if you allow computers to grade essays, then you should allow students access to AI based tools to generate essays by supplying keywords. Now that is fair competition. In America rich people will by high quality essay-generators for their school district. In socialist Australia government will supply all students with the same single-payer essay generator. Meanwhile Korean and Chinese parents will dutifully coach their children to memorize multiplication tables all the way to 20 times 20. (My Korean friend was surprised to learn we Indians went only till 16 x 16). Japanese would create essay-gochi, an app that you buy as a child and take care of it to produce high quality essays by the time you finish high school. Indians would write project proposals that require technical back-office teams (about three IT techies per student) to create and maintain the essay grading apps.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
Aside from the moral issues, is AI ready for this major task?
Moral issues aside?!? I'm sorry, but the moral issues are front and center here. Australia is seriously proposing to bore an AI to death, or at least drive it insane, buy having it grade hundreds of thousands of grade school essays. This is an outrage!
To mark out or point out the exceptions (possibly horrible, possibly great, anomalies). A tool to the teacher.
Never heard anything about this and I live in Australia. Got to question where the author lives and what their motives are for posting shit like this? Maybe they like the attention....hehehehe
Husband is currently grading final papers for college classes. He slaps them into software that detects plagiarism, then another software that picks out vocabulary level, typos, etc, and assigns a grammar score. Only then does he read it, quickly skimming over it and seeing whether there are citations on the "plagiarized" parts, if there are any, and whether he agrees with the AI score. Nine times out of ten, he does, and he uses the grammar score assigned by the AI. If someone plagiarized whole paragraphs without citations, they get an incomplete and need to do a rewrite. If someone didn't write the required number of words or pages, they get points knocked off the grammar score. It's faster than manually marking 150 papers, but still takes him about 15-20 hours of labor over the course of 2-3 days.
Occasionally living proof of the Ballmer peak.
It's a literacy exam, so maybe having an AI grade the papers won't be so bad? I mean, if all the AI is doing is checking grammar and sentence structure and the like, then that seems doable. By the fact that they used the term "cognitive computing" I assume they are planning on using Watson, who should be good enough to get the job done. Better than having a human do it anyway.
Hell, why not. While we're at it, why don't we automate the student process. Dump the students and educate AIs instead. Computing solutions always work, just ask any nerd about self-driving cars.
At some point, and it seems that that point is arriving now, people will realize that the driving force behind technological change, as far as money people are concerned, is to eliminate jobs, and that the good jobs are not realy being replaced, and cannot be replaced. AIs grading papers gets rid of more pesky teachers who make a living wage. A self-driving car doesn't fit the picture until you realize that millions of people make a living *driving trucks*, and self-driving trucks will eliminate their jobs (in theory, if it works, and I don't see it working) and make oodles of money for capital and kick millions of truck drivers, along with all the taxi and Uber car drivers, out without a dime. (Uber is VERY interested in self-driving cars. Guess why).
Some jobs are being made. And capital is desperately trying to commodify and cheapen such labor, to the point of demanding governments force coding classes on all kids. There are such jobs, but no where near enough, and those are mostly dropped onto cheaper kids, not newly dumped middle-aged workers.
Asimov was on point, decades ago, when he wrote that inevitably automation would eliminate most jobs, and that the biggest problem - in his view, opportunity -- would be finding something for people to do. I would say that people without purpose are the most dangerous force for destruction and stupidity on the planet - worse than global climate change.
Capital and people who work for capital, and neoliberals and business conservatives who support capital, tend to have well-paying white collar jobs and live among other people of their class, and don't see anything amiss. They're fine. Step outside into the vast middle grounds of the world, and you'll see a growing sense of we're-being-fucked that will require an endless army of pepper-spraying drones and surveillance to keep from erupting into riots someday soon.
The winning entry will be a heart warming story about a robot that kills all humans.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
It's very difficult to explain to the average person the difference between a computer problem that is simple and one that is virtually impossible. Obligatory xkcd... https://xkcd.com/1425/
from TFA:
"(ACARA) plans to start marking two-to-three page written components of the test using cognitive computing from 2017."
They're using software that was written a couple of years in the future...
Anyway my nephew Eli is into that sort of thing, but I don't know if he reads /. these days
(using AI to analyze essays, not the time travel part)
I mean, seriously? I don't expect an AI to be able to tell "Finnegans Wake" from gibberish. I mean, that's tough enough for a human.
This gon be goood.
.
I have to disagree with the statement that content doesn't matter. Without considering the content, you cannot judge whether the student is displaying reasoning and making cogent arguments, or merely faking it. <curmudgeon> it seems to me that the number of people I deal with who cannot tell the difference is increasing - a coincidence? Perhaps not. Murdoch has made a political movement out of exploiting such people.</curmudgeon>
If you say you cannot do a fair test if content is considered, that is not an argument for dumbing it down to pointlessness; it is an argument for doing it a different way or not doing it at all. In reality, you can set meaningful essay questions, that test a student's critical analysis and reasoning skills, within the context of the humanities and sciences.
At best, such programs can only flag questionable submissions, requiring more in-depth review by a human mind. Grading them totally? What dren! I wonder what these "tools" would make of Ulysses and other James Joyce novels? I suspect he would fail the course!
I recently took the TSI (Texas Success Initiative) test over writing, and wrote an essay with complete nonsensical information, but that was logically structured and scored well.
I think this is a fine idea, as long as the algorithm that scores the papers is publicly known. While this might initially seem like a bad idea, I think it is identical to what we have today - I remember intentionally adjusting my writing style to match teacher expectations in high school/college: some teachers liked me to parrot back facts and figures, others wanted their own theories returned to them, while still others (okay, just once in my school life) rewarded for original analytical thinking.
Since we already train students according to teacher bias of what makes a 'good' human-graded paper, it seems only fair to publish the bias that will be used to define a 'good' electronically-graded paper.
I see two ways electronic grading can fail.
(1) Students who submit poor papers which still score highly. If the AI algorithm is complicated enough that real cleverness is required, perhaps that's not a bad thing... And if the AI algorithm is easy to game, everyone will score highly and it will be obvious that the technology wasn't ready and this was a bad idea.
(2) Students who submit good papers which score poorly. Resolving this probably requires a public appeal-to-a-human-teacher process. If a large number of papers are appealed and found to be of quality, it will be obvious that the technology wasn't ready and this was a bad idea.
If after the trial, the number of overturn-by-appeals is low and the distribution of scores looks good, then mankind will have found a way to automate another (I believe) tedious task and free up more human capital and resources for more challenging and valuable pursuits, which sounds like a big win. Seems like we ought to try it and learn something.
In Soviet Russia, us are belong to all your base.
Writing is an art form, not a science. If a computer could grade the art of writing, then the computer could DO THE WRITING - or at least 'fix' the problems it detected. In which case it would become the equivalent of teaching humans to use a slide rule.
I am absolutely sure that our best and brightest writers will end up being screwed over by AI programs grading them
excitingthingstodo.blogspot.com
This reminds me of an old joke:
He has the brain of a computer - the errors that he makes are fantastic.
Yesterday I said on Facebook:
If your live in Ontario and you're planning to vote liberal in the federal election, then you have to proven stupidity has no limit. Just because Justin's father was a rock star doesn't mean he is. Justin is on par with Wynne for most dysfunctional and idiotic political leader in history.
I had women telling that I was suppressing there right to vote, I had others telling me that I was trying to control there freedom to vote, when in fact both are clearly wrong. So I fully support computer based grading of essays.
Automatic essay grading will be the perfect synergy for computer generated essays.
After all, the computer generated essay will follow grammatic rules consistently (assuming they are programmed in correctly, but let's assume for now that we wait for version 3.1 or so).
One big question is -- What are you trying to test for -- Do you only care if the student knows proper grammar and can follow it (maybe for lower + middle school english class)? Then automatic grading for STRUCTURE is probably good enough.
Do you want to see if the student has read + understood content enough to write a meaningful summary / review (ie: book reviews). Of if the student has understood the concepts and can make coherent arguments for or against a position (logic / philosophy / debating / management persuasion / marketing spin)?
Then you need someone who can understand this deeper level of content. Right now, I don't believe any automated system can do it.
OTOH, if you want to run the essays thru a "first pass" of a syntax / grammar checker, perhaps in the hope of either reducing the person-minutes required from 20 per essay to 15 per essay, that might be a great idea. It's not that automation is a completely bad idea, but it's not ready to take on the job yet. As a supplement, it could still be useful
I would hope they'd take some of the tests and have those human-graded so that a) test-takers will have to write real papers given the chance that a human will be grading them and b) to compare the human-tested vs the computer-tested paper grades.
The vast majority of most school essays probably can be accurately graded programmatically. All that's needed is some sort of appeal processes for students that think they got boned. The system would probably still come out way ahead in speed, consistency and cost.
The result of most school work is not to create new/useful/insightful/creative things, but to create something that can easily be judged as fulfilling the requirements. (It's the result of most work work too.) Just because the judge has traditionally been human doesn't change the fact that if you write an essay in the style of Anthony Burgess, you're likely to get a failing mark. Regardless if it's a masterpiece or not.
No.
Aside from the moral issues, is AI ready for this major task?
I would have expected more from a forum of computer nerds. Run the AI over all previous NAPLAN essay responses and compare the AI's score against the teacher scores. If there are significant differences find out why. If there are no significant difference, win!
FTA:
Rabinowitz said the trials show the artificial intelligence solutions perform as well, or even better, than the teachers involved.
See? Somebody's already checked! Nothing to see here, move along!
to hack the AI and turn it against it's masters.....
Similar to this is my favorite sentence:
Dew knot trussed yaw spilling chequer two finned awl yore miss takes.
...has basically been doing such a thing for years :-)