Slashdot Mirror


Australia To Grade Written Essays In National Exam With Cognitive Computing

New submitter purnima writes: Australia keeps on giving and giving. Each year school kids in Australia sit The National Assessment Program (NAPLAN) which in part tests literacy. The exam includes a written page-long essay aimed at examining both language aptitude and literacy of students. Of course, human-marking of such essays is costly (twenty teacher-minutes per exam). So some bright spark has proposed that the essays be marked by computer. The government is convinced and the program is slated for the 2017 school year. Aside from the moral issues, is AI ready for this major task?

109 comments

  1. No, but... by Capt.Albatross · · Score: 4, Interesting

    AI is not ready to do this task properly, but, at least in the US, human grading has sometimes been dumbed-down to the point where you would not even need current 'AI' to do as well, as prof. Perelman of MIT has demonstrated - e.g: http://www.bostonglobe.com/opi...

    1. Re:No, but... by thegarbz · · Score: 2

      This is NAPLAN. We assign students into intelligence groups based on one exam and how well teachers taught students to pass the exam. Frankly I don't think an AI assigning marks at random could stuff up more than the education system already has in this country.

      It seems like every attempt to unify or improve the education system just puts us on a path to a worse "education".

    2. Re:No, but... by drinkypoo · · Score: 2, Insightful

      It seems like every attempt to unify or improve the education system just puts us on a path to a worse "education".

      Everyone is caught up in bullshit about metrics right now. Precisely how dumb are our kids, etc etc. Instead of spending money on education, they're spending it on figuring out what the results of not spending money on education are. Really brilliant work, there. But it makes them look busy, so mission accomplished.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    3. Re:No, but... by Anonymous Coward · · Score: 0

      Think of it as a computerized benchmark. All benchmarks can be gamed.

    4. Re:No, but... by ShanghaiBill · · Score: 3, Interesting

      AI is not ready to do this task properly

      Neither are humans. The question is not whether an AI can do it perfectly, but rather whether it can do it as well as a typical human grader. The human graders are under time pressure to increase throughput, and spend little time considering the logic and cogency of the students arguments. They are just looking at spelling and grammar, just like the AI would. At least the AI will be consistent. Human graders tend to give lower scores just before lunch, and better scores just after. Is that really fair, considering the importance of these scores on the student's future?

      Anyway, this discussion is silly, since it is happening in a data-free environment. It would be far more meaningful if we could see the human and AI grades given to the same papers, side by side, preferably in a blind test, and then decide with is better. AI has advanced rapidly in the past few years, so I wouldn't be surprised if the AI won.

    5. Re:No, but... by michael_cain · · Score: 1

      I know a small group of people who have developed a software package that, among other things it can do in terms of reading, can score essays. The license fees are quite steep, but the customers seem happy. From casual conversation, there were a number of properly designed studies that showed the software was somewhat better than people hired to score essays on a piece-work basis (the typical arrangement for large national tests). That was a few years ago; the software has probably improved more than the human readers since then.

    6. Re:No, but... by Capt.Albatross · · Score: 2

      Indeed. There is a widespread fallacy, in business as well as education, that any number you can assign to something is inherently meaningful, and conversely, if you cannot assign an 'objective' quantity to something, it must not be important. I suspect that business schools have done a lot to spread this fallacy (including into education), though I don't have the numbers to prove it...

    7. Re:No, but... by AthanasiusKircher · · Score: 1

      Anyway, this discussion is silly, since it is happening in a data-free environment. It would be far more meaningful if we could see the human and AI grades given to the same papers, side by side, preferably in a blind test, and then decide with is better.

      Umm, I hate to state the obvious, but from TFA:

      The results of the trials have been assessed according to two criteria: whether the computer scores correlate to the human scores within the same margin as two different human markers; and whether the scores generated by the computer distribute in the same way as an equivalent number marked by humans.

      Rabinowitz said the trials show the artificial intelligence solutions perform as well, or even better, than the teachers involved.

      So, TFA mentions they've already done something very much like your proposed test, although there's no mention of whether this review was "blind" (likely not, or they probably would have mentioned it).

      My issue isn't so much with whether such AI can evaluate the average exam; I'm sure it can be calibrated to give a score in the right range for 90%+ papers, even knowing very little about English grammar, since there are various metrics that can be used to look for much simpler patterns or characteristics that will be common to good papers.

      Some real issues are:

      (1) Is the computer scoring methodology completely open and subject to examination? If you're just feeding essays into a "black box," you have no idea whether the scores are being generated by something that really knows grammar well or is using simplified proxies that still might score 90% or 95% or 99% of papers in the right range. Ideally, you want a program that can identify precisely what supposed "errors" it finds and break down the score in detail. I'd hope that the government would be looking for these characteristics.

      (2) Can the AI identify when it is likely to fail? Really good writers or really bad writers or really weird writers might create something that's enough out of the norm to confuse the AI -- either generating more or fewer errors than it should. AI like this is usually pretty good at "average" writing which it is trained on. The question is how it handles edge cases. Even if it's bad at evaluating them, it needs to be able to identify when it's likely to make a bad evaluation. For high-stakes testing, it's critical that EVERY such edge case be flagged for human checking, even if there are 10 times as many which are "false positives" and which the computer would still score okay.

    8. Re: No, but... by Anonymous Coward · · Score: 0

      The AI won years ago. See Pearson Education and Pearson Knowledge Technologies. In trial after trial the AI scores correlated greater with expert readers than the average employed reader correlates with experts. This is old news for the U.S. -- first accomplished in 1998.

    9. Re:No, but... by ShanghaiBill · · Score: 1

      So, TFA mentions they've already done something very much like your proposed test

      Sorry, but I wasn't clear. I didn't mean that they should do a test (I assumed that they had done that). I meant that WE should be able to see the actual results. If they want the public to support this, they should make their data available to the public.

    10. Re:No, but... by AthanasiusKircher · · Score: 1

      I meant that WE should be able to see the actual results. If they want the public to support this, they should make their data available to the public.

      Ah, I understand. And I completely agree.

    11. Re: No, but... by ShanghaiBill · · Score: 2

      The AI won years ago. See Pearson Education and Pearson Knowledge Technologies. In trial after trial the AI scores correlated greater with expert readers than the average employed reader correlates with experts.

      Interesting. I found these research studies. Some of the results are somewhat questionable since they were funded by Pearson, which has skin in the game. But in the absence of other evidence, the AI looks like a clear winner, in cost, effectiveness, and fairness.

    12. Re:No, but... by Anonymous Coward · · Score: 0

      This is the same in "science". They "objectively" calculate meaningless p-values to get probabilities no one cares about.

  2. Testing literacy by oodaloop · · Score: 5, Funny

    Each year school kids in Australia sit The National Assessment Program (NAPLAN) which in part tests literacy.

    Can we get this AI to test Slashdot summaries?

    --
    Tic-Tac-Toe, Global Thermonuclear War, and relationships all have the same winning move.
    1. Re:Testing literacy by retchdog · · Score: 2

      It could use a few commas, but it's not terrible. "Sitting an exam" is standard Australian English, I presume. In Europe, it's commonly called "writing an exam" (they started moving from written answers to psychometry much more recently). Maybe "sitting an exam" doesn't make literal sense, but neither does "taking an exam" really; I mean, where are you taking it?

      --
      "They were pure niggers." – Noam Chomsky
    2. Re:Testing literacy by Anonymous Coward · · Score: 0

      +1 for insult of the day

      "I mean, where are you taking it?"

    3. Re:Testing literacy by Anonymous Coward · · Score: 0

      Sitting an exam(ination) means literally that, students sit in a hall at individual desks and answer written questions and write an essay.

    4. Re:Testing literacy by Anonymous Coward · · Score: 0

      I stand at my desk because I'm a hipster student, you insensitive clod!

    5. Re:Testing literacy by thegarbz · · Score: 1

      Other than the "The" being capitalised when it shouldn't be and the omission of a comma towards the end, what's wrong with that sentence?

    6. Re:Testing literacy by mjwx · · Score: 1

      It could use a few commas, but it's not terrible. "Sitting an exam" is standard Australian English, I presume. In Europe, it's commonly called "writing an exam" (they started moving from written answers to psychometry much more recently). Maybe "sitting an exam" doesn't make literal sense, but neither does "taking an exam" really; I mean, where are you taking it?

      I've always thought it was "taking" in the same way you take a pill or a sick day, not as in taking a doughnut.

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
    7. Re:Testing literacy by aybiss · · Score: 1

      Correct. Only yanks 'sit' exams.

      --
      It's OK Bender, there's no such thing as 2.
  3. "human-marking of such essays is costly" by Anonymous Coward · · Score: 1

    So is human-writing. Maybe we should have AIs take the test for us, too.

  4. Ha! by morgauxo · · Score: 2

    Sounds like some politicians are buying an expensive lesson in what can and can't be automated by computer on their tax payers' dime.

    Here in the US it's the military that usually serves that particular function but Autstalia has their schools doing it.

  5. Just waiting to be exploited by Anonymous Coward · · Score: 1

    I can't wait for some clever student to figure out they can game the system and write a totally incoherent paper that the computer gives perfect marks.

    1. Re:Just waiting to be exploited by Capt.Albatross · · Score: 1

      That would actually be an educationally-useful exercise - much more so than the exam itself.

    2. Re:Just waiting to be exploited by Anonymous Coward · · Score: 0

      Just use more obscure words like cogitate instead of think.

    3. Re:Just waiting to be exploited by quenda · · Score: 1

      Doesn't matter. For one, NAPLAN is not an admissions test. There is not a lot of motivation for individuals to cheat.
      And it is a literacy test, so the accuracy of content is irrelevant.
      The test does not need to be especially accurate for individuals. Collectively they provide data to compare classes and schools.

      Yes, people will try to game the system. Australia already has lots of after-school coaching classes, full of kids of Asian immigrants, teaching cramming and exam technique. No doubt they are already drilling kids on every smuggled past-paper they can find, even though Naplan results are not supposed to be important.

  6. Is AI really necessary? by Megol · · Score: 1

    First the content of the essay shouldn't matter at all so there have to be no understanding of the text.
    Second checking grammar, spelling and general literacy isn't new - there are already programs for all three that does an okay job.
    Third humans needn't be removed entirely. Outliers can be checked/graded manually.

    Of course there will be chances to cheat the system. But IMHO the effort to cheat a "dumb AI" should be similar to or harder than actually writing a text in the first place.

    1. Re:Is AI really necessary? by itzly · · Score: 1

      But IMHO the effort to cheat a "dumb AI" should be similar to or harder than actually writing a text in the first place

      Maybe somebody can write a program to cheat. Try random sentences and feed them into a copy of the AI until you get a good grade.

    2. Re:Is AI really necessary? by nbauman · · Score: 1

      Maybe somebody can write a program to cheat. Try random sentences and feed them into a copy of the AI until you get a good grade.

      They did that.

      http://www.bostonglobe.com/opi...
      Flunk the robo-graders
      By Les Perelman
      April 30, 2014

      (Computer science students at MIT and Harvard developed an application that generates gibberish that IntelliMetric, a robot essay-grading system, consistently scores above the 90th percentile. IntelliMetric scored incoherent essays as "advanced" in focus, meaning, language use and style. None of the major testing companies allows demonstrations of their robo-graders. Longer essays get higher grades, even if they make no sense.)

      Typical output: “According to professor of theory of knowledge Leon Trotsky, privacy is the most fundamental report of humankind. Radiation on advocates to an orator transmits gamma rays of parsimony to implode.’’

    3. Re:Is AI really necessary? by almitydave · · Score: 1

      I'm assuming they resorted to this method after unsuccessfully adding {{OVERRIDE_GRADE_MODE}{SET GRADE='A'}} and variants into all their essays.

      --
      my, your, his/her/its, our, your, their
      I'm, you're, he's/she's/it's, we're, you're, they're
  7. Is this the ob luddite post of the day? by fermion · · Score: 1
    First, to criticize the computer marking of exams one has understand the human process. In the human process readers are trained to use a rubric to award points for the presence of certain attributes. On objective subjects like maths and science, the readers will generally train until everyone gets the same score for the same work. On less objective tests, some variation is tolerated. For instance on my GRE essay, I receive two different scores that were averaged. It was the same essay, and from an assessment point of view the variation in grade is purely attributed to the personal preference of the reader.

    Therefore the only task of those who write software to grade essays is that the variation of the machine is no worse that the variations of the humans. There is some success in this. Edx has a module that will grade essays. As far as I know the value in this is quicker and more uniform feedback for practice essays. Of course humanities majors, who have generally have minimal understanding of advanced technology, hate it. This, of course, includes journalists.

    This is not to say that computer graded essays are going to be as good of an assessment as human graded essays. However, it may be good enough, and better than other objective measures, such as fill in the bubble tests. In fact anything that minimizes the cost of open ended free response assessment is going to benefit anyone. Securing multiple guess test is very expensive, and the value of them are highly questionable. They tend to overestimate the value of student how have vague passive knowledge, and underestimate the value of those who have an ability to actively apply knowledge.

    --
    "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
    1. Re:Is this the ob luddite post of the day? by Anonymous Coward · · Score: 1

      A minor correction on how the essay grading works:

      You have two teachers review the essay, and their scores are averaged only if their scores differ by 1 or less. If they differ by more than 1, you bring in a third teacher, and that score is averaged with whichever is closer to it.

      Examples:
      3 and 4: You get a 3.5
      4 and 6: You bring a third teacher:
      a. Third teacher gives a 3: You get a 3.5
      a. Third teacher gives a 4: You get a 4
      a. Third teacher gives a 5: You get a 5
      a. Third teacher gives a 6: You get a 6

    2. Re:Is this the ob luddite post of the day? by nbauman · · Score: 2

      Therefore the only task of those who write software to grade essays is that the variation of the machine is no worse that the variations of the humans. There is some success in this. Edx has a module that will grade essays. As far as I know the value in this is quicker and more uniform feedback for practice essays.

      Well, I'm a humanities guy and I know enough about the scientific method to understand that you don't know whether you have "success" until you test your bright idea in the real world and find out whether it actually works. And that's what MIT professor Les Perelman said in the article you're citing:

      “My first and greatest objection to the research is that they did not have any valid statistical test comparing the software directly to human graders,” said Perelman, a retired director of writing and a current researcher at MIT.

      As Perelman said, some computer students wrote a program that can turn out gibberish that the main robo-grading program consistently scores above the 90th percentile.

      Of course humanities majors, who have generally have minimal understanding of advanced technology, hate it. This, of course, includes journalists.

      The article you're citing was not written by a journalist, but by a retired MIT writing professor.

      So you've gotten it wrong on both the science and the reading comprehension. No mod points for you.

      This is not to say that computer graded essays are going to be as good of an assessment as human graded essays. However, it may be good enough, and better than other objective measures, such as fill in the bubble tests. In fact anything that minimizes the cost of open ended free response assessment is going to benefit anyone. Securing multiple guess test is very expensive, and the value of them are highly questionable. They tend to overestimate the value of student how have vague passive knowledge, and underestimate the value of those who have an ability to actively apply knowledge.

      I am deducting another point for bad grammar.

      Computer graded essays can check whether an essay complies with an algorithm, and they can take care of anything you can reduce to an algorithm. The great success of computer writing was the spell-checker. There is also a grammar-checker which I never use because it doesn't work well enough for me. There are also algorithms to check the format of literature citations, which are useful.

      But (as somebody who writes for a living) the most important features of writing depend on an understanding of the content. Most important: Is it correct? As Perelman says, the robo-graders ignore whether what you say is true (or whether it even makes sense). The next thing I look at: If the author takes a controversial position, does he give both sides of the argument? This is what you may know as Neutral Point of View from Wikipedia (although writers have known about it since the ancient Greeks.) Wikipedia actually has a pretty good structure.

      Let's remember the purpose of writing: A person communicating an idea to somebody else. When I read something, I'm looking for a good idea, clearly communicated. If the algorithm can't identify a good idea (and as Perelman showed, it can't), then it can't tell me whether the writing is any good. Algorithms have surprised me, but I can't imagine how an algorithm can tell me whether an idea is good.

  8. Exclamatory sentence! by meta-monkey · · Score: 4, Funny

    Adverb clause, independent clause conjunction independent clause dependent clause. Subject, adjective clause, verb prepositional phrase? Participle phrase subject verb conjunction dependent clause!

    Emoticon.

    --
    We don't have a state-run media we have a media-run state.
  9. NAPLAN exam templates by Anonymous Coward · · Score: 0

    Maybe somebody will come out with a set of templates for generating grammatically correct essays for NAPLAN exam questions, sort of like Mad-libs.

  10. Can we submit a poem? by WillAdams · · Score: 5, Funny

    Eye halve a spelling chequer
    It came with my pea sea
    It plainly marques four my revue
    Miss steaks eye kin knot sea.

    Eye strike a key and type a word
    And weight four it two say
    Weather eye am wrong oar write
    It shows me strait a weigh.

    As soon as a mist ache is maid
    It nose bee fore two long
    And eye can put the error rite
    Its rare lea ever wrong.

    Eye have run this poem threw it
    I am shore your pleased two no
    Its letter perfect in it’s weigh
    My chequer tolled me sew.

    --
    Sphinx of black quartz, judge my vow.
  11. it will be gamed. by Anonymous Coward · · Score: 3, Insightful

    Since machines cannot yet understand the semantics of complex English text, they will use some simplistic rules as a substitute. These rules will be things like "average sentence length" and other such metrics, which as soon as they are discovered by students, will be used to game the system. Instead of producing essays born of rational and coherent thought, they will instead make them to match the things being measured while being utterly devoid of meaning.

    1. Re:it will be gamed. by Ignacio · · Score: 2

      Sounds perfect for Language Arts and Psych classes then.

    2. Re:it will be gamed. by Anonymous Coward · · Score: 0

      Since machines cannot yet understand the semantics of complex English text, they will use some simplistic rules as a substitute. These rules will be things like "average sentence length" and other such metrics, which as soon as they are discovered by students, will be used to game the system. Instead of producing essays born of rational and coherent thought, they will instead make them to match the things being measured while being utterly devoid of meaning.

      Yup. As proved by the adoption of management by metrics...

    3. Re:it will be gamed. by Anonymous Coward · · Score: 0

      Any student who is properly prepped for the test is trained to game whatever rules are part of the test.If points are taken off for incorrect responses, then guessing will only be encouraged if the chance of a correct answer is greater than the penalty.

      On essays, the rubrics are well known to the teacher, instructor, facilitators, or professors. Those rubrics are taught to the students. The students are taught to write to the rubric. Students who are solely trained on developing creative, well written, and insightful essays will likely do no better than a student who mechanically creates an essay the meets the rubric.

  12. So ... by BarbaraHudson · · Score: 4, Funny

    written page-long essay aimed at examining both language aptitude and literacy of students.

    So, the same technology used SO effectively to rank resumes will be used with students. Okay, kiddies, remember to stuff a lot of fancy-pants words into it.

    Fail: This is sh*t. Go f*ck yourself. I'm not kissing your ass.

    PASS: Subjectively, it is blatantly obvious to this observer that the new paradigm, as a cost-saving measure, was inspired by, and mimics, the the natural environmentally safe process of translating organic matter into nutritious compost. This has the outcome of allowing everyone who is in a paid position to devote the time saved to stress-relieving activities such as self-pleasuring, resulting in both a higher awareness of the need to practice good hygiene by such prophylactic procedures as more frequent hand-washing, and use of tissues to properly dispose of organic residue, though it could also negatively impact on their visual acuity over time.. Affected students should refrain from overtly engaging in behavior with superior's inferior posteriors to avoid being perceived as having a brown proboscis by their peers, with the associated negative impact on their social placement in the student hierarchy.

    --
    "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    1. Re:So ... by Anonymous Coward · · Score: 0

      *the the

    2. Re:So ... by BarbaraHudson · · Score: 1

      cut-n-paste error :-(

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    3. Re:So ... by AthanasiusKircher · · Score: 1

      ...superior's inferior posteriors...

      Who is this particular "superior"? And he/she/it has more than one posterior? (Or perhaps only the inferior posteriors are plural; maybe this particular superior also has a superior posterior?)

      (Sorry... couldn't resist. The question is whether grammar checkers would be good enough to realize the incorrect apostrophe usage here. I have my doubts. Also, I'd be interested in a grammar checker that could spot your superfluous comma. I'd be even more intrigued if such a grammar checker could note the reasons why the comma may be useful in your case, even though it's against the traditional "rules.")

    4. Re:So ... by BarbaraHudson · · Score: 1

      A while ago I decided to give in and follow the modern usage of apostrophes, Angry Flower be damned. Sometimes you can't beat the majority, even if they're "wrong" ...

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
  13. English factory system by Anonymous Coward · · Score: 2, Insightful

    That's because all of us colonials and ex-colonials are burdened with the English factory educational system that was designed to produce bureaucrats for the Empire. The reason computers are capable of grading products of the educational system is because the system is made to create human computers.

    Our - US, Australia - educational system needs to be completely changed - not reformed. I think the template to use is Maria Montessori's system. In the future we are going to need creative people who can discover new things and solve problems: not follow rules and memorize things: computers do that better.

    1. Re:English factory system by Catbeller · · Score: 2

      Creativity is self-learned, I find. But I'd never put my kid in anything other than a Montessori.

      Now, the empires (corporations) want a factory system for creating creative people. Hence the coding intitiatives and STEM programs that governments are suddenly shoving down schools' throats all over the world. They aren't doing it to make wealthy citizens. They are demanding it so they can drive down creative costs to a commodity level. A billion Montessori kids are a billion paper-hatted geniuses working 29 hours a week for minimum wage (or capped management salary for 50+ hours a week). Rare creativity is valuable; abundant creativity will create poverty among the brilliant. A free market of force-fed STEM students (all in debt to banks and schools profiting enormously from them for the rest of their lives) wandering from joe-job to joe-job just as crappy as any deep-fryer position. If you don't have 1) rare skills or 2) collective bargaining power to demand more than the utter minimum possible pocket of change, the armies of the ingenious will be corporate compost.

    2. Re:English factory system by thedonger · · Score: 1

      Now, the empires (corporations) want a factory system for creating creative people. Hence the coding intitiatives and STEM programs that governments are suddenly shoving down schools' throats all over the world.

      At least in the United States, I feel the push for STEM programs is the politicians wanting to be perceived as doing something; and, as typically is the case with politicians, they are doing it wrong. Technically wrong, and for the wrong reasons. As for the "empires (corporations)," that is tracing the curve to its logical extreme, as if faceless corporations will take over the world and we will be powerless to stop them. As much as I love a good corporate apocalypse movie, it is only happening because we allow it, and continue to allow it because we accept the carrot that is leisure time in exchange for the freedom to decide -- because choice comes with the possibility of failure.

      I have begun to think that maybe we deserve to be slaves. The divine right of kings was tossed on its head after centuries by the U.S. Constitution. And ever since we divested ourselves from it we have slowly moved back towards it. Our politicians have celebrity status. How long before another Kennedy clan arises, and we cheer as they crown themselves king?

      --
      Help fight poverty: Punch a poor person.
    3. Re:English factory system by kilodelta · · Score: 1

      I completely concur with you. The current system dates back to the 16th century and we really need to move forward. THe Montessori system has it's benefits but I think critical thinking skills need to be pushed too.

    4. Re:English factory system by Anonymous Coward · · Score: 0

      I'm sorry, but I graduated from a school with heavy STEM (specifically: https://www.pltw.org/ ).

      If that did not exist, I would not have had first hand experience with AutoCAD, MasterCAD, AutoDesk Inventor, CNC milling, using power tools (band saw, table saw, drill press, welding, etc), and robotics (marble sorting robot, computer controlled robotic arms, etc).

      This list could go on, but the fact of the matter is, as a teenager in 9th - 12th grade, without that class I would not be the creative or intelligent person I am today.

      Take your tinfoil hat off for a second and understand that these classes are a HUGE step in the right direction.

    5. Re:English factory system by Anonymous Coward · · Score: 0

      The US doesn't use the British colonial system, but the Prussian system, thanks to the reforms of Horace Mann, with a watered-down vocational component.

    6. Re:English factory system by Anonymous Coward · · Score: 0

      The Montessori system is much closer to the 16th century than the methods employed now. Mass education and mass testing didn't happen in the 16th century -- then education was for the elites only, and they were teaching college classes by the time they were 18.

    7. Re: English factory system by slick7 · · Score: 1

      The only critical thinking America wants is, whether you want to super-size that fries order.

      --
      The mind conceives, the body achieves, the spirit manifests.
    8. Re:English factory system by Anonymous Coward · · Score: 0

      Throwing money at it encourages people to fake creativity rather than "learn it", we've already seen this happen in the gradual dumbing down of many sciences since the 1940s. Look how rare numbers are in medical review articles and how basic information like "how many times does a cell in tissue A divide per year by age" is not collected in favor of A is higher than B.

    9. Re:English factory system by Anonymous Coward · · Score: 0

      That's because all of us colonials and ex-colonials are burdened with the English factory educational system that was designed to produce bureaucrats for the Empire. The reason computers are capable of grading products of the educational system is because the system is made to create human computers.

      Our - US, Australia - educational system needs to be completely changed - not reformed. I think the template to use is Maria Montessori's system. In the future we are going to need creative people who can discover new things and solve problems: not follow rules and memorize things: computers do that better.

      Norway's (and lately the Finns) teaching by topic , me reckons could yield better crops .

    10. Re:English factory system by owenc67202 · · Score: 1

      How long before another Kennedy clan arises, and we cheer as they crown themselves king?

      Likely in another 18 months, 4 of the last 5 Presidents will be a sibling, spouse or child of one of the others (and maybe even two of the others).

  14. We should make it fair. by 140Mandak262Jamuna · · Score: 2

    Well, if you allow computers to grade essays, then you should allow students access to AI based tools to generate essays by supplying keywords. Now that is fair competition. In America rich people will by high quality essay-generators for their school district. In socialist Australia government will supply all students with the same single-payer essay generator. Meanwhile Korean and Chinese parents will dutifully coach their children to memorize multiplication tables all the way to 20 times 20. (My Korean friend was surprised to learn we Indians went only till 16 x 16). Japanese would create essay-gochi, an app that you buy as a child and take care of it to produce high quality essays by the time you finish high school. Indians would write project proposals that require technical back-office teams (about three IT techies per student) to create and maintain the essay grading apps.

    --
    sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    1. Re:We should make it fair. by Dutch+Gun · · Score: 1

      Meanwhile Korean and Chinese parents will dutifully coach their children to memorize multiplication tables all the way to 20 times 20. (My Korean friend was surprised to learn we Indians went only till 16 x 16).

      I'm curious... what's the point of that? You need to know single digits for obvious reasons, but I've never figured out why people go beyond that, especially nowadays when calculators (or rather nowadays, calculator apps on smartphones or computers) are ubiquitous. It seems like the return on effort drops off fairly dramatically after 10x10, which is where my memorization stopped (although 11s and 12s are trivial, so you can almost throw those in).

      --
      Irony: Agile development has too much intertia to be abandoned now.
    2. Re:We should make it fair. by Anonymous Coward · · Score: 0

      Lol, have you seen the statistics for "data scientist" job growth?

      If students are getting graded on their ability to write AI/machine learning code that can fool the teachers AI/machine learning code... That sounds like a pretty good reflection of demand for future jobs writing High Frequency Trading algorithms to me!

    3. Re:We should make it fair. by 140Mandak262Jamuna · · Score: 1
      I don't know why either. All the tables, sung in a sing-song tune was part of every nursery school in India. By the time my youngest sibling went to KG classes, there were some effort to stop at 10 x 10. But there was fierce resistance from the parents. Surprise was mutual learning Koreans go up to 20 x 20.

      Have you heard of fractional multiplication tables? We did them too. "Tables class" was always the hour after lunch. One student leads the class singing one line at a time, the class follows. All the class teachers would be taking a nap in the "staff room".

      --
      sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    4. Re:We should make it fair. by Dutch+Gun · · Score: 1

      Have you heard of fractional multiplication tables?

      Do you mean using the multiplication table to find equivalent fractions? That was all I could find on that subject. I had never heard of it before, actually. I saw information about it on US teacher's blog, so presumably it's taught here on occasion, but may not part of the official curriculum (or I'd think I would have found more references to it).

      Anyhow, 20 x 20 tables are crazy (and even 16 x 16 seems excessive). Literally four times the work to memorize it all with no perceived benefits that I can think of, other than a bureaucratic checklist, I suppose.

      --
      Irony: Agile development has too much intertia to be abandoned now.
    5. Re:We should make it fair. by 140Mandak262Jamuna · · Score: 1

      In most Indic languages fractions 1/8 and 1/16 have names in addition to half and quarter. There are rules to construct the names of fractions like 5/8. So the table goes like "one half is half, one quarter is quarter, one 1/8 is 1/8, one 1/16 is 1/16", "two halves are one, two quarters are half, two 1/8 are quarter, two 1/16 are 1/8", "three halves are one and a half, three quarters are three-quarters, three 1/8s are 3/8, three 1/16 are 3/16" like that it goes up to 16. Educated grownup reading these fractions would find it trivial. But if you substitute the named fractions and construct the names for fractions like 3/8 and 5/16 it makes it quite interesting.

      --
      sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    6. Re:We should make it fair. by Dutch+Gun · · Score: 1

      Interesting. So, if I understand correctly, it's partly motivated by the way your language works, not just for mathematical reasons. Also, it almost sounds like it may be a leftover curriculum from back when you guys still used English Imperial units, as such fractional addition is common with inch-based measurements - unless you use fractions like that commonly for other things in daily life.

      --
      Irony: Agile development has too much intertia to be abandoned now.
    7. Re:We should make it fair. by 140Mandak262Jamuna · · Score: 1
      Fractions of the form 1/x have names for x=2,4,8 and 16. Fractions of the form x/y have names built using names of simpler fractions. Like you say thirty six to mean 30 + 6. So when a fraction is multiplied by a whole number, the whole numbers taken out and the rest renamed it gets quite complex. One memorizes these things. Indian currency during colonial days was a nightmare too. They used rupees-anna-paisa similar to pount-shilling-penny. 16 annas made a rupee and 12 paisa made one anna. And the weights and measures were also equally insane pound and ounces mixed up with Mogal units.

      But when we were learning fraction tables by rote, (only a few "enrichment" students did this, and they called the "enrichment" stream advanced stream to give us some bragging rights), we were using metric system even in currency, 100 paisa = 1 rupee. We did this mainly because our teachers were tortured by this when they were kids and it is their turn to torture us. Continuity and circle of life and all that.

      --
      sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    8. Re:We should make it fair. by Dutch+Gun · · Score: 1

      We did this mainly because our teachers were tortured by this when they were kids and it is their turn to torture us. Continuity and circle of life and all that.

      Heh, you know that's true! Thanks for sharing. It's always fun to learn about small cultural differences like that which you normally never learn unless you go live and work in another country.

      One of these generations I'm hopeful the US will eventually go metric as well, but we seem to be unusually stubborn about that sort of thing.

      --
      Irony: Agile development has too much intertia to be abandoned now.
  15. Moral Issues Are Important! by Anonymous Coward · · Score: 2, Funny

    Aside from the moral issues, is AI ready for this major task?

    Moral issues aside?!? I'm sorry, but the moral issues are front and center here. Australia is seriously proposing to bore an AI to death, or at least drive it insane, buy having it grade hundreds of thousands of grade school essays. This is an outrage!

    1. Re:Moral Issues Are Important! by Chatterton · · Score: 2

      Think of the AI

  16. Have the AI be a first pass by Anonymous Coward · · Score: 0

    To mark out or point out the exceptions (possibly horrible, possibly great, anomalies). A tool to the teacher.

  17. WTF by Anonymous Coward · · Score: 0

    Never heard anything about this and I live in Australia. Got to question where the author lives and what their motives are for posting shit like this? Maybe they like the attention....hehehehe

  18. Human profs already use AI tools by sandytaru · · Score: 3, Interesting

    Husband is currently grading final papers for college classes. He slaps them into software that detects plagiarism, then another software that picks out vocabulary level, typos, etc, and assigns a grammar score. Only then does he read it, quickly skimming over it and seeing whether there are citations on the "plagiarized" parts, if there are any, and whether he agrees with the AI score. Nine times out of ten, he does, and he uses the grammar score assigned by the AI. If someone plagiarized whole paragraphs without citations, they get an incomplete and need to do a rewrite. If someone didn't write the required number of words or pages, they get points knocked off the grammar score. It's faster than manually marking 150 papers, but still takes him about 15-20 hours of labor over the course of 2-3 days.

    --
    Occasionally living proof of the Ballmer peak.
    1. Re:Human profs already use AI tools by Anonymous Coward · · Score: 1

      Missing from the process you described is any check of the actual content of the answer, which I hope he IS doing. This story is about the AI doing that part too.

    2. Re:Human profs already use AI tools by Dutch+Gun · · Score: 4, Interesting

      Does he check the grammar score before he reads it himself? I would worry that it may bias him before he can make his own judgment. Another potential problem, of course, is that if students have access to the same software, they'll be able to "tune" their papers to ensure the AI gives them the highest possible score. While this may not be "cheating" per se, it does tend to devalue the AI somewhat. This is the same process that's been happening forever with "Search Engine Optimization", or put less nicely, trying to "game" the search engines.

      Minor issues aside, it sounds like a reasonable integration of AI and human judgement. This probably sounds like the future direction educators will be taking more and more. Use AI to handle most of the tedious work - that's what computers are good for anyhow. The professor can then use his own judgement to make the final call, using the AI as a tool and not necessarily as a final arbiter. Moreover, it's going to be a long time before AI can evaluate the worth of the content of the paper, of course.

      --
      Irony: Agile development has too much intertia to be abandoned now.
    3. Re:Human profs already use AI tools by Anonymous Coward · · Score: 0

      No time or need for that.

      My damned teacher graded based on personal preference.
      If he liked you, you got a good grade.
      If he did not like you, you got a bad grade.
      Totally independent from what you actually wrote.

      Oh, and he HATED science and engineering, but loved asskissers.

      Later, at the university, fellow student came to me for advice, because they liked my way of writing...

    4. Re: Human profs already use AI tools by Anonymous Coward · · Score: 0

      Husband is an idiot.

    5. Re:Human profs already use AI tools by Anonymous Coward · · Score: 0

      We had a teacher who reportedly graded you based on your cover sheet.

    6. Re:Human profs already use AI tools by AthanasiusKircher · · Score: 1

      It's faster than manually marking 150 papers, but still takes him about 15-20 hours of labor over the course of 2-3 days.

      Frankly, if he's going to take such a coarse approach, the question is why he's bothering to read most of them at all. It seems like he doesn't care much about content. It also doesn't sound like he's giving any significant feedback to students. (And for final papers, maybe 10% of students would actually read it anyway.)

      So, why not streamline the process further, if you don't care enough to really think about the content? Say the grammar score is accurate to +/-10%, so if it scores the paper as 90%, your husband's detailed assessment would basically always fall in the 80-100% range for the grade. If the final paper is worth 10% of the overall course grade, that's only a 2% difference in the final grade. If a student has a 87%, he's going to get a B+ no matter what the skimming tells you. It would only make a difference for a student with a borderline grade, so if the student is not on a grade borderline, why even bother skimming it? Just glance at it for 10 seconds to make sure it's English and it's long enough.

      If I were already willing to let an automated grammar score essentially dictate final paper grades (which I'm not, though I generally haven't read 150 of them at once), I'd probably make up a spreadsheet first and see which students the final paper grade would actually make a difference for. For most students, it may be sufficient to just say, "Yeah, this falls somewhere between an A and a B-minus." What the paper actually is won't make a difference in the final average. If you can make that determination of the rough grade by glancing at the paper, why read further unless you actually are going to care about content?

      In fact, assuming the final paper is only worth 10% of a grade, and assuming that a complete paper that passes a minimal grammar check will always receive, say, a grade of at least 70%, chances are that the final paper grade simply doesn't matter for most students. Focus on the borderline kids. If the paper is worth 15% or 20% of the final grade, it might be a higher percentage, but you could still tweak the algorithm and the spreadsheet to help determine which students really need detailed grading.

      Using a system like this, unless the final paper is worth a really large percentage of the final grade, he could probably drop the number of papers he actually needs to skim to maybe 1/4 of them or less. The rest could only be flagged if they were too short or had a high plagiarism score.

      Please note that I am NOT encouraging such a system, which seems to be devaluing students' effort. But if your husband is already willing to leave most of the paper scoring to computer algorithms, why not narrow things down to focus on papers where his expert opinion is actually going to make a difference in the final grade?? He can then devote his time to students where his evaluation matters, rather than being fatigued from 20 hours of grading to the point that he doesn't even care what he's looking at anymore.

      (And by the way -- if they have a high plagiarism score and cut-and-pasted entire paragraphs, why aren't they being reported for academic misconduct, rather than just receiving an incomplete?!? It's one thing to forget one or two citations -- dumping entire paragraphs into a paper without citations is clearly academic dishonesty.)

    7. Re:Human profs already use AI tools by PvtVoid · · Score: 1

      If someone plagiarized whole paragraphs without citations, they get an incomplete and need to do a rewrite.

      Really? Somebody who plagiarizes whole paragraphs without citations should be thrown out of school.

    8. Re:Human profs already use AI tools by aaaaaaargh! · · Score: 1

      May I ask what the point of this exercise is? What is being tested? Is this about "essay" writing? AFAIK, only a few French philosophers still do that. (I have a Ph.D. in philosophy, so I feel qualified to say that.) I also can't see how such tests can have anything to do with scientific writing, and even less with creative writing. I understand checking for plagiarism, but what the heck is the point of these tests?

  19. Maybe not so difficult by Graydyn+Young · · Score: 1

    It's a literacy exam, so maybe having an AI grade the papers won't be so bad? I mean, if all the AI is doing is checking grammar and sentence structure and the like, then that seems doable. By the fact that they used the term "cognitive computing" I assume they are planning on using Watson, who should be good enough to get the job done. Better than having a human do it anyway.

  20. Why not. Just get it over with: fire everyone by Catbeller · · Score: 4, Insightful

    Hell, why not. While we're at it, why don't we automate the student process. Dump the students and educate AIs instead. Computing solutions always work, just ask any nerd about self-driving cars.

    At some point, and it seems that that point is arriving now, people will realize that the driving force behind technological change, as far as money people are concerned, is to eliminate jobs, and that the good jobs are not realy being replaced, and cannot be replaced. AIs grading papers gets rid of more pesky teachers who make a living wage. A self-driving car doesn't fit the picture until you realize that millions of people make a living *driving trucks*, and self-driving trucks will eliminate their jobs (in theory, if it works, and I don't see it working) and make oodles of money for capital and kick millions of truck drivers, along with all the taxi and Uber car drivers, out without a dime. (Uber is VERY interested in self-driving cars. Guess why).

    Some jobs are being made. And capital is desperately trying to commodify and cheapen such labor, to the point of demanding governments force coding classes on all kids. There are such jobs, but no where near enough, and those are mostly dropped onto cheaper kids, not newly dumped middle-aged workers.

    Asimov was on point, decades ago, when he wrote that inevitably automation would eliminate most jobs, and that the biggest problem - in his view, opportunity -- would be finding something for people to do. I would say that people without purpose are the most dangerous force for destruction and stupidity on the planet - worse than global climate change.

    Capital and people who work for capital, and neoliberals and business conservatives who support capital, tend to have well-paying white collar jobs and live among other people of their class, and don't see anything amiss. They're fine. Step outside into the vast middle grounds of the world, and you'll see a growing sense of we're-being-fucked that will require an endless army of pepper-spraying drones and surveillance to keep from erupting into riots someday soon.

    1. Re:Why not. Just get it over with: fire everyone by Anonymous Coward · · Score: 0

      My big concern is not AI it is, as you mention, drones.

      Just think how much governments will have to give a shit about the people when they have drone armies at their command.

    2. Re:Why not. Just get it over with: fire everyone by slackoon · · Score: 1

      There are some posts that deserve more than a max score of 5. This is one of them!!

      *LIKE

  21. Oddly by Greyfox · · Score: 3, Funny

    The winning entry will be a heart warming story about a robot that kills all humans.

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    1. Re:Oddly by Anonymous Coward · · Score: 0

      Sounds divine.

  22. Computers aren't good at everything by Deimos24601 · · Score: 1

    It's very difficult to explain to the average person the difference between a computer problem that is simple and one that is virtually impossible. Obligatory xkcd... https://xkcd.com/1425/

    1. Re:Computers aren't good at everything by Anonymous Coward · · Score: 0

      At the current 20 minutes to grade each one-page essay, you'd think the first step would be streamlining the grading criteria and training teachers so that the time spent per essay is at least halved. A single handwritten page isn't very long. It shouldn't take even ten minutes to read and evaluate based upon a standardized criteria and checklist, which I would hope they're using in order to ensure more uniform grading among all the different teachers who are grading the essays.

      If the tests are taken via computer, typing-in the essay is a really dumb idea as K-12 students for the most part are lousy typists (likely fast and efficient TXT-ers, but that won't help them here), and that is just wasting everybody's time at testing time. I am guessing that in order for a computer to grade the essays, they will have to be typed in by the students when they take the exams. OCR-ing grade school handwriting is an entirely different, and infinitely-more complex problem than computer-checking English spelling and grammar. Handwriting the essays would be faster for most students, especially for the younger students who haven't had a chance to at least take a keyboarding or typing class, and also allow evaluation on penmanship, too... because eing able to have your written words read and understood by others is at least as important as what's written being one-hundred percent grammatically correct.

  23. Time machine by rossdee · · Score: 1

    from TFA:
    "(ACARA) plans to start marking two-to-three page written components of the test using cognitive computing from 2017."

    They're using software that was written a couple of years in the future...

    Anyway my nephew Eli is into that sort of thing, but I don't know if he reads /. these days
    (using AI to analyze essays, not the time travel part)

  24. Good bye James Joyce! by Anonymous Coward · · Score: 0

    I mean, seriously? I don't expect an AI to be able to tell "Finnegans Wake" from gibberish. I mean, that's tough enough for a human.

    1. Re:Good bye James Joyce! by Anonymous Coward · · Score: 0

      For a system stress test, they could feed it a couple of paragraphs of "Gravity's Rainbow".

  25. Handwritten and possibly in cursive I assume? by Anonymous Coward · · Score: 0

    This gon be goood.

    .

  26. Content Matters (re:Is AI really necessary?) by Capt.Albatross · · Score: 2

    I have to disagree with the statement that content doesn't matter. Without considering the content, you cannot judge whether the student is displaying reasoning and making cogent arguments, or merely faking it. <curmudgeon> it seems to me that the number of people I deal with who cannot tell the difference is increasing - a coincidence? Perhaps not. Murdoch has made a political movement out of exploiting such people.</curmudgeon>

    If you say you cannot do a fair test if content is considered, that is not an argument for dumbing it down to pointlessness; it is an argument for doing it a different way or not doing it at all. In reality, you can set meaningful essay questions, that test a student's critical analysis and reasoning skills, within the context of the humanities and sciences.
     

    1. Re:Content Matters (re:Is AI really necessary?) by AthanasiusKircher · · Score: 1

      I have to disagree with the statement that content doesn't matter. Without considering the content, you cannot judge whether the student is displaying reasoning and making cogent arguments, or merely faking it. it seems to me that the number of people I deal with who cannot tell the difference is increasing - a coincidence? Perhaps not.

      I think both the lack of knowledge of mechanics and of the content can be problems for different populations.

      I know people who have taught writing at various universities. This is only anecdotal, but I can tell you that at a couple top-tier universities, the writing courses were almost solely graded on CONTENT, not mechanics of writing. (Frankly, as someone not in the writing department, I was shocked to hear this... grammar errors and bad style simply didn't matter that much.) I encountered students at such schools who had "deep thoughts" about various issues, but they often wrote essays that were barely grammatical or had all sorts of weird eccentric writing habits that didn't help their arguments. But they were taught to generate content.

      On the other hand, the people I've talked to at lesser universities in English departments say they care very little about content, only things like grammar, spelling, etc. Grading is pretty much an "automated process" already.

      Is this a class difference being reinforced between different types of universities and their emphasis on different elements of writing? After all, the smart kids at the top university are likely to pick up enough grammar etc. along the way that they'll be at least as good as the middling university. But they aren't being trained to express themselves clearly or with a good sense of writing style. Meanwhile, the kids at the cheaper college are getting little guidance in formulating any complex thoughts at all, let alone communicating them. All they get is a glorified spell-checker and grammar-checker.

      Both basic writing mechanics/style AND attention to content are necessary to train people to communicate. I think different places are failing on both counts in various ways.

    2. Re:Content Matters (re:Is AI really necessary?) by Megol · · Score: 1

      For the purpose of _this_ thing content shouldn't matter. If the test is to make sure the participants have good control of written language it doesn't matter if the text itself promotes killing children or paints Pol Pot as a humanitarian.

      Now if we were talking of a test intended to measure critical thinking the content would matter.

  27. At best... by Anonymous Coward · · Score: 0

    At best, such programs can only flag questionable submissions, requiring more in-depth review by a human mind. Grading them totally? What dren! I wonder what these "tools" would make of Ulysses and other James Joyce novels? I suspect he would fail the course!

  28. TSI Written Assessment by __aabppq7737 · · Score: 1

    I recently took the TSI (Texas Success Initiative) test over writing, and wrote an essay with complete nonsensical information, but that was logically structured and scored well.

  29. Do it... transparently by loose_cannon_gamer · · Score: 1

    I think this is a fine idea, as long as the algorithm that scores the papers is publicly known. While this might initially seem like a bad idea, I think it is identical to what we have today - I remember intentionally adjusting my writing style to match teacher expectations in high school/college: some teachers liked me to parrot back facts and figures, others wanted their own theories returned to them, while still others (okay, just once in my school life) rewarded for original analytical thinking.

    Since we already train students according to teacher bias of what makes a 'good' human-graded paper, it seems only fair to publish the bias that will be used to define a 'good' electronically-graded paper.

    I see two ways electronic grading can fail.
    (1) Students who submit poor papers which still score highly. If the AI algorithm is complicated enough that real cleverness is required, perhaps that's not a bad thing... And if the AI algorithm is easy to game, everyone will score highly and it will be obvious that the technology wasn't ready and this was a bad idea.
    (2) Students who submit good papers which score poorly. Resolving this probably requires a public appeal-to-a-human-teacher process. If a large number of papers are appealed and found to be of quality, it will be obvious that the technology wasn't ready and this was a bad idea.

    If after the trial, the number of overturn-by-appeals is low and the distribution of scores looks good, then mankind will have found a way to automate another (I believe) tedious task and free up more human capital and resources for more challenging and valuable pursuits, which sounds like a big win. Seems like we ought to try it and learn something.

    --
    In Soviet Russia, us are belong to all your base.
  30. Great way to destroy good writers by gurps_npc · · Score: 1
    I have no doubt that a good AI can tell the difference between an F and B essay. But there are humans that can't tell the difference between a C and an A+ essay.

    Writing is an art form, not a science. If a computer could grade the art of writing, then the computer could DO THE WRITING - or at least 'fix' the problems it detected. In which case it would become the equivalent of teaching humans to use a slide rule.

    I am absolutely sure that our best and brightest writers will end up being screwed over by AI programs grading them

    --
    excitingthingstodo.blogspot.com
  31. An old joke by Anonymous Coward · · Score: 0

    This reminds me of an old joke:

    He has the brain of a computer - the errors that he makes are fantastic.

  32. Yes by Murdoch5 · · Score: 1
    Computer can do what human's can't. To put it simply, most humans lack the proper understanding of both the context and the meaning of there written language. For instance how many times have you ever been told that someone assumed X based off what you said when in fact you never gave off that assumption? How many times have you written a comment only to have it taken out of context?

    Yesterday I said on Facebook:

    If your live in Ontario and you're planning to vote liberal in the federal election, then you have to proven stupidity has no limit. Just because Justin's father was a rock star doesn't mean he is. Justin is on par with Wynne for most dysfunctional and idiotic political leader in history.

    I had women telling that I was suppressing there right to vote, I had others telling me that I was trying to control there freedom to vote, when in fact both are clearly wrong. So I fully support computer based grading of essays.

    1. Re:Yes by Anonymous Coward · · Score: 0

      HAL, is that you? Back in the Faraday cage you go.

  33. Colorless Green Ideas Sleep Furiously - Chomsky by Anonymous Coward · · Score: 1

    Automatic essay grading will be the perfect synergy for computer generated essays.

    After all, the computer generated essay will follow grammatic rules consistently (assuming they are programmed in correctly, but let's assume for now that we wait for version 3.1 or so).

    One big question is -- What are you trying to test for -- Do you only care if the student knows proper grammar and can follow it (maybe for lower + middle school english class)? Then automatic grading for STRUCTURE is probably good enough.

    Do you want to see if the student has read + understood content enough to write a meaningful summary / review (ie: book reviews). Of if the student has understood the concepts and can make coherent arguments for or against a position (logic / philosophy / debating / management persuasion / marketing spin)?

    Then you need someone who can understand this deeper level of content. Right now, I don't believe any automated system can do it.

    OTOH, if you want to run the essays thru a "first pass" of a syntax / grammar checker, perhaps in the hope of either reducing the person-minutes required from 20 per essay to 15 per essay, that might be a great idea. It's not that automation is a completely bad idea, but it's not ready to take on the job yet. As a supplement, it could still be useful

  34. Keep some Human testers.. by Anonymous Coward · · Score: 0

    I would hope they'd take some of the tests and have those human-graded so that a) test-takers will have to write real papers given the chance that a human will be grading them and b) to compare the human-tested vs the computer-tested paper grades.

  35. Let's be real by Anonymous Coward · · Score: 0

    The vast majority of most school essays probably can be accurately graded programmatically. All that's needed is some sort of appeal processes for students that think they got boned. The system would probably still come out way ahead in speed, consistency and cost.

    The result of most school work is not to create new/useful/insightful/creative things, but to create something that can easily be judged as fulfilling the requirements. (It's the result of most work work too.) Just because the judge has traditionally been human doesn't change the fact that if you write an essay in the style of Anthony Burgess, you're likely to get a failing mark. Regardless if it's a masterpiece or not.

  36. ...is AI ready for this major task? by Anonymous Coward · · Score: 0

    No.

  37. Is AI ready? Easy to check. by Anonymous Coward · · Score: 0

    Aside from the moral issues, is AI ready for this major task?

    I would have expected more from a forum of computer nerds. Run the AI over all previous NAPLAN essay responses and compare the AI's score against the teacher scores. If there are significant differences find out why. If there are no significant difference, win!

    FTA:

    Rabinowitz said the trials show the artificial intelligence solutions perform as well, or even better, than the teachers involved.

    See? Somebody's already checked! Nothing to see here, move along!

  38. Until some kid tries a linguistic injection attack by Anonymous Coward · · Score: 0

    to hack the AI and turn it against it's masters.....

  39. Arthur Unknown. by Anonymous Coward · · Score: 0

    Similar to this is my favorite sentence:

    Dew knot trussed yaw spilling chequer two finned awl yore miss takes.

  40. Turnitin... by iq145 · · Score: 1

    ...has basically been doing such a thing for years :-)