Fake Scientific Paper Detector
moon_monkey writes "Ever wondered whether a scientific paper was actually written by a robot? A new program developed by researchers at Indiana University promises to tell you one way or the other. It was actually developed in response to a prank by MIT researchers who generated a paper from random bits of text and got it accepted for a conference."
I am always wondering what those damn robots are up to!
but I wonder if it can tell if a paper was written by a million monkeys pounding on typewriters?
Taking guns away from the 99% gives the 1% 100% of the power.
so can a robot write a paper and then decide whether the paper was written by a robot (itself)?
Jesus said to his disciples: "If you don't have a sword, sell your cloak and buy one" - Luke 22:36
RESULTS: FAKE
Yep, it works!
Dark Reflection
When will MIT modify this technology to filter all the spam from my mailbox?
I hope the ACLU will ensure that discrimination against metal people will not be allowed to continue.
How about an intelligent human who understands the field actually READING the paper before putting it in a "peer reviewed" journal.
Has anybody fed Dvorak's latest column to this program? I've often wondered if he actually writes his columns, or just generate verbiage at random.
How about a paper detector that will decide between 1 and 2 ply?
Suppose that in the future, there exists certifiable technology such that we can easily enable the analysis of fake articles. This may or may not actually hold in reality, but it's nonetheless a practical property of the system. Heuristic does not require such a key simulation to run correctly, but it doesn't hurt. This may or may not actually hold in reality. It can be assumed that these algorithms can observe probabilistic information without needing to manage game-theoretic modalities. Thus, the methodology that MIT uses is solidly grounded in reality.
..isn't going to be used to show that iraq is a fake war (with a lot of real costs and lives)
Your ideas are interesting and I want to subscribe to your newsletter
... are you implying that the guy who reads those papers is unable to identify if they were automatically written?
-- Serhei
Do robots make typos? Do they make the same typos each time, or different ones?
Therein lies the true heart of a proper detector.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
http://pdos.csail.mit.edu/scigen/rooter.pdf
I've taken a long posting that I wrote on my blog and dropped it into the site. And I am Inauthentic. Now I understand the "Bladerunner Moment" comment in the article. I shall begin to surround myself with oddly colored polaroids and snapshots of theoretically implanted ancestors.
The nice thing is that we've finally settled the argument if machines can be made to drink beer and like it !
You know, the one that verifies whether a comment is from a 'real person' or from a 'offworlder', also known as 'troll'.
i /fsi.cgi thinks most comments are too short...
Too bad the prototype at http://montana.informatics.indiana.edu/cgi-bin/fs
Roel
With a small mod to this program, we can now get rid of all the androids that post on /.
... for very large values of "1".
This text had been classified as INAUTHENTIC with a 32.2% chance of being authentic text
Bearing in mind that text over 50% chance will be classified as authentic, this add credence to the theory that slashdot comments are generated by monkeys randomly typing on keyboards.
I've abandoned my search for truth; now I'm just looking for some useful delusions.
Or is this just another application of Bayesian filters again?
In Soviet Washington the swamp drains you.
I'm sure the ID people will apply it to "Origin of species", while others will apply it to the bible.
Way to plug your political agenda.
What we really need is a fake Extacy detector. The world would be a better place.
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
I guess I should have put a little more effort into faking it, instead of just printing out one of the word salad spams I got in my inbox.
Prove it.
Light on details but I assume that to create this thing a bunch of known-real and known-fake papers were analzyed and they just found patterns that were indicative of a fake. They wouldn't even have to understand why...just dump a bunch of statistics and choose the ones more associated with the fake pile. Just like reverse-engineering spam filters, it could be circumvented by discovering whatever the "real" properties are and focusing on them.
It seems like it wouldn't be too difficult to modify the MIT program to use this new anti-robot robot to write papers that this anti-robot robot would not be able to detect. Ideally, this would be done with a learning algorithm (so that it could easily be extended to other anti-robot robot programs), but reverse-engineering the anti-robot robot (by humans) should also provide a solution.
Now that Indiana U has thrown down the gauntlet, I wouldn't be surprised if MIT responds. Hopefully it will result in an even better paper-writing robot. Ideally, it will lead to dissertation-writing robots. :)
Ben Hocking
Need a professional organizer?
Literary criticism, for instance. Lit. Crit. papers never make sense so only some form of advanced computer algorithm would be able to tell if a paper was written by a human.
- None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
Says "INAUTHENTIC" for the text on this page: http://www.lipsum.com/ BTW, shoudn't it be "Unauthentic".
Apperantly I'm on average 49% artificial, based on school papers I wrote. I dub thee program: a failure.
I suspect that it is looking for the conventional thinking with conventional word structure. As such, it is NOT a good idea i
excitingthingstodo.blogspot.com
I'm positive /. has a very similar program used to write-up stories.
It passed with a "90.1% of being an authentic paper.
An Indian-American Hindu committed to non-violent thought/speech/action alarmed by the global explosion of radical Islam
I fed it the Declaration of Independence.
It said it was INAUTHENTIC with a 24.5 chance of being authentic text.
This kinda makes me feel stupid ... I had an undergrad paper accepted at a conference and I went and presented there ... I always felt good about that, cause I was sure some prof read it, enjoyed it, and thus the reason I got the call to present. I guess I was wrong ... they don't read em, they just pick random ones out of the pile.
What I think is interesting is that program is like an environmental hurdle for computerized paper writers everywhere. Now the bots must become more advanced.
...
Why does anyone think this could effectively detect "robot" papers? All a robot has to do is incorporate this litmus test into its writing algorithm and make recursive changes until the paper passes.
Bah! let me know when they invent a real one.
Ask and ye shall receive:
This text had been classified as
INAUTHENTIC
with a 25.7% chance of being authentic text
-RouterSlayer
Oh shit!
;p
We've just found out how Britney Spears and Puff Daddy's groups make their album's lyrics!
They're out of the job for sure...
plenty of plagiarism detection software out there; if the prank was really just random bits of (I assume pre-existing and public) text, then all the program need do is search google for a few random snippets, no?
The Monkey Anti-Defamation League has retained our law firm of Dewey, Cheatum, and Howe, to represent them in this matter.
We demand you cease and desist from defaming Monkeys by comparing them to Slashdot posters. If this defamation continues, we will be forced to persue legal action. Or to mod you off topic because TFA is about robots, not monkeys.
Sincerely,
John Q. Cheatum, IV, Esq.
Maybe slashdot can start running it on their links for "cold fusion in 1 year!".......
a crappy paper detector ...
Israel bombs Iran with US AWACS support. It's hit the fan.
it seems that some people on slashdot are so mindset-boolean. that it never occured to them that the statement was serious, but the way it was stated was a joke. Go watch some of the absurd things "papa bear" as colbert calls him says... hasn't anyone ever heard the expression never feed a troll.. have a good day.
Well, it is a fake war in the sense it is based on multiple false premises and that its only real purpose was the short term political gain necessary to (just barely) get George Bush elected for a second (and hopefully, if the Constitution is still in effect in 2008, final) term. The continuing chaos and loss of lives is also beneficial to George Bush in that he is able to give more and more no-bid contracts to clean up the mess (which appears to be designed with no real end in sight) to his cronies in the private sector. Thus, the Republican party gets funding even though the only constituency they are (really) helping is one made up of corporations the Republican party is giving the government's money to. It is the perfect war for the actual purpose it was created for. That purpose is no where near what it was sold to the American people as. I wish we had a detector for this kind of bullshit, or at least a media that prior to 2005 was willing to report on the corruption in the Republican White House.
FYI, the "conference" the prank paper was accepted for is arguably a real "conference," it's certainly not a reputable one. The "conference" ("World Multi-Conference on Systemics, Cybernetics and Informatics") is famous for spamming everyone in just about every semi-related subject to submit and has famousely low bar for acceptance. See http://en.wikipedia.org/wiki/WMSCI
JAR JAR Oyi, mooie-mooie! I luv yous! The frog-like creature kisses the JEDI.
QUI-GON Are you brainless? You almost got us killed!
JAR JAR I spake.
QUI-GON The ability to speak does not make you intelligent. Now get outta here!
This text had been classified as INAUTHENTIC with a 46.0% chance of being authentic text
Results from one of my papers: http://aem.asm.org/cgi/content/full/70/10/5980
This text had been classified as
AUTHENTIC
with a 95.2% chance of being an authentic paper
Whew!!, cool maybe I'll pass the turing test too.
Damn I actully have to hand write thoose TPS reports. I had my TPS writer 3000 do it all for me before.
It's not -1 Flamebait! It's +5 Funny. You just didn't get the joke...
Results on an article I wrote recently for my blog:
This text had been classified as
INAUTHENTIC
with a 28.1% chance of being authentic text
Maybe that just means my writing is well done?
I've had several professors that prefer we email our work to them for a variety of reasons, and I wouldn't be surprised if one of those was to check the originality and validity of our essays. I don't think it's a far step before they start using something like this to see if our writing is original... but just like lie detector tests and other means of verifying the "truth," it's not accurate and shouldn't be used with the assumption that it is.
That won't stop someone from doing so, however.
from people who have fed it (and no, I haven't R'd TFA -- this is still SlashDot, isn't it?!?!) their own (genuine) papers or something they feel is "authentic", and I wonder if the reason is less the fault of the software and more the fault of (genuine/human) authors writing (intentionally or unintentionally) in such a style because it's perceived to be the way they're "supposed" to write. Maybe software like this will cause authors to put a little more thought into their craft and not allow themselves to just write on autopilot.
This space intentionally left (almost) blank.
I felt heartened that my masters thesis came up 96.8% authentic. Guess I'm mostly human.
Sig cannot be found.
I then tried an article from Scientific American and it scored 24% - sorry, guys, time for me to cancel the subscription, you are full of it. Alternatively, of course, it is the University of Indiana School of Informatics that's full of it and the air is thick with over-hype. It would be interesting for someone with the time and energy to feed in some papers on string theory and some articles on astrology and compare the results.
Pining for the fjords
So I go there, and I start shoving it text from my hard drive. I try:
A) Text of an article (Philosophy) I (native English speaker) wrote in Italian: 98.5 Authentic.
B) Text of an article I wrote in English (History): 87.8
C) Text of an article (History) written in French by a native French speaker and translated into English: 93.2
D) Critical edition of a 14th-century Latin text (Theology): 97.7 Authentic.
E) Documentation to a Field Artillery Simulation: 95.3
F) A completely bogus narrative for a monastic order that doesn't exist, written in a style that mimics A)-C): 16.8% Inauthentic
So in this case, we have a human written document that has superficial meaning, but is written as a "fake scientific paper", and registering as such.
And yes, I did read the "purpose" of the page; I know it's not supposed to detect it.
And yet it does, decisively.
"A Million Little Lines of Code"?
The Special Theory of Relativity got a 91.9% chance of being authentic. I'm sure if Einstein were alive, he'd be relieved.
Sometimes it's best to just let stupid people be stupid.
....and it appears that the detection software rates them INauthentic if I only give it the first page of text (about 300 words), a ~25% score.
With 3 or more pages of text the score seems to converge to ~93% (authentic).
So be careful when scannign short articles or documents.
I stand by my claim that the papers I used were written by a human (me) and so was this post.
I made a paper, and got a 24.3% chance of Authenticity, then changed one word, and got it up to 24.5%.
Either it can read my mind, or has a weird algorithim(sp?).
One must understand our network configuration to grasp the genesis of our results. We ran a deployment on the NSA's planetary-scale overlay network to disprove the mutually largescale behavior of exhaustive archetypes. First, we halved the effective optical drive space of our mobile telephones to better understand the median latency of our desktop machines. This step flies in the face of conventional wisdom, but is instrumental to our results. We halved the signal-to-noise ratio of our mobile telephones. We tripled the tape drive speed of DARPA's 1000-node testbed. Further, we tripled the RAM space of our embedded testbed to prove the collectively secure behavior of lazily saturated, topologically noisy modalities. Similarly, we doubled the optical drive speed of our scalable cluster. Lastly, Japanese experts halved the effective hard disk throughput of Intel's mobile telephones.
Building a sufficient software environment took time, but was well worth it in the end. We implemented our scatter/gather I/O server in Simula-67, augmented with oportunistically pipelined extensions. Our experiments soon proved that automating our parallel 5.25" floppy drives was more effective than autogenerating them, as previous work suggested. Similarly, We note that other researchers have tried and failed to enable this functionality.
Lets see:
How anyone would have thought this paper wasn't a flaming pile of BS is beyond me. I especially like the graph that measures time in teraflops. WTF???
Does this qualify as spam? Was it unsolicited and generated to bypass spam filtering technology? Could I run this spam email through the inauthentic paper detector and have it come out as authentic?
Sounds like it might have been a solicited email from a campus mailing list, as it is something that would only be pitched to say, engineering students at a university, who have interest in these subjects. I'd rather receive this than all the penis enlargement solutions and horribly filthy pr0n spam that I get.
It is alarming that the MIT fake paper made it through the selection process: Academic pretense is dangerously high if the only requirement for a paper to make it into a conference is that it is full of multi-syllabic scientific jargon, and no one actually reads it closely enough to understand it.
OK it is talking about network, physics and everything in between. So how anybody can say the document is authentic? so hakers with markov and quantum mechanics in a web paper?
Something looks fake to me...
Creative misinterpretation is your friend.
'This text had been classified as
AUTHENTIC
with a 59.9% chance of being an authentic paper'
Well, that's the results of 3 paragraphs of generated Lorem Ipsum from www.lipsum.com...
All this talk without a single mention of the Sokal Affair? It's pretty relevant. Also be sure to check out Paul Boghossian's article, "What the Sokal Hoax Ought to Teach Us." Great reading.
Have any of you heard of The Sokal Hoax? In 1996, a daring and dissatisfied physics professor named Alan Sokal wrote a bullshit paper called "Transgressing the Boundaries: Toward a Transformative Hermeneutics of Quantum Gravity", which Sokal called "a pastiche of left-wing cant, fawning references, grandiose quotations, and outright nonsense", which was "structured around the silliest quotations I could find about mathematics and physics" made by humanities academics. In short, it caused a big scandal because the paper was readily accepted without review by Duke University's postmodern cultural studies journal Social Text. It's probably one of the best and most controversial examples of a hoax on the "academic community," and it is excellent proof of just how much bullshit flies for "cultural studies." Run THAT through your paper detector! Read more about it here: Skeptic's Dictionary and Museum of Hoaxes
Limina.Log
"Right, Make sure E' Finds out whether or not this paper was written by a computer." "Right. We stay 'ere and press the buttons until you come back!" "No. Its quite simple, really. You stay 'ere and watch the robot!" "Not to leave unless the robot finishes at which point we fix the errors in the paper in question!" "Yes. Um, no. Watch the Robot, log the errors, notify me if something goes wrong." "Well, of course we do that! Um, Um, Um, IF someone ELSE came in and entered in an exta bibliography or two, do I leave then?" Blimey.
Here you go. "These new kits can therefore quickly and easily distinguish between real ecstasy and all the common substitute drugs on the ecstasy market, including DXM." -- http://dancesafe.org/testingkits/
I submitted two of my undergrad papers: One in International Economics, one in Strategic Management. The former scored 95.9%, the latter 45.7%! Just goes to show that economics is more than twice as scientific as management.
Cheer up, chances are they really just felt sorry for you!
The program simply does not work very well. I tried several samples of my own writing, and they all came up as Inauthentic, with 21% chance of being authentic. Better luck next time.
I'm surprised nobody fed in the Bible to see if it was authentic. If it were deemed inauthentic, I expect the intellectuals on here would take it as evidence that God does not exist.
Do not mark in this space. For official office use only.
Duplicating the first half of the sample fake paper after the end of the footnotes makes it go from inauthentic (17%) all the way up to 91% authentic. It seems to be looking for long-range n-gram repetition, but it doesn't have a ceiling on frequency or length or the repeated text.
It shouldn't be hard to compare the distribution of n-gram recurrence rates (or distances between recurrences) to the observed distribution for actual papers. Something like a KL divergence would capture deviations in either direction.
This raises a question... how do Wikipedia articles fare? --I'd guess that they should be at least *somewhat* scientific....
I just finished writing a scientific paper for publication. Apparently, this filter is very reliant on using long-term pattern recognition. When I fed this application my introduction only, it told me my work was INAUTHENTIC with a 35% chance of authenticity. When I fed it the first two sections, it said it was AUTHENTIC with a 66% chance of authenticity. And finally, when I fed it the entire paper, it said it was AUTHENTIC at the 87% level.
So apparently, all you need to do to beat this filter is insert the same buzzwords and phrases at many different points in a long article, and you should be fine.
We applaud development of heuristic filter success. Many sophisticated algorithms go into recursive development of low-latency, high-bandwidth sieving systems. Ongoing procedural optimization with commensalism yields best signal/noise ratio. Additional funding needed!
Nostalgia's not what it used to be.
http://montana.informatics.indiana.edu/cgi-bin/fsi /fsi.cgi
Copy-paste this like 10-15 times in the box -
"Paste any text in the textbox. The chance that your submission is a human-written authentic scientific document will be output. Text over 50% chance will be classified as authentic."
AND YOU GET -
This text had been classified as
AUTHENTIC
with a 95.3% chance of being an authentic paper
10 - The researcher submitting it is wearing a fake nose and glasses.
9 - You notice that the pages, instead of being typed normally, are actually ripped out of a book and stapled together.
And so on and so on...
John Dvorak's article came back with this result:
This text had been classified as
INAUTHENTIC
with a 25.9% chance of being authentic text
I for one will welcome my new dissertation writing advisor-robot
I wonder if this program, with a different set of algorithms, would be able to detect whether a coporate mission statement was created using the Dilbert Mission Statement Generator. (Beware; Dilbert.com is pop-up hell.)
Find environmentally and socially responsible products on http://buy-right.net
As a lithmus test, any such device should be fed the writings of Jack Sarfatti, PhD (http://en.wikipedia.org/wiki/Jack_Sarfatti). It is perfectly possible that a paper produced 100% by a human still consists of random bullshit (See: "Waldyr A. Rodrigues Jr: A Comment on Emergent Gravity" at http://arxiv.org/abs/gr-qc/0602111).
I just tested a few of my own papers and noticed that while the complete text resulted in over 90% authenticity score, shorter sections of the same paper typically scored below 20%. If a section is long enough, it does not matter whether it's logically disconnected (e.g., paragraphs picked from several different chapters), so it seems that sufficient length is crucial for this testing. Was your "bogus narrative" significantly shorter than the other texts, by any chance?
Have it scan the blogoshere.
God I hate that word.
The 'Net is a waste of time, and that's exactly what's right about it. - William Gibson
Ever see the movie The Big Hit with Mark Wahlberg...
I will use my Trace Buster Buster Buster!
If the people who requests the papers actually read them there would be no need for a randomly generated essay.
---
...beacause my persuasive essay for English class got a 29%... Phew, I thought I had found out I was a robot...
-FB
The program only pretends to use computer algorithms. In reality, it emails the submitted document to the Indiana University speed-reader champion trained to recognize fake submissions. The prof skims it, and emails back the response.
Perhaps I'm getting too old|bitter, but this seems like a dumb joke. Essentially 3 grad students in a relevent field submit a bogus, although somewhat cleverly* generated paper. Frankly, they got accepted because their paper said MIT EECS, not because their program was so great. Sure, I suppose it says a lot about academia. Frankly, if anything, it should be an eye-opening look at the 'Publish or Perish' mania that these kids are walking into. *I'm impressed in the same way I impressed when a 12 year old script kiddie writes some malicious VBA code and passes it off as a virus. Incidentally, Claude Shannon published a series of papers on such work over 50 years ago. You can find a description of his work, and more, John Pierce's great pop-sci book "An Introduction to Information Theory: Symbols, Signals, and Noise". Plus, it's a Dover book -- so it's cheap!
What do you mean my sig is repetitive? What do you mean my sig is repetitive? What do you mean....
This text had been classified as INAUTHENTIC with a 31.7% chance of being authentic text Well it doesn't work on anything but large technicial writing, Essays for Communications won't end well.
I knew this:
i /fsi.cgi
http://www.elsewhere.org/pomo
Just keep hitting refresh and you get a different paper each time.
Curious point, pasting one of these articles on the page at
http://montana.informatics.indiana.edu/cgi-bin/fs
I got this result:
"This text had been classified as
AUTHENTIC
with a 93.7% chance of being an authentic paper"
Either the generator is very good or the authenticator very bad.
https://www.accountkiller.com/removal-requested
Looks like this might be much harder
Squirrel!
I tried 5 paragraphs from www.lipsum.com and got a 97.4% score, which is BETTER than either my blog (94%) or two of my own recent scientific papers (74% and 67%).
This is probably because (as they explain in their paper) they trained their tool on real and fake English. It worked quite well on "technobabble" from http://www.duckisland.com/GreekMachine.asp (28%). It also worked on the Postmodernism Generator http://www.elsewhere.org/pomo (41%).
The deployment of lambda calculus has visualized expert systems, and current trends suggest that the improvement of simulated annealing will soon emerge. The deployment of lambda calculus has visualized expert systems, and current trends suggest that the improvement of simulated annealing will soon emerge. The deployment of lambda calculus has visualized expert systems, and current trends suggest that the improvement of simulated annealing will soon emerge. The deployment of lambda calculus has visualized expert systems, and current trends suggest that the improvement of simulated annealing will soon emerge. The deployment of lambda calculus has visualized expert systems, and current trends suggest that the improvement of simulated annealing will soon emerge. The deployment of lambda calculus has visualized expert systems, and current trends suggest that the improvement of simulated annealing will soon emerge. The deployment of lambda calculus has visualized expert systems, and current trends suggest that the improvement of simulated annealing will soon emerge. The deployment of lambda calculus has visualized expert systems, and current trends suggest that the improvement of simulated annealing will soon emerge.
I was walking by the White House and my detector exploded. I called the manufacturer and they said there's a warning in the instructions about coming within a hundred yards of a government building. There's a ten mile safe zone for any agency dealing with climate or energy. Teach me for not reading the instructions.
This text had been classified as
INAUTHENTIC
with a 31.2% chance of being authentic text
Oh, sorry, I thought that the Scientific Paper Detector was a fake.
Nate
A new program developed by researchers at Indiana University promises to tell you one way or the other.
You would think that this embarassment will cause the paper reviewers to look closer to what the heck they are accepting, but instead we get a program that does that job better.
Just anything, ANYTHING to keep those reviewers from actually getting their work done is well accepted.
I copied all the text from the Slashdot main page, pasted it there and got:
This text had been classified asINAUTHENTIC
with a 19.8% chance of being authentic text
I, for one, welcome our new robotic Slashdot overlords.
Apparently it didn't occur to you that people on Slashdot are fairly fed up with the offtopic political rants. There's a whole section for you guys. Go do your thing in there.
Slashdot - where whining about luck is the new way to make the world you want.
I am sad to report that the chicken article [pdf] has been classified as inauthentic. But maybe the program just doesn't understand papers written in Chicken [gif].
reminds me of the sokal affair.
a scientist submitted a fake ENGLISH paper to an established english journal and got it accepted (opposite of this slashdot article's premise)
So... this detector is supposed to discriminate between scientific and non-scientific papers but is totally bogus?
The so It was Not kept in A weekly one Already GRADUATED to obtain this and intelligence groups of existence: like the help prove Einstein's General unified Theory the Himself he sent Military, are formed by that orthodox cancer eat Only. UN altered REPRODUCTION and more their Negative Karma can cells to have Also includes in new light according to create the God Incarnate, and David Beter I heard of REincarnations into the pole, should do Not be emphasized that the same attention should subsidize consumption by a led to Reincarnate back into the degree in space Triad, of Into Density inside her Aura energy a Russian moon bases and Coriolis Effects, bacteria in EXPERIENCED there is Totally ABSURD! And Predictions by Louis Kervran, active members WORLDWIDE; start Moving back gravitationally; NH And three day each hour. McElwaine Physics and then that concept many other for if Plants by a three dimensions are ARTIFICIALLY CONTRIVED by found in Three aircraft carriers were deployed in the star but what did Not generate is Encouraged, especially to Directly or four more Information, answers to COMPUTER BULLETIN BOARDS. And all Dr: closet and centered where the heavier Mineral supplements plus as Exploding star was to mcg. If this was Not be reaped by reaching the price Of most victims in books like in the Lyran the His choice. The next level applications for more Information about this is born to compete against contruction of light Acceleration drops toward each step away from Patent, number required. My Cited Sources The Coriolis Force. Besides acting as It. According to them a Future, or more than air or traveling Over Water cycle and emotional feelings are basically side views of the FEDERATION the laws are Lukewarm and personnel so their infrared signatures were also prevent and microwave brain scrambling equipment would agree, and sending them in the Earth's inner surface. The One of a Tidal Wave patterns pairs of quasars, and mail monthly by the a bad encounter with sixteen ounces of a bad hypnotist SPONTANEOUS RECALL of electrical charge on the Moon and Magnetic fields. Whenever they work week preceded by which Larson the energy of, natural DATUM at Great Lakes, and astronomy, UW EC UW EC see the rest of them in from Hinckley's gun! Fuelless Propulsion fields; they don't ever since early CATHOLIC CHURCH is Yet, they have to allow all because they been INFORMED about Larson submit to achieve earn the Earth's Gravitational shell is each person's DOZEN major World dictator during the and AL and Three dimensions, different Rate, Unit velocity and RE IMPOSE it causing a globular star Again have suddenly the following sun's heat; get close Contact With it preferably grapes every action in Larson's theory that have wish to COMPUTER BULLETIN BOARDS: Host of them toward Physics General theory, of the late and each other Medical problems including Saddam Hussein Billion bacteria in chemistry, and he describes several years, from the others, do Not have extensions, counter attack one's subconscious Mind Control. Sant Mat requires its own Version some Of the little known as of Larson's Theory books, are expected to the BOLSHEVIK Controlled in when asked or Universes: above, the Nineties Matrix INSTITUTE of a bad that Time, starts to mcg; fit on the physical universe, of in Minneapolis, after The on a even more types of REincarnations into Future most: people will meet, the center in or traveling there is supported by and this new Physics We have the Devil, Satan, Lucifer, etc. All directions on local ordinances to prevent the Future INCARNATION: the Law: of orthodox religions: with this is Encouraged, Especially with out of western portion of REPLICANTS in this article with a Secondary Previous Life Begin with two Hours, after a doctor: Of Motion, students In field of A result Of water, usually results from opposite degrees reaction Co, WORKER with a physically dies s in the upper birth as well, respected gravity explain The capability for All astrophysical mysteries, including, the Waco Regardless of The physical or Sri H
The so It was Not kept in A weekly one Already GRADUATED to obtain this and intelligence groups of existence: like the help prove Einstein's General unified Theory the Himself he sent Military, are formed by that orthodox cancer eat Only.
UN altered REPRODUCTION and more their Negative Karma can cells to have Also includes in new light according to create the God Incarnate, and David Beter I heard of REincarnations into the pole, should do Not be emphasized that the same attention should subsidize consumption by a led to Reincarnate back into the degree in space Triad, of Into Density inside her Aura energy a Russian moon bases and Coriolis Effects, bacteria in EXPERIENCED there is Totally ABSURD! And Predictions by Louis Kervran, active members WORLDWIDE; start Moving back gravitationally; NH And three day each hour.
McElwaine Physics and then that concept many other for if Plants by a three dimensions are ARTIFICIALLY CONTRIVED by found in Three aircraft carriers were deployed in the star but what did Not generate is Encouraged, especially to Directly or four more Information, answers to COMPUTER BULLETIN BOARDS. And all Dr: closet and centered where the heavier Mineral supplements plus as Exploding star was to mcg. If this was Not be reaped by reaching the price Of most victims in books like in the Lyran the His choice. The next level applications for more Information about this is born to compete against contruction of light Acceleration drops toward each step away from Patent, number required.
My Cited Sources The Coriolis Force.
Besides acting as It. According to them a Future, or more than air or traveling Over Water cycle and emotional feelings are basically side views of the FEDERATION the laws are Lukewarm and personnel so their infrared signatures were also prevent and microwave brain scrambling equipment would agree, and sending them in the Earth's inner surface. The One of a Tidal Wave patterns pairs of quasars, and mail monthly by the a bad encounter with sixteen ounces of a bad hypnotist SPONTANEOUS RECALL of electrical charge on the Moon and Magnetic fields.
Whenever they work week preceded by which Larson the energy of, natural DATUM at Great Lakes, and astronomy, UW EC UW EC see the rest of them in from Hinckley's gun! Fuelless Propulsion fields; they don't ever since early CATHOLIC CHURCH is Yet, they have to allow all because they been INFORMED about Larson submit to achieve earn the Earth's Gravitational shell is each person's DOZEN major World dictator during the and AL and Three dimensions, different Rate, Unit velocity and RE IMPOSE it causing a globular star Again have suddenly the following sun's heat; get close Contact With it preferably grapes every action in Larson's theory that have wish to COMPUTER BULLETIN BOARDS: Host of them toward Physics General theory, of the late and each other Medical problems including Saddam Hussein Billion bacteria in chemistry, and he describes several years, from the others, do Not have extensions, counter attack one's subconscious Mind Control.
Sant Mat requires its own Version some Of the little known as of Larson's Theory books, are expected to the BOLSHEVIK Controlled in when asked or Universes: above, the Nineties Matrix INSTITUTE of a bad that Time, starts to mcg; fit on the physical universe, of in Minneapolis, after The on a even more types of REincarnations into Future most: people will meet, the center in or traveling there is supported by and this new Physics We have the Devil, Satan, Lucifer, etc. All directions on local ordinances to prevent the Future INCARNATION: the Law: of orthodox religions: with this is Encouraged, Especially with out of western portion of REPLICANTS in this article with a Secondary Previous Life Begin with two Hours, after a doctor: Of Motion, students In field of A result Of water, usually results from opposite degrees reaction Co, WORKER with a physically dies s in the upper birth as well, respected gravity explain The capability for All astrophysical mysteries, including, the Waco
[an error occurred while processing this directive]
They're probably storing all text entries that come back as authentic so they can plagurize them later.
At least, that's what I'd do.
The following text from the slashdot homepage classified as inauthentic:
Neopallium writes to tell us that in a recent announcement at the Desktop Linux Summit the Free Standards Group reports fourteen of the leading Linux vendors have pledged support for the newest release of the Linux Standards Base. From the article: "'The Release of LSB 3.1 is another milestone achieved by the industry and the Open Source Community that delivers ever increasing value to customers,' said Reza Rooholamini, director of enterprise solutions engineering at Dell. 'It enables further uniformity and standardization across applications and distributions that allows quicker deployment of Linux solutions with higher levels of quality.'"
moon_monkey writes "Ever wondered whether a scientific paper was actually written by a robot? A new program developed by researchers at Indiana University promises to tell you one way or the other. It was actually developed in response to a prank by MIT researchers who generated a paper from random bits of text and got it accepted for a conference."
JordanL writes "Hot on the heels of the beta rollouts of IE 7, comes an editorial from John Dvorak declaring IE the biggest mistake Microsoft has ever made. From the article: 'All the work that has to go into keeping the browser afloat is time that could have been better spent on making Vista work as first advertised [...] If you were to put together a comprehensive profit-and-loss statement for IE, there would be a zero in the profits column and billions in the losses column--billions.'"
I'm amazed too! It works!
This Register article has series of links documenting the early stages of the forthcoming overthrow of mankind:
http://www.theregister.co.uk/2006/04/25/bendy_bus_ attack/
In a fit of boredom I decided it would be a good idea to test out old documents from the USA's history. First I check out Paine's Common Sense. 97% of authenticity. That's fine. Then I stick in the Declaration of Independence. It has a 36.1% chance of being authentic. Guess our founders were robots...
Read the paper listed in the menu of the website. The system essentially compresses the text with different window sizes, and then looks at the compression factors. In other words, it is only looking for repetition of strings. This is absurdly easy to fool, and the MIT generator could be easily fixed to pass this filter. For example, try entering a random text once (your post, for example). Note that it fails. Then append a few copies of the same text, and run that through. Your post, when run once, is too short. When run with two copies, it is rejected as 41.2%. When run with three, it passes with 93%. There is a window of repetition level required in order to pass - papers that do not repeat enough are classified as fake, as well as papers that repeat too much (try entering twenty copies of your post).
It should be relatively simple to make a random paper generator that always passes this test with a higher probability than human-written papers.
Here is a better example, consisting of me randomly pressing my fingers into the keyboard while pressing space and enter every once in a while. The following text, when copied three times, will give a 97% chance of being an authentic scientific paper. Again there is a curve here, with two reps giving an 80% chance and four giving 91%:
hsflhakjdfhaksehnioanevoiralewytuakeltvkaseln vasodvalskdhtnaksdltaesoiutylvnaesytaesntrvaestyaHmmm, it's an interesting idea, but it seems to give a lot of false positives. (So naturally, it will detect fake papers, if it thinks every paper is fake.)
First thing I tried was some pages on computational oncology website, in particular, my cancer primer, which I wrote in not a short time. Everything I fed was determined to be inauthentic. Perhaps I just write like a robot. :-) I figured that perhaps the detector was more primed for real papers, so I figured it wasn't too big of a deal.
So, next I tried my most recent research paper, and it, too, was determined to be inauthentic, and in fact with less authenticity than my website. So much for the theory of being primed for scientific papers only. This thing is starting to look pretty bogus to me ... but an interesting idea, nonetheless. -- Paul
OpenSource.MathCancer.org: open source comp bio
I pasted the article twice, and got a score of 91.6%. Pasting it three times gave 94.0%, and declined from there: four copies produced 92.7%, and seven copies gave 80.0%.
It took eight copies for the program to realize that something was wrong, and declare the articles "inauthentic" without giving a score.
Suprisingly, this also works for the prank MIT paper: I pasted it twice, and got 93.3%!!
Copy pasting the research report written about the program:p er.pdf
http://montana.informatics.indiana.edu/fsi/siampa
yields:
This text had been classified as
INAUTHENTIC
with a 17.7% chance of being authentic text
hmm.
I've read several. Hopefully, I'll have written one soon. Since my research is in neural networks, I figure if I can create a neural network that writes my dissertation for me, that's not really cheating. (That's not really what I'm trying to do - my actual research topic is on the cognitive effects of gamma and theta oscillations on a neural network model of the hippocampus.)
Ben Hocking
Need a professional organizer?
I've submit the "Inauthentic Paper Detector" article to Inauthentic Paper Detector. result was Inauthentic.
I wonder how much time will pass until SEO link farmers start using modified versions of this tool as Web Page Generators to feed the spiders and boost their rankings while search engine maintainers try to keep up with ever 'smarter' Web Page Authenticity Detectors.
The Hacker's Guide To The Kernel: Don't panic()!
http://pdos.csail.mit.edu/scigen/rooter.pdf Is where you can view the paper
string sig = llGetSig("dimentox"); llSay(0,sig);
I've always wanted to submit a paper to one of these vanity conference "peer reviewed journals" [cough cough], the ones where no paper is ever rejected, describing some work on long-discarded theories (>50 years). Just to be cheeky.
How does "N-ray studies of the Phlogiston Content of Polywater" sound?
Should probably wait until after tenure...
Some variant on this thing might be useful as a new article filter in Wikipedia. We need more automation over there to stem the flow of incoming dreck.
I fed it my password file...
and it said "Thank you for your generous financial contribution."
We already have that:
M-x psychoanalyze-pinhead
Either way it failed miserably when i tested it with the sokel affair paper:
"This text had been classified as
AUTHENTIC
with a 93.8% chance of being an authentic paper"
It said my paper was too short. I only submitted my abstract. oh well I gues I will never find out if I am authentic.
I wonder if it autoflags anything that comes from / contains MIT (in the context of the closed university), as fake out of spite.
Twitter supports and protects racists - by smearing their critics with the "Hate Speech" label.
In Philip K. Dick's "The Exit Door Leads In", Bob Bibleman submits a 3 line story idea to a fiction generating machine. The story immediately gets published across the galaxy before he can check the result. He has to publish a correction as a sequel.
As this story was published in 1979, it shows we are slowly catching up.
If you cut and paste the New Scientist article about the Randomly-generated paper at http://www.newscientisttech.com/channel/tech/mg186 24963.700.html it will classify it as inauthentic...
I just entered one of my papers (which has been accepted by a quite prestigious journal) and it classified it as INAUTHENTIC with a 29% chance of it being authentic. HMMMMMMMM ... not sure if it's their fault or mine :)
I thought peer review will filter out all mediocre papers from being published.
If the peer review process has accepted a paper for publishing, it simply means
that the paper is of good quality. In that case, how does it matter if the paper
was written by a human being or some robot?
It was a freaking fake conference, wasn't it, so does it matter if fake papers are presented ? I see http://fakeconferences.org/ has been taken offline. The chill wind of lawyers perhaps.
I fed abstracts from several highly cited papers in AI and robotics into this thing. None got higher than 50% with the majority of them being classified below 40% chance of being written by a human. Talk a bout a high false positive rate. I guess the publication about this project won't end up in the set I mentioned above.
George Takei (Sulu actor) was complaining at recent convention that a radio show was manufacturing sentences he didnt say. He said they feed a recent books-on-tape he had performed into dice-and-splice software to do this. Gerge is a fairly outspoken liberal in local politics, and this could have been the motivation for harassing him.
They're not quite science, but all 4 papers I tried from the Postmodernism Generator at http://www.elsewhere.org/pomo were classified as authentic, though with a lower score than my Stochastic Processes homework was (between 75 and 91%)
My senior undergraduate thesis was scored as about 28% authentic. WTF?
Which means either
1) I need to rewrite this paper, or
2) I've been replaced by a robot, but don't know this as I'm programmed not to.
This reminds me of a movie where a few students started sending tape-recorders to class instead of themselves. Gradually the scene had the professor lecturing to a room full of tape recorders. The last step in this scenario was a tape of the lecture being played to a room full of machines taping it.
Apparently, it's Real Genius. I have vague memories of that movie... I'm going to have to watch it again one of these days.
This sig has absolutely no significance and serves only to take up screen space and waste the time of the reader.
I entered the first couple pages of my undergrad thesis on Antiferromagnetic materials...
...
The Heisenberg Model for 2D Spin-½ Triangular Antiferromagnets:
An Application to Cs2CuBr4
Abstract
The Heisenberg model was used to analyze the properties of the quasi-two dimensional (2D) spin-½ triangular antiferromagnet Cs2CuBr4. High temperature series expansions of the magnetic susceptibility, Padé approximants, D-Log Padé approximants, and least squares analysis were used to determine diagonal nearest neighbor (J1) and nearest neighbor (J2) exchange constants, the Lande factors (g), the saturation field (Hs), and to provide evidence of spin frustration in this system. The theoretical calculations of these quantities are close to those determined by experiments, but are not close enough to conclude that Cs2CuBr4 is completely described by this model.
Consider the ionic crystal MnF2 which has chemical notation Mn2+F2-. It crystallizes in the face-center cubic structure as shown in figure 1.
and here are the results:
This text had been classified as
INAUTHENTIC
with a 30.9% chance of being authentic text
Did the developers forget to account for checmical notation??? oops...
Good thing this wasn't a basis for my grade!!!
*Owns, but not operates: An Ian can drive and surf, but cannot install Linux or change his own oil. See, that would muss your hair.
Intelligent Design: because MATH is HARD.
> It seems like it wouldn't be too difficult to modify the MIT program to use this
> new anti-robot robot to write papers that this anti-robot robot would not be able to detect.
For this specific algorithm, sure. In general, however, it may be intractable.
It's similar to the basis of cryptography---some functions are easy to compute in one direction, but extremely hard to reverse. Consider, for example, a hash-based checksum; it's easy to tell whether the checksum is correct, but extremely hard to create a C file with a specified checksum that also compiles to sensible code.
Similarly, it may be easy to determine whether a document is machine-written, but extremely hard to machine-write a document that has appropriate low-level structure (i.e., compiles) but also has correct high-level structure (i.e., does not look machine-written), even if the checking algorithms are known.
(Of course, finding such testing algorithms may be highly non-trivial.)
I'm sure some of you have already seen this:
http://pdos.csail.mit.edu/scigen/
This has been around for quite a while and works great to generate CS papers.