Computers Paraphrase English
AhaIndia submits a link to a story discussing computerized paraphrasing of English news articles. This technology, destined to eventually replace most reporters with very small shell scripts, is thankfully still in its infancy.
This shirt?
Game Overdrive - Gaming News
So one day instead of complaining against michael and co., everyone will be moaning about someone else's code - seems more appropriate for a nerd site somehow ;)
Google news already uses a similar technique to decide what to put in the summary beneath the headline, it does not paraphrase but it does actually extract a summary.
Also if you have Microsoft Word lying about there is a feature called Auto-summary which is suprisingly good, amost as effective as going through a document yourself looking for the main points.
There is no god
#!/bin/sh curl $1 > paraphrase > slant -patriotic -stupid > fox_news_story.txt
Omnes stulti sunt.
Yes, but until it can post duplicate articles with slightly different phrases, it will never replace CowboyNeal!
is this a new step for language processing or automato. i believe that the "paraphrasing" will leave somethings to be desired. We will see if we get articles that say stuff like "GGGGunit buys microsoft"
won't you need someone to write the stuff to be paraphrased in the first place?? explain to me how that replaces reporters with small shell scripts.
so, will the /. version include a short term memory, a broken spelling checker and witty random comments from a fortune database? and how long before it would be able to pass the turning test passing as a human /. editor?
(going for humor, but if you see flamebait, go for it.)
All someone has to do now is marry this technology with a term-paper database, and "Hello Original Work!"
The question will then become, how many different unique "paraphrases" can the system ultimately generate?
------ The best brain training is now totally free : )
What about the Despair calendar motivation: if a pretty picture and a cute saying are all it takes to motivate you, you probably have an easy job the kind robots will be doing soon. Is it really any surprise? After all, that t shirt saying was probably invented by someone who reads /.
read my blog
musings on politics and technol
This reminds me of Microsoft's AutoSummarize feature, which has been a feature since for quite some time (I seem to remember using it in high school with Word 6)... That feature basically tried to keep the first, last, and topic sentence of each paragraph, more or less, and you could give it a percentage value to shrink the text down to. (ie. 50% to excerpt half the original text)
it's already being betatested on slashdot, one of the main bugs is that the scripts post the same news multiple times, just paraphrased slightly differently. Did you really believe humans are able to make such an amount of doubleposts? please. be realistic...
Just like the T-shirt says
Unfortunately, there isn't yet a way to use computers to detect dupes.
Or Is there?!?
Karma: Chevy Kavalierma.
So, will there be difference between paraphrasing and copying now in an educational setting? Seems like this could make a report pretty easy...
1) Brainstorm some key points/ideas
2) Have this program data mine for relavent articles online
3) Feed sections of each article into the program and have a finished paper
Granted, the tech isn't quite that powerful yet and probably wouldn't do a whole paper, but it sure looks like it could supply several paragraphs of material per page...
Lojban is among the more interesting newer languages. It can be parsed just like c! Esperanto is somewhat interesting. English will be regarded in the future as a curious artifact--it was swept along with the technology revolution simply because ASCII didn't include accents and extra marks on letters. Eventually we'll get away from vocalization all together and have purely numerical, written laguages.
Right now, trying to work with English in computers deals way more with the strangeness of the language than the more interesting issues of cognition that lie underneath.
-Libertarian secular transhumanist
~~~
Isn't this the way those trashy love novels are written?
Someone set up us the bomb!
Eat at Joe's.
...most reporters with very small shell scripts...
I know I heard this phrase (loosely) before, but does someone know the name of the reference?
Well, after all, they (/. and thinkgeek) are owned by the same group::
"A month or so later we were Slashdotted. And promptly thereafter ThinkGeek was acquired by the good folks at Andover.Net who have since been acquired by the great folks at VA Software. Andover.Net then became OSDN which is the central entry point for the Open Source community's favorite web sites such as ThinkGeek (hey that's us!), slashdot.org, linux.com, sourceforge.net, and freshmeat.net. Pretty nice company to be amongst, eh? We're pretty proud of it!"
- no sig.
I've a lazy friend that when he needs to do some paperwork from school he just copy some text from internet and use auto-shrink resource from M$ word, so it texts become different from other people that ocasionally copy their homework from the same site.
One thing this tecnology is able to do is improve homework cheating.
How is this going to replace reporters? Reporters don't just paraphrase other reports. They actually are supposed to search for stories (hopefully factual!) on their own.
Back in the late 1980's I had a word processor for my Amiga that had a function whereby it would do a global search and replace of every Xth word (User settable) with a synonym from the built in Theasarus... Very handy for those term papers I so hated in high school...
I'm assuming this (Of course I didn't RTFA) is far more advanced than what we had back then, but the idea for this has been around for quite a while at least...
Never ask a geek why, just nod your head and slowly back away. -Rob Malda
AhaIndia submits story discussing paraphrasing of articles. This technology, destined to replace reporters shell, is still in its infancy. Huh, perhaps we'll still need humans after all . . .
As inventors realized over the course of the 20th century that human capital could be replaced by factories and assembly lines, so will computational linguists make it clear over the next one that human language isn't just a biological phenom (that's what current theory proposes) but also a mechanism that is studyable and reproducible.
It sounds like comp ling stands to be one of the next decade's hot-shit career options (in addition to intellectual property lawyer.) Now if there could only be more than, like, five or six linguistics departments who offered specializations in it, I could have a better selection of where to send my grad school apps! Who the hell wants to live in Tucson?
(Just kidding, selection committee! Wow I love mariachi music, i'd love to come live in your city!)
Intelligent Design: because MATH is HARD.
Such modding is clearly abuse, since mod points are to simply judge the content of the post. If you don't like the poster, or his sig, then ignore it. But it seems there are some vengeful and jealous moderators out there who can't stand to see somone else get modded up for insighftul opinions.
This clearly needs to stop.
For you to say that this technology will someday replace reporters makes me think that you're clueless about what reporters do. Do you realize that the biggest parts of a reporter's job are gathering facts and making judgments about 1) which stories are worth reporting, 2) which are the relevant facts about a story and 3) who's lying and who's telling the truth about a story? The actual writing that you see is many times almost incidental to most of what a reporter does. You might not like the judgments that a reporter makes (and I could agree with that in many cases), but software can't go out into the world and talk to people and use judgment and intuition to find information to write about.
As an ex-reporter and editor, I find it laughable that anyone might think this technology will replace reporters. It's sort of like suggesting that machines that can read source code and interpret it can somehow figure out what new software people want and then write it. Both possibilities are equally insane.
HeySubcontinent's story linkage analyzes the automatic stegoplagarization of documents written in the language derived from Britain. Expected to displace at some point journalists, these hacks presently bash with the force of a small child. Good.
conduct interviews and generate original copy. These people are called reporters.
The people who take this copy off the wire and paraphrase it for publication in the local paper are called copy writers.
This software will reduce the number of copy writers needed, not reporters.
This is certainly an issue to the copy writers and their families, but overall it's really just a blue collar worker being replaced by a robot issue.
The idea of a 'style dial' I find a bit more disturbing.
KFG
The poster incorrectly assumes that this could be used to replace reporters. The problem is that computers have a difficult time generating new text. The methods that computers use to evaulate text (as any user of grammar-check would realize) aren't that great.
In fact, most language models cannot generate even a large portion of English text. Those that do have a good range rarely have good accuracy, because there are many things that we "just don't say that way." This is why when you're talking to a non-native speaker, you often cannot explain why something they said was wrong. This is because there is no real grammar rule against speaking in a given way.
So if we rule out syntax-based models, that just leaves statistical-based models. I worked in a NLP lab during the summer of 2002, and my prof there said that syntax and statistics are like the two sides of the force. Statistics are quick and easy but are seductive. They corrupt you and leave you unable to really think about the language itself. You only think in terms of bigrams and HMMs.
So even though these systems are doing well, they are mostly statistical. Thus, it's hard to get incremental improvement. You have to have larger corpora, and larger corpora usually have more errors, thus defeating any advantage you might get by capturing more aspects of a language.
In my opinion, only with well-developed language models that can effectively generate NL can we get anywhere. Which is what Barzilay is working on, but it's still a long, long, long way off.
----------
I am an expert in electricity. My father held the chair of applied electricity at the state prision.
Re journalistic integrity - There's the possibility that a single entity could issue the release to the wire services, they could relase it in some kind of 'compiled' form (where it's just the syntax/semantic relations.) (How this could be different from how releases are issued now is a good question, but I guess there'd have to be reporters on hand to inquire about details... so maybe journalism might be saved after all... but not if templates for information were used, and the templates themselves needed to fill in the missing gaps...)
You could imagine how each news outlet could receive the relase, and use their own reconstructive code to flesh out the [NP][VP]{NP] ("who did what to who"* scenario) and then write their own story from that.
Editing scripts could decide what in the story would be details that would shine damaging light on that paper's politics, and then stuff those details in the 37th paragraph that no one reads, write a potentially-misleading headline that would allow for a reading that would tell its readers the exact slant they want to give the story, and DONE - they've printed the ostensible truth, but since few people are going to read the article, they've done their job and done it well.
"Wait a minute, isn't that what happens now anyway?" Maybe, but now papers can save that much more on spin-sters' salaries. And then there'd be yet more English majors who can't find a job. Go capitalism, yay. *shudder*
*it's who. not whom. No one has said whom in english for a century or so, and then only because they 'think' it's correct. Anytime I hear someone saying it for real, I shudder to think that they're so neurotic about their grammar that they use something they've been told is right but have never really heard themselves. None of my linguistics profs ever used "whom", EVER. I think they privately hate the word.
P.S. This entire post have been wrote by a really good scripts.
Intelligent Design: because MATH is HARD.
This article posted before already tells us all this, the paper that originated it was mentioned in the comments, and this one is another of a series of papers by this researcher.
OK, nothing else to see here, move on to the next redundant post (Is that paraphrasing 'dupe'?)
...that explains cowboyneal
"The problem with socialism is eventually you run out of other people's money" - Thatcher.
We can do the same with extra long /. posts.
I'd be the richest man to ever appear on "Oprah".
Intelligent Design: because MATH is HARD.
...if[false] then kill -9 fi our new interpreted command-line overlords.
Well parsing of course becomes your grammar checking. Spell checking is more lexical analysis--but for numbers that's just seeing what's in range.
-Libertarian secular transhumanist
Beat reporters on the other hand might be helped, by police reports, witness accounts, etc, but not replaced. That's android territory and we hope, is maybe for the 22nd century, not the 21st.
Intelligent Design: because MATH is HARD.
At least until this starts to work:
/dev/random > NewNewsStory.txt.
cat
For your convenience, here's the link to the original article that requires registration.
To build and maintain these shell scripts.
Trolls will rush to this technology, exploiting it for endless ways to phrase anal sex and slashdot bashing.
"Academicians are more likely to share each other's toothbrush than each other's nomenclature."
Cohen
December 14, Z.Y. Yao, a Computer Science freshman from Fudan University in Shanghai, posted an article "Introduction to BabelCode Project" to his newly set-up project website www.babelcode.org, releasing the theoretical principles and technical specifications for his human-aided machine translation approach. The approach actually enables the human writer to directly author machine-translatable content and guarantees such content be converted to correct, natural and multilingual translation versions, automatically.
See the English version of the Introduction at
http://www.babelcode.org/doc/intro.htm
or PDF:
http://www.babelcode.org/doc/intro.pdf
Considered the pioneer if its kind, the project has been receiving both praises and doubts in China's IT media. And Yao is currently developing three demo programs (an input module for English, an output module for Chinese and another output module for German) with his friend Zheng Shao from UIUC. They are calling for volunteer participation in this project.
Of course the time will come when machines summarize articles, and I believe I have seen where this has already been tried with mixed success. It would be kind of neat to see /. use both a summary engine and a paraphrase engine on
submitted articles. Then we could have 3 article descriptions: the posters description; a machine summary of the same article; and a machine paraphrase of the original posters summary.
Letter To Iran
Would be nice to be able to summarize + paraphrase large articles and documents. Not all of us have the necessary time to read 20+ page documents.
:)
It won't replace original works, but it could help reduce a lot of extraneous data on the web
- Dan
YES
NO
WHAT?
PLEASE COME BACK LATER
FUCK YOU, ASSHOLE
JUST YOU WAIT
Just you wait.
to wit, there are attributes of register, tone, and modality that can be applied not just to individual sentences, but to entire pieces of text that may be able to indicate a piece's slant, political tone, reading level, and (ahem) ability to incite readers to flame.
Some of the decision making processes you're talking about that go on during editing and truth judgments admittedly will probably not be computerized. But some of them can.
The point of the responses here are not to relegate journalism or wordsmithy (as it were) to the level of manual labor, as manual labor has been replaced by machines. But the truth is that machines are more complex now and they're ready to take on more complex tasks. Some things about language are very much NOT a mystery. Code isn't either.
Intelligent Design: because MATH is HARD.
December 14, Z.Y. Yao, a Computer Science freshman from Fudan University in Shanghai, posted an article "Introduction to BabelCode Project" to his newly set-up project website www.babelcode.org, releasing the theoretical principles and technical specifications for his human-aided machine translation approach. The approach actually enables the human writer to directly author machine-translatable content and guarantees such content be converted to correct, natural and multilingual translation versions, automatically.
See the English version of the Introduction at
http://www.babelcode.org/doc/intro.htm
or PDF:
http://www.babelcode.org/doc/intro.pdf
Considered the pioneer if its kind, the project has been receiving both praises and doubts in China's IT media. And Yao is currently developing three demo programs (an input module for English, an output module for Chinese and another output module for German) with his friend Zheng Shao from UIUC. They are calling for volunteer participation in this project.
are you a Negro? I get the distinct impression that you're an angry white Liberal Negro. Thoughts? Conjecture? You know, I always laugh about silly shit like this because I've read almost identical quotes during the Reagan administration, the [Theodore] Roosevelt and Wilson administrations. Even Taft and McKinley.
December 14, Z.Y. Yao, a Computer Science freshman from Fudan University in Shanghai, posted an article "Introduction to BabelCode Project" to his newly set-up project website www.babelcode.org, releasing the theoretical principles and technical specifications for his human-aided machine translation approach. The approach actually enables the human writer to directly author machine-translatable content and guarantees such content be converted to correct, natural and multilingual translation versions, automatically.
See the English version of the Introduction at
http://www.babelcode.org/doc/intro.htm
or PDF:
http://www.babelcode.org/doc/intro.pdf
Considered the pioneer if its kind, the project has been receiving both praises and doubts in China's IT media. And Yao is currently developing three demo programs (an input module for English, an output module for Chinese and another output module for German) with his friend Zheng Shao from UIUC. They are calling for volunteer participation in this project...
Software that can paraphrase statements in online articles is in preliminary stages of development, for applications not limited to translation.
Obviously this is a developing field. The best models seem to use phrases from the original text, anyway the Mac OSX example above shows that it is useful to users willing to take it with a massive grain of salt, even if we are not into full computational sentience yet.
When it works even a little better it will replace all those awful grade school teachers who assign paraphrasing as a homework assignment. The reporters who might have been replaced by it will have already lost their jobs, except for the ones in AhaIndia of course who will paraphrase for the rest of us, usually at a marginally better level than the machine.
The research is interesting - and I'd like to understand Barzilay's notation is that APL or calculus of statement? - in the paper (pdf) I found on google. Also see the papers on her site.
Of course structured text is easier, and news stories are known to have most of the meat in the beginning, but this is great stuff.
One interesting older system is ThoughtTreasure which was built to understand a story and answer questions about it. The author also did work on news analysis ("NewsForms") too. There are tools out there, I've been making a survey myself too. If anyone has information about practical NLP tools for real world tasks please post.
I know you probably mean "newsreaders", the helmet-hair-headed idiots that are found in most newscasts. These are not reporters.
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
December 14, Z.Y. Yao, a Computer Science freshman from Fudan University in Shanghai, posted an article "Introduction to BabelCode Project" to his newly set-up project website www.babelcode.org, releasing the theoretical principles and technical specifications for his human-aided machine translation approach. The approach actually enables the human writer to directly author machine-translatable content and guarantees such content be converted to correct, natural and multilingual translation versions, automatically....
See the English version of the Introduction at
http://www.babelcode.org/doc/intro.htm
or PDF:
http://www.babelcode.org/doc/intro.pdf
Considered the pioneer if its kind, the project has been receiving both praises and doubts in China's IT media. And Yao is currently developing three demo programs (an input module for English, an output module for Chinese and another output module for German) with his friend Zheng Shao from UIUC. They are calling for volunteer participation in this project.
The main problem is that languages, especially English, are so idiomatic that mechanical translators will be a too much of a disadvantage - take the Babelfish translator for instance.
Furthermore, the English language is so flexible that just about any word can arbitrarily substitute for anything else - for instance, take 'bad' meaning 'good'.
It would be impossible to program a machine to be able to understand the full spectrum of idiomatic phrases but the future may lie in employing neural net technologies so that computers can do some limited learning.
cogito ergo sig...
This technology, destined to eventually replace most reporters with very small shell scripts, is thankfully still in its infancy.
Considering how much bad news there is today, it would be nice to have a completely unbiased point of view. I dont trust CNN as they seem to be driven by politics behind the cameras as well as in front, dont trust MSNBC (or whatever its called) because you dont really know who is paying who to cover the story, and FOX is..well...FOX. Everyone has a slant, itd be nice to have something free of this. This is why I like slashdot; even if you have a slant, your story is treated like everyone elses.
No, no, I'm in agreement. The investigation part, insurmountably a human task. The writing, though? Probably somewhat automatable.
Am I correct in assuming that many reporters would love it if their editors were replaced by robot masters, then?
Intelligent Design: because MATH is HARD.
I believe this was covered in a related Slashdot before regarding to this site: http://www1.cs.columbia.edu/nlp/newsblaster/
Here is a quote from their site:
Columbia Newsblaster is a system to automatically track the day's news. There are no human editors involved -- everything you see on the main page is generated automatically, drawing on the sources listed on the left side of the screen.
Every night, the system crawls a series of Web sites, downloads articles, groups them together into "clusters" about the same topic, and summarizes each cluster. The end result is a Web page that gives you a sense of what the major stories of the day are, so you don't have to visit the pages of dozens of publications.
Newsblaster is an academic project from the Natural Language Processing group at Columbia University's Department of Computer Science. It is designed to demonstrate the Group's technologies for multidocument summarization, clustering, and text categorization, among others. It is funded under DARPA TIDES and KDD and has been operational online since September 2001.
Current and future enhancements include international perspectives, multilingual capability, and tracking events across days.
December 14, Z.Y. Yao, a Computer Science freshman from Fudan University in Shanghai, posted an article "Introduction to BabelCode Project" to his newly set-up project website www.babelcode.org, releasing the theoretical principles and technical specifications for his human-aided machine translation approach. The approach actually enables the human writer to directly author machine-translatable content and guarantees such content be converted to correct, natural and multilingual translation versions, automatically.
See the English version of the Introduction at
http://www.babelcode.org/doc/intro.htm
or PDF:
http://www.babelcode.org/doc/intro.pdf
Considered the pioneer if its kind, the project has been receiving both praises and doubts in China's IT media. And Yao is currently developing three demo programs (an input module for English, an output module for Chinese and another output module for German) with his friend Zheng Shao from UIUC. They are calling for volunteer participation in this project.
###
#!/bin/sh curl $1 > paraphrase > slant -liberal -mindless > bbc_news_story.txt
-- You see, there would be these conclusions that you could jump to
December 14, Z.Y. Yao, a Computer Science freshman from Fudan University in Shanghai, posted an article "Introduction to BabelCode Project" to his newly set-up project website www.babelcode.org, releasing the theoretical principles and technical specifications for his human-aided machine translation approach. The approach actually enables the human writer to directly author machine-translatable content and guarantees such content be converted to correct, natural and multilingual translation versions, automatically.
See the English version of the Introduction at
http://www.babelcode.org/doc/intro.htm
or PDF:
http://www.babelcode.org/doc/intro.pdf
Considered the pioneer if its kind, the project has been receiving both praises and doubts in China's IT media. And Yao is currently developing three demo programs (an input module for English, an output module for Chinese and another output module for German) with his friend Zheng Shao from UIUC. They are calling for volunteer participation in this project.
---
December 14, Z.Y. Yao, a Computer Science freshman from Fudan University in Shanghai, posted an article "Introduction to BabelCode Project" to his newly set-up project website www.babelcode.org, releasing the theoretical principles and technical specifications for his human-aided machine translation approach. The approach actually enables the human writer to directly author machine-translatable content and guarantees such content be converted to correct, natural and multilingual translation versions, automatically.
See the English version of the Introduction at
http://www.babelcode.org/doc/intro.htm
or PDF:
http://www.babelcode.org/doc/intro.pdf
Considered the pioneer if its kind, the project has been receiving both praises and doubts in China's IT media. And Yao is currently developing three demo programs (an input module for English, an output module for Chinese and another output module for German) with his friend Zheng Shao from UIUC. They are calling for volunteer participation in this project.
+++++
Silliness aside about an Apple 2 being able to gather the news for us and feed it, the thing about wordsmithery is that there is a certain amount of creativity that needs to go into it. Otherwise you have the literary equivalent of the Backstreet Boys and such. Not a good mix.
This sig no verb.
From the article:
The researchers, Regina Barzilay, an assistant professor in the department of electrical engineering and computer science at the Massachusetts Institute of Technology, and Lillian Lee, an associate professor of computer science at Cornell University, said that while the program would not yield paraphrases as zany as those in the Monty Python sketch, it is fairly adept at rewording the flat cadences of news service prose.
Two women came up with this! Why doesn't it surprise me in the least that women are officially researching ways to automate the process of saying the exact same thing in an infinite number of different ways?
LK
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
It was apparantly a reference to a t-shirt on thinkgeek (see the fp). I try to give the editors the benefit of the doubt before railing on them.
now maybe we can get some real news that covers all sides...oh wait...there will still be Editors..DAMN
I am the Alpha and the Omega-3
Now, correct me if I am wrong, but hasn't Hollywood beem using this system for some time now? If a movie isn't a direct rip of something that was made in the past, then it takes familiar characters and tosses them in a blender with a dash of CG effects and frappes until smooth.
Television uses this system, too. The formula there seems to also involve borrowing a successful British TV show's concept, just to keep things a little fresher.
You're confusing the issue. No programming language is ambiguous. The English language is ambiguous. Using "integer" instead of "int" isn't what I'm talking about.
-Libertarian secular transhumanist
sure.
ok...i have never cheated. ever.
i'm now on academic probation and fear that this may be my last semester of university. sure, i could be smarter(my iq is only around 120), and i could go on provigil, ritalin or speed to sleep even less...but i've really allready been studying 12-16 hours every day...and the 65 average is just barely within my reach.
why am i complaining here? because the program i am in is an _easy_ one... in my university you need a 85% average to just stay in the electrical engineer program... i could never do that and this university is actually one of the most lenient in the contry...
in other words...when the differece is between a 4.9 gpa and having my marks...id suggest for anyone else to cheat, if they think that they can pull it off.
its evolution baby. survival of the fittest. and i can tell you right now that i am not the fittest...and if you were to cheat, and me not...this gives you that much more of an edge that you could survive that much longer on... get good marks, succeed, at any cost short of not learning anything(after all, knowledge is power)
GENERATION 26: The first time you see this, copy it into your sig on any forum and add 1 to the generation.
This reminds me of when I used to work for Marketwatch.com. I was bored one day, and wrote a program called ThomCalandra.pl. It used Markov word chains to generate new news stories based on Thom Calandra's previous articles and new news stories coming in. Just glancing over the story, it looked legit, but then it would say things like "Deutch Bank filed for bankruptcy" and things like that which were totally false. It was entertaining though, especially when you fed it a mixture of articles by Thom Calandra, and Smoove B or Herbert Cornfeld from the Onion. Then you'd get things like:
"Baby, you are so beautiful, I'm going to make sweet love to you all night long, and the Euro is increasing in value against the dollar."
or
"The muthafucking DJIA has dropped another 10%, and those bitches from accountz recievable are responsible."
Need Free Juniper/NetScreen Support? JuniperForum
Here's the US news agency formula :
1) Have reporter attend whitehouse briefing
2) Take notes
3) Regurgitate content to news anchor in live shot
4) NEVER ask any questions that would challenge the message of the day
Automated Smokey The Bear: "Only who can prevent forest fires?" Bart: *presses the "You" button* Automated Smokey The Bear: "You have pressed 'You,' referring to me. That is incorrect. The correct answer is 'You""
Couldn't come sooner! I'm tired of those damn "don't I look wonderfull with my new outfit today" no-brain news presenters. BBC News 24 interviewing an astronomer ref Beagle 2 introduced him as an ASTROLOGER! For F***s Sakes!!! There we have it folks. Theres been a failure to communicate 'cause Beagle 2's birthsign coincides with Rus*el Gr*nts bad dose of piles! Bring on the automatic presenters and lets dispense with those vain overpaid faggots. " Theres been a quake in Irqustanoble, er does my bum look big in this?" Typical reports contain 1% substance 3% hype chasing spin and 96% filler - hmm what OS does that remind me of?
My hyperlinks aren't worth the paper they're printed on.
Slashdot has been using this system to generate its articles for a while now. Obviously it's still loaded with bugs.
-
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
Go away before I, replace you with a very small shell script!
http://www.thinkgeek.com/tshirts/frustrations/374d /
You cant make anything foolproof, they'll only invent better fools.
Shouldn't that read "very small shill scripts"?
Interlingua, being tightly defined
By "Interlingua" do you actually mean the Interlingua of IALA, or do you refer to "Lojban", a more precise interlanguage defined specifically for machine comprehension?
This is news? We're supposed to get worked up about what two ding-bat women do, once they're finished playing leap-frog in graduate school? Where's the practical application? Do these idiots even have to provide such a justification before they get REAMS of funds to do their silly things - funds that often come out of TAXPAYERS's pockets?
Who puts these stories on Slashdot anyhow? Are Malda and Hemos on holidays?
Term paper obfuscation! Technology at its finest.
"The greatest tragedy in mankind's entire history may be the hijacking of morality by religion." - Arthur C. Clarke
...when we can replace upper level management with small shell scripts.
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
Autonomy has been been offering similar products for 4 or 5 years now and, IIRC, they have a number of Fortune 500 companies among their customers. Their work is based on Baysian algorithms. They used to have a free desktop app incorporating some of this technology, but withdrew that a few years ago. I'd suspect they probably have patents out the wazoo on this stuff, which might come back to bite others using Baysian techniques.
http://autonomy.com/Content/Press/FactSheet
"Autonomy Corporation was founded in 1996 by Dr. Michael Lynch using a proprietary pattern matching technology that was the result of Cambridge University research on the probability theorems of an 18th century mathematician, Reverend Thomas Bayes."
Hey, I went and played w/ this feature of word. Here is the summary of the article. hmmm... maybe if we set up an auto summary more people would RTFS?
Anyway, here it is:
Now, computers can play along
Computers can't do nearly that well at paraphrasing. Now, using several methods, including statistical techniques borrowed from gene analysis, two researchers have created a program that can automatically generate paraphrases of English sentences.
The program gathers text from online news services on specific subjects, learns the characteristic patterns of sentences in these groupings and then uses those patterns to create new sentences that give equivalent information in different words.
Then the computer sought clusters of sentences that had similar words or phrases.
Testing for statistical evidence that expressions were paraphrases, the system compiled templates or patterns that formed the backbones of equivalent sentences. Barzilay said the system did well at paraphrasing short articles but bogged down as the articles grew longer and the text more idiosyncratic. When the researchers tested their paraphrasing rules by using articles on violence in the Middle East published after they had developed their system, the program was able to paraphrase 61 percent of the sentences in articles with 10 or fewer sentences. Fernando Pereira, chairman of the computer and information science department at the University of Pennsylvania, said the authors were wise to focus on news articles. This type of information seeds a database of paraphrases that can then be used to generate new sentences. The paraphrasing program might one day be useful in machine translation, said Kevin Knight, a senior research scientist at the Information Sciences Institute of the University of Southern California. Pereira said that the paraphrasing work had given him pause.
just let the vast majority that bias their article in one way or another be offshored to india also.
#!/bin/sh curl $1 | paraphrase | mispel -slashdot | slant -page-hits -linux | troll --max-troll=50 > slashdot_story.txt
Fellowship 9/11
Boy, I'm glad that computers don't have their (hands?) in reporting news; it'd be terrible to get rid of all that slant in the media this way and that. I mean who wants fair, equitable stories?! You read the NYT to ra ra for the Bleeding Heart shit, or if you're a heartless republican the Journal is for you. Now how would they sell if they just told the facts as they were and left interpretation up to the readers?!
Well, at least Slashdot will always be biased, thank god for that.
I haven't posted in so long, my sig is out of date.
As far as I can see, this system is best at rewording things, and the best place to use that ability would be in writing essays.
;)
Want to plagiarise something without getting caught? Feed it through the handy-dandy-rewordatron and you're away.
Hell you could feed an entire essay through the damn thing.
It's a scary thought, but one that I as a lazy student quite like the idea of
Or maybe we should all just settle for Al-Jezeera or those French faggots that write "cannon fodder" on their gay-wad photographer vests.
you conservative ultra-rights make me puke.
I'll bet it's not as cute as Linda Vester, though.
``Tension, apprehension & dissension have begun!'' - Duffy Wyg&, in Alfred Bester's _The Demolished Man_
This is kent brockman, you may remember I used to report live on how the little guy lost his job to technology and how I appeared to care.
Well today its my turn to lose my job to small scripts.
This reporter DOES NOT welcome our new script overlords.
[and now back to weather with wendy]
The OS X feature appears to operate not so much by restating in different words, but by identifying redundancies and eliminating them. It works very well.
It appears to me that it's editors that are in jeopardy, not reporters.
The type of the first argument will tell you exactly what the "+" will do. The English problem is best represented by "right". Get in the right lane. There is no way to know whether to get in the lane opposite the left, or choose the correct lane.
-Libertarian secular transhumanist
Wrong.
People learn those languages that make the biggest amount of valuable information available to them.
English is my third language by acquisition while clearly the first one in terms of importance now.
And i find it to be the weirdest of the three.
For example, the spelling system is just silly. For example, why are there five ways to write "k" (click, kick, suck, schedule, iraq)?
And while we are at it, why not use "f.e." instead of "e.g."?
And why such an irrelevant six-billionth of the whole deserves to be honored by the capital leter ("I")?
I kould go on but i'm a bit busy katching up with my skedule. And BTW i like the spelling reform that KDE krowd is doing ;-)
Rest assured that the above sentence reads much weirder to me than the "proper" english spelling but that's because i've got used to it during the years of usage. I remember my first English (BTW isn't it about time to relabel it "Earthish"?) classes... i was outraged when introduced to the weirdness of that spelling nonsence and it took years to cool down and get used to it.
In case you are wondering, my languages are (by sequence i've learned them): 1. Lithuanian (so called "native" one (i'd just label it "local")), 2. Russian (i used to be in Soviet Empires' claws), 3. English (computers, internet, huge amount of books being published, etc... plenty of reasons but certainly not the quality of the language).
And the bottom line is: I'm ready to "waste" yet another huge amount of time learning one more language artificial this time if i see the long term benefits ("long term" stands for - centuries and millenias not just years) of an introduction of such a language globally. But for now i'd just like to see the whole EU switching to English as a primary means of communication (i find talks about "EU-wide information society" to be silly without an idea of the common communications protocol (language) being promoted as well. f.e. - how do you imagine a US-like-dynamic EU-wide flow of (skilled) workforce if there is no common language being used in workplace?)
Some silly redundancy? Yes. But the amount of it in English is just crazy in comparison to the other two languages i know (lt and ru).
I think it's bad when history stands in a way of logic and convenience. I just wanted to point out that wasting time learning some extra quirk of a language is a negative rather than positive thing. If i want to abbreviate "for example" i'm incilned to write it as "f.e." and that's it. One more example of such an annoyance is a.m./p.m.
Just for a record: neither lt nor ru capitalize "I".
Anyway this "I" and "e.g." thing is pretty irrelevant in contrast to the spelling madness.
"there is no morality,"
or perhaps there are other things in the world than a human idea by that very name...and that it may not be the most important or at all important. i don't think i even have to go that far, however. wouldn't a sheer "i will not impose my [elitist] morality on you, feel free to cheat if you think you can get away with it" work?
"free reign to do what we must to get ahead?"
i think the idea of freedom is a hack and fraud, but this may be besides the point.
"Or perhaps you don't feel that cheating harms anyone"
lots of things carry risk, this one in specific carries risks both to the person involved, the community(as the cheater may not know needed information, and pseudoinnocent bystanders(students like me)... but i expect a level of risk:productivity assessment from everyone, or at least encourage it.
"said. Cheating, of the teen-age sort we talk about most frequently, doesn't get you anywhere. " i think 0.5% made the difference between 20,000$ worth of scholarships and nothing in scholarships for a student 2 years ago...(2 students had high 90s averages. one hit the jackpot multiple times, the other... is treated like a normal everyday student) i actually liked him much better, was less involved in football than in math and Computers...but the both of them were doing so many things on so many fronts, getting extreemly high marks, being way over involved, volunteering way too much, etc...its too bad there had to be a loser in that case. i at least hope the second guy got into university(with his marks i can see it).
(isn't 100 average?) yes
while money isn't all its cracked up to be, starvation/poverty sucks.
Anywho, good luck in school thanks. semester starts on the 5th... i have a lot of ground to cover, although i'm starting to get that neo-at-the-end-of-the-first-matrix-flick-feeling dealing with this pre-calc math that ive kept myself busy with.
GENERATION 26: The first time you see this, copy it into your sig on any forum and add 1 to the generation.