Researchers Develop an Internet Truth Machine
Hugh Pickens writes "Will Oremus writes that when something momentous is unfolding—the Arab Spring, Hurricane Sandy, Friday's horrific elementary school shooting in Connecticut—Twitter is the world's fastest, most comprehensive, and least reliable source of breaking news and in ongoing events like natural disasters, the results of Twitter misinformation can be potentially deadly. During Sandy, for instance, some tweets helped emergency responders figure out where to direct resources. Others provoked needless panic, such as one claiming that the Coney Island hospital was on fire, and a few were downright dangerous, such as the one claiming that people should stop using 911 because the lines were jammed. Now a research team at Yahoo has analyzed tweets from Chile's 2010 earthquake and looked at the potential of machine-learning algorithms to automatically assess the credibility of information tweeted during a disaster. A machine-learning classifier developed by the researchers uses 16 features to assess the credibility of newsworthy tweets and identified the features that make information more credible: credible tweets tend to be longer and include URLs; credible tweeters have higher follower counts; credible tweets are negative rather than positive in tone; and credible tweets do not include question marks, exclamation marks, or first- or third-person pronouns. Researchers at India's Institute of Information Technology also found that credible tweets are less likely to contain swear words (PDF) and significantly more likely to contain frowny emoticons than smiley faces. The bottom line is that an algorithm has the potential to work much faster than a human, and as it improves, it could evolve into an invaluable 'first opinion' for flagging news items on Twitter that might not be true writes Oremus. 'Even that wouldn't fully prevent Twitter lies from spreading or misleading people. But it might at least make their purveyors a little less comfortable and a little less smug.'"
This is really interesting research, but it's also based on one event in one country.
Conclusions based on what may be language or cultural norms (such as "did you phrase in the positive or the negative") might not translate to other locales well (e.g. Hurricane Sandy in the US).
But, then, that's what's great about science. Testable predictions we can apply to data.
How effective would this be on real media? I bet it'd put those bastards in their place! :)
A guide for how to make my bullshit disaster tweets more believable. Thanks, Yahoo!
They published a checklist of how to make your trollish tweets sound legit, and created a service to assign said tweets a high truthiness rating? Sounds helpful.
So it provides a first opinion on first posts, sort of. Neat, but I do wonder how accurate this is going to be to vet individual tweets. Twitter trolls may get wise to this and game the system to get their stuff past this filter. A bit like phishers learning how to spell. In the end, the best check is still independent verification, for example by other people tweeting the same thing (not just retweeting of course). If this system could automatically group and cross-verify tweets from multiple sources on the same subject, that would be a step in the right direction.
If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
Couldn't some enterprising douche programmer use simular programs to write better misleading tweets.
It's interesting to note, that a seismology student at a university in Chile finally had enough nonsense from false information over Twitter, etc about earthquakes, that he directly wired a big batch of seismographs to directly post their results via Twitter. The last I knew, they had over 1 million followers, and this particular student has been getting big thank yous from residents of the country.
@Mindless Drivel: 100% of Twitter posts ever Tweeted.
Twitter is the world's fastest, most comprehensive, and least reliable source of breaking news
Twitter has dethroned Fox News?!?
This is the most dubious research that Slashdot has presented in the past week.
How about something that is of actual value or is at least mentally stimulating? Here's a helping hint; it won;t involve Facebook or Twitter in any way whatsoever.
1. Reality is relative, as Einstein showed us. And 2. Nearly all stuff we think we "know" is based on somebody else telling us. Somebody we *trusted*. Or who of you has confirmed the Higgs himself. Or just something as simple as that text you're currently reading actually coming Slashdot.
Yes, of course there's math and all that stuff. But still: Have you verified it yourself? No. Most probably not even remotely. So no, you do not actually know if it is true. You only *trust* that it is true.
And that is why there is no absolute reality nor an absolute truth in practice. There is only your sensory input and your trust. And that's usually OK.
When it becomes a problem, is when you start thinking it's the "absolute truth" and that theories would be "facts". It's a theory. And that's okay. Because it turned out to be very useful for you, or you wouldn't use it.
So no, don't trust anything posing as "the truth" or "the facts". That's when the biggest bullshit that you ever saw hides right behind it.
let me FTFY:
"... the results of Twitter misinformation can be potentially deadly... team at Yahoo has analyzed tweets... to automatically assess the credibility of information... A machine-learning classifier developed by the researchers uses 16 features to assess the credibility... : credible tweets tend to be longer and include URLs; credible tweeters have higher follower counts; credible tweets are negative rather than positive in tone; and credible tweets do not include question marks, exclamation marks, or first- or third-person pronouns... less likely to contain swear words... significantly more likely to contain frowny emoticons than smiley faces... it could evolve into an invaluable 'first opinion' for flagging news items..."
this of course, is all based on current human behavior. now the misinformers know how the next generation information sanitization works, and they can adjust accordingly.
let me fix that further...
"dear terrorists, theives, misinformation puppets, propagandists, and counter propagandists. please follow this list of 16 ways to type out your garbage and continue the madness. it will not only work better, but it will circumvent next generation information warfare detection technology. thanks for your cooperation. yours with love, the good guys."
One of the criteria in their algorithm seems to be that credible tweets were
They were evaluating tweets about a disaster; not a lot of smiley faces there.
The algorithm seems to have a bias toward bad news. So, if my buddy tweets that a rare Belgian beer will be available at the local liquor store, the algorithm will decide that it isn't credible because of the smiley face.
We just had the above case. Beer that you usually have to cross the Atlantic to get became available for about 30 minutes locally. Some of us lined up starting at 3:00 AM. I would have been really ticked off if some algorithm had made me miss the news.
Who stops to type emoticons in the middle of a natural disaster (including switching to the alternate keyboard to get those characters)?
There's two topics here, one is use of potentially valuable information by, say, emergency responders (leads, evidence, etc.). The program could be useful. The second (e.g. "don't use 911") is "a headline", i.e. it is aimed at spreading news (or troll farts) as media to the social public. These are definitely two completely separate problems to solve. The second problem is best solved by evolution, as people who get their "news" off of social media become even stupider than they were to begin with and die off.
Gently reply
I don't need a Truth Detector®. What has been stated is that information that has less filtering by statists is by definition less reliable.
Human nature is such that bad news is more believable than good news, except when the goal is to sieze liberty from individuals.
It is Twitter, not Tweeter. Therefore Twits. Not Tweets. Twits.
Of course, in just the same way that spammers can game Bayesian spam filters or rule-matching pattern filters by knowing what the rules are, given a known set of rules that attempt to assess credibility of tweet allows someone to tweak their tweets in order to be assessed as having high credibility: ;>(] :>( beebs
:>( -- beebs
/.'s /-code and is not part of the wc wordcount :>(
1 -- max out your tweet length
2 -- include an URL [doesn't say whether to use a link shrtnr
3 -- use a Twitter account with a high number of followers
4 -- use a negative tone
5 -- no question marks or exclamation points
6 -- use 2nd person (same as don't use 1st or 3rd person)
7 -- don't use swear words
8 -- use a sad emoticon
.
Example to maximize this:
a - break into / hack a high follower account (e.g. justinbieber) and tweet: cat > finaltweet
You should know Mayan Calendar sez: world ending this week. Confirmed@ http://netcraft.calendar.mayan/ you go hug loved 1s now.
wc finaltweet
1 20 139 finaltweet
First iteration was:
gia@sodium$ cat > count2
You should know that Mayan Calendar says : world ending within week. Confirmed by http://netcraft.calendar.mayan/ , you should hug loved ones now.
gia@sodium$ wc count2
1 25 159 count2
Please note that the "[netcraft.calendar.mayan]" was inserted by
If you weren't aware that Hurricane Sandy, Irene or whatever occurred, just tune into the local television and watch the car commercials. If I see one more Maxon, Salerno Dwayne, Rutherford Ford or Honda Hurricane Sandy stimulus event, I'm going to throw up. THAT is how you know something bad has happened.
This is Skynet for 10 kilowatts I can post your tweet.
According to a previous story ( http://tech.slashdot.org/story/11/07/28/2244236/ ), URLs on Twitter are overwhelmingly posted by males. So, what's we've got here is a correlation being male and posting credible information on Twitter.
The basic problem with any such approach is that tweets are individual opinions and you cannot arrive at the truth or falsehood of objective facts by analyzing a collection of he-saids and she-saids.
The hospital is either on fire or it is not on fire, regardless of what anybody says.
.
Hey, that's pretty cool! :)
I mean, that's pretty cool! :(
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
I think for the first one you wanted to write: "Hey, that's fucking cool! :)"
And for the second one, you don't want the exclamation mark. That was also claimed to be a sign of non-credibility.
The Tao of math: The numbers you can count are not the real numbers.
Don't write "Help!" (exclamation mark" or "please help me" (first person pronoun).
The Tao of math: The numbers you can count are not the real numbers.
How to know they are real valuable is when they are censored
My tweets were censored because they had URLs, even to Twitlonger.
So I resorted to these tweets instead @ ~140 characters limits (how long a tweet can be):
#taxes 1) The Declaration of independence recognized the peoples rights & duty to ... remove budgeting & accounting failed tasks from Gov't.
#taxes 2) for proper representation, given all the budgeting & accounting fails, &more, the people must direct where their taxes R 2 B used.
#taxes 3) For the people 2 voice where their taxes R 2 B used, forms R required to be created and made available by all Gov't tax collectors
#taxes 4) each taxpayers direction of where their taxes R 2 B used is with the constraint of generating teamwork benefits they can share in.
#taxes 5) for those who trust gov't, there is option of letting the government decide where their taxes, or some portion, R 2 B used.
#taxes 6) Address political/election faild promises R replaced w/taxpayer direction. Elected R hired to sum & implement taxpayer direction.
#taxes 7) For amount of taxes the taxpayers "trust" the government with, #voters not only help hire the elected but help direct these funds
#taxes 8) For people 2 know where their taxes are needed, Gov't must become transparent 2 inform the people of funding needs. People decide.
#taxes 9) Clarity, I decide on where the taxes I pay are used, you on yours, etc.. This is a republic where all voices are accounted for.
#taxes 10) We have plenty proof this tax directing change works. Open Source Software, Iceland's recovery, & many crowd sourced projects.
#taxes 11) either you trust the people 2 do the right thing, or you rig #elections 2 have some perceived unfair advantage over the people
#taxes 12) We shall NOT vote on this right & duty of the people to direct where their taxes are to be used. It has already been established
#taxes 13) The tax processor jobs are in position to allocate a taxpayer taxes according to that taxpayer's direction. And provide receipt.
#taxes 14) Should Gov't fail this job, the people can set it up through Credit unions & provide receipts/proof to tax processors of tax paid
#taxes 15) In event of going through Credit Unions, funding access will require proof of proper spending in accord with taxpayer directions.
#taxes 16) #1 priority directing taxes is 4 creation & availability of required forms giving taxpayers voice, allowing proper representation
But its kinda hard to post a PDF such as :
Facebook deleted this so I reposted to see how long it'll take before they do it again... Hmmm wonder why as I am American.
The easy way to direct where your taxes are to be used. Its on your check, its legal (don't use it offensively). You can use this idea to create your own. Scan a blank check to a jpeg file and use this as a background to create your text on. I used Autocad, but I'm sure there are many ways to accomplish this. See "enlarged" for more info.
Income Tax payment check Overlays:
http://3seas.org/Fed-Tax-check-overlay.pdf
http://3seas.org/web-federal-overlay.pdf
http://3seas.org/web-enlarged-fed.pdf (so you can read it here)
http://3seas.org/web-state-overlay.pdf
http://3seas.org/web-enlarged-state.pdf (so you can read it here)
Wanna know why the Brits stepped over the border that early morning, just far enough to determine Assange was still there?
I had suggested Assange had left the building weeks before?
The authorities were watching the comments....
it's called snopes.
If you weren't AC I would moderate you 'Woosh'.
Snopes != truth
No brain, no pain.
The second (e.g. "don't use 911") is "a headline", i.e. it is aimed at spreading news (or troll farts) as media to the social public
Yes but this worked so well in re-electing a president...
In case you've not noticed, the closer to the event, the less reliable the story.
Snopes needs to borrow this algorithm and create a subsection devoted to Twitter. It will highlight the unreliable posts and list which criteria made them fail the sniff test. Then, if there's time and resources, a human being might follow up the most significant ones and flesh out the stories.
Typical over zealous faith in technology to define "truth". And exactly from whose perspective are we using for reality. Oh internet god can you please think for me and guide me along the righteous path.
Just ignore Twitter. Works for me.
How combining learning algorithms with community moderation?
Credible tweets are negative? How is "Coney Island hospital is on fire" or "don't use 911, the lines are maxed out" positive in tone?
This reminds me of the anecdote about a DOD learning "AI" program to identify tanks in images that worked perfectly in the lab. We they took it into the field it didn't. They taught it by showing it pictures of landscapes with and without tanks. As it turns out, all of the tank pictures also had clouds and all of the no tank pictures didn't have clouds. So the AI was working, doing exactly what it was taught, identifying clouds.
Headline should read "Researchers Develop Tool For Twitter Trolls To Improve Plausibility Of Their Tweets"
This Post IZ Fucking Credible!!!