Slashdot Mirror


Deep Neural Networks for Bot Detection (arxiv.org)

From a research paper on Arxiv: The problem of detecting bots, automated social media accounts governed by software but disguising as human users, has strong implications. For example, bots have been used to sway political elections by distorting online discourse, to manipulate the stock market, or to push anti-vaccine conspiracy theories that caused health epidemics. Most techniques proposed to date detect bots at the account level, by processing large amount of social media posts, and leveraging information from network structure, temporal dynamics, sentiment analysis, etc. In this paper [PDF], we propose a deep neural network based on contextual long short-term memory (LSTM) architecture that exploits both content and metadata to detect bots at the tweet level: contextual features are extracted from user metadata and fed as auxiliary input to LSTM deep nets processing the tweet text.

39 comments

  1. NO COLLUSION! more than 1 time in a thought.. by Anonymous Coward · · Score: 1

    That should be a pretty high ranking flag in the algorithm seed data.

  2. Wew by negRo_slim · · Score: 4, Insightful

    Putting way to much confidence in bots ability to do any of those things listed in the summary.

    --
    On the Oregon Cost born and raised, On the beach is where I spent most of my days
    1. Re:Wew by Anonymous Coward · · Score: 0

      Bots could narrow the field for different detection confirmations by eyeballware. I guess you're right in this case though you did tend to deny Russia did this shit generally like all Trump morons. Weird that you'd have insights really.

    2. Re:Wew by Anonymous Coward · · Score: 0

      How else am I supposed to know what the DNC/RNC want me to have an opinion on? After all I can't form my own so its best to gaslight me into thinking I am all alone in my opinions. So they can continue to spin up their sockpuppets to tell me what to think. What else can go wrong if we do not make sure one side is removed from the web.

    3. Re: Wew by Anonymous Coward · · Score: 0

      Use bots to detect bots!
      What could possibly go wrong?

    4. Re:Wew by Anonymous Coward · · Score: 0

      Weird that someone who thinks that twitter bots and astroturfing and lying on the internet were invented by Russians in the year 2016 would be able to type at all.

    5. Re:Wew by Anonymous Coward · · Score: 0

      In fact it's very similar to the Bayesian spam filter you probably use. The network learns conditioned outputs based on input priors, but network design is still a bit of an art.

      In general you can learn a high dimensional statistical representation of sentences or lists of sentences. So you can predict the next word from the previous several, etc., or generate grammatical text, etc.

      The key is you do this for all of the hate speech, fake news, fake ads, fake reviews, etc., and create one statistical model, and then you do it again with real news, real ads, real etc.

      So then when the combined network sees a sample of text it can classify it as something it's seen before, or if not, guess. It comes down to training and robustness of the model to the grammatical quirks of English folk might tend to employ if they were amind to thwart automatical recognition systems.

      GANs can then be trained to generate even more fake news.

      The trouble with automated spam detection is that once you know how it works it's easy to design spam that can pass through. Networks that capture large semantic models are perhaps not quite here yet.

    6. Re:Wew by sg_oneill · · Score: 2

      If you read the actual paper you'd know exactly how much confidence one can place. (Hint, its extremely high). 96% on a single tweet text read, up to over 99% once network , metadata and other factors are taken into account.

      6 CONCLUSIONS
      Given the prevalence of sophisticated bots on social media platforms such as Twitter, the need for improved, inexpensive bot detection methods is apparent. We proposed a novel contextual LSTM architecture allowing us to use both tweet content and metadata to
      detect bots at the tweet level. From a single tweet, our model can achieve an extremely high accuracy exceeding 96% AUC.
      We show that the additional metadata information, though a weak predictor of the nature of a Twitter account per se, when exploited by LSTM decreases the error rate by nearly 20%. In addition to this, we propose methods based on synthetic minority oversampling that yield a near perfect user-level detection accuracy
      (> 99% AUC).

      So how much should we distrust this? Unless that under 1% really upsets you, I'd say "Almost completely"

      --
      Excuse the Unicode crap in my posts. That's an apostrophe, and slashdot is busted.
    7. Re: Wew by Sique · · Score: 1

      We will get a bot arms race, where bots are fighting bots, and we humans get left alone. So all back to normal, except for trillions of CPU cycles wasted.

      --
      .sig: Sique *sigh*
    8. Re: Wew by Anonymous Coward · · Score: 0

      The horror! Oh the humanity! Not my precious CPU cycles!

    9. Re:Wew by IamTheRealMike · · Score: 1

      We should distrust it completely, as the paper gives no examples of any of the tweets or accounts they classified as being "bots". None whatsoever. Lots and lots of stats about their model and many implausible claims of it being perfect, but nothing that could be used to actually verify their claims.

      Indeed their claims are completely implausible. Extraordinary claims require extraordinary evidence and they provide none.

    10. Re: Wew by HiThere · · Score: 1

      Well...not exactly back to normal. The faker bots will improve their ability to fool people into thinking they're real.

      Also, the intent is probably impossible for even a superhuman AI to accomplish (except by judging something like volume of posts, which ordinary recipients don't have access to). A twitter post often doesn't contain enough information to decide whether it was posted by a human or by a bot. As the faker bots improve, they'll be able to handle longer segments of connected text, and possibly to ever respond reasonably. (Eliza, Parry, Doctor, etc. show that ordinary human responses are often shallow enough to easily fake. And Eliza was *supposed* to be a counter-example.)

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    11. Re:Wew by sg_oneill · · Score: 1

      We should distrust it completely, as the paper gives no examples of any of the tweets or accounts they classified as being "bots". None whatsoever. Lots and lots of stats about their model and many implausible claims of it being perfect, but nothing that could be used to actually verify their claims.

      Or alternatively you could have read the paper and seen that it used the Cresci/De Pietro/Petrocchi/et..al dataset which is publically available and has been for a while now.

      --
      Excuse the Unicode crap in my posts. That's an apostrophe, and slashdot is busted.
  3. wow by Anonymous Coward · · Score: 0

    The standard soultion.
    I can't figure it out, so let's throw AI at it.

    Look bots are easy to detect. They all have tells and signatures. Just like humans, why because humans wrote them. If you want to detect a bot on the first tweet, all you need to do is think like a programer, and think how would I send out a tweet to 1 million people, without anyone knowing that this is a bot.

    In most cases you can simply look for ambiguous yet targeted langage. And example would be "hey look at this cool ".
    Another example is links that are not explained in the message.
    Another example would be copying a known headline and changing the data in a predictable way. " gets with ".

    Note if you or someone you know posts like this, then my firend you have a human bot at work.

    1. Re: wow by Anonymous Coward · · Score: 0

      The bots they want to catch don't even exist.
      Bots sway elections? No, those were people. None of that was written by bots.

    2. Re:wow by Sir+Lurkalot · · Score: 1

      Interesting...

    3. Re:wow by Anonymous Coward · · Score: 0

      If you want to detect a bot on the first tweet, all you need to do is think like a programer, and think how would I send out a tweet to 1 million people, without anyone knowing that this is a bot.

      Depends. If the recipients are all deplorables, you just need to use 'yuuuuge', 'tremendous' and 'sad' a lot.

  4. Hmm by TFlan91 · · Score: 2

    How does this resolve the case of my political uncle posting extreme ideas every week or two.

    Anyone outside the family would rightfully think he's a bot. He isn't, he's just that uncle.

    The first amendment protections required for a system like this would make it far too cumbersome for practical use. Yea, Twitter is proving the opposite case with their manual interventions, but there must be a middle ground

    1. Re:Hmm by LifesABeach · · Score: 0, Troll

      Let me see if I understand this correctly. A bunch H!B lying dumb asses created FaceBook. Now some really bad dudes that would have no second thoughts about deleting Cadet Bone Spurs and anyone else that poses a problem; shows up. When is enough, enough?

    2. Re:Hmm by Anonymous Coward · · Score: 0

      How does this resolve the case of my political uncle posting extreme ideas every week or two.

      Hrm. I think I've read some of his posts on /. Might be best to just have him put down.

    3. Re:Hmm by Anonymous Coward · · Score: 0

      "Extreme ideas"?

      I like ideas. What qualifies as extreme?

    4. Re:Hmm by Anonymous Coward · · Score: 0

      "I need a gun to protect myself from a dystopian future of government agents knocking down my door"
      "Fake News! Fake News! Only Fox is real!"

      Extremist, like the kid that killed those kids in Florida. He was the embodiment of Trump supporters. Muslims and Mexicans are not our problem, White Supremacists and far-right ideology is.

      Wanting to make peoples life better is not a threat to anything but hate and extremism. So don't even bring up crap about "libtards" being the enemy.

    5. Re:Hmm by Anonymous Coward · · Score: 0

      How does this resolve the case of my political uncle posting extreme ideas every week or two.

      Anyone outside the family would rightfully think he's a bot. He isn't, he's just that uncle.

      Well, he pretty much is a bot. Except that modern talking bots probably are better at arguing for their point and actively tries to avoid logic inconsistencies.

      Your uncle haven't given the ideas much thought, he is just posting the extreme ideas he got from someone else.
      Shut down the bots and he won't be fed the ideas anymore.
      Without someone else telling him what to think he will gradually become more normal, although less politically interested.

      Of course he will likely be a victim to the next one who shows up with all the answers to life that helps him get by without thinking.
      By shutting down the Russian bots your uncle with either find faith in some church or jump on the next pyramid scheme.
      If bitcoins are still a thing by then it might be his next thing.
      Although it seems that much of the bitcoin boom comes from Russians moving their money over there after the Magnitsky Act so if we start to hunt them down after all this is over we might experience a major bitcoin crash.

    6. Re:Hmm by DNS-and-BIND · · Score: 1

      Then he gets wrongfully accused and his rantings stop. There's no great loss to civilization. It's not worth it to let 100 guilty men go free than accuse a single innocent. Those are Enlightenment values - the same ones that created racism and justified slavery. They're as yesterday's news as your uncle.

      --
      Shutting down free speech with violence isn't fighting fascism. It IS fascism!
    7. Re:Hmm by cascadingstylesheet · · Score: 1

      How does this resolve the case of my political uncle posting extreme ideas every week or two.

      Anyone outside the family would rightfully think he's a bot. He isn't, he's just that uncle.

      The first amendment protections required for a system like this would make it far too cumbersome for practical use. Yea, Twitter is proving the opposite case with their manual interventions, but there must be a middle ground

      This is all about squelching unapproved opinions. Can't have people (or bots) "disparaging" Hillary Clinton, for example. We indict people for that now.

    8. Re:Hmm by Anonymous Coward · · Score: 0

      The shooters mother was a practicing jew (synagogue goer) and his father was a latino. And the boy is a prescription drug user because of mental health problems.
      And you say that he is a prime example of "white supremacist"?

  5. Everything I don't like is a Russian bot by Anonymous Coward · · Score: 0

    What a great way to silence opposition. They only issue is that most people will simply go to a different website that isn't being censored. You'll have to go deeper!

  6. Makes sense. by Anonymous Coward · · Score: 0

    LTSM has been a pretty good technique for generating / predicting sequential data (generating sentences, recognizing natural language phrases..) so it would make sense that it would be used to analyze behavior (sequential posts on social media) to determine botness.

    Now all you gotta do is couple it with an adversarial generator bot thats trying to outwit it, and you have the classic training pair.

  7. End result- really good bots. by Maxo-Texas · · Score: 1

    Antagonistic neural networks improves the quality of both networks.

    The detector will get better and the fake will get better. Quickly.

    --
    She was like chocolate when she drank... semi-sweet at first and then increasingly bitter.
  8. cowboyneal exposed by SchroedingersCat · · Score: 1

    I always suspected that about CowboyNeal. Now we will know the truth.

  9. Misleading definition by CustomSolvers2 · · Score: 1

    detecting bots, automated social media accounts governed by software but disguising as human users

    The expression "bot" is used to describe a wide variety of software applications, not just those emulating people in social media. In fact, the most common bots are the ones used by a big number of sites to retrieve information from internet for different purposes (e.g., search engines retrieving what they are showing to their users); they are also called crawlers or spiders. Here you can find a detailed list of active ones (I am the proud father of one of them :)).

    So, a better version of the summary would have been:

    detecting the social media bots disguised as human users

    --
    Custom Solvers 2.0 = Alvaro Carballo Garcia = varocarbas.
    1. Re:Misleading definition by Anonymous Coward · · Score: 0

      The expression "bot" is used to describe a wide variety of software applications

      Not just software applications.

      It is short for 'robot'. It is used for physical robots too, and pretty much any automated service.

    2. Re:Misleading definition by CustomSolvers2 · · Score: 1

      It is short for 'robot'. It is used for physical robots too

      Sure. I meant that, even within this specific context of software/internet, that expression is commonly used for much more than just the referred malware-like subtype.

      --
      Custom Solvers 2.0 = Alvaro Carballo Garcia = varocarbas.
  10. Take privacy serious then we talk by Anonymous Coward · · Score: 0

    Else, get fucked. Until companies are held responsible - to the degree they bow before libtards - none of this will change. That is kind of the point.

    The article is pure clickbait. Just because there's no direct financial link between stolen identities doesn't undermine the importance of securing them. That is all we're being led to believe though. That financial implications are the only worry we should have. Fuck that.

    We can't even drag the CEO's behind absolutely disgusting hacks like Equifax to jail, how in the world are we going to deal with smaller hacks like those that hit Bell Canada?

    Oh right, 13 Russians are all we should worry about. No.

    The answer is NOT more monitoring. It's going after the bastards on Wall St., Bay St., and any other place they hang out.

  11. Deep Neural Networks for Bot Detection Evasion by aglider · · Score: 1

    Easy, isn't it?

    --
    Sent as ripples into the electromagnetic field. No single photon has been harmed in the process.
  12. Ideal for an adversarial generative network by Anonymous Coward · · Score: 0

    This looks like it will be ideal to create an one half adversarial generative network. Plug a spam bot into the other side and before we know it the spam bots will be even more convincing. Yay :-(

  13. Overkill by An+dochasac · · Score: 1

    You don't need deep neural networks when this will do:

    egrep 'MAGA|NO COLLUSION||FAKE NEWS|LIBTARD' > /russian_bots.txt