Slashdot Mirror


Over 40% of New Mechanical Turk Jobs Involve Spam

An anonymous reader writes "An NYU study reveals that over 40% of the jobs posted by new employers on MTurk are some sort of spam request, such as fake account creation, fraudulent ad clicks, or fake comments, tweets, likes and votes. The study also shows that the bad jobs could be automatically filtered with 95% accuracy, but Amazon is not interested."

21 of 56 comments (clear)

  1. Good! More people should play chess. by santax · · Score: 2
    1. Re:Good! More people should play chess. by oldspewey · · Score: 2

      You won't know until you open the lid on the box.

      --
      If libertarians are so opposed to effective government, why don't they all move to Somalia?
  2. Amazon is not interested by whathappenedtomonday · · Score: 3, Insightful
    Because Amazon only cares about ToS, and about nothing else.

    "We look forward to continuing to serve our AWS customers and are excited about several new things we have coming your way in the next few months."

    Well, I'm looking forward to you confirming the deletion of my account I requested a week ago. And that 2nd part sounds like a threat.

    --
    I hope I didn't brain my damage.
  3. Hmmm by sexconker · · Score: 4, Interesting

    So when 40% of their MT service usage is contrary to the ToS, everything's fine and dandy.

    But when Wikileaks is in full compliance with the ToS of their EC2 service, they get the boot?

  4. MTurk by fiannaFailMan · · Score: 4, Informative

    I had to look this up.

    Amazon Mechanical Turk (beta)

    Amazon Mechanical Turk is a marketplace for work that requires human intelligence. The Mechanical Turk web service enables companies to programmatically access this marketplace and a diverse, on-demand workforce. Developers can leverage this service to build human intelligence directly into their applications.

    While computing technology continues to improve, there are still many things that human beings can do much more effectively than computers, such as identifying objects in a photo or video, performing data de-duplication, transcribing audio recordings or researching data details. Traditionally, tasks like this have been accomplished by hiring a large temporary workforce (which is time consuming, expensive and difficult to scale) or have gone undone.

    Mechanical Turk aims to make accessing human intelligence simple, scalable, and cost-effective. Businesses or developers needing tasks done (called Human Intelligence Tasks or “HITs”) can use the robust Mechanical Turk APIs to access thousands of high quality, low cost, global, on-demand workers—and then programmatically integrate the results of that work directly into their business processes and systems. Mechanical Turk enables developers and businesses to achieve their goals more quickly and at a lower cost than was previously possible.

    --
    Drill baby drill - on Mars
    1. Re:MTurk by Anonymous Coward · · Score: 2, Funny

      A little more informative.

      http://en.wikipedia.org/wiki/The_Turk

      The Turk, the Mechanical Turk or Automaton Chess Player was a fake chess-playing machine constructed in the late 18th century. From 1770 until its destruction by fire in 1854, it was exhibited by various owners as an automaton, though it was exposed in the early 1820s as an elaborate hoax.[1] Constructed and unveiled in 1770 by Wolfgang von Kempelen (1734–1804) to impress the Empress Maria Theresa, the mechanism appeared to be able to play a strong game of chess against a human opponent, as well as perform the knight's tour, a puzzle that requires the player to move a knight to occupy every square of a chessboard exactly once.

      The Turk was in fact a mechanical illusion that allowed a human chess master hiding inside to operate the machine. With a skilled operator, the Turk won most of the games played during its demonstrations around Europe and the Americas for nearly 84 years, playing and defeating many challengers including statesmen such as Napoleon Bonaparte and Benjamin Franklin. Although many had suspected the hidden human operator, the hoax was initially revealed only in the 1820s by the Londoner Robert Willis.[2] The operator(s) within the mechanism during Kempelen's original tour remains a mystery. When the device was later purchased and exhibited by Johann Nepomuk Mälzel, the chess masters who secretly operated it included Johann Allgaier, Boncourt, Aaron Alexandre, William Lewis, Jacques Mouret, and William Schlumberger.

      Figures the original was a scam.

    2. Re:MTurk by nickb64 · · Score: 2

      I take it you haven't read Doctorow's For The Win, then? If not, then I definitely suggest it.

    3. Re:MTurk by Monchanger · · Score: 2

      Calling Turkey "islamic" is in fact offensive, since Islam does not define that country, just as "Anglican" does not define the UK.

      That you do not know this and make awful jokes reflects poorly on me, thus the need to apologize for my failing to ensure you don't run around the world being a jackass. It doesn't have to be my house; an idiot child brought to a restaurant demands that apologies be made to the other guests there as much as to the proprietor.

      When you can comprehend these points, you may apologize to me. Until then, you can feel just as free to go fuck off yourself.

  5. Filtering by AndrewNeo · · Score: 4, Funny

    So, would the filtering of bad services from MTurk be performed using MTurk?

  6. Profit by The+Raven · · Score: 4, Insightful

    Same reason the USPS likes bulk mailers... they keep the operation afloat. Especially as more and more people turn to email.

    --
    "I will trust Google to 'do no evil' until the founders no longer run it." Hello Alphabet.
  7. What did you think was going to happen? by Khopesh · · Score: 2

    I know a few research scientists who use the Turk for some awesome ideas (it's a LOT cheaper than in-person human subjects and the people you get aren't homeless, drunks, or freshman psych students fulfilling requirements). However, there is little money in (non-military) basic research at the moment, and only a fraction of that even requires human subjects.

    The rest is merely a new breed of on-demand advertising and promotion. Amazon is still getting paid, so they likely don't care. I'd argue that if they don't want to squash the problem altogether that they should at least isolate it to grant people an easier time in going wherever they were heading, e.g. "help me solve vision" versus "help me get popular"

    --
    Use my userscript to add story images to Slashdot. There's no going back.
  8. 95% Accuracy? by gringer · · Score: 2

    "Accuracy" is a difficult measure to quantify. I see from reading the article that the accuracy has been estimated at 95% due to a a 95% true positive rate and a 95% true negative rate. Given that the current spam rate is 40%, these rates aren't particularly bad, but Amazon would still have quite a few problems with angry customers. Assuming 1500 HITs per day, and 60% of those non-spam submissions, 45 would be falsely flagged as spam.

    --
    Ask me about repetitive DNA
  9. And that was before Google Places appeared in Web by Animats · · Score: 4, Interesting

    That data is from two months back, before Google Places appeared in web search. Now, it's worse. There's a whole mini-industry in the "black hat" search engine "optimization" community creating phony Google Places entries. Here's an ad on Mechanical Turk today:

    Reno Gym - Google Maps Promotion (Client QMDHKOB)
    Requester: Smartsheet.com Clients
    HIT Expiration Date: Dec 18, 2010 (10 hours 52 minutes) Time Allotted: 60 minutes
    Reward: $0.25 HITs Available: 2
    Description:

    • Follow Instructions on PDF attached for BUSINESS ADDRESS (1)
    • Repeat Instructions on page 5 to 14 for BUSINESS ADDRESS (2) and (3) below.
      GMAIL ADDRESS: [Create a new Gmail Account] PASSWORD:
      BUSINESS ADDRESSES:
      • (1) 6370 Mae Anne Avenue, Reno, NV 89523
      • (2) 4784 Caughlin Parkway, Reno, NV 89519
      • (3) 18603 Wedge Parkway, Reno, NV 89511

      BUSINESS TITLE AND FULL ADDRESSES:

      • (1) Anytime Fitness 6370 Mae Anne Avenue, Reno, NV 89523 (775) 746-8400
      • (2) Anytime Fitness 4784 Caughlin Parkway, Reno, NV 89519 (775) 622-8034
      • (3) Anytime Fitness 18603 Wedge Parkway, Reno, NV 89511 (775) 852-7007

      WEBSITE URL: http://renogyms.org/
      GOOGLE PLACES URLs:

    Keywords: Smartsheet, Reno, Gym, Google, Maps, Promotion, QMDHKOB

    Google Places spamming hasn't been fully automated yet, so we get to watch spammers outsource their manual spamming. Spamming Google Places is incredibly easy, much easier than creating the link farms required to spam Google's old web search. See the instructions in "Dominating Google Maps- The Most Effective Spam Ever And What You Can Learn From It".

    Google Places has been 0wned.

  10. Is the filtering of bad workers the same? by YesIAmAScript · · Score: 3, Interesting

    I find it interesting that the people placing the HITs have to decide whether the work done is good quality and then decide to pay or not. So that means for each tiny job you farm out, you have to do your own tiny bit of make work to decide whether to pay or not. Can you farm this out on the turk too? If not, maybe there's a market for a service that let's you do so...

    --
    http://lkml.org/lkml/2005/8/20/95
    1. Re:Is the filtering of bad workers the same? by whitehaint · · Score: 2

      I did the turk thing for a couple days, and you nailed it. Sometimes a job would pop up reviewing someone else's work. I quit though because being offered pittance to do some tasks that take time, then having the person take the work and not pay (you have no control of that) or decide to sit on it for a few weeks, well it's crap.

  11. Only 40%? by D+J+Horn · · Score: 3, Interesting

    From my time exploring mturk I would have guessed it to be much higher than that, non-spam related jobs were definitely the minority of what I saw.

    The creepiest (and highest paying) job I saw though involved watching surveillance footage from airports, making sure the automated face tracker stayed on target...

  12. Re:Hmmm by forkfail · · Score: 4, Funny

    So, obviously, Wikileaks should have hired people at 0.0001 cents per word to type in the leaked documents.

    --
    Check your premises.
  13. Re:Wait, humans taking over robot jobs? by icebike · · Score: 3, Interesting

    The surprise is that anyone noticed all these HIT requests.

    Who, other than the utterly unemployable, has time to take on meaningless tasks dished out by machine for pennies. You can find more money laying on the ground in a parking lot.

    A casual perusal didn't find one task I would do for fun or profit.

    --
    Sig Battery depleted. Reverting to safe mode.
  14. Re:And that was before Google Places appeared in W by cowtamer · · Score: 2

    I hope the folks at Google start trolling the same MTurk job listings to mark down location spam for what it is...

  15. Re:And that was before Google Places appeared in W by Animats · · Score: 5, Interesting

    I'm also surprised at how low the wages are at this Turk thing. ... I thought spammers had to at least sweat through that manual task by themselves.

    It's like $0.25 per human-generated spam. Automation seems to be coming. I'm seeing mentions on black hat SEO forums that an automated tool for doing this in bulk will be released early next month.

    Marketing fake numerical addresses in between legit ones ensures that Google Pagerank rates your "unique" business as #1...

    Sometimes. That technique is mostly used to give real businesses extra bogus locations. Check out "New York City locksmith", for example. Other heavily spammed terms are "carpet cleaning" and "divorce lawyer".

    This week's new technique is described at "How To Spam Google Maps For Top Google Place Listings". This is like SQL injection for mailing addresses. The trick depends on Google's parsing of mailing addresses from the top, while USPS standards say they should be parsed from the bottom line upward. So a mailing address with two street addresses is parsed differently by the USPS and Google, allowing the spammer to redirect Google's confirmation postcard to some mail drop.

    Google seems to be out to lunch in this area. The same exploits have been working for months. Yet Google doesn't list any such issues under "Known Issues. Over on Matt Cutts' blog, where you'd expect to see some discussion of this, he reports that he's writing a novel.

    It's even worse at Bing. Bing emulated Google's October 27th merger of Places into web search within a few days. But they weren't ready. Look up "New York City locksmith" in Bing, and the five "Places" entries are all the same business.

  16. Re:Fake comments, eh? by Xaositecte · · Score: 2

    So is this one

    but in /. posting xkcd links automatically overcomes the filter.