How Much Internet Traffic Is Fake? Turns Out, a Lot of It, Actually. (nymag.com)
Long-time Slashdot reader AmiMoJo shared this article from New York magazine:
In late November, the Justice Department unsealed indictments against eight people accused of fleecing advertisers of $36 million in two of the largest digital ad-fraud operations ever uncovered... Hucksters infected 1.7 million computers with malware that remotely directed traffic to "spoofed" websites.... [B]ots "faked clicks, mouse movements, and social network login information to masquerade as engaged human consumers." Some were sent to browse the internet to gather tracking cookies from other websites, just as a human visitor would have done through regular behavior. Fake people with fake cookies and fake social-media accounts, fake-moving their fake cursors, fake-clicking on fake websites -- the fraudsters had essentially created a simulacrum of the internet, where the only real things were the ads.
How much of the internet is fake? Studies generally suggest that, year after year, less than 60 percent of web traffic is human; some years, according to some researchers, a healthy majority of it is bot. For a period of time in 2013, the Times reported this year, a full half of YouTube traffic was "bots masquerading as people," a portion so high that employees feared an inflection point after which YouTube's systems for detecting fraudulent traffic would begin to regard bot traffic as real and human traffic as fake. They called this hypothetical event "the Inversion...."
[N]ot even Facebook, the world's greatest data-gathering organization, seems able to produce genuine figures. In October, small advertisers filed suit against the social-media giant, accusing it of covering up, for a year, its significant overstatements of the time users spent watching videos on the platform (by 60 to 80âpercent, Facebook says; by 150 to 900 percent, the plaintiffs say). According to an exhaustive list at MarketingLand, over the past two years Facebook has admitted to misreporting the reach of posts on Facebook Pages (in two different ways), the rate at which viewers complete ad videos, the average time spent reading its "Instant Articles," the amount of referral traffic from Facebook to external websites, the number of views that videos received via Facebook's mobile site, and the number of video views in Instant Articles.
On Twitter the author also shared a Twitter thread by the Washington Post's director of advertising technology, who shares his own complaints about the ecosystem of online advertising. "The problem isn't just that the internet is full of fakery and bullshit and bad numbers and malfunctioning metrics and bullshitters and fraudsters. The problem is that all the fake shit is layered on top of other fake shit and it just COMPOUNDS itself... Like you get fake users, who get autoplay videos which no one is really watching....
"That's not even counting the entire ad campaigns that are fake where the product is just a bullshit excuse to collect data on you."
How much of the internet is fake? Studies generally suggest that, year after year, less than 60 percent of web traffic is human; some years, according to some researchers, a healthy majority of it is bot. For a period of time in 2013, the Times reported this year, a full half of YouTube traffic was "bots masquerading as people," a portion so high that employees feared an inflection point after which YouTube's systems for detecting fraudulent traffic would begin to regard bot traffic as real and human traffic as fake. They called this hypothetical event "the Inversion...."
[N]ot even Facebook, the world's greatest data-gathering organization, seems able to produce genuine figures. In October, small advertisers filed suit against the social-media giant, accusing it of covering up, for a year, its significant overstatements of the time users spent watching videos on the platform (by 60 to 80âpercent, Facebook says; by 150 to 900 percent, the plaintiffs say). According to an exhaustive list at MarketingLand, over the past two years Facebook has admitted to misreporting the reach of posts on Facebook Pages (in two different ways), the rate at which viewers complete ad videos, the average time spent reading its "Instant Articles," the amount of referral traffic from Facebook to external websites, the number of views that videos received via Facebook's mobile site, and the number of video views in Instant Articles.
On Twitter the author also shared a Twitter thread by the Washington Post's director of advertising technology, who shares his own complaints about the ecosystem of online advertising. "The problem isn't just that the internet is full of fakery and bullshit and bad numbers and malfunctioning metrics and bullshitters and fraudsters. The problem is that all the fake shit is layered on top of other fake shit and it just COMPOUNDS itself... Like you get fake users, who get autoplay videos which no one is really watching....
"That's not even counting the entire ad campaigns that are fake where the product is just a bullshit excuse to collect data on you."
Pond scum feeding on pond scum. I'm having a hard time drumming up concern.
Why can't the operators of these servers join a multi-publisher subscription network? Two decades ago, such a network called Adult Check was popular, founded on the principle that adults can pay for nice things. One $10/mo payment bought access to all sites that took Adult Check, and the network paid publishers per page view. This helped to alleviate the sticker shock from each website charging a separate subscription.
More recently, Google Contributor could have been that network. The biggest problem with Contributor is lack of privacy, as it shares a parent company with AdSense and DoubleClick. This means Google can use page history gathered through Contributor to infer interests of a Contributor user for use on sites using Google adtech.
The moderation mechanism described in xkcd #810 already resembles that in use on various forums and Q&A sites, such as Slashdot and Stack Overflow.
1. Each newly registered user sees a page of what Stack Overflow calls "review audits". This resembles Slashdot metamoderation: does what the new user sees as constructive align with what established users see as constructive?
2. Anyone who gets most of the review audits correct has posts placed in "awaiting moderation" state. Only established users can see such a post until at least one established user upvotes the post.
3. Once a user is firmly in positive reputation/karma, the user's posts skip the "awaiting moderation" state.
Yet this hasn't led to any artificial intelligence breakthroughs on the part of the spam industry. Instead, I've noticed that spammers on forums.nesdev.com appear to be humans in low-exchange-rate countries. They search for an old post, reword it, start a discussion, and days later edit the post to include off-topic commercial links. A user who isn't paying close attention is unlikely to see this karma whoring for what it is.
Don't care. Nary a tear shed.
Fucking parasites.
Eventually they'll catch on, but for now, One of life's little pleasures is to make it tough as possible for those bottomfeeders to make a buck.
I've always maintained that the way to beat the panopticon companies isn't with ad blockers and privacy legislation. It's to dilute the value of the data they collect by inserting so much fake data that they can no longer sufficiently distinguish real people from the bots.
There's an apocryphal story that after the end of the Cold War, a bunch of the CIA and KGB got together for drinks. The CIA spooks lamented that theirs had been the harder job. The Soviet Union was such a closed society and had so many restrictions on travel that it was virtually impossible for the CIA to get a spy in there, whereas all the KGB had to do was drive to a town next to a military base and mingle with staff from the base eating lunch there. The KGB spooks disagreed and claimed that theirs had been the harder job. The U.S. produced so much information that it was virtually impossible for them to separate out fact from fiction. If the National Enquirer ran a story about the military working on a, or some conspiracy theorist reported the military was controlling their brain waves with weather balloons, they had to devote resources to figure out if the stories were real or fake.
Then there is the issue of "personalized ads" which ARE relevant but don't generate a purchase. For example, Google serving you lawnmower ads after you 've just bought the only lawnmower you will need for the next 10 years, or ads for holiday packages in your hometown you are never going to buy because your parents have a spare room to share, or ads by resellers you are never going to buy from because you buy from Amazon. But advertisers pay for those ads because users click on them accidentally or for kicks. Personalized ads are scams all the way down.