How 136 People Became 7 Million Illegal File-Sharers
Barence writes "The British government's official figures on the level of illegal file sharing in the UK come from questionable research commissioned by the music industry. The Radio 4 show named More or Less examined the government's claim that 7m people in Britain are engaged in illegal file sharing. The 7m figure actually came from a report written about music industry losses for Forrester subsidiary Jupiter Research. The report was privately commissioned by none other than the UK's music trade body, the BPI. The 7m figure had been rounded up from an actual figure of 6.7m, gleaned from a 2008 survey of 1,176 net-connected households, 11.6% of which admitted to having used file-sharing software — in other words, only 136 people. That 11.6% was adjusted upwards to 16.3% 'to reflect the assumption that fewer people admit to file sharing than actually do it.' The 6.7m figure was then calculated based on an estimated number of internet users that disagreed with the government's own estimate. The wholly unsubstantiated 7m figure was then released as an official statistic."
I actually had several feelings about this summery, because:
1) Usually pro-filesharers try to make it sound like filesharing is usual activity and try go for most or 70-90% user share
2) The summary tries to paint this study bad because it "downsides" the amount of filesharers
3) The rant about examining only 1,176 people for the study - in which case the same kind of tv viewer statistics and other studies are made in what case.
So could someone please explain *why* is it a questionable research. It is like every other study where you study small amount of people and make estimates based on it to reflect whole population. Usually this amount of people also gives somewhat correct results on the whole population. Theres some error margin, but its close enough.
So what is the point of this story? That statistics researches use only minor subset or people to do their research instead of asking from everyone? They always have.
That's what I was thinking. The summary makes it seem that estimating the number that high is outrageous. I certainly wouldn't wager any money that it's significantly higher than actual piracy.
What doesn't kill you only delays the inevitable
Whenever you estimate a statistic like that, you should also indicate the level of uncertainty surrounding the estimate. Why are they not reporting the upper and lower bounds of the confidence interval surrounding that estimate?
They think that a single copy of a song is worth over a hundred thousand dollars too. They claim to lose more in revenue each month than the GDP of most countries. All because of those dyyyeaaarrrn pirates. Enron looks positively boring in comparison to the accounting techniques the recording industry uses. None of this is news. About the only people that buy this crap are judges and legislators -- the rest of us are almost universally of the mindset that a bag of potato chips has more value than most of the recording industry's portfolio.
#fuckbeta #iamslashdot #dicemustdie
Using file-sharing software does not equate to sharing files illegally. I admit to using BitTorrent to download Fedora ISO's, and there's nothing illegal about that.
Some of the estimation steps might be sketchy, but the basic practice of estimating a population proportion from a sample of that population is not particularly questionable. That's how almost all studies of populations work, because taking censuses of all people in a country is rarely feasible. We have century-old statistical theory on how to put bounds on the sampling error, too, assuming the sample was indeed random.
You could have a whole slew of these stories if you really objected to that basic methodology, e.g. nearly every estimate of N million people suffering from a disease or disorder is based on a sample.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
"If they facts don't fit the theory, change the facts."
~Albert Einstein
Funny may not give karma, but +5 Informative never made anyone snort coffee out their nose.
maybe the authors of the study were taught math skills through unschooling?
weinersmith
using file sharing software does not mean you pirate software or media.....
136 out of 1176 people in households with internet connections admitted to having used file-sharing software (source: the summary)
18.3 million households in the UK had internet access at time of polling in 2009 (source: http://www.statistics.gov.uk/CCI/nugget.asp?ID=8 )
136/1176 * 18.3M ~= 2.12M
Not sure if "having used file-sharing software" means that they downloaded / distributed at least 1 item - say, a song - via said software and that they had no actual rights to do so (you know, as most people use file-sharing software to distribute Linux distros, or have simply 'used it' but didn't actually download or upload anything... *cough*)...
But let's presume it does.
Then let's take the low price in iTunes UK of GBP 0.79 per song, then the music industry 'lost' ('cos obviously people had no intention of buying that song that they didn't download / distribute because they were downloading a Linux distro instead *cough*) about GBP 1,671,897.96.
Well, that's peanuts, innit.
This is yet another example as to why the BBC is the finest broadcasting and journalistic organisation on the planet (I've never worked for them, sold to them or have any other financial connection other than the license fee).
They actually investigated something created by an industry group and found it to be bollocks and then reported it. The BBC are arguably the most "socialist" organisation in the democratic world (funded by a tax on everyone for the benefit of everyone) and yet they still question and challenge everything.
The US seriously needs something that questions vested interests and rubbish statistics as much as the BBC. Jon Stewart and Bill Maher are just comedians and FoxNews is just comedy.
Given a choice between the first amendment and the BBC, I'll take the BBC; its demonstrated more freedom of speech in a week than the US media has in a decade.
An Eye for an Eye will make the whole world blind - Gandhi
When you know the total population of the UK is roughly 30 million households, that's a fair chunk of the population. (total population is roughly 60 million people)
Out of the total population, only 18.7 million have broadband. Guess roughly 40% of the population is a pirate then. We should make it legal, government being there for the populace and all that.
O "Statistics are like a drunk with a lampost: used more for support than illumination."
O "The only statistics you can trust are those you falsified yourself."
Tick one.
> Work backwards from the undisputed declining sales figures of the recording industry.
The main reason for declining sales is the fact that CD sales during the 90s were artificially boosted by people replacing records and tapes with CDs... then replacing them again when remastered CDs were released a few years later. It was a once-in-a-lifetime event for the recording industry that won't be repeated during our lifetimes.
People re-bought CDs they already owned in analog (or optimized-for-analog CDs) because they represented an epic improvement in quality by just about any meaningful standard over the analog media they replaced. Everything that's come out since CDs has only been cheaper, shittier-sounding, or intolerably-crippled by DRM.
Here's an idea for the music industry: ditch the DRM'ed formats, and roll out a music format on DVD media with 96KHz 32-bit stereo PCM. Make the discs gold-colored, call it something like "X-fi", and sell them for $24.95. You'll win on all counts -- genX'ers will go back into highschool mode and buy them to show off how rich they are and/or pretend they sound sufficiently better than 16-bit CDs to justify spending ~twice as much on them, and the fact that every disc will be ~4-8 gigabytes will serve as self-limiting DRM for the next decade or so. Just make sure they still have the MOST compelling consumer benefit intact (and reason why people who buy CDs still DO buy CDs): it's a flawless first-generation master to use for making all your "working" copies for everywhere else.
As a solipsist I'd say everyone does it.
This is almost as cliche in arguments of statistics as the car analogy is on slashdot, and it's the sign of a scoundrel. If you actually had a first year stat student's understanding of stats you'd know where the weaknesses actually are, and where all the rest of the smoke blown in this discussion goes laghably wrong.
So let's apply some first year stats to the issue.
First, the sample size. Whether it is numerically large enough to be useful is a matter not only of it's size but also the number of positive results. IOW, a sample size of 1176 is too small if you found 3 of what you're looking for, but if you found 136 (11.6% of 1176), you have plenty of samples. The question is then only whether you had a representative sample.
My next concern would be precision. Using data with three or four significant digits (136, 1176) to make conclusions to seven significant digits (11.56463%) is silly, but that doesn't seem to have happened here. The only number in all of this that is fishy is the 16.3% number. To get three significant digits they'd have to know the number of lying households to that precision. If they had another study that determined this number they might very well have a number to that precision, but I'm assuming they just guessed.
That's still not a problem. If you guess, you run your confidence interval through your formulae (here it's a simple product) to put a range on your results. If it's a from-your-ass guess you might put a 100% failure estimate on your low end (i.e. there might be no lying households at all) to arrive at a conservative range. Here, it looks like they used an estimate of 40%. They should have (and might have; I didn't RTFA) run the un-adjusted 11.6% through the formulae to get a conservative low-end range.
Anyway, the number they finally used was 7%. One significant digit. That doesn't imply the same precision as, say, 6.7% would. In fact, if their figure for the number of lying households really was accurate to one digit (i.e. 35-45%) then rounding their final result to one digit was the correct procedure. If it was just a guess they should have run the absolute low estimate (probably, zero lying households) through to get a range.
So, with actual first year stat knowledge it's possible to actually state what might be wrong with the study, and not resort to "any first year stat student" hand-waving. It's clear that the most-cited criticism (the sample size) is the result of ignorance and group think, not actual knowledge of statistics.
Assume nothing. Google is your friend.
Google: sample size
First result has all you need.
And liars figure.
The best way to shut these slime-oids up would be to conduct a forensic audit of their royalty payments to artists. I bet not one of the companies would come out clean.
I've calculated my velocity with such exquisite precision that I have no idea where I am.