Slashdot Mirror


How 136 People Became 7 Million Illegal File-Sharers

Barence writes "The British government's official figures on the level of illegal file sharing in the UK come from questionable research commissioned by the music industry. The Radio 4 show named More or Less examined the government's claim that 7m people in Britain are engaged in illegal file sharing. The 7m figure actually came from a report written about music industry losses for Forrester subsidiary Jupiter Research. The report was privately commissioned by none other than the UK's music trade body, the BPI. The 7m figure had been rounded up from an actual figure of 6.7m, gleaned from a 2008 survey of 1,176 net-connected households, 11.6% of which admitted to having used file-sharing software — in other words, only 136 people. That 11.6% was adjusted upwards to 16.3% 'to reflect the assumption that fewer people admit to file sharing than actually do it.' The 6.7m figure was then calculated based on an estimated number of internet users that disagreed with the government's own estimate. The wholly unsubstantiated 7m figure was then released as an official statistic."

10 of 313 comments (clear)

  1. Wait, you believed them? by girlintraining · · Score: 3, Informative

    They think that a single copy of a song is worth over a hundred thousand dollars too. They claim to lose more in revenue each month than the GDP of most countries. All because of those dyyyeaaarrrn pirates. Enron looks positively boring in comparison to the accounting techniques the recording industry uses. None of this is news. About the only people that buy this crap are judges and legislators -- the rest of us are almost universally of the mindset that a bag of potato chips has more value than most of the recording industry's portfolio.

    --
    #fuckbeta #iamslashdot #dicemustdie
  2. the story title is kind of lame by Trepidity · · Score: 5, Informative

    Some of the estimation steps might be sketchy, but the basic practice of estimating a population proportion from a sample of that population is not particularly questionable. That's how almost all studies of populations work, because taking censuses of all people in a country is rarely feasible. We have century-old statistical theory on how to put bounds on the sampling error, too, assuming the sample was indeed random.

    You could have a whole slew of these stories if you really objected to that basic methodology, e.g. nearly every estimate of N million people suffering from a disease or disorder is based on a sample.

  3. Re:Story meaning? by wizardforce · · Score: 4, Informative

    So could someone please explain *why* is it a questionable research.

    1. the same size is small.. probably too small to make the claims they did. 2. they altered the numbers on an estimate of how many people fileshare on the assumption that the number was under-reported. 3. conflict of interest... it's like the tobacco industry sponsoring studies claiming that smoking doesn't have anything to do with lung cancer... there is significant reason to believe that the study carries significant bias in favor of their conclusion and must at the least be repeated by other sources.

    So what is the point of this story? That statistics researches use only minor subset or people to do their research instead of asking from everyone? They always have.

    N. real statistics researchers know that this study has numerable crippling flaws and should not be held as gospel by anyone. Even a first year stats student can see it. The reason this story is important is that it may influence governmental policy and it's flawed... That's dangerous.

    --
    Sigs are too short to say anything truly profound so read the above post instead.
  4. Re:Story meaning? by Trepidity · · Score: 4, Informative

    It doesn't really make sense to claim "sample size is small" for an 1,100-person sample. If the sampling was done in a random, unbiased manner, that size sample gives a margin of error of +/- 3%. If there are flaws in the sampling method, that's another thing, but the sample size alone doesn't seem problematic, unless you need accuracy better than +/- 3%.

  5. Re:Story meaning? by Trepidity · · Score: 3, Informative

    Basically, except that the confidence level for the interval is 95%, not 50%. Should've quoted that, but 95% is the usual assumed one.

  6. Re:Story meaning? by Anonymous Coward · · Score: 4, Informative

    A margin of error of +/- 3% is the Maximum margin of error for a random sample of 1100 drawn from a large enough population at the 95% significance level (actually its really +/-2.95%), i.e this is the margin of error when the observed % is 50% , The margin of error is less when the observed % approaches 0 or 100%.

    In the case of an observed % of 11.6 the margin of error is +/-1.9% so it is 95% likely that the population figure is between 9.8% and 13.5%

  7. Re:Story meaning? by Atario · · Score: 5, Informative

    it's A SMALL SAMPLE

    No, it's not.

    http://www.raosoft.com/samplesize.html

    About 60 million people in the UK, sample size of 1,176, confidence interval of 96% gives a margin of error of 2.99%. So, it's 96% likely that they got within 2.99% of the right answer (to the question of how many people admit to it).

    I hate seeing this "that's too small a sample size" objection to every single study, from people who clearly don't know enough about how sample sizes work.

    --
    "A great democracy must be progressive or it will soon cease to be a great democracy." --Theodore Roosevelt
  8. Re:Story meaning? by wizardforce · · Score: 3, Informative

    I just don't understand the stance that most people on this board seem to take regarding this issue. How can everyone be so supportive of what very obviously amounts to theft?

    not everyone does obviously... most reasonable slashdotters advocate for reformed copyright pertly because of the unenforceable nature of longer copyright terms. many such as myself support the concept of a shorter more reasonable copyright term that does what the constitution requires: encourage the advancement of the arts.

    If you do indeed use all file-sharing applications for 100% legit purposes, please educate me what you use these services for that makes them so very essential to cause these very emotional posts here.

    most of the anger is directed toward the music/movie industry's response to piracy- weaken/destroy fair use, demonize all p2p [possibly restricting its use in the future out of fear] suing people as a scare tactic, excessive/un-constitutional fines, DRMed media etc...

    --
    Sigs are too short to say anything truly profound so read the above post instead.
  9. Scoundrel Statistics by anyaristow · · Score: 5, Informative

    Even a first year stats student can see it.

    This is almost as cliche in arguments of statistics as the car analogy is on slashdot, and it's the sign of a scoundrel. If you actually had a first year stat student's understanding of stats you'd know where the weaknesses actually are, and where all the rest of the smoke blown in this discussion goes laghably wrong.

    So let's apply some first year stats to the issue.

    First, the sample size. Whether it is numerically large enough to be useful is a matter not only of it's size but also the number of positive results. IOW, a sample size of 1176 is too small if you found 3 of what you're looking for, but if you found 136 (11.6% of 1176), you have plenty of samples. The question is then only whether you had a representative sample.

    My next concern would be precision. Using data with three or four significant digits (136, 1176) to make conclusions to seven significant digits (11.56463%) is silly, but that doesn't seem to have happened here. The only number in all of this that is fishy is the 16.3% number. To get three significant digits they'd have to know the number of lying households to that precision. If they had another study that determined this number they might very well have a number to that precision, but I'm assuming they just guessed.

    That's still not a problem. If you guess, you run your confidence interval through your formulae (here it's a simple product) to put a range on your results. If it's a from-your-ass guess you might put a 100% failure estimate on your low end (i.e. there might be no lying households at all) to arrive at a conservative range. Here, it looks like they used an estimate of 40%. They should have (and might have; I didn't RTFA) run the un-adjusted 11.6% through the formulae to get a conservative low-end range.

    Anyway, the number they finally used was 7%. One significant digit. That doesn't imply the same precision as, say, 6.7% would. In fact, if their figure for the number of lying households really was accurate to one digit (i.e. 35-45%) then rounding their final result to one digit was the correct procedure. If it was just a guess they should have run the absolute low estimate (probably, zero lying households) through to get a range.

    So, with actual first year stat knowledge it's possible to actually state what might be wrong with the study, and not resort to "any first year stat student" hand-waving. It's clear that the most-cited criticism (the sample size) is the result of ignorance and group think, not actual knowledge of statistics.

  10. Re:Why the BBC rocks by _Shad0w_ · · Score: 3, Informative

    You only need one license, you can have as many tellies as you like. Portable tellies used in caravans and the like will be covered by the license for your home as well.

    If you have two houses, you will need two licenses though, afaicr - which is why students away at Uni need to buy a license - including if they're in halls - even though their permanent residence might still be their parent's house.

    I find the BBC great value and love it dearly. I suspect people will say that's because I'm white, middle class and liberal or something.

    --

    Yeah, I had a sig once; I got bored of it.