I’m sorry but representative samples are 100% unattainable

[tweetmeme source=”lovestats” only_single=false]Statistics are just numbers. 1 + 2 is always 3 even if the 2 was written in a disgusting colour. People, on the other hand, have crappy days all the time. It could be because a lunch was packed without cookies or because horrible tragedy has struck.

So why does it matter? Because crappy days mean someone:

  • doesn’t answer a phone survey
  • lies on their taxes
  • makes a mistake on the census survey
  • accidentally skips page 2 on a paper survey
  • drips sarcasm all over their facebook page

You recognize these. We call them data quality issues.

Statistics lull us into a false sense of accuracy. Statistics are based on premises which do not hold true for beings with independent thought. Statistics lead us to believe that representative samples are possible when theory dictates it is impossible. Though a million times better than the humanities can ever dream of achieving, even “real” science can’t achieve representative samples. The universe is just far too big to allow that.

In other words, even when you’ve done everything statistically possible to ensure a rep sample, humans and their independent thought have had a crappy day somewhere in your research design.

There is no such thing as a rep sample. There are only good approximations of what we think a rep sample would look like.

And because I AM CANADIAN, I apologize if I have crushed any notions.

Read these too

  • #Netgain5 Keynote Roundup: Last Thoughts #MRX #li
  • The Death of Social Media Research #MRX
  • Will it blend?
  • The Dumbing Down of America (and Canada)
  • 10 items you must include in every successful list
  • 6 responses

    1. I agree, and thanks for this post! It can’t be perfectly representative because you can’t know where all of the bias can creep in, and even if you did, you wouldn’t be able to control for it. But of course that means you do your best, acknowledge where it’s off, and produce insights to drive decision making (I just added that last sentence so the “poor students” don’t get lazy).


      1. I harnessed the power of your comment and engaged the community to create more insights. 🙂

    2. You had a bad day when writing this post, right?

      I sincerely hope you didn’t encourage poor students to make do with small samples because “it doesn’t matter anyway”.

      1. 🙂 Glad to have hit a nerve! I see so many people who indiscriminately use statistics to force a point without fully understanding the statistics they are using. Statistics help us make decisions. They aren’t THE decision.

    3. The perfection does not exist, I agree! Every method has its limits. I believe that the important is to be the most rigorous possible and to show transparency by explaining clearly the limits of the used method…

    4. Social comments and analytics for this post…

      This post was mentioned on Twitter by tomderuyck: RT @lovestats: I’m sorry but representative samples are 100% unattainable: http://wp.me/pow9s-xv

    %d bloggers like this: