Data Tables: The scourge of falsely significant results #MRX

By LoveStats on November 9, 2010

Image via Wikipedia

Who doesn’t have fond thoughts of 300 page data tabulation reports! Page after page of crosstab after crosstab, significance test after significance test. Oh, it’s a wonderful thing and it’s easy as pie (mmmm…. pie) to run your fingers down the rows and columns to identify all of the differences that are significant. This one is different from B, D, F, and G. That one is different from D, E, and H. Oh, the abundance of surprise findings!

But let me take you back to your introductory statistics class in college or university. Significance testing is a process we use to determine the likelihood that Number A is different from Number B at level that is different than what would be expected by chance. As an industry, we have generally agreed that we are willing to put up with a 5% chance of error, a 5% chance that the difference we see was just random chance. And each individual test has a 5% chance of error.

Now let’s think about those lovely tabulation reports that contain thousands of individual significance tests. Did you realize that each of those significance tests has a 5% chance of error? So, that’s 5% plus 5% plus 5% plus….. I can’t bare to do the math and I’m not even sure I can do the math.

If you’re a researcher who wants to understand why you are making the decisions you are making and why sometimes your results don’t pan out in the marketplace, this is something you need to know. Words like Post hoc tests, multiple comparisons, family wise error, Bonferonni, Scheffe, and Tukey aren’t just cool sounding statistical terms. They are processes that ensure the error rate across an entire study is kept at your desired level, whether 5%, 10% or 1%.

There are many things about market research that aren’t perfect but it’s better to know and work with them, than not know and fight them.

Silly old chi-square! (stat.columbia.edu)
The significance of t-tests (annezelenka.com)
Why Most Published Research Findings Are False (ncbi.nlm.nih.gov)

3 responses

Tabulation Reports. The Purgatory of Data | Kathrin Maass November 10, 2010 at 11:22 am

[…] Lovestats once again hits the nail on the head when saying: Who doesn’t have fond thoughts of 300 page data tabulation reports! Page after page of crosstab after crosstab, significance test after significance test. Oh, it’s a wonderful thing and it’s easy as pie (mmmm…. pie) to run your fingers down the rows and columns to identify all of the differences that are significant. This one is different from B, D, F, and G. That one is different from D, E, and H. Oh, the abundance of surprise findings! […]
andrew jeavons November 9, 2010 at 10:31 pm

but…if they do it properly all the data would be……non significant…hence no hypothesis wise error rate. i gave up explaining this years ago…nice article !
1. lovestats November 10, 2010 at 9:31 am
  
  gee…. i didn’t think of that. so let’s just forget i said this so that we will get tons of significant results. 🙂

The LoveStats Blog

Data Tables: The scourge of falsely significant results #MRX

3 responses

Hi and welcome!

Mmm books!

My Chatter

The LoveStats Blog

Data Tables: The scourge of falsely significant results #MRX

Related Articles

Sharing is nice:

Related

3 responses

Hi and welcome!

Mmm books!

My Chatter