Nearly every day, I see a really cool statistic on TV or the interweeb. Everyone gets all excited about losing 312 pounds in four days, curing cancer, or eliminating measles forever. Candy is good for you! Coffee increases your memory! Drink more wine! Eat more Doritos! But if we paid ANY attention to the research methodology, you’d ignore the entire study. Here are a few of the biggest problems I see.
1) Significantly increased memory!!! Yes, when the sample size is large enough or the difference is large enough, anything is significant. So if 5 people in the control group remembered 5 things and 5 people in the test group remembered 8 things, the difference might be statistically significant. Or, if 1000 people in the control remembered 5 things and 1000 people in the test group remembered 5.2 things, the difference might be statistically signficant. Do you trust the results based on 10 people? Do you care about a difference of 0.1 points? I don’t. Get back to me when your sample sizes and effect sizes go beyond pre-test methodology sizes.
2) Cancer rates decreased by 75%!!! Yes, very nice finding. Especially when the cancer rate of the control group was 0.04% and the cancer rate of the test group was 0.01%. That is indeed a 75% decrease but will that massive decline of 0.03 points mean that you stop eating chocolate or start drinking wine? Doubt it. It’s not a meaningful difference when it comes to one single person. Get back to me when the rate decreases by 75% AND the base rate can be measured without any decimal places.
3) Chocolate makes you thin!!!! I’m sure it did. In that one, single study. That has never been replicated. Remember how we compare all our findings against a 5% chance rate? Well, that’s what you just discovered. The 5% chance where the finding occurred randomly. Run the research another 19 times and then get back to me when 19 of them say that chocolate makes you thin.
There are about 423 other cautions to watch out for, but today has been brought to you by the number three.
… Live blogging from the 2013 ESOMAR Congress in Istanbul Turkey. Any errors are my own, any comments or terrible jokes in  are my own…
Opening keynote speaker: Error and Innovation
- Are Olympic athletes just born with amazing genes and they’re lucky to win? But how can people and teams win over and over again? There must be more to it. What is that thing?
- The British cycling team says it’s all because of Matt Parker – “Hot Pants”. He made sure to warm up the cyclers legs before racing with electrically heated shorts [ha ha ha, no not THOSE hot pants]
- Parker also realized that alcohol on the tires of the bikes removes a tiny layer of dirt meaning the bikes wouldn’t slip in the initial start of the race. He also told the athletes to wash their hands to avoid germs and illnesses from meeting so many people from around the world.
- His job title was “Head of marginal improvements”
- Massive results can come from tiny innovations built on top of each other, 10s of thousands of a second add up over time and are eventually enough to be the difference between a gold and silver medal
- Hospitals and schools need this process of breaking things down into their component parts
- You might also call it A B testing, test the colors, the fonts, the design and see what makes people buy
- But you can’t jump over a canyon in tiny steps
- During WWII, why would you need a fighter plane that didn’t have a co-pilot and couldn’t be used against bombers that arrived unexpectedly? “It will be a most interesting experiment” This plane was the spitfire, one of the most loved aircraft in the history of aviation. People came to britain just to fly this plane. It was really fast, maneuverable. This plane was so good, it turned back the german airforce. It cost 10 000 pounds, the price of one house in London at the time. This long shot really paid off.
- We like to try new things, little things. We can’t always take small steps though. Chance of failure is very high. You can have multiple failures if the success is big enough.
- We find it hard to support people who bet on long shots.
- Too many smart people try to impress too many other smart people too quickly – A complaint about Harvard University. You need to not be trying to impress someone every minute of every hour.
- if you think big, you will fail and you will fail a lot
- It’s hard to tell the difference between a total loser and a genius
- Why must the risk taker bear all the responsibilities for failure? We must all support people who are willing to take risks, not only those that turn out to be successes
- The internet was people betting on a small chance of success with many opportunities for failure
- We owe to these dreamers that we will share the risks
- Touring and Tasting Istanbul in 12 Minutes #ESOCong #MRX (lovestats.wordpress.com)
Data tables – ten thousand pages filled with eight billion numbers and four trillion significance tests. Some might think that’s a slight exaggeration, but to me it feels bang on.
Data tables have some great features they make it really easy to forget the basics and stretch beyond the true validity of the data. Here are a few things to try to remember.
1) Data tables show chi-squares and t-tests on every single combination not because those comparisons are important but rather because the software is capable of plugging numbers into equations. Human beings are the only ones who can say which comparisons make sense. (I assume you are human.)
2) Data tables will show significance testing even when the sample sizes are too small. The software will still calculate the test, and it might even provide a warning that the sample size is very small, but once again a human must intervene to verify that doing the test actually makes sense with that sample size. I don’t care if the test comes out statistically significant in spite of a small sample size. You must use your brain and decide for yourself if the sample size is still just too small.
3) We usually use a p value of 5% to decide whether a test result is significant. Using this threshold means there is a 5% chance that your conclusion will be wrong. On * every * single * solitary * test. What this means is that across your datatables of thousands of tests, there’s a frickin huge chance that lots of the significant findings are pointing you in the wrong direction. Wondering which ones are misleading you? Read #4.
4) Running a billion t-tests on a set of datatables isn’t something to brag about. What it really means is that you haven’t thought about why you’re doing the research and what you want to focus on. You’re basically doing tests in a stepwise regression style and waiting for anything to drop in your lap. It’s called exploratory research for a reason. If you do the exact same study again, you’ll probably get a completely different set of significant results. So whatever significant number you so eloquently explained to your client is going to disappear next time and you’re going to have to eloquently explain it away. Again. If you like looking dumb, this is the tactic for you.
The moral of the story is don’t let your stats software think for you. Take the time to decide what’s important. Talking to clients will be a whole lot easier and you’ll look a whole lot smarter.