Tag Archives: p value

Daddy, where do statistics come from? #MRX

Bedtime_story_-_MadelineWell, my little one, if you insist. Just one more bedtime story.

A long, long time ago, a bunch of people who really loved weather and biology and space and other areas of natural science noticed a lot of patterns on earth and in space. They created neato datasets about the weather, about the rising and setting of the sun, and about how long people lived. They added new points to their datasets everyday because the planets always revolved and the cells always went in the petri dish and the clouds could always be observed. All this happened even when the scientists were tired or hungry or angry. The planets moved and the cells divided and the clouds rained because they didn’t care that they were being watched or measured. And, the rulers and scales worked the same whether they were made of wood or plastic or titanium.

Over time, the scientists came up with really neat equations to figure out things like how often certain natural and biological events happened and how often their predictions based on those data were right and wrong. They predicted when the sun would rise depending on the time of year, when the cells would divide depending on the moisture and oxygen, and when the clouds would rain depending on where the lakes and mountains were. This, my little curious one, is where p-values and probability sampling and t-tests and type 1 errors came from.

The scientists realized that using these statistical equations allowed them to gather small datasets and generalize their learnings to much larger datasets. They learned how small a sample could be or how large a sample had to be in order to feel more confident that the universe wasn’t just playing tricks on them. Scientists grew to love those equations and the equations became second nature to them.

water lake lady bottle file9141234953692 bottle-of-water

It was an age of joy and excitement and perfect scientific test-control conditions. The natural sciences provided the perfect laboratory for the field of statistics. Scientists could replicate any test, any number of times, and adjust or observe any variable in any manner they wished. You see, cells from an animal or plant on one side of the country looked pretty much like cells from the the same animal or plant on the other side of the county. It was an age of probability sampling from perfectly controlled, baseline, factory bottled water.

In fact, statistics became so well loved and popular that scientists in all sorts of fields tried using them. Psychologists and sociologists and anthropologists and market researchers started using statistics to evaluate the thoughts and feelings of biological creatures, mostly human beings. Of course, thoughts and feelings don’t naturally lend themselves to being expressed as precise numbers and measurements. And, thoughts and feelings that are often not understood and are often misunderstood by the holder. And, thoughts and feelings aren’t biologically determined, reliable units. And worst of all, the measurements changed depending on whether the rulers and scales were made of English or Spanish, paper or metal, or human or computer.

frog lilies flower pond water water pond lily lilies heron bird water lake people swimming jump

Sadly, these new users of statistics grew to love the statistical equations so much that they decided to ignore that the statistics were developed using bottled water. They applied statistics that had been developed using reliable natural occurrences to unreliable intangible occurrences. But they didn’t change any of the basic statistical assumptions. They didn’t redo all the fundamental research to incorporate the unknown, vastly greater degree of random and non-randomness that came with measuring unstable, influenceable creatures. They applied their beloved statistics to pond water, lake water, and ocean water. But treated the results as though it came from bottled water.

baby boy sleepSo you see, my dear, we all know where statistics in the biological sciences come from. The origin of probability sampling and p-values and margins of error is a wonderful story that biologists and chemists and surgeons can tell their little children.

One day, too, perhaps psychologists and market researchers will have a similar story about the origin of psychological statistics methods to tell their little ones.

The end.

 

Advertisements

The 2014 Gift Guide for Geeks, Dorks, and Research Gurus

Well, it’s that time of year again!

Regardless of which holiday you celebrate and even if you celebrate the holiday of “I deserve a treat today”, you’re sure to find a statistics gift for yourself or your loved ones below. Just click on the image to go to the website and order. Go! Quickly before they run out! Shirts, cups, hats, toddler toys, and more, they’re all here.

If you come across other fun statistics and research gifts, leave a link in the comments for other folks. Enjoy!
normal distribution plush
wood puzzle number
statistics blocks
statistically significant cap hat
statistics mousepad
statistically significant bag tote
science shirt
statistics decal car
normal distribution dinosaur shirt
phd bs apron
phd iphone case
statistics burp cloths
cup cozy
normal signicant cross stitch
plush pie chart
pie chart poster
pi cookie cutter
pi tie clip

Which of these statistical mistakes are you guilty of? #MRX

On the Minitab Blog, Carly Barry listed a number of common and basic statistics errors. Most readers would probably think, “I would never make those errors, I’m smarter than that.” But I suspect that if you took a minute and really thought about it, you’d have to confess you are guilty of at least one. You see, every day we are rushed to finish this report faster, that statistical analysis faster, or those tabulations faster, and in our attempts to get things done, errors slip in.

p values effect sizes statisticsNumber 4 in Carly’s list really spoke to me. One of my pet peeves in marketing research is the overwhelming reliance on data tables. These reports are often hundreds of pages long and include crosstabs of every single variable in the survey crossed with every single demographic variable in the survey. Then, a t-test or chi-square is run for every cross, and carefully noted for which differences is statistically significant. Across thousands and thousands of tests, yes, a few hundred are statistically significant. That’s a lot of interesting differences to analyze. (Let’s just ignore the ridiculous error rates of this method.)

But tell me this, when was the last time you saw a report that incorporated effect sizes? When was the last time you saw a report that flagged the statistically significant differences ONLY if that difference was meaningful and large? No worries. I can tell you that answer. Never.

You see, pretty much anything can be statistically significant. By definition, 5% of differences are significant. Tests run with large samples are significant. Tests of tiny percents are significant. Are any of these meaningful? Oh, who has time to apply their brains and really think about whether a difference would result in a new marketing strategy. The p-value is all too often substituted for our brains. (Tweet that quote)

It’s time to redo those tables. Urgently.

Read an excerpt from Carly’s post here and then continue on to the full post with the link below.

Statistical Mistake 4: Not Distinguishing Between Statistical Significance and Practical Significance

It’s important to remember that using statistics, we can find a statistically significant difference that has no discernible effect in the “real world.” In other words, just because a difference exists doesn’t make the difference important. And you can waste a lot of time and money trying to “correct” a statistically significant difference that doesn’t matter.

via Common Statistical Mistakes You Should Avoid.

I poo poo on your significance tests #AAPOR #MRX

What affects survey responses?
– color of the page
– images on the page
– wording choices
– question length
– survey length
– scale choices

All of these options, plus about infinity more, mean that confidence intervals and point estimates from any research are pointless.

And yet, we spew out out significance testing at every possible opportunity. You know, when sample sizes are in the thousands, even the tiniest of differences are statistically significant. Even meaningless differences. Even differences caused by the page colour not the question content.

So enough with the stat testing. If you don’t talk about effect sizes and meaningfulness , then I’m not interested.

Other Posts

Stop wasting time on significance tests #MRX

Have you ever conducted a research project and NOT done any significance tests?

Have you ever run a series of significance tests and wondered why you bothered to do them?

pie chart significance testLet’s think about why we do research projects and why we do significance testing. First of all, research isn’t worth doing unless the methodology is designed very carefully with appropriate sample sizes, great questions, and high standards of data quality. It should be designed with very clear research objectives in mind, with potential outcomes carefully thought out, with potential action steps carefully thought out. Quality research studies are conducted with measures of success clearly outlined before the research is carried out.

If all of these things are in place, then I challenge you to consider why you even bother with significance testing. A research study with clearly thought out objectives should be accompanied by specific hypotheses that lead to specific outcomes. Your well planned out study determined that Product A must generate scores that are at least X% better than Product B before Product A is identified as a success. If it does, then it makes sense to proceed with launching Product A.

So if you already know that you are seeking improvements of size X%, there is zero reason to conduct significance tests. Your measure of success has been predetermined. You already know that, based on your high quality research design, the difference is large enough to warrant moving forward with the launch.

In other words, if you need to run a signficance test to determine if a difference is important, then the difference is for sure not important at all. Significance tests aren’t required.

Radical Market Research Idea #6: Don’t calculate p-values #MRX

p-values are the backbone of market research. Every time we complete a study, we run all of our data through a gazillion statistical tests and  search for those that are significant. Hey, if you’re lucky, you’ll be working with an extremely large sample size and everything will be statistically significant. More power to you!

But what if you didn’t calculate p-values? What if you simply looked at the numbers and decided if the difference was meaningful? What if you calculated means and standard deviations, and focused more on effect sizes and less on p<0.05? Instead of relying on some statistical test to tell you that you chose a sample size large enough to make the difference significant, what if you used your brain to decide if the difference between the numbers was meaningful enough to warrant taking a decision?

Effect sizes are such an underused, unappreciated measure in market research. Try them. You’ll like them. Radical?

Ask a Simple Question, Get an Encyclopedia

Broadwater Focus Group

Image by Nebraska Library Commission via Flickr

If you want to be an excellent market researcher, you need to know a lot about many different topics. You need to know what makes a good survey question or focus group discussion guide and how to avoid writing a horrid one. You need to know about research methods, sampling, weighting, and sample size determination. Knowledge of statistics is essential and it must go beyond t-tests, chi-squares, and p-values. There is a ton of very detailed, complicated information you must know to do your job well.

But here is the problem. When people ask for research advice, they don’t always want an essay on the pros and cons of various options and techniques.  They know they’re asking a complicated question with a complicated answer but sometimes they just want a quick and simple answer. They want to know that they’re pointing in the right direction, that they’re generally thinking the right thing.

So what do we do? We don’t try to understand whether it’s a request for a simple answer or an in-depth consultation. No matter what they’re looking for, we give people a three hour lecture about the intricacies of research and make everything far more complicated than it needs to be. Our strange technical languages serves to scare off some people and bore others to tears.

Isn’t it time we considered what people really want? Perhaps just a simple answer to a simple question?

Really Simple Statistics: 1-Tail and 2-Tail tests #MRX

Welcome to Really Simple Statistics (RSS). There are lots of places online where you can ponder over the minute details of complicated equations but very few places that make statistics understandable to everyone. I won’t explain exceptions to the rule or special cases here. Let’s just get comfortable with the fundamentals.

What are tails?

No, not these tails.


monosodium demondimum nasirkhan moneysaver67 from morguefile

The tails in statistics refer to the predictions we make about our research results and how we want to hedge our bets.

One tailed tests

One tailed tests are what you use when you have a specific guess. Men are taller than women. Women like chocolate more than men like chocolate. Roses smell nicer than tulips. It’s like putting all your eggs in one basket. Take a guess and hold yourself to it.


xandert from morguefile

Ideally, this is what you should be aiming for. You should have a prediction about what is going to happen before you conduct your research. You should do your homework and not just willy nilly see ‘what comes up significant.’

One tailed tests are advantageous because they give you a better chance of generating differences that turn out to be statistically significant, as long as, of course, your prediction turned out to be right.

Two tailed tests

Two tailed tests are used when you can’t make a guess. Will men or women eat more bread? Is basketball or soccer more fun? Do people spend more on coffee or on hot chocolate? In this case, you’re putting splitting your eggs between two baskets – maybe men but maybe women, maybe basketball but maybe soccer.


jdurham from morguefile
.
In reality, most of what we do in market research is based on two tailed tests. We don’t spend the time to develop specific hypotheses ahead of time. We wait to get the datatables and then search through hundreds of pages looking for whatever happens to be significant.

And that’s it! Really Simple Statistics!

Really Simple Statistics: Chi-Square #MRX

Welcome to Really Simple Statistics (RSS). There are lots of places online where you can ponder over the minute details of complicated equations but very few places that make statistics understandable to everyone. I won’t explain exceptions to the rule or special cases here. Let’s just get comfortable with the fundamentals.

If you haven’t had your morning cup of tea or coffee, may I be the first to disappoint you by saying this post has nothing to do with chai tea! Sorry. 😦

And my apologies again, it has nothing to with a traditional Chinese unit of length, or a dragon in Chinese mythology or a life-force.

What is a chi-square

Chi-squares are all about percentages. They are a statistical test that is used to determine if the percentage for one group is significantly different than the percentage for another group. Is the percentage of men who play soccer different from the percentage of women who play soccer? Is the percentage of people who made a purchase on Saturday the same as the percentage of people who made a purchase on Sunday? Is the percentage of high-income people who buy Brand A the same as the percentage of low-income people who buy Brand A?

Like any statistic, chi-squares can be very simple.

  • Compare the percentage of men who buy Brand A vs the percentage of women who buy Brand A

Chi-squares can also be more complicated.

  • Compare the percentage of men who buy Brand A or Brand B or Brand C vs the percentage of women who buy Brand A or Brand B or Brand C

Most basic market research relies heavily on chi-square tests. All of those grid questions in a survey are usually analyzed with a chi-square – the percentage of people who chose “Strongly Agree” or the percentage of people who chose “Disagree.”

Usually, when a study is launched, one of the project deliverables is a set of data-tables, you know those 300 pages of tables? These tables are chock full of chi-square tests but you wouldn’t know it unless you read the tiny little print at the bottom of the tables.

The important thing to remember is that chi-squares all about percentages.

Really simple statistics!

Related Articles

Really Simple Statistics: p values #MRX

Welcome to Really Simple Statistics (RSS). There are lots of places online where you can ponder over the minute details of complicated equations but very few places that make statistics understandable to everyone. I won’t explain exceptions to the rule or special cases here. Let’s just get comfortable with the fundamentals.

What is a p value?

P value is a short form for probability value and another way of saying significance value. It refers to the chance that you are willing to take in being wrong. (I know, once in your life is too many times to be wrong.)

No matter how careful you are, random chance plays a part in everything. If you try to guess whether you’ll get heads or tails when you flip a coin, your chance of guessing correctly is only 50%. Half the time, you’ll flip tails even if you wanted to flip heads.p value

In research, we don’t like 50/50 odds. We instead only want to risk that 5% or 1% of our predictions are wrong. And, if you just  picked 1% or 5%, you’ve just picked a peck of picked peppers. Whoops, I mean you’ve just picked a p value.

P values are almost always expressed out of 1. For example, a p value of 0.05 means you are willing to let 5% of your predictions be wrong. A p value of 0.1 means you are willing to let 10% of them be wrong. Don’t let that pesky decimal place fool you. A p value of 0.01 means 1% and a p value of 0.1 means 10%.p value

When you do a statistical test in software like SPSS or Systat, it will tell you the exact p value associated with your specific set data. For instance, it might indicate that the p value of your result is 0.035, or “Men are significantly taller than women, p=0.035.” That means there is a 3.5% chance that men are NOT actually taller than women and this result happened only because of random chance.

Really Simple Statistics!

%d bloggers like this: