Welcome to Really Simple Statistics (RSS). There are lots of places online where you can ponder over the minute details of complicated equations but very few places that make statistics understandable to everyone. I won’t explain exceptions to the rule or special cases here. Let’s just get comfortable with the fundamentals.
** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** **
Oh my, what a gorgeous heteroscedasticity you have! You mean other than a really cool eight syllable statistics word that you can show off with in front of friends?
This long and lovely word comes into play when you’re dealing with pairs of variables – perhaps height and weight, or grades and time spent studying, or voting behaviour and time spent reading the political section of the paper. It has mean and nasty effects on correlation coefficients and regression models so pay attention!
Specifically, it refers to the distribution of numbers for one variable in relation to the distribution of numbers for another variable. Homoscedasticity refers to a spread that is very even and regular no matter which section of the chart you look at. This is what you see in the first chart.
- We all know that shorter people weigh less and taller people weigh more. But, what if most 5 foot tall women
- weigh between 90 and 100 pounds while most 6 foot tall women weigh between 130 and 170
points. The range of 10 pounds at 5 feet is very different from the range of 40 pounds at 6 feet. That’s a lot of heterobebijicty!
- We also know that people who study a lot tend to get higher grades. Now, what if people who studied 1 hour per week got a D while people who studied 2 hours per week got a C, B, or A? Once again, 1 hour resulted in one possible grade while 2 hours resulted in three possible grades. That’s even more heteroihjusdfgicty.
- And, what if jogging for 30 minutes burns 200 to 250 calories while jogging for 60 minutes burns 400 to 500 calories. Half an hour resulted in a range of 50 calories while a full hour resulted in a range of…. also 50 calories per half hour. That’s a lot of…. homoscedasticity!
So the next time you’re wondering why your correlation coefficient or regression equation isn’t as nice as what you had hoped for, have at look for heteroscedasticity. And make it a habit to look before you statisticize.
- Really Simple Statistics: T-Tests
- Really Simple Statistics: p values
- Really Simple Statistics: Nominal Ordinal Interval and Ratio Numbers
- Really Simple Statistics: What is Ratio Data
- Really Simple Statistics: What is Ordinal Data?
- Really Simple Statistics: What is Nominal Data?
- Really Simple Statistics: What is Interval Data?
- Really Simple Statistics: What is a standard deviation?
- Really Simple Statistics: Sample Sizes