Tag Archives: census rep

Do we need to control for non-quota variables? by Deb Santus and Frank Kelly #CASRO #MRX

Live blogged from Nashville. Any errors or bad jokes are my own.

Third author is Peter Kwok

we moved many offline sampling techniques to online sampling. now we have river and dynamic sourcing and routers.

– should we use one or other of both of outgo quotas or return quotas
– balancing quotas are set from sampling frames. usually region, age, gender, household size, often based on US census.
– survey quotas are determined by respondent profiles or subject category.
– some populations are really hard to find. not everyone is simply looking for genpop
– sample frames may not reflect the target populations
– females can respond 20 or more points higher than men
– with river or dynamic sampling, you don’t even know the demos that you’re getting

– router selection is efficient use of respondents but there’s not as much quota control compared to traditional sampling that uses outgo and return balancing
– traditional sampling focused on a specific person for a specific study

carried out a study using various sampling techniques. used interlocking age and gender, plus region.
– 10 minutes, grocery shopping habits, census quotas
cell 1 – 4 balancing variables including income, quotas for outgo
cell 2 – only used age, gender, region quotas on outgo

then weighted to census
– better weights on cell 1, better weight efficiency, minimum weights, maximum weights.
– every type of sample has skews [yer darn right! why do people forget this?]
– controlling for age, gender, region just wasn’t enough
– income and household size did not represent well when they weren’t initially balanced for, marital status also didn’t work well
– some of the profiling questions showed differences as well – belonging to a warehouse club showed differences, using a smartphone to help with chopping showed differences
– quotas do not guarantee a representative sample. additional controls are necessary on outgo. with more controls, weighting can even be unnecessary
– repetition is good. repetition is good. repetition is good (i.e., test-retest reliability is good!)

we need to retain our sample expertise. be smart. learn about sampling and do it well. keep the good things about the traditional ways.

[please please control on the outgo and returns if you can. weighting as a strategy is not the way to think about this. get the sample you need and fuss with it as little as possible through weighting]

A Model-Based Approach for Achieving a Representative Sample by George Terhanian #CASRO #MRX

Live blogging from the CASRO Digital conference in San Antonio, Texas. Any errors or bad jokes are my own.CasroDigital

“A Model-Based Approach for Achieving a Representative Sample”
Although the enterprise of online research (with non-probability samples) has witnessed remarkable growth worldwide since its inception about 15 years ago, the US public opinion research community has not yet embraced it, partly because of concerns over data reliability and validity. The aim of this project is to rely on data from a recent, large-scale ARF study to develop an optimal model for achieving a representative sample. By that, we mean one that reduces or eliminates the bias associated with non-probability sampling. In addition to the presenter, this paper was authored byJohn Bremer, (Toluna) and Carol Haney (Toluna).

  • George Terhanian, Group Chief Strategy & Products Officer ,Toluna
  • The key is representativeness. This topic is not new, we talked about it 15 years ago. Criticisms are not new – Warren Mitofksy said the willingness to discard sampling frames and feeble attempts at manipulating the resulting bias undermines the credibility of the research process
  • SLOP – Self, selected, opinion, panel. [Funny!]
  • Growth of online research remains strong as it has since the beginning.
  • 2011 – AAPOR needs to promote flexibility not dogmatism, established a task force on non-probability methods. Identified sample matching as most promising non-probability approach. Did not offer next steps or an agenda.
  • Study with 17 different companies in FOQ studyEmbedded image permalink
  • Researchers should use the ARF’s FOQ2 data to test on-probability sampling and representativeness
  • Used a multi-directional search algorithm (MSA)
  • Bias is difference between what respondents report and what we know to be true. e.g., Do you smoke? Benchmark vs panel scores.
  • Reduce bias through 1) respondent selection or sampling and 2) post hoc adjustment or weighting  [I always prefer sampling]
  • FOQ2 suggests weighting needs to include additional variables such as demographics, secondary demographics (household characteristics), behaviours, attitudes
  • [If you read my previous post on the four types of conference presenters, this one is definitely a content guru 🙂 ]
  • Using only optimal demographics, panel and river sample were reasonably good, reduced bias by 20 to 25%. Time spent online helps to reduce bias and is a proxy for availability in terms of how often they take surveys
  • Ten key variables are age gender region, time spent online, race, education [sorry, missed the rest]
  • Other variables like feeling hopeful, , concern about privacy of online information were top variables [sorry, missed again, you really need to get the slides!]
  • Need to sample on all of these but don’t need to weight on all of them
  • [I’m wondering if they used a hold-back sample and whether these results are replicable, the fun of step-wise work is that random chance makes weird things happen]

Other Posts

I hate social media research because: It’s not a rep sample #2 #MRX


I recently wrote a blog post citing ten of the biggest complaints about social media research. Today I address complaint #2.

It’s not a representative sample.

Part 1. Are we really going to go there? I guess we ought to. In 99.9% of market research, we aren’t using a representative sample in the strict sense of the word. Survey panels aren’t probability samples. Focus groups aren’t probability samples. Market research generally uses convenience samples and social media research is no different.

But here is the difference. We’ve all heard the statistic that a tiny percentage of people answer the majority of all market research surveys. In other words, most people aren’t participating in the survey experience and we never hear their opinion. Similarly, when we conduct social media research, we only listen to people who wish to share their opinions on Facebook, Twitter, YouTube, or any of the other millions of websites where they can write out their opinions. No matter what research method you choose, you only hear the people who wish to contribute their opinion in that mode.

Part 2. Who is talking about the brand anyways? Alright, so we know SMR doesn’t use rep samples. Big deal. One of the reasons we use rep samples in traditional research is to ensure we are talking to the right people. We do a rep sample because a product is used by a rep sample. We do a male only sample because a product is used by males only. In both cases, we choose a particular sample because it is most likely to reflect product triers and users. Guess what. The only people talking about your brand in social media are the people who care about your brand. Whether they hate your brand or love your brand, you have instantly reached the people who are relevant to your brand. They have raised their hand to tell you, “Listen to me. I have an opinion about your brand.”

If you require a rep sample, you ought to use a survey because that is the closest approximation. Always use the right method for the job.

True or False: True, but does it matter?

The Best Panel is a Census Rep Panel, NOT! #MRX

When you’re commissioning a new survey project, it can be hard to select the best survey panel for the job. There are many criteria to judge including response rates, data quality processes, panel sizes, and panel make-up.

Response rates are clear. If panelists don’t answer surveys, you get no completes. Data quality is clear. If panelists are speeders or random responders, you get garbage data. If the panel isn’t large enough, you don’t get enough completes. But what about panel make-up?

Logo for the 2010 United States Census.

What does the ideal panel look like? One of the most common misconceptions is that a survey panel should be census rep. Therefore, a Canadian panel should have the same demographics as the Canadian census and a US panel should have the same demographics as the US census. Unfortunately, this is NOT the ideal make-up of a survey panel.

Let’s think about the kinds of samples that we want to survey. Certainly, many survey projects are interested in census rep samples. Political surveys and social surveys for sure need to understand how a census representative population feels. But think a bit more. How often do you need samples of 1) males 18 to 34, 2) mothers of teenages, 3) mothers of babies, and 4) adults aged 65 and more. Those types of samples couldn’t be further from census rep and yet they are more representative of the types of samples that researchers need, the types of people they need to have as part of a survey panel.

So here is what the ideal panel looks like. It looks like what survey researchers need. And if researchers send 25% of their surveys to people aged 18 to 24, then 25% of their panel should be aged 18 to 24. (It’s actually more complicated than this as we must take into account that young people have lower response rates and therefore the panel should probably be 35% aged 18 to 24.) Similarly, since older people are less often the target of surveys, they should be underrepresented on a panel compared to census. (And even MORE underrepresented because their responses rates are much higher than average.)

The reason for this comes to the annoyance factor. If survey panels were census rep, we would have a lot of very annoyed, very frustrated younger people. They would be receiving far more surveys than other people and the demand on their time would be much more.  On the other hand, older people, who aren’t the focus of as many research objectives would receive far fewer surveys than other people and they would more easily become disengaged and disappointed at the lack of involvement. Neither of these situations is ideal.

So the next time you’re considering a research panel, don’t ask the providers if it’s census rep. Ask instead about the average number of invitations each panelist receives. Find out if some people receive 5 survey invites every week while other people receive only 5 invites every year. Find out if they treat their panelists nicely.

Surprise, surprise! A non-rep sample is as good as a ‘rep’ sample

Marina City in Chicago, Illinois, United State...

Image via Wikipedia

A new study called Re -Examining the Validity of Different Survey Modes for Measuring Public Opinion in the U.S. by Brian Schaffner and Stephen Ansolabehere made a stunning discovery that non-representative samples accurately predict the market place. Why is this surprising?

Market research has never had the good fortune to use rep samples. Not everyone signs up for, let alone knows about, online survey panels. Mail surveys only go to people who have homes and often skip large apartment buildings. Telephone surveys, even random digit dial version, don’t give every person an equal and independent chance of participating.

Market research has always been about non-rep samples. That’s the nature of our business. We thrive on turning non-rep data into rep conclusions. So stop being surprised and start being proud. It’s what we do.

I’m sorry but representative samples are 100% unattainable

[tweetmeme source=”lovestats” only_single=false]Statistics are just numbers. 1 + 2 is always 3 even if the 2 was written in a disgusting colour. People, on the other hand, have crappy days all the time. It could be because a lunch was packed without cookies or because horrible tragedy has struck.

So why does it matter? Because crappy days mean someone:

  • doesn’t answer a phone survey
  • lies on their taxes
  • makes a mistake on the census survey
  • accidentally skips page 2 on a paper survey
  • drips sarcasm all over their facebook page

You recognize these. We call them data quality issues.

Statistics lull us into a false sense of accuracy. Statistics are based on premises which do not hold true for beings with independent thought. Statistics lead us to believe that representative samples are possible when theory dictates it is impossible. Though a million times better than the humanities can ever dream of achieving, even “real” science can’t achieve representative samples. The universe is just far too big to allow that.

In other words, even when you’ve done everything statistically possible to ensure a rep sample, humans and their independent thought have had a crappy day somewhere in your research design.

There is no such thing as a rep sample. There are only good approximations of what we think a rep sample would look like.

And because I AM CANADIAN, I apologize if I have crushed any notions.

Read these too

  • #Netgain5 Keynote Roundup: Last Thoughts #MRX #li
  • The Death of Social Media Research #MRX
  • Will it blend?
  • The Dumbing Down of America (and Canada)
  • 10 items you must include in every successful list
  • %d bloggers like this: