Moderator: Jessica L. Holzberg, U.S. Census Bureau
Satisfied or Dissatisfied? Does Order Matter?; Jolene D. Smyth, University of Nebraska-Lincoln Richard Hull, University of Nebraska-Lincoln
- Best practice is to use a balanced question stem and keep response options in order
- What order should it be in the question stem
- Doesn’t seem to matter whether the scale is left to right or top to bottom
- Visual Heurstic Theory – help make sense of questions, “left and top mean first” and “up means good”, people expect the positive answer to come first, maybe it’s harder to answer if good is a the bottom
- Why should the question stem matter, we rarely look at this
- “How satisfied or dissatisfied are you? [I avoid this completely by saying what is your opinion about this and then use those words in the scale, why duplicate words and lengthen questions]
- Tested Sat first and Disat second in the stem, and then Sat top and Disat bottom in the answer list, and vice versa
- What would the non repsonse look like in these four options – zero differences
- Order in question stem had practically no impact, zero if you think about random chance
- Did find that you get more positive answers when positive answer is first
- [i think we overthink this. If the question and answers are short and simple, people change no trouble and random chance takes its course. Also, as long as all your comparisons are within the test, it won’t affect your conclusions]
- [She just presented negative results. No one would ever do that in a market research conference 🙂 ]
Question Context Effects on Subjective Well-being Measures; Sunghee Lee, University of Michigan Colleen McClain, University of Michigan
- External effects – weather, uncomfortable chair, noise in the room
- Internal effects – survey topic, image, instructions, response sale, question order
- People don’t view questions in isolation, it’s a flow of questions
- Tested with life satisfaction and self-rated health, how are the two related, does it matter which one you ask first; how will thinking about my health satisfaction affect my rating of life satisfaction
- People change their behaviors when they are asked to think about mortality issues, how is it different for people whose parents are alive or deceased
- High correlations in direction as expected
- When primed, people whose parents are deceased expected a lesser lifespan
- Primed respondents said they considered their parents death and age at death
- Recommend keeping the questions apart to minimize effects [but this is often/rarely possible]
- Sometimes priming could be a good thing, make people think about the topic before answering
Instructions in Self-administered Survey Questions: Do They Improve Data Quality or Just Make the Questionnaire Longer?
Cleo Redline, National Center for Education Statistics Andrew Zukerberg, National Center for Education Statistics Chelsea Owens, National Center for Education Statistics Amy Ho, National Center for Education Statistics
- For instance, if you say “how many shoes do you have not including sneakers”, and what if you have to define loafers
- Instructions are burdensome and confusing, and they lengthen the questionnaire
- Does formatting of instructions matter
- Put instructions in italics, put them in bullet points because there were several somewhat lengthy instructions
- Created instructions that conflicted with natural interpretation of questions, eg assessment does not include quits or standardized tests
- Tried using paragraph or list, before or after, with or without instructions
- Adding instructions did not change mean responses
- Instructions intended to affect the results did actually do so, I.e., people read and interpreted the instructions
- Instructions before the question are effective as a paragraph
- Instructions after the question are more effective as lists
- On average, instructions did not improve data question, problems are real bu they are small
- Don’t spend a lot of time on it if there aren’t obvious gains
- Consider not using instructions
Investigating Measurement Error through Survey Question Placement; Ashley R. Wilson, RTI International Jennifer Wine, RTI International Natasha Janson, RTI International John Conzelmann, RTI International Emilia Peytcheva, RTI International
- Generally pool results from self administered and CATI results, but what about sensitive items, social desirability, open end questions, what is “truth”
- Can evaluate error with fictitious issues – e.g., a policy that doesn’t exist [but keep in mind policy names sound the same and could be legitimately misconstrued ]
- Test using reverse coded items, straight lining, check consistency of seeming contradictory items [of course, there are many cases where what SEEMS to contradict is actually correct, e.g., Yes, I have a dog, No I don’t buy dog food; this is one of the weakest data quality checks]
- Can also check against administrative data
- “AssistNow” loan program did not exist [I can see people saying they agree becuase they think any loan program is a good thing]
- On the phone, there were more substantive answers on the phone, more people agreed with the fictitious program [but it’s a very problematic questions to begin with]
- Checked how much money they borrowed, $1000 average measurement error [that seems pretty small to me, borrow $9000 vs $10000 is a non-issue, even less important at $49000 and $50000]
- Mode effects aren’t that big
Do Faster Respondents Give Better Answers? Analyzing Response Time in Various Question Scales; Daniel Goldstein, NYC Department of Housing Preservation and Development; Kristie Lucking, NYC Department of Housing Preservation and Development; Jack Jerome, NYC Department of Housing Preservation and Development; Madeleine Parker, NYC Department of Housing Preservation and Development; Anne Martin, National Center for Children and Families
- 300 questions, complicated sections, administered by two interviewers, housing, finances, debt, health, safety, demographics; Variety of scales throughout
- 96000 response times measured, left skewed with a really long tail
- Less education take longer to answer questions, people who are employed take longer to answer, older people take longer to answer, and none glish speakers take the longest to answer
- People answer more quickly as they go through the survey, become more familiar with how the survey works
- Yes no are the fastest, check all that apply are next fast as they are viewed as yes no questions
- Experienced interviewers are faster
- Scales with more answer categories take longer
Live note taking at #AAPOR in Austin Texas. Any errors or bad jokes are my own.
The feedback of respondent committment and tailored feedback on response quality in an online survey; Kristin Cibelli, U of Michigan
- People can be unwilling or unable to provide high quality data, will informing them of the importance and asking for committment help to improve data quality [I assume this means the survey intent is honourable and the survey itself is well written, not always the case]
- Used administrative records as the gold standard
- People were told their answers would help with social issues in the community [would similar statements help in CPG, “to help choose a pleasant design for this cereal box”]
- 95% of people agreed to the committment statement, 2.5% did not agree but still continued; thus, we could assume that the control group might be very similar in committment had they been asked
- Reported income was more accurate for committed respondents, marginally significant
- Overall item nonresponse was marginally better for committed respondents, not committed people skipped more
- Not committed were more likely to straightlining
- Reports of volunteering, social desirability were possibly lower in the committed group, people confessed it was important for the resume
- Committed respondents were more likely to consent to reviewing records
- Committment led to more responses to income question, and improved the accuracy, more likely to check their records to confirm income
- Should try asking control group to commit at the very end of the survey to see who might have committed
Best Practice Instrument design and communications evaluation: An examination of the NSCH redesign by William Bryan Higgins, ICF International
- National and state estimates of child well-being
- Why redesign the survey? To shift from landline and cell phone numbers to household address based sampling design because kids were answering the survey, to combine two instruments into one, to provide more timely data
- Moe to self completion mail or web surveys with telephone follow-up as necessary
- Evaluated communications about the survey, household screener, the survey itself
- Looked at whether people could actually respond to questions and understand all of the questions
- Noticed they need to highlight who is supposed to answered the survey, e.g., only for households that have children, or even if you do NOT have children. Make requirments bold, high up on the page.
- The wording assumed people had read or received previous mailings. “Since we last asked you, how many…”
- Needed to personalize the people, name the children during the survey so people know who is being referred to
- Wanted to include less legalese
Web survey experiments on fully balanced, minimally balanced, and unbalanced rating scales by Sarah Cho, SurveyMonkey
- Is now a good time or a bad time to buy a house. Or, is now a good time to buy a house or not? Or, is now a good time to buy a house?
- Literature shows a moderating effect for education
- Research showed very little difference among the formats, no need to balance question online
- Minimal differences by education though lower education does show some differences
- Conclusion, if you’re online you don’t need to balance your results
How much can we ask? Assessing the effect of questionnaire length on survey quality by Rebecca Medway, American Insitute for research
- Adult education and training survey, paper version
- Wanted to redesign the survey but the redesign was really long
- 2 version were 20 pages and 28 pages, 138 questions or 98 questions
- Response rate slightly higher for shorter questionnaire
- No significant differences in demographics [but I would assume there is some kind of psychographic difference]
- Slightly more non-response in longer questionnaire
- Longer surveys had more skips over the open end questions
- Skip errors had no differences between long and short surveys
- Generally longer had lower repsonse rate but no extra problems over the short
- [they should have tested four short surveys versus the one long survey 98 is just as long as 138 questions in my mind]
“Big Things from Little Data”
|While great effort has been expended on improving how we collect online data, there has been insufficient attention on making full use of the data collected. Partial completes of long surveys are discarded. But if there was an effective method to salvage this data, we could increase the average sample size for any given question in a survey by 20% for no additional cost. As an extension of previous research around survey modularization, this research evaluates the potential of partial completes in a modularized and randomized survey design.
- Online surveys averaged around 20 minutes for the last ten years
- 65% of smartphone users not willing to spend more than 15 minutes on a survey.
- Almost half of the time spent on completed survey are on surveys that are more than 25 minutes.
- Longer surveys have higher drop out rates.12% on up to ten minutes, 28% on 31 minutes or more. Why can’t we use the partial data?
- Drop outs on mobile data are way higher than computer. 46% on smartphone, 25% on tablet, 12% on computer.
- New panelists have much higher drop out rates
- Around 40% of new panel joins do so via mobile. People think it makes sense and then realize it’s not that good after all.
- A fully optimized survey still took 34% longer to complete on the phone than on the computer.
- We could charge more for long surveys, tell people to write shorter surveys, chunk surveys into pieces and impute or fuse
- Proposal – don’t ask everybody everything. work with human nature, encourage responses through smaller surveys.
- Tried various orders of various modules, not all had same sample size depending on important of module
- 1000 completes, 26 minutes cost $6500; 1400 completes 17 minutes cost $6500; 1000 completes 19 minutes $5000. Modular design allowed them to save some costs.
- Incompletes could be by module, by skip pattern, or by drop-outs
- High incidence study of social media, common brands, respondent info
- In general 17% of people dropped out as in this study. But within those 35% completed at least one section.
- What drives drop out? boring question or topic, hard questions, extended screening, low tolerance for survey taking
- Survey enjoyability was higher with module surveys, survey length satisfaction higher in module survey
- Reported more social media activities and brand engagement within module survey
- Richer open ends in module survey
- It’s not fusion and bayesian networks. it’s a generally applicable model but it still requires careful design. can be generally applied.
- Think about partial completes as modular completes
- Look for big positive effect on fieldwork costs and data quality
- Are there better question types to do this with? How to randomize modules best?
- Peanut Labs Ask-Me-Anything with special guest Tom Ewing
- Peanut Labs Ask-Me-Anything with special guest Kristin Luck
- What is a convenience sample?
- What does plus or minus 3% 19 times out of 20 mean?
- Short answer lists inflate endorsement rates
- What is Vue magazine? #MRX
- CASRO in San Antonio: The fun so far #MRX #CASRO
Our Evolving Ecosystem: A Comprehensive Analysis of Survey Sourcing, Platforms and Lengths by Mark Menig and Chuck Miller #CASRO #MRX
“Our Evolving Ecosystem: A Comprehensive Analysis of Survey Sourcing, Platforms and Lengths”
|As more variables enter the research eco-system, assessing the impact of any specific element becomes increasingly difficult. A greater understanding of the interrelationship among survey question types, survey lengths, medium of survey completion (device types), and respondent sourcing (traditional panel, virtual panel, river) – and how these relate to respondent engagement and data quality will be achieved through studying the results of this comprehensive primary research project.
- “Portable” experience, not tied to desk. It used to be just phones or computers. Now it’s also tablets.
- Does screen size impact ability to complete surveys? Does survey length matter? How is data quality affected? Is there an optimal combination of devices?
- Used trad research panel, managed panel, and river sampling; compared phone, tablet, computer; compared 3 survey lengths
- Quota sampled on demographics by device, some cells took longer to fill, some took two months to fill [but that can affect the results]
- Didn’t break any grids up, same on phone and computer. Widest grid was 7 points.
- Time viewing the concept was inversely proportional to the screen size
- Age showed the greatest variation – age showed the greatest variation and age 18-34 had the quality score
- Those with lower quality data were far more likely to skim the concept
- among higher quality data, awareness that Google Glass was the concept had little impact on amount of time viewing concept
- Re verbatims: 8% of people gave junk answers, 8% gave a valid but short response, 33$ gave a short single thought, 18% gave one complex sentence, 33$ gave multiple sentences
- Lowest and highest education have better data quality – surprising
- Influencers gave low quality data
- Males give slightly worse data
- Prefer not to answer demos gives a lot worse data
- Speeders have the worst data quality
- Affiliate samples have slightly worse data quality
- PC data has slightly worse data quality, best from tablet. Has concerns about PC data quality.
- Best data – People over 50, shoppers, average techies
- Worst data – Tech enthusiasts, tech laggards, influentials, people under 35
- MS Windows PC was only OS with lower quality data; Those who answer factual questions correctly provide better data (e.g., describe the government structure)
- Prefer not to answer gives lower quality data
- Recommend timing grids because people can still answer them randomly
- Slower concept viewing was NOT an indicator
Over the years, I’ve been asked many time to create or produce case studies proving a variety of points. Prove surveys shouldn’t be longer than 60 minutes. Prove surveys should use simple language.
Well how about this instead. Prove you can watch a 60 minute television show without getting distracted. Prove you can understand a pharmaceutical commercial. Hey, just prove you can sit through a pharmaceutical commercial.
I really don’t know why we need some of these case studies. Turn your brain on and think what it’s like to be human again. There’s your answer.
–Written on the go
- Be inspired at the 2013 #MRIA annual conference #MRX (lovestats.wordpress.com)
Shorter Isn’t Always Better by Inna Burdein, Director of Panel Analytics, The NPD Group, Inc.
- Consider actual length – a 20 minute survey might feel like 40 minutes and vice versa
- Consider that some people can handle a 40 minute survey and others cannot
- Compared cognitive easy vs difficult surveys on same topic
- Completion rate declined with more questions and more difficult surveys
- Difficulty causes abandonment more than length does
- Shorter difficult was less well received than longer easy
- More straightlining in difficult surveys
- Difficult surveys were perceived to be longer surveys
- Shorter surveys are seen as longer and longer surveys are seen as shorter
- Perceived time is linked with satisfaction [makes me think of cognitive dissonance]
- People who display fraudulent behavior feel that surveys are longer than they really are
- More questions = more straightlining
- No matter how long, difficult surveys cause more problems
- Had little effect on completion of the next survey invite except for brand new panelists – they needed a really short survey no matter what [this to me is an important learning]
- Good experienced panelists are introverted, more laid back, more conventional, high need for cognition, and they like surveys
- Best thing to do with any survey is make it more simple [Hail mary!]
- Grids with more than 20% straightlining need to be cut
- [Way too many details, way too quickly, you’ll have to read the paper]
- DIY Panel: Gardlen, Ribeiro, Smith, Terhandian, Thomas #CASRO #MRX (lovestats.wordpress.com)
- Do I have your attention? By Pete Cape #CASRO #MRX (lovestats.wordpress.com)
- Combining Mobile, Social and Survey: Carol Haney #CASRO #MRX (lovestats.wordpress.com)
- Perfecting Social Media Segmentation: Margie Strickland #CASRO #MRX (lovestats.wordpress.com)
- Keynote: Reinventing Online Markets by Gayle Fuguitt #CASRO #MRX (lovestats.wordpress.com)
- Bringing Colour into our Digital Lives: Piet Hein van Dam #CASRO #MRX (lovestats.wordpress.com)
- Data Privacy: Gina Pingitore and Kristin Cavallaro #CASRO #MRX (lovestats.wordpress.com)
- Does Sample Size still matter? By David Bakken #CASRO #MRX (lovestats.wordpress.com)
I know this is the right thing to do. You know this is the right thing to do. Market research suppliers, panel companies, sample companies all know this is the right thing to do. Clients know this is the right thing to do.
We continue to write surveys longer than 60 minutes. We continue to program surveys longer than 60 minutes. We continue to say yes to surveys longer than 60 minutes. We continue to worry about straightlining and speeding and random responding and declining engagement. We are as addicted to long surveys as some people are to smoking and I am to sugar.
Here’s an idea.
I received my research organization’s magazine today. Inside were many lovely articles and beautiful charts and tables. I quickly noticed one particular article because of all the charts it had, but the charts are not what caused my fury.
The article was YET ANOTHER one on panel quality. Yes, random responding, straightlining, red herrings. The same topic we’ve been talking about for years and years and years.
Now, I love panel quality as much as the next person and it is an absolutely essential component for every research panel. We know what the features of low quality are and how to spot them and how to remove their effects. We even know the demographics of low quality responders (Ha! Really? We know the demographics of people who aren’t reading the question they’re answering?) But this isn’t the point.
Why do we measure panel quality? Because the surveys we write are so bad, we turn our valuable participants into zombie. They want to answer honestly but we forget to include all the options. They want to share their opinions but we throw wide and long grids at them. They want to help bring better products to market but we write questions about “purchase experience” and “marketing concepts.”
I don’t want to hear about panel quality anymore. It’s been done to death. Low panel quality is OUR fault.
Tell me instead how you’re improving survey quality. How have you convinced clients that shorter is better and simpler is more meaningful? What specific techniques have you used to improve surveys and still generate useful results? Tell me this and I’ll gaze at you with supreme admiration.
The MR business relies almost 100% on the kindness and generosity of our fellow human beings. We hope that people will answer unending surveys on the most boring topics with far more attention than they pay to their favorite child. Basically, we expect people to be lab rats at our beck and call.
But do we show them the respect they deserve? The respect they earned? Take out the last survey you wrote and give it a good look. Was it more than 15 minutes long? Did if have more than ten items in a grid? Were there more than two grids? Did you use marketing language not people language? Did you include outs on every question (DK, none)? I think I know the answer without even seeing the survey myself.
People must come first. Long surveys must go. Boring questions must go. Confusing question set-ups must go. The declining numbers of survey responders who put up with our bad behaviours now cannot sustain our industry for very long.
In the spirit of the season, consider better surveys your gift to the research community.