A “How-To” Session on Modularizing a Live Survey for Mobile Optimization by Chris Neal and Roddy Knowles #FOCI14 #MRX
A “How-To” Session on Modularizing a Live Survey for Mobile Optimization
Chris Neal, CHADWICK MARTIN BAILEY
& Roddy Knowles, RESEARCH NOW
- conducted a modularized survey for smartphone survey takers, studied hotels for personal travel and tablets for personal use, excluded tablet takers to keep the methodology clean
- people don’t want to answer a 20 minute survey on a phone but clients have projects that legitimately need 20 minutes of answers
- data balanced and weighted to census
- age was the biggest phone vs computer difference
- kept survey to 5 minutes, asked no open ended questions, minimize the word count, break grids into individual questions to avoid burden of scrolling and hitting a tiny button with a giant finger
- avoid using a brand logo even though you really want to. space is at a premium
- avoid flash on your surveys, avoid images and watermarks, avoid rich media even though it’s way cool – they don’t always work well on every phone
- data with more variability is easier to impute – continuous works great, scale variables work great, 3 ordinal groups doesn’t work so well, nominal doesn’t work so well at all
- long answer options lists are more challenging – vertical scrolling on a smartphone is difficult, affects how many options responders choose, ease of fewer clicks often wins out
- branching is not your friend. if you must branch, have the survey programmers account for the missing data ahead of time, impute all the top level variables and avoid imputing the bottom level branched variables
- Predictive mean matching works better than simply using a regression model to replace missing data
- hot decking (or data stitching which combines several people into one) replaces missing data with that from someone who looks the same, worked really well though answers to “other” or “none of the above” didn’t work as well
- hot decking works better if you have nominal data
- good to have a set of data that EVERYONE answers
- smartphone survey takers aren’t going away, we need to reach people on their own terms, we cannot force people into our terms
- we have lots of good tools and don’t need to reinvent the wheel. [i.e., write shorter surveys gosh darn it!!!]
- Should a panel be representative of the population?
- Humanizing surveys: Why did you screen me out after I told you my age?
- Economy or Healthcare: What matters most to Americans today?
- What is Vue magazine? Find every article here!
Ah, yet another enjoyable set of sessions from #AAPOR, chock full of modeling, p-values, and the need to transition to R. Because hey, if you’re not using R, what old-fashioned, sissy statistical package are you using?
This session was all about satisficing, burden, and data quality and one of the presenters made a remark that really resonated with me – when is burden caused by responders. In this case, burden was measured as surveys that required people to extend a lot of cognitive ability, or when people weren’t motivated to pay full attention, or when people had difficulty with the questions.
Those who know me know that it always irks me when the faults of researchers and their surveys are ignored and passed on to people taking surveys. So let me flip this coin around.
- Why do surveys require people to extend a lot of cognitive ability?
- Why do surveys cause people to be less than fully motivated?
- Why do people have difficulty answering surveys?
We can’t, of course, write surveys that will appeal to everyone. Not everyone has the same reading skills, computer skills, hand-eye coordination, visual acuity, etc. Those problems cannot be overcome. But we absolutely can write survey that will appeal to most people. We can write surveys with plain and simple language that don’t have prerequisites of sixteen Dicken’s novels. We can write surveys that are interesting and pleasant and respective of how people think and feel, thereby helping them to feel motivated. We CAN write surveys that aren’t difficult to answer.
And yes, my presentation compared data quality in long vs short surveys. Assuming my survey was brilliantly written, then why were there any data quality issues at all? 🙂
This afternoon, I attended a session on how the number of survey contacts affects data quality and results equivalence this afternoon. I just loved the tables and stats and multicollinearity. Many of my hunches, and likely your hunches were confirmed and yes overly obvious.
But something bothered me. As cool as it is to confirm that people who are reluctant to participate give bad data and people who always participate give good data, it irked me to be reminded that our standard business practice is to recontact people ten times. 10. TEN. X.
Have we conveniently ignored various facts?
– people have call display. When they see the same name and number pop up ten times, they learn to hate that caller. And the name associated with that caller. And that makes our industry look terrible.
– half of people are introverts. A ton of them let every call go to voicemail which means we are pissing them off by calling them ten times in a few days. Seriously pissing them off. I know. I’m a certified, high order introvert.
– I like to listen to what people say about research companies online. People DO search out the numbers on their call display and identify survey companies. Even if you use local numbers to encourage participation. And yet again, this makes us look bad.
Why do we allow this? For the sake of data integrity? Hogwash. It’s easy for me.
Do we care about respondents or not?
Do Smartphones Really Produce Lower Scores? Understanding Device Effects on Survey Ratings by Jamie Baker-Prewitt #CASRO #MRX
“Do Smartphones Really Produce Lower Scores? Understanding Device Effects on Survey Ratings”
As the proliferation of mobile computing devices continues, some marketing researchers have taken steps to understand the impact of respondents opting to take surveys on smartphones. Research conducted to date suggests a pattern of lower evaluative ratings from smartphone respondents, yet the cause of this effect is not fully understood. Whether the observed differences truly are driven by the data collection device or by characteristics of smartphone survey respondents themselves requires further investigation. Leveraging the experimental control associated with a repeated measures research design, this research seeks to understand the implications of respondent-driven smartphone survey completion on the survey scores obtained.
- Jamie Baker-Prewitt, SVP/Director of Decision Science, Burke, Inc.
- Tested four devices for data quality and responses
- Brand awareness was not significantly different
- Brand engagement – trust, financially stable, value, popular, proud, socially responsible – did show differences. PC users had higher ratings. Smartphone takers had lower ratings.
- Customer engagement – purchase, recommend, loyalty, preference – half of tests showed significant differences. PC users had higher recommend scores and smartphone takers had lower recommend scores.
- Different topics and sources all suggested that devices cause lower ratings
- Did a nice repeated measures design with order controls
- Frequency of purchasing looked the same on both devices, average cell phone bill showed no differences [interesting data point!]
- No differences on brand engagement – 1 out of 30 was significant [i.e., the 5% error rate we expect due to chance]
- Purchase data looked very similar in many cases for PC vs phone, frequency distributions were quite similar
- Correlations between PC and phone scores were around .8, which is very high [recall people did the same survey twice, once on each device]
- Current research replicates original research, no significant device effect. Did not replicate lower scores from smartphones.
- Study lacked mundane realism, they were in a room with other people taking the survey, there weren’t ‘at home’ distractions but there were distractions – chatty people, people needed assistance, people might have simply remembered what they wrote in the first survey
- Ownership of mobile will continue to grow and mobile surveys will grow
- Business professionals are far more likely to answer surveys via mobile, fastfood customer are more likely to use smart phone for surveys
- Very few people turned the phone horizontally – they could see less screen but it was easier to read. Why not tell people they CAN turn their phone horizontally.
Total Survey Error is a relatively recent approach to understanding the errors that occur during the survey, or research, process. It incorporates both sampling errors, non-sampling errors, and measurement errors, including such issues as specification error, coverage errors, non-response errors, instrument error, respondent error, and pretty much every single other error that could possibly exist. It’s an approach focused on ensuring that the research we conduct is as valid and reliable as it can possibly be. That is a good thing.
Here’s the problem. Total Survey Error is simply a list. A list of research errors. A long list, yes, but a list of every error that every researcher has been trained to recognize and account for in every research project they conduct.
We have been trained to recognize a bad sample, improve a weak survey, conduct statistics properly, generalize appropriately, and not promise more than we can deliver. Is conducting research the old name of ‘total survey error?’ It is not a new, unique approach. It does not require new study nor new books.
Perhaps I’m missing something, but isn’t total survey error how highly skilled, top notch researchers have been trained to do their job?
How do you create a survey question measuring frequency of behaviour that will generate the most accurate responses? Experience tells us to consider things like:
- Should I include a zero or incorporate that into the smallest value?
- Should I use use whole numbers like ‘2 to 4’ or partial numbers like ‘2 to less than 4’?
- Should I use 4 break points or go all out with 10 break points?
These are all smart considerations and will help you collect more precise data. But seriously? How accurate, how precise, how valid are these data anyways? Do you really think that survey responders are going to carefully and precisely calculate the exact number of days or minutes they do something?
When we ask responders to choose between these two options…
- 2 to 4.99
- 2 to less than 5
… do you really think that one or the other option will help responders think of estimates that are any more accurate? Let’s face it. They won’t. Yes, there’s statistical precision in the answer options we’ve provided, but we are manufacturing a level of accuracy that does not exist. It’s no different than reporting 10 decimals places where 1 is more than sufficient.
So what do I recommend? Make things simple for your responder. Use real language not hoity toity, decades of academic research language. Use language that makes responders want to come and take another survey.
- Hot weather causes drownings #MRX (lovestats.wordpress.com)
- It’s just a keyword search #mrx (lovestats.wordpress.com)
- The Fourth Dimension of Research by Gregg Archibald #ACEI_CO #InvestigAction2013 #MRX (lovestats.wordpress.com)
Guest Post by Prof. Dr. Peter Ph. Mohler
Having listened to uncountable papers and reading innumerable texts on non-response, non-response bias, survey error, even total survey error, or global cooling of the survey climate it seems to be timely considering why after so many decades working in a, according to the papers, seemingly declining field called “survey research” I still do not intend to quit that field.
The truth is, I am mighty proud to be a member of survey research because:
- We can be proud of our respondents who, after all these years, still give us an hour or so of their precious time to answer our questions to the best of their abilities.
- We can be proud of our interviewers who, despite low esteem/status and payment, under often quite difficult circumstance, get in contact with our respondents, convince them to give us some of their time and finally do an interview to the best of their abilities.
- We can be proud of our survey operations crews, who, despite low esteem/status and increasing costs/time pressures organize data collection, motivate interviewers, and edit/finalize data for analysis.
- We can be proud of our social science data archives who for more than five decades preserve and publish surveys nationally and internationally as a free, high quality service unknown of in other strands of science.
- We can be proud of our survey designers, statisticians and PIs, who constantly improved survey quality from its early beginnings.
Of course there are drawbacks such as clients insisting on asking dried and dusted questions or, often academic, PIs who do not estimate the efforts and successes of respondents, interviewers, survey operations and all the rest, and there are some who deliberately fabricate surveys or survey analyses (including all groups mentioned before).
But it is no good to define a profession by its outliers or not so optimal outcomes.
Thus it seems timely to turn our attention from searching for errors to optimizing survey process quality and at long last defining benchmarks for good surveys that are fit for their intended purpose.
The concepts and tools are already there, waiting to be used to our benefit.
As originally posted to the AAPOR distribution list.
Peter is owner and Chief Consultant of Comparative Survey Services (COMPASS) and honorary professor at Mannheim University. He is the former Director of ZUMA, Mannheim (1987-2008). Among others he directed the German General Social Survey (ALLBUS) and the German part of the International Social Survey Programme (ISSP) for more than 20 years. He was a founding senior member of the European Social Survey (ESS) Central Scientific Team 2001- 2008. He is co-editor of Cross-Cultural Survey Methods (John Wiley, 2003) and Surveys in Multinational, Multiregional and Contexts (John Wiley, 2010, AAPOR Book Award 2013). Together with colleagues of the ESS Central Coordinating Team, he received the European Descartes Prize in 2005.
- Reflecting on AAPOR 2013 (csrindiana.wordpress.com)
… Live blogging from Disney Orlando, any errors are my own…
Steve von Bevern, VP, Client Services and Operations, Research Now Mobile (presented by Roddy Knowles)
- Today’s path to purchase is more complex than ever
- People use mobile for discover, evaluation, buying, accessing, using, getting support, and much more
- Mobile users multi-task, 51% listen to music, 52% watch TV, 43% use internet, 28% play video games, 17% read a book 16% read newspaper/magazine
- What is mobile – nott just phones (dumb phones), it’s tablets that can display surveys
- 1.4 billion smartphones! 268 million tablets in active use!!
- Who uses devices? Similar age and gender to census though slightly more younger, slightly less older; same ethnicities as census, similar income as census though slightly
- Mobile open ends are richer – longer responses and more of these longer responses [do 99% of people provide a survey answer because they are forced to? and the provide a crap answer as a result?]
- Do we get the same answers both ways? In this case study, when weighted back to census, results were very similar, sport opinions were similar, smoking opinions were similar
- How do responders feel about it? Older responders feel it isn’t as easy on mobile, on average people don’t find mobile more difficult if the survey is properly designed
- How fun is the survey? Mobile is seen as more fun than online, online surveys are no longer new and interesting, maybe this will change over time as people become used to mobile surveys
- Mobile responders prefer mobile surveys overwhelming. Get people where they want to be
- Like online, you must target to ensure mobile and online deliver representative data.
- Online or mobile are not the solution for every audience. You must choose the method that’s right for the situation.
- Don’t rely only on one method or you will miss people who prefer other methods
- Mobile allows rich media uploads, seamless option for people
- In the moment surveys work well too, get close to the point of experience and a computer isn’t always the right way to do it, don’t rely on imperfect memory, a 5 minute survey while they are at breakfast can work well; we know memory is fallible, will you remember everything from this conference next week?
- Geolocated surveys are an advantage. Target people entering a store, determine who walked past a store [I always turn off my geolocation though I know my signals are tracked by my phone provider]
- Home ethnography – 97% scanned barcodes, 99% uploaded pictures, 82% uploaded audio, 80% of people with pets uploaded videos
- Sporting event two day diary – 69% completed two diaries, 62% uploaded images, 19% uploaded video, very detailed openends, high level of engagement
- Holistic Insights – Behaviour data like app downloads, music played plus survey data result in deeper insights
- Retail surveys – good way to get shopper feedback on in-store displays, respondents can go in and upload photos of instore displays and scan barcodes; geofencing means the survey is only available when the responder is in the store ensuring strong validity
- Mobile is not online and online is not mobile. You can’t just use them in place of each other.
- Define – who you want to reach, consider multi-mode to ensure broad representation
- Dive deep – be creative, enhance open-ends with audio and video, use rich media when it’s fit for purpose, let people write it out or audio record as they wish
- Design – be pragmatic and make the most of respondent time and screen real estate, streamline and simplify, know which questions work and don’t work, use multiple points of engagement, think like a respondent
- When you’re doing mobile, don’t ask people to constantly scroll, don’t ask them to rank 15 options
- Stick to 7 responses so they don’t have to scroll, limit “please specify” to where you really need it, eliminate superfluous words and phrases
- Don’t use lots of cute and colors and fancy just because you can, practical must come first
- Grids and mobile don’t play well together [let me rephrase… DO NOT USE GRIDS ON YOUR PHONE. How freakishly tiny are your fingers? Come on!]
- Test out what you’re asking your responders to do first, take a video while you’re pouring the milk
- New Era of Engaging Qual by Steve August, Revelation #MRA_National #MRX (lovestats.wordpress.com)
- Storytelling Through Digital Analytics by Scott Vanderbilt, NPR #MRA_National #MRX (lovestats.wordpress.com)
- Digital Disruptors by James McQuivey #MRA_National #MRX (lovestats.wordpress.com)
- New Methods, New Wisdom by Denise Brien, AOL #MRA_National #MRX (lovestats.wordpress.com)
- Crowd Interpretation by Niels Schillewaert, InSites Consulting #MRA_National #MRX (lovestats.wordpress.com)
Probability and Non-Probability Samples in Internet Surveys
Moderator: Brad Larson
Understanding Bias in Probability and Non-Probability Samples of a Rare Population John Boyle, ICF International
- If everything was equal, we would choose a probability sample. But everything is not always equal. Cost and speed are completely different. This can be critical to the objective of the survey.
- Did an influenza vaccination study with pregnant women. Would required 1200 women if you wanted to look at minority samples. Not happening. Influenza data isn’t available at a whim’s notice and women aren’t pregnant at your convenience. Non-probability sample is pretty much the only alternative.
- Most telephone surveys are landline only for cost reasons. RDD has coverage issues. It’s a probability sample but it still has issues.
- Unweighted survey looked quite similar to census data. Looked good when crossed by age as well. Landline are more likely to be older and cell phone only are more likely to be younger. Landline more likely to be married, own a home, be employed, higher income, have insurance from employer.
- Landline vs cell only – no difference on tetanus shot, having a fever. Big differences by flu vaccination though.
- There are no gold standards for this measure, there are mode effects,
- Want probability samples but can’t always achieve them
A Comparison of Results from Dual Frame RDD Telephone Surveys and Google Consumer Surveys
- PEW and Google partnered on this study; 2 question survey
- Consider fit for purpose – can you use it for trends over time, quick reactions, pretesting questions, open-end testing, question format tests
- Not always interested in point estimates but better understanding
- RDD vs Google surveys – average different 6.5 percentage points, distribution closer to zero but there were a number that were quite different
- Demographics were quite similar, google samples were a bit more male, google had fewer younger people, google was much better educated
- Correlations of age and “i always vote” was very high, good correlation of age and “prefer smaller government”
- Political partisanship was very similar, similar for a number of generic opinions – earth is warming, same sex marriage, always vote, school teaching subjects
- Difficult to predict when point estimates will line up to telephone surveys
A Comparison of a Mailed-in Probability Sample Survey and a Non-Probability Internet Panel Survey for Assessing Self-Reported Influenza Vaccination Levels Among Pregnant Women
- Panel survey via email invite, weighted data by census, region, age groups
- Mail survey was a sampling frame of birth certificates, weighted on nonresponse, non-coerage
- Tested demographics and flu behaviours of the two methods
- age distributions were similar [they don’t present margin of error on panel data]
- panel survey had more older people, more education
- Estimates differed on flu vaccine rates, some very small, some larger
- Two methods are generally comparable, no stat testing due to non-prob sample
- Trends of the two methods were similar
- Ppanel survey is good for timely results
Probability vs. Non-Probability Samples: A Comparison of Five Surveys
- [what is a probability panel? i have a really hard time believing this]
- Novus and TNS Sifo considered probability
- YouGov and Cint considered non-probability
- Response rates range from 24% to 59%
- SOM institute (mail), Detector (phone), LORe (web) – random population sample, rates from 8% to 53%
- Data from Sweden
- On average, three methods differ from census results by 4% to 7%, web was worst; demos similar expect education where higher educated were over-represented, driving licence over-rep
- Non-prob samples were more accurate on demographics compared ot prob samples; when they are weighted they are all the same on demographics but education is still a problem
- The five data sources were very similar on a number of different measures, whether prob or non-prob
- demographic accuracy of non-prob panels was better. also closer to political atittudes. No evidence that self recruited panels are worse.
- Need to test more indicators, retest
Modeling a Probability Sample? An Evaluation of Sample Matching for an Internet Measurement Panel
- “construct” a panel that best matches the characteristics of a probability sample
- Select – Match – Measure
- Matched on age, gender, education, race, time online, also looked at income, employment, ethnicity
- Got good correlations and estimates from prob and non-prob.
- Sample matching works quite well [BOX PLOTS!!! i love box plots, so good in so many ways!]
- Non-prob panel has more heavy internet users
- Thoughts on the CMRP designation #MRX #NewMR (mriablog.wordpress.com)
- Minimizing Nonresponse Bias (GREAT session) #AAPOR #MRX (lovestats.wordpress.com)
- The Roles of Blogs in Public Opinion Research Dissemination #AAPOR #MRX (lovestats.wordpress.com)
- AAPOR Women Leaders Share Their Insights #AAPOR #MRX (lovestats.wordpress.com)
Mobile Research Risk: What Happens to Data Quality When Respondents Use a Mobile Device for a Survey Designed for a PC” by Jamie Baker-Prewitt, Senior Vice President, Director of Decision Sciences, Burke, Inc.
- Prior data shows that opt-in profiles of mobile completes were more likely to be younger, employed, ethnically diverse, response rates were slightly lower than web, some survey results were different
- In traditional surveys and mobile web, most dropoff is extremely early; in SMS surveys the trend is steep but far more extended over time
- Responders are choosing to use a mobile device to complete a survey when we have designed it for a PC
- Two recent studies showed 7% to 17% of completes coming from mobile devices, and 4% to 8% coming from tablets
- Research required people to take a survey on all three devices but they only actually answered the 10 minute survey on one device
- Tablet and smart phone adapted responses had lower survey lengths
- Bad smart phone survey took a couple of minutes longer than tablet and smart phone adapted
- Bad smart phone survey had much higher drop out rates – 18% vs 5% to 11%
- [WHY do we keep using this “mark a 2”. There is no more ridiculous measure of data quality than this!]
- Straightlining is less of an issue on bad smart phone surveys maybe because people can only see a couple of items at a time anyways
- Multiple errors higher on bad smart phone surveys
- Responders generally prefer to take surveys on a PC but responders answering a good mobile survey prefer that method
- Results of the four methods were generally the same
- Non-response bias introduced by requiring responders to have all three devices [but it was a necessary evil]
- Metnal cheating is not rampant in a 10 minute survey but bad smart phone surveys have higher drop out rates, more likely to straightline, more likely to not comply with instructions
- Good smart phone surveys need to use fewer words were possible, make sure response alternatives are visible and not clunky to navigate, ensure respondents can tell when they response has been registered
- Test the survey on a variety of devices and operating systems
- We must accomodate responder preferences, we can no longer request that people only answer on a PC
- Cyborgs vs Monsters in modularizing surveys: Edward Paul Johnson and Lynn Siluk #CASRO #MRX (lovestats.wordpress.com)
- Mobile and CAWI Parallel: Frank Kelly and Sherri Stevens #CASRO #MRX (lovestats.wordpress.com)
- Shorter isn’t always better: Inna Burdein #CASRO #MRX (lovestats.wordpress.com)
- Combining Mobile, Social and Survey: Carol Haney #CASRO #MRX (lovestats.wordpress.com)
- Validity of Gamification: Sweeney, Goldstein, and Becker #CASRO #MRX (lovestats.wordpress.com)
- Keynote: Reinventing Online Markets by Gayle Fuguitt #CASRO #MRX (lovestats.wordpress.com)
- DIY Panel: Gardlen, Ribeiro, Smith, Terhandian, Thomas #CASRO #MRX (lovestats.wordpress.com)
- Data Privacy: Gina Pingitore and Kristin Cavallaro #CASRO #MRX (lovestats.wordpress.com)
- Perfecting Social Media Segmentation: Margie Strickland #CASRO #MRX (lovestats.wordpress.com)