Tag Archives: survey design

Is MyDemocracy.ca a Valid Survey?

Like many other Canadians, I received a card in the mail from the Government of Canada promoting a website named MyDemocracy.ca. Just a day before, I’d also come across a link for it on Twitter so with two hints at hand, I decided to read the documentation and find out what it was all about. Along the way, I noticed a lot of controversy about the survey so I thought I’d share a few of my own comments here. I have no vested interested in either party. I am simply a fan of surveys and have some experience in that regard.

First, let’s recognize that one of the main reasons researchers conduct surveys is to generate results which can be generalized to a specific population, for example the population of Canada. Having heard of numerous important elections around the world recently, we’ve become attuned to polling research which attempts to predict election and electoral winners. The polling industry has taken a lot of heat regarding perceived levels of low accuracy lately and people are paying close attention.

Sometimes, however, the purpose of a survey is not to generalize to a population, but rather to gather information so as to be more informed about a population. Thus, you may not intend to learn whether 10% of people believe A and 30% believe B, but rather that there is a significant proportion of people who believe A or B or C or D. These types of surveys don’t necessarily focus on probability or random sampling, but rather on gathering a broad spectrum of opinions and understanding how they relate to each other.  In other cases, the purpose of a survey to generate discussion and engagement, to allow people to better understand themselves and other people, and to think about important issues using a fair and balanced baseline that everyone can relate to.

The FAQ associated with MyDemocracy.ca explains the purpose of the survey in just this manner – to foster engagement. It explains that the experimental portion of the survey used a census balanced sample of Canadians, and that the current intention of the survey is  to help Canadians understand where they sit in relation to their fellow citizens. I didn’t see any intention for the online results to be used in a predictive way.

I saw some complaints that the questions are biased or unfair. Having completed the survey two and a half times myself, I do see that the questions are pointed and controversial. Some of the choices are extremely difficult to make. To me, however, the questions seem no different than what a constituent might be actually be asked to consider and there are no easy answers in politics. Every decision comes with side-effects, some bad, some horrid. So while I didn’t like the content of some of the questions and I didn’t like the bad outcomes associated with them, I could understand the complexity and the reasoning behind them. In fact, I even noticed a number of question design practices that could be used in analysis for data quality purposes. In my personal opinion, the questions are reasonable.

I’m positive you noticed that I answered the survey more than twice. Most surveys do not allow this but if the survey was launched purely for engagement and discussion rather than prediction purposes, then response duplication is not an issue. From what I see, the survey (assuming it was developed with psychometric precision as the FAQ and methodology describe) is a tool similar to any psychological tool whether personality test, intelligence test, reading test, or otherwise. You can respond to the questions as often as you wish and see whether your opinions or skills change over time. Given what is stated in the FAQ, duplication has little bearing on the intent of the survey.

One researcher’s opinion.


Since you’re here, let me plug my new book on questionnaire design! It makes a great gift for toddlers and grandmas who want to work with better survey data!
People Aren’t Robots: A practical guide to the psychology and technique of questionnaire design


Rights Of Respondents #AAPOR

Live note taking at #AAPOR in Austin Texas. Any errors or bad jokes are my own.

Moderator: Ronald Langley, University of Kentucky

Examining the Use of Privacy Language: Privacy from the Respondent’s View; Kay Ricci, Nielsen Lauren A. Walton, Nielsen Ally Glerum, Nielsen Robin Gentry, Nielsen

  • Respondents have concerns about the safety and security of their data, Want do know how data is stored, collected, We hear about breaches all the time now
  • In 2015, FCC issued TCPA rules re automatic telephone dialling systems, can’t use them for cell phones without consent
  • Existing page of legalize was terrifying and could affect key metrics
  • 3 steps – in depth interviews, implemented language into TV diary screener, analyzed key metrics to understand impact of new language
  • 30 English interviews, 16 Spanish interviews
  • Did people notice the language, read it, skim it, did they care? Did they understand the terms, was the language easy or difficult
  • Tested several versions, tested a longer version with softer language
  • Only one person understood what an autodialler was, people didn’t realize it was a live interviewer, people didn’t care how their number was dialled if they were going to talk to a live human anyways
  • 2/3 didn’t like the concept, thought they’d be phoned constantly, 1/3 didn’t mind because it’s easier to hang up on a machine
  • People liked we weren’t selling or marketing products, but many don’t see the difference
  • Many people don’t know who neilsen is
  • People liked being reminded that it was voluntary, extra length was fine for this
  • The after version was longer with more broken up sentences
  • Test group had lower return rate but very slightly, lower mail rate
  • Higher return rate for 50 plus, and Hispanic
  • Contact number provision was much lower, drop from 71% to 66%
  • It’s essential to protest so you know the impact
  • [simple language is always better even if it takes more space]

Allowing Access to Household Internet Traffic: Maximizing Acceptance of Internet Measurement; Megan Sever, Nielsen Sean Calvert, Nielsen

  • How do we measure what people watch and buy online in their home
  • How do we access third party data , but then how do we great true demographic information to go with it
  • 22 semi structured interviews – mock recruit into please share your password
  • Ranges from absolutely yes – they believe it’s already being collected anyways
  • Sceptics wanted more information – what are you actually recording, how is my data secure
  • Privacy – security – impact on Internet performance
  • People seemed to think they would Screencap everything they were doing, that they could see their bank account
  • Brought examples of real data that would be collected, what the research company will see, essentially lines of code, only see URL, not the contents of the page, start and stop times; at this point participants were no longer concerned
  • Gave a detailed description of encryption, storage and confidentially procedures
  • Explain we’re not marketing or selling and data is only stored as long as is necessary
  • Reputation of the research company builds trust, more familiar folks were okay with it
  • Script should describe purpose of measurement, what will and will not be measured, how it will be measured, data security privacy and protection policies, impact on Internet performance, reputation of company
  • Provide supplementary information is asked for – examples of data, policies that meet or exceed standards, examples of Internet performance, background and reputation of company 

Informed Consent: What Do Respondents Want to Know Before Survey Participation; Nicole R. Buttermore, GfK Custom Research Randall K. Thomas, GfK Custom Research Jordon Peugh, SSRS; Frances M. Barlas, GfK Custom Research Mansour Fahimi, GfK Custom Research

  • Recall the kerfuffle last year about what information should be told to respondents re sponsor or person conducting the research 
  • Recognized participants should be told enough information to give informed consent – but also if we are concerned about bias, then we can tell people they won’t be debriefed until afterwards; but companies said sometimes they could NEVER review the sponsor and they’d have to drop out of #AAPOR if this happened
  • We worry about bias and how knowing the sponsors affects the results
  • Sponsor information is a less important feature to respondents
  • Do respondents view taking sureys as risky? What information to respondents want prior to computing surveys.
  • Topic, my time, and my incentive are thought to be most important
  • People were asked about surveys in general, not just this one or just this company
  • 6% thought an online survey could have a negative impact 
  • Most worried about breaks of privacy, confidentially; less worried is survey is waste of time or boring, or might upset them
  • 70% said no risk to mental health, 2% said high risk to mental health
  • 23% said stopped taking a survey because it made them uncomfortable – made think more deeply about life, made them angry, made them feel worse about themselves, made them feel sad, or increased their stress
  • Politics made them angry, bad questions made them angry, biased survey and too long survey made them angry [That’s OUR fault]
  • Same for feeling stressed, but also add in finance topics
  • Feel worse about self is the finance topic or health, or about things they can’t have
  • Feel sad related to animal mistreatment
  • People want to know how personal information will be protected, surely length, risks, topic, how results will be used, incentives, purpose of survey – at least one third of people [1/3 might not seem like a lot but when you’re sample is 300 people that’s 100 people who want to know this stuff]
  • Similar online vs phone, incentives more important for online, one the phone people wan to know what types of questions will be asked

Communicating Data Use and Privacy: In-person versus Web Based Methods for Message Testing; Aleia Clark Fobia, U.S. Census Bureau Jennifer Hunter Childs, U.S. Census Bureau

  • Concern about different messages in different place and they weren’t consistent
  • Is there a difference between “will only be used for statistically purpose” and “will never be used for non statistical purposes”
  • Tested who will see data, identification, sharing with other departments, burden of message
  • Tested it on a panel of census nerds :), people who want to be involved in research, 4000 invites, 300 completes
  • People were asked to explain what each message means, broke it down by demographics
  • 30 cognitive interviews, think aloud protocol, reads sets of messages and react, tested high and low performing messages [great idea to test the low messages as well]
  • FOcused on lower education and people of colour
  • Understanding is higher for in person testing, more misunderstanding in online responses, “You are required by law to respond to the census (technical reference)” was better understood than listing everything in a statement
  • People want to know what ‘sometimes’ means. And want to know which federal agencies – they don’t like the IRS
  • People don’t believe the word never because they know breaches happen
  • More negative on the web
  • Less misunderstanding in person
  • Easier to say negatives things online
  • In person was spontaneous and conversation
  • Focus on small words, avoid unfamiliar concepts, don’t know what tabulate means, don’t know what statistical means [they aren’t stupid, it’s just that we use it in a context that makes no sense to how they know the word]

    Respondent Burden & the Impact of Respondent Interest, Item Sensitivity and Perceived Length; Morgan Earp, U.S. Bureau of Labor Statistics Erica C. YuWright, U.S. Bureau of Labor Statistics

    • 20 knowledge questions, 10 burden items, 5 demographic questions, ten minute survey
    • Some questions were simple, others were long and technical
    • Respondents asked to complete a follow up survey a week later
    • Asked people how hard the survey was related to taking an exam at school or reading a newspaper or completing another survey – given only one of these comparisons 
    • Anchor of school exam had a noticeable effect size but not significant 
    • Burden items – length, difficulty, effort, importance, helpfulness, interest, sensitivity, intrusive, private, burden
    • Main effects – only sensitivity was significant, effect size is noticeable
    • Didn’t really see any demographic interactions
    • Burden length difficult; effort importance helpfulness interesting; sensitive intrusive private – these are the three factors 
    • Only first factor related to whether they would answer the second survey
    • Females more likely to respond a second time
    • More sensitive less likely to be answered again, more interestnig in would attract more women the second dime

      Goodbye Humans: Robots, Drones, and Wearables as Data Collectors #AAPOR 

      Live note taking at #AAPOR in Austin Texas. Any errors or bad jokes are my own.

      Moderator: Jamres Newswanger, IBM 

      Using Drones for Household Enumeration and Estimation; Safaa R. Amer, RTI International Mark Bruhn, RTI International Karol Krotki, RTI International

      • People have mixed feelings about drones, privacy
      • When census data is available it’s already out of date
      • Need special approval to fly drones around
      • Galapagos province  census, new methodology used tablet to collect info to reduce cost and increase timeliness
      • Usually give people maps and they walk around filling out forms
      • LandScan uses satellite imager plus other data
      • Prepared standard and aerial maps for small grid cells, downloaded onto tablet
      • Trained enumerators to collect data on the ground
      • Maps show roof of building so they know where to go, what to expect, maps online might be old, show buildings no longer there or miss new buildings
      • Can look at restricted access, e.g., on a hill, vegetation 
      • Can put comments on the map to identify buildings no longer existing
      • What to do when a building lies on a grid line, what if the entrance was in a different grid than most of the house
      • Side image tells you how high the building is, get much better resolution with drone
      • Users had no experience with drones or GIS
      • Had to figure out how to standardize data extraction
      • Need local knowledge of common dwelling opposition to identify type of structure, local hotels looked like houses
      • Drones gave better information about restricted access issues, like fence, road blocks 
      • Drones had many issue but less time required for drones, can reuse drones but you can’t use geolisting
      • Can extend to conflict and fragile locations like slums, war zones, environmentally sensitive areas

      Robots as Survey Administrators: Adapting Survey Administration Based on Paradata; Ning Gong, Temple University
      Nina DePena Hoe, Temple University Carole Tucker, Temple University; Li Bai, Temple University; Heidi E. Grunwald, Temple University

      • Enhance patience reported outcome for surveys of children under 7 or adults with cognitive disabilities 
      • Could a robot read and explain the questions, it is cool and cute, and could reduce stress
      • Ambient light, noise level, movement of person are all paradata
      • Robot is 20 inches high, likes toys or friends, it’s very cute, it can dance, play games, walk, stand up, to facial recognition, speech recognition, sees faces and tries to follow you
      • Can read survey questions, collect responses, collect paradata, use item response theory, play games with participants 
      • Can identify movements of when person is nervous and play music or games to calm them down 
      • Engineers, social researchers, and public health researchers worked together on this; HIPPA compliance

      Wearables: Passive Media Measurement Tool of the Future; Adam Gluck, Nielsen; Leah Christian, Nielsen
      Jenna Levy, Nielsen; Victoria J. Hoverman, Nielsen Arianne Buckley, Nielsen Ekua Kendall, Nielsen
      Erin Wittkowski, Nielsen

      • Collect data about the wearer or the environment
      • People need to want to wear the devices
      • High awareness of wearable, 75% aware; 15% ownership. Computers were 15% ownership in 1991
      • Some people use them to track all the chemicals that kids come near everyday
      • Portable People Meter – clips to clothing, detects radio and audio codes or TV and radio; every single person in household must participate, 80% daily cooperation rate
      • Did research on panelists, what do they like and dislike, what designs would you prefer, what did younger kids think about it
      • Barriers to wearing clothing difficulties, some situations don’t lend to it, it’s a conspicuous dated design
      • Dresses and skirts most difficult becuase no pockets or belts, not wearing a belt is a problem
      • Can’t wear while swimming, some exercising, while getting ready in the morning, preparing for bed, changing clothes, taking a shower
      • School is a major impediment, drawing attention to is is an impediment, teachers won’t want it, it looks like a pager, too many people comment on it and it’s annoying 
      • It’s too functional and not fashionable, needs to look like existing technology
      • Tried many different designs, LCD write and most prefered by half of people, others like the watch, long clip, jawbone, or small clip style
      • Colour is important, right now they’re all black and gray [I’M OUT. ]
      • Screen is handy, helps you know which meter is whose
      • Why don’t you just make it a fitness tracker since it looks like I’m wearing one
      • Showing the the equipment should be the encouragement they need to participate
      • [My SO NEVER wore a watch. But now never goes without the wrist fitbit]

      QR Codes for Survey Access: Is It Worth It?; Laura Allen, The Gallup Organization Jenny Marlar, The Gallup Organization

      • [curious where the QR codes she showed lead to 🙂 ]
      • Static codes never change; Dynamice works off a redirect and can change
      • Some people think using a QR code makes them cool
      • Does require that you have a reader on your phone
      • You’d need one QR code per person, costs a lot more to do 1000 codes
      • Black and what paper letter with one dollar incentive, some people also got a weblink with their QR code
      • No response rate differences
      • Very few QR code completes, 4.2% of completes, no demographic differences
      • No gender, race differences; QR code users had higher education and were younger
      • [wonder what would happen if the URL was horrid and long, or short and easy to type]
      • Showing only a QR code decreased the number of completes
      • [I have a feeling QR codes are old news now, they were a fun toy when they first came out]

      Comparing Youth’s Emotional Reactions to Traditional vs. Non-traditional Truth Advertising Using Biometric Measurements and Facial Coding; Jessica M. Rath, Truth Initiative; Morgane A. Bennett, Truth Initiative; Mary Dominguez, Truth Initiative; Elizabeth C. Hair, Truth Initiative; Donna Vallone, Truth Initiative; Naomi Nuta, Nielsen Consumer Neuroscience Michelle Lee, Nielsen Consumer Neuroscience Patti Wakeling, Nielsen Consumer Neuroscience Mark Loughney, Turner Broadcasting; Dana Shaddows, Turner Broadcasting

      • Truth campaign is a mass media smoking prevention campaign launched in 2000 for teens
      • Target audience is now 15 to 21, up from 12 years when it first started
      • Left swipe is an idea of rejection or deleting something
      • Ads on “Adult Swim” incorporating the left swip concept in to “Fun Arts”
      • Ads where profile pictures with smoking were left swiped
      • It trended higher than #Grammys
      • Eye tracking showed what people paid attention to, how long attention was paid to each ad
      • Added objective tests to subjective measures
      • Knowing this helps with media buying efforts, can see which ad works best in which TV show

      New Math For Nonprobability Samples #AAPOR 

      Moderator: Hanyu Sun, Westat

      Next Steps Towards a New Math for Nonprobability Sample Surveys; Mansour Fahimi, GfK Custom Research Frances M. Barlas, GfK Custom Research Randall K. Thomas, GfK Custom Research Nicole R. Buttermore, GfK Custom Research

      • Neuman paradigm requires completes sampling frames and complete response rates
      • Non-prob is important because those assumptions are not met, sampling frames are incomplete, response rates are low, budget and time crunches
      • We could ignore that we are dealing with nonprobability samples, find new math to handle this, try more weighting methods [speaker said commercial research ignores the issue – that is absolutely not true. We are VERY aware of it and work within appropriate guidelines]
      • In practice, there is incomplete sampling frames so samples aren’t random and respondents choose to not respond and weighting has to be more creative, uncertainty with inferences is increasing
      • There is fuzz all over, relationship is nonlinear and complicated 
      • Geodemographic weighting is inadequate; weighted estimates to benchmarks show huge significant differences [this assumes the benchmarks were actually valid truth but we know there is error around those numbers too]
      • Calibration 1.0 – correct for higher agreement propensity with early adopters – try new products first, like variety of new brands, shop for new, first among my friends, tell others about new brands; this is in addition to geography
      • But this is only a Université adjustment, one theme, sometimes it’s insufficient
      • Sought a Multivariate adjustment
      • Calibration 2.0 – social engagement, self importance, shopping habits, happiness, security, politics, community, altruism, survey participation, Internet and social media
      • But these dozens of questions would burden the task for respondents, and weighting becomes an issue
      • What is the right subset of questions for biggest effort
      • Number of surveys per month, hours on Internet for personal use, trying new products before others, time spend watching TV, using coupons, number of relocations in past 5 years
      • Tested against external benchmarks, election, BRFSS questions, NSDUH, CPS/ACS questions
      • Nonprobability samples based on geodemogarphics are the worst of the set, adding calibration improves them, nonprobability plus calibration is even better, probability panel was the best [pseudo probability]
      • Calibration 3.0 is hours on Internet, time watching TV, trying new products, frequency expressing opinions online
      • Remember Total Research Error, there is more error than just sampling error
      • Combining nonprobability and probability samples, use stratification methods so you have resemblance of target population, gives you better sample size for weighting adjustments
      • Because there are so many errors everywhere, even nonprobability samples can be improved
      • Evading calibration is wishing thinking and misleading 

      Quota Controls in Survey Research: A Test of Accuracy and Inter-source Reliability in Online Samples; Steven H. Gittelman, MKTG, INC.; Randall K. Thomas, GfK Custom Research Paul J. Lavrakas, Independent Consultant Victor Lange, Consultant

      • A moment of silence for a probabilistic frame 🙂
      • FoQ 2 – do quota controls help with effectiveness of sample selections, what about propensity weight, matching models
      • 17 panels gave 3000 interviews via three sampling methods each; panels remain anonymous, 2012-2013; plus telephone sample including cell phone; English only; telephone was 23 minutes 
      • A – nested region, sex, age
      • B – added non nested ethnicity quotas
      • C – add no nested education quotas
      • D – companies proprietary method
      • 27 benchmark variables across six government and academic studies; 3 questions were deleted because of social desirability bias
      • Doing more than A did not result in reduction of bias, nested age and sex within region was sufficient; race had no effect and neither did C and those made the method more difficult; but this is overall only and not looking at subsamples
      • None of the proprietary methods provided any improvement to accuracy, on average they weren’t powerful and they were a ton of work with tons of sample
      • ABC were essentially identical; one proprietary methods did worse;  phone was not all that better
      • Even phone – 33% of differences were statistically significant [makes me think that benchmarks aren’t really gold standard but simply another sample with its own error bars]
      • The proprietary methods weren’t necessarily better than phone
      • [shout out to Reg Baker 🙂 ]
      • Some benchmarks performed better than others, some questions were more of a problem than others. If you’re studying Top 16 you’re in trouble
      • Demo only was better than the advanced models, advanced models were much worse or no better than demo only models
      • An advanced model could be better or worse on any benchmark but you can’t predict which benchmark
      • Advanced models show promise but we don’t know which is best for which topic
      • Need to be careful to not create circular predictions, covariates overly correlated, if you balance a study on bananas you’re going to get bananas
      • Icarus syndrome – covariates too highly correlated
      • Its’ okay to test privately but clients need to know what the modeling questions are, you don’t want to end up with weighting models using the study variables
      • [why do we think that gold standard benchmarks have zero errors?]

      Capitalizing on Passive Data in Online Surveys; Tobias B. Konitzer, Stanford University David Rothschild, Microsoft Research 

      • Most of our data is nonprobability to some extent
      • Can use any variable for modeling, demos, survey frequency, time to complete surveys
      • Define target population from these variables, marginal percent is insufficient, this constrains variables to only those where you know that information 
      • Pollfish is embedded in phones, mobile based, has extra data beyond online samples, maybe it’s a different mode, it’s cheaper faster than face to face and telephone, more flexible than face to face though perhaps less so than online,efficient incentives
      • 14 questions, education, race, age, location, news consumption, news knowledge, income, party ID, also passive data for research purposes – geolocation, apps, device info
      • Geo is more specific than IP address, frequency at that location, can get FIPS information from it which leads to race data, with Lat and long can reduce the number of questions on survey
      • Need to assign demographics based on FIPS data in an appropriate way, modal response wouldn’t work, need to use probabilities, eg if 60% of a FIPS is white, then give the person a 60% chance of being white
      • Use app data to improve group assignments

      The impact of questionnaire design on measurements in surveys #1 #ESRA15  #MRX  

      Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

      I tried to stay up until midnight last night but ended going to bed around 10:30pm. Naturally, it was still daylight outside. I woke up this morning at 6am in broad daylight again. I’m pretty sure it never gets dark here no matter what they say. I began my morning routine as usual. Banged my head on the slanted ceiling, stared out the window at the amazing church, made myself waffles in the kitchen, and then walked past the pond teaming with baby ducks. Does it get any better? I think no. Except of course knowing i had another day of great content rich sessions ahead of me!

      designs and developments of the income measures in the european social surveys

      • tested different income questions. allowed people to use a weekly, monthly, or annual income scale as they wished. there was also no example response, and no example of what constitutes income. Provided about 30 answer options to choose from, shown in three columns. Provided same result as a very specific question in some countries but not others.
      • also tested every country getting the same number breaks, groups weren’t arranged to reflect each countries distribution. this resulted in some empty breaks [but that’s not necessarily a problem if the other breaks are all well and evenly used]
      • when countries are asked to set up number breaks in well defined deciles, high incomes are chosen more often – affected because people had different ideas of what is and isn’t taxable income
      • [apologies for incomplete notes, i couldn’t quite catch all the details, we did get a “buy the book” comment.]

      item non-response and readability of survey questionnaire

      • any non-substantive outcome – missing values, refusals, don’t knows all count
      • non response can lower validity of survey results
      • semantic complexity measured by familiarity of words, length of words, abstract words that can’t be visualized, structural complexity
      • Measured – characters in an item, length of words, percent of abstract words, percent of lesser known words, percent of long words 12 or more characters
      • used the european social survey which is a highly standardized international survey, compared english and estonian, it is conducted face to face, 350 questions, 2422 uk respondents
      • less known and abstract words create more non-response
      • long words increase nonresponse in estonian but not in english, perhaps because english words are shorter anyways
      • percent of long words in english created more nonresponse
      • total length of an item didn’t affect nonresponse
      • [they used a list of uncommon words for measurement, such a book/list does exist in english. I used it in school to choose a list of swear words that had the same frequency levels as regular words.]
      • [audience comment – some languages join many words together which means their words are longer but then there are fewer words, makes comparisons more difficult]

      helping respondents provide good answers in web surveys

      • some tasks are inherently difficult in surveys, often because people have to write in an answer, coding is expensive and error prone
      • this study focused on prescription drugs which are difficult to spell, many variations of the same thing, level of detail is unclear, but we have full lists of all these drugs available to us
      • tested text box, drop box to select from list, javascript (type ahead look up)
      • examined breakoff rates, missing data, response times, and codability of responses
      • asked people if they are taking drugs, tell us about three
      • study 1 – breakoffs higher from dropbox and javascript; median response times longer, but codability was better. LIsts didn’t work well at all.
      • study 2 – cleaned up the list, made all the capitalization the same. break off rates were now all the same. response times lower but still higher than the textbox version. codability still better for list versions.
      • study 3 – if they couldn’t find a drug in the list, they were allowed to type it out. unlike previous studies which proceeded with the missing data. dropbox had highest missing data. javascript had lowest missing data. median times highest for drop box. trends for more and more drugs as expected, effect is more but not as much more.
      • older browswers had trouble with dropdowns and javascript and had to be routed to the textbox options
      • if goal is to get codable answers, use a text box. if goal is to create skip patterns then javascript is the way to go.

      rating scale labelling in web surveys – are numeric labels an advantage

      • you can use all words to label scales or just words on the end with numbers in between
      • research says there is less satisficing with verbal scales, they are more natural than numbers and there is no inherent meaning of numbers
      • means of the scales were different
      • less tie to completes the end labeled groups
      • people paid more attention to the five point labeled scale, and least to the end point labeled score
      • mean opinions did differ by scale, more positive on fully labeled scale
      • high cognitive burden to map responses of the numeric scales
      • lower reliability for the numeric labels

      What they can’t see can hurt you: improving grids for mobile devices by Randall Thomas #CASRO #MRX

      Live blogged in Nashville. Any errors or bad jokes are my own.

      Frances Barlas, Patricia Graham, and Thomas Subias

      – we used to be constrained by an 800 by 600 screen. screen resolution has increased, can now have more detail, more height and width. but now mobile devices mean screen resolution matters again.
      – more than 25% of surveys are being started with a mobile devices, less are being completed with a mobile device
      – single response questions don’t serve a lot of needs on a survey but they are the easiest on a mobile device. and you have to take the time to consider each one uniquely. then you have to wait to advance to the next question
      – hence we love the efficiency of grids. you can get data almost twice as fast with grids.
      – myth – increase a scale of 3 to a scale of 11 will increase your variance. not true. a range adjust value shows this is not true. you’re just seeing bigger numbers.
      – myth – aggregate estimates are improved by having more items measure the same construct. it’s not the number of items, it’s the number of people. it improves it for a single person, not for the construct overall. think about whether you need to diagnose a person’s illness versus a gender’s purchase of a product [so glad to hear someone talking about this! its a huge misconception]
      – grids cause speeding, straightlining, break-offs, lower response rates in subsequent surveys
      – on a mobile device, you can’t see all the columns of a grid. and if you shrink it, you cant read or click on anything
      – we need to simplify grids and make them more mobile friendly

      – in a study, they randomly assigned people to use a device that they already owned [assuming people did as they were told, which we know they won’t 🙂 ]
      – only have of completes came in on the assigned device. a percentages answered on all three devices.
      – tested items in a grid, items by one by one in a grid, and an odd one which is items in a list with one scale on the side
      – traditional method was the quickest
      – no differences on means
      [more to this presentation but i had to break off. ask for the paper 🙂 ]

      Survey advice from my trip to Kensington Palace #MRX

      Kensington PalaceWhile in London to give at talk at the IJMR Research Methods Forum, I managed to take in a few of the local historic sites. Having already seen Buckingham Palace, the next palace on my must-see list was Kensington Palace. I even bought my ticket ahead of time on the internet. I made sure to arrive at the Palace early so I’d have plenty of time to see every little thing. The signs said that the Palace had just finished a major renovation so I was darn excited! Until I got inside.

      Stairwell at kensington palace
      Yes, the palace had undergone a renovation. I’d have to call the result the Modern Craft Style. Paper cut-outs hung from the ceiling, small movies were being shown on the white painted walls, and drywall had been put up to create brand new hallways. Given that I was expecting gold gilded crown molding, centuries old wall paintings, and original three hundred year old well-worn furniture, to say that I was disappointed was a major understatement. You can see the most exciting rooms in the photos here. I imaged filling out one of those restaurant review cards and having to check the “Strongly Disagree” box five times.

      Inside Kensington palace
      I guessed that one right! On my way out of the building, I was asked to answer a survey. I begrudgingly agreed and was handed a clipboard and a pen. Would you believe it? For a building that took me 45 minutes to wander through, for I went slowly as I tried to get my moneys worth, I was asked to fill out a 3 page, double-sided survey full of grids and self-skip patterns. I always answer surveys as honestly as I can and so I went through and checked “disagree” to most of the questions (it was unfortunately not designed to discourage straight-lining). It was so terribly depressing to have to give negative answers to so many questions.

      Even worse, though, was the fact that when I tried to slip my completed survey back onto the desk, the oh so kind lady picked it right up and blocked my exit. She explained to me that she just wanted to make sure I had answered the survey correctly. She carefully reviewed all the answers to make sure I had followed the skip patterns correctly. I was so embarrassed by my negative answers about her beloved palace that I wanted to run out of the room as fast as I could. But nope, I stood there in shame until she dismissed me. She made no hints that she approved or disapproved of my answers, but that didn’t matter. I was completely embarrassed. If  had known she was going to review my answers, I probably wouldn’t have given such negative answers.

      What did I learn from this?

      1. If you’re going to give people a survey to answer on their own, let them answer it completely on their own from beginning to end.
      2. If you’re going to review their answers in front of them, tell them that up front. And know that the answers they give probably won’t reflect reality.

      PS Don’t pay to visit Kensington Palace

      The Ultimate Guide to Writing Surveys From a Social Media Guru #MRX

      I have the most fortunate experience of having both a solid survey design background and expert social media research knowledge. It gives me uncanny insight into the world of current and culturally appropriate language thereby ensuring the highest level of clarity and understanding by survey audiences. I will now bestow this expert knowledge upon you, my most fortunate reader.

      Below you find the old traditional way of writing surveys as well as the new and improved way of writing surveys. Try the new way. You know you’ll like it.

      Instead of saying… Try this instead…
      Do you plan to window shop today? – – – – – – — – – – – – – – – – – — – – – – – – – – U bouta winnow shp in a minn?
      How likely are you to purchase running shoes?– – – – – — – – – – – – – — – – – – – – – – – – — – – – –
      U gon cop kicks?
      On a scale from 1 to 5, where 1 is Hate and 5 is Love, what is your opinion about Chuck Taylor running shoes?– – – – – — – – – – – – – — – – – – – – – – – – — – – – U tink dis chucks r bomb r whack?
      In your opinion, which of the following words best describes the comfort of Chuck Taylor running shoes? A) Above Average B) Average C) Below Average– – – – – — – – – – – – – — – – – – – – – – – – — – – – – – – – – Wich werds chucks? A) gud af B) meh C) lyk ass
      In your opinion, which one of the following actresses would be best suited to promote Chuck Taylor running shoes?– — – – – – – – – — – – – – – – – – – – — – – – – – – – ->/span> So like, wich 1 U gon tap?
      And, which one of the following celebrities is most likely to maintain a long-term promotional commitment to the Chuck Taylor brand?– — – – – – – – – — – – – – – – – – – – — – – – – – – – – Who clutchest?
      Thank you for your contribution. We appreciate     your time and effort.– — – – – – – – – — – – – – – – – – – – — – – – – – – – – tx

      Best of #Esomar Canada: Jon Puleston Games a Better Survey #MRX

      esomar logoThis is one of several (almost) live blogs from the ESOMAR Best of Canada event on May 9, 2012. Any errors, omissions, and ridiculous side comments are my own.

      Jon Puleston, VP GMI

      Creative survey design and gamification

      • We’ve taken a journey of massive pages of boring hard to read text to simple clear phrasing in 140 characters
      • Now we need to do the same for surveys and that doesn’t mean just throwing a thumbs up picture on your survey. (Massive giggling in my head. I know that tactic extremely well!)thumbs up
      • Survey writers are competing against Twitter and Farmville and that’s why we see a horrible decline in completion rates
      • We need to see surveys as a creative art form but we are still at the 1980’s whacky clip art and funky page transition stage
      • We need to let respondents give feedback in the way they want
      • A typical survey starts with a terribly boring long question of precisely what we want to know. We forget just how important foreplay is (ah yes, joke intended)
      • Think about survey design as a television interview with a celebrity. Would you ask George Clooney “On a scale from 1 to 10, how much did you enjoy making that last film?”
      • A first bad question means respondents will have no respect for the rest of the questions.
      • Imagery is very important and yet we just slap on a really dumb smiley face or thumbs up. Why haven’t you used a design firm to do this? It’s not about making it look nice. It’s about communicating effectively.
      • Must grids dominate every survey? They don’t capture the imagination of respondents.
      • The average person spend 4.3 seconds considering their answer to a question. For a grid, it drops down to 1.7 seconds.
      • We are researchers and yet we don’t use research to design better surveys. We rarely pilot surveys any more even though it’s extremely easy to pilot an online survey. You can get 15 completes and make tweaks all within an hour.
      • Now for a game. Jon took two volunteers who weren’t as familiar with gamification.
      • First women was asked a series of ordinary questions. Tell me about toronto. What’s your favourite meal. What’s in your fridge. What would you do with a brick. Jon responded with “um”, “ok”, and stared at his notes most of the time. Then he complained that she was too creative and ruined his demonstration. (Yup, that’s how live demos ALWAYS work. 🙂 )
      • Second person was then brought into the room and asked about his family, to write a postcard home to his family, to imagine he’s been convicted and is on death row and has to describe his last meal, to name his favourite foods within two minutes and anything not named can never be eaten again, who can think of more uses of a brick.
      • Game conclusion: you can get the same kind of data, but more of it and in more detail if you are creative.
      • The approach you use to ask a question can enthuse and excite people to give answers.
      • Book on gamification “Reality is broken”
      • Check out website gamification.org
      • What defines a game: Anything that is thinking and that we do for fun. Games have rules, skill and effort.
      • Gardening is a popular game for older age groups. There are rules for planting seeds, skill and effort to get a wonderful garden. (heeeey, was I just called old?)
      • Twitter is a game that tries to get you to say your thought in a short space and then get the message out
      • Email is a game: do you get the feedback you wanted
      • Surveys are games…… just rather dull ones
      • In a face to face interview, people can’t just turn their backs and walk away. We’re still working on that heritage.
      • Don’t ask for my favorite colours – ask what colour you’d paint your room
      • Don’t ask what you want to wear – ask what you’d wear if you were going on telelvision
      • Don’t ask what do you think of this product – ask what would you value this company at
      • Describe yourself vs describe yourself in exaclty 7 words. It’s liberating and you’ll actually get more information.
      • How much do you like these music artists – If you owned a radio station, which of these artists would you put on your play list.
      • Add in a competitive element. what brand comes to mind – how many brands can you guess in two minutes.
      • Rewards. Give them points for answering questions correctly. Stake some ‘money’ on what brand reflects this logo. These allow people ot be more circumspect instead of overly positive about everything. Makes it more easy to give a negative answer.
      • The best games encourage us to think and we like thinking.
      • But remember, games can affect data. Greater thought and consideration changes answers. Point scoring can steer data badly as people try to cheat the system. (This is the same with any research. Once you improve the research, the historical norms, whether they are correct or incorrect, are no longer applicable. But… if the survey was horribly boring, just how valid were those results? Be honest with yourself!)

      Would you prefer to kill someone over helping families? #MRX

      So what’s your answer? I’m going to take a wild stab in the dark and assume that 99.9% of people would say no to that question.  There is only one way to answer it without making yourself feel like a horrible person. This is what makes it a leading question.

      Now have a look at this survey that I received in the mail from Mr Bob Rae, my member of parliament.

      Continue reading →

      %d bloggers like this: