Tag Archives: ESRA

Representativeness of surveys using internet-based data collection #ESRA15 #MRX

By LoveStats on July 17, 2015

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

Yup, it’s sunny outside. And now i’m back inside for the next session. Fortunately, or unfortunately, this session is once again in a below ground room with no windows so I will not be basking in sunlight nor gazing longingly out the window. I guess I’ll be paying full attention to another really great topic.

conditional vs unconditional incentives: comparing the effect on sample composition in the recruitment of the german internet panel study GIP

unconditional incentives tend to perform better than promised incentives
include $5 with advance letter compared to promised $10 with thank you letter; assuming 50% response rate, cost of both groups is the same
consider nonresponse bias, consider sample demo distribution
unconditional incentive had 51% response rate, conditional incentive had 42% response rate
didn’t see a nonresponse bias [by demographics I assume, so many speakers are talking about important effects but not specifically saying what those effects are]
as a trend, the two sets of data provide very similar research results, yes differences in means but always fairly close together, confidence intervals always overlap

https://twitter.com/ialstoop/status/622001573481312256

evolution of representativeness in an online probability panel

LISS panel – probability panel, includes households without internet accesst, 30 minutes per month, paid for every completed questionnaire
is there systematic attrition, are core questionnaires affected by attrition
normally sociademographics only which is restrictive
missing data imputed using Mice
strongest loss in panel of sociodemographic properties
there are seasonal drops in attrition, for instance in June which is lots of holidays
has more effects for survey attitudes and health traits, less so for political and personality traits which are quite stable even with attrition
try to decrease attrition through refreshement based on targets

https://twitter.com/ialstoop/status/622004420314812417

moderators of survey representativeness – a meta analysis

measured single mode vs multimode surveys
R-indicators – single measure from 0 to 1 for sample representativeness, based on logistic regression models for repsonse propensity
hypothesize mixed mode surveys are more representative than single mode surveys
hypothesize cross-sectional surveys are more representative than longitudinal survyes
heterogeneity not really explained by moderators

setting up a probability based web panel. lessons learned fromt he ELIPSS pilot study

online panel in france, 1000 people, monthly questionnaires, internet access given to each member [we often wonder about the effect of people being on panels since they get used to and learn how to answer surveys, have we forgotten this happens in probability panels too? especially when they are often very small panels]
used different contact mdoes including letters, phone, face to face
underrepresented on youngest, elderly, less educated, offline people
reason for participatign in order – trust in ELIPSS 46%, originality of project 37%, interested in research 32%, free internet access 13%
16% attiriont after 30 months (that’s amazing, really low and really good!), response rate generally above 80%
automated process – invites on thursday, sustematic reminders, by text message, app message and email
individual followups by phone calls and letters [wow. well that’s how they get a high response rate]
individual followups are highly effective [i’d call them stalking and invasive but that’s just me. i guess when you accept free 4g internet and a tablet, you are asking for that invasiveness]
age becomes less representative over time, employment status changes a lot, education changes the most but of course young people gain more education over time
need to give feedback to panel members as they keep asking for it
want to broaden use of panel to scientific community by expanding panel to 3500 people

https://twitter.com/nicolasbecuwe/status/622009359082647552

https://twitter.com/ialstoop/status/622011086783557632

the pretest of wave 2 of the german health interview and examination survey for children and adolescents as a mixed mode survey, composition of participant groups

mixed mode helps to maintain high response, web is prefered by younger people, representativeness could be increased by using multiple modes
compared sequential and simultaneous surveys
single mode has highest response rate, mixed mode simultaneous was extremely close behind, mixed mode multi-step had the lowest rate
paper always gave back the highest porportion of data even when people had the choice of both, 11% to 43% chose the paper among 3 groups
sample composition was the same among all four groups, all confidence intervals overlap – age, gender, nationality, immigration, education
metaanalysis – overall trend is the same
4% lower response rate in mixed mode – additional mode creates cognitive burden, creates a break in response process, higher breakoffs
mixed mode doesn’t increase sample composition nor response rates [that is, giving people multiple options as opposed to just one option, as opposed to multiple groups whereby each groups only knows about one mode of participation.]
current study is now a single mode study

12 years of mixed-modes research in the ESS, in 15 minutes #esra15

— Jon Burton jonburton@bsky.social (@jburton123) July 17, 2015

https://twitter.com/oparnet/status/622015032231075840

Posted in: marketing research | Tagged: conference, ESRA, nonprobability, probability, representative

Sample composition in online studies #ESRA15 #MRX

By LoveStats on July 17, 2015

Live blogged at #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

I’ve been pulling out every ounce of bravery I have here in Iceland and I went to the pool again last night (see prevoius posts on public nakedness!). I could have also broken my rule about not traveling after dark in strange cities but since it never gets dark here, I didn’t have to worry about that! The pool was much busier this time. I guess kiddies are more likely to be out and about after dinner on a weekday rather than sunday morning at 9am. All it meant is that I had a lot more people watching to do. All in all good fun to see little babies and toddlers enjoying a good splash and float!

This morning, the sun was very much up and the clouds very much gone. I’ll be dreaming of breaktime all morning! Until then however, i’ve got five sessions on sample composition in online surveys, and representativeness of online studies to pay attention to. It’s going to be tough but a morning chock full of learning will get me a reward of more pool time! what is the gain in a probability based online panel to provide internet access to sampling unites that did not have access before

germany has GIP, france has ELPSS, netherlands has LISS as probability panels
weighting might not be enough to account for bias of people who do not have internet access
but representativeness is still a problem because people may not want to participate even if they are given access, recruitment rates are much lower among non-interenet households
probaility panels still have problems, you won’t answer every survey you are sent, attrition
do we lose much without a representative panel? is it worth the extra cost
in Elipss panel, everyone is provided a tablet, not just people without access. the 3G tablet is the incentive you get to keep as long as you are on the panel. so everyone uses the same device to participate in the research
what does it mean to not have Internet access – used to be computer + modem. Now there are internet cafes, free wifi is everywhere. hard to define someone as no internet access now. We mean access to complete a survey so tiny smartphones don’t count.
14.5% of adults in france were classified as not having internet. turned out to be 76 people in the end which is a bit small for analytics purposes. But 31 of them still connected every day.
non-internet access people always participated less than people who did have internet.
people without internet always differ on demographics [proof is chi-square, can’t see data]
populations are closer on nationality, being in a relationship, and education – including non-internet helps with these variables, improves representativity
access does not equal usage does not equal using it to answer surveys
maybe consider a probability based panel without providing access to people who don’t have computer/tablet/home access

Melanie Revilla: internet access isn't the same as internet use, & internet use isn't the same as using internet for surveys. #esra15

— Gerry Nicolaas (@GerryNicolaas) July 17, 2015

parallel phone and web-based interviews: comparability and validity

phones are relied on for research and assumed to be good enough for representativeness, however most people don’t answer phone calls when they don’t recognize the number, cant use autodialler in the USA for research
online surveys can generate better quality due to programming validation and ability to only be able to choose allowable answers
phone and online have differences in presentation mode, presence of human interviewer, can read and reread responses if you wish, social desirability and self-presentation issues – why should online and offline be the same
caution about combining data from different modes should be exercised [actually, i would want to combine everything i possibly can. more people contributing in more modes seems to be more representative than excluding people because they aren’t identical]
how different is online nonprobability from telephone probability [and for me, a true probability panel cannot technically exist. its theoretically possible but practically impossible]
harris did many years of these studies side by side using very specific methodologies
measured variety of topics – opinions of nurses, bug business trust, happiness with health, ratings of president
across all questions, average correlation between methods was .92 for unweighted means and .893 for weighted means – more bias with weighted version
is it better for scales with many response categories – corrections go up to .95
online means of attitudinal items were on average 0.05 lower on scale from 0 to 1. online was systematically biased lower
correlations in many areas were consistently extremey high, means were consistently very slightly lower for online data; also nearly identical rank order of items
for political polling, the two methods were again massively similar, highly comparable results; mean values were generally very slightly lower – thought to be ability to see the scale online as well as social desirability in telephone method, positivity bias especially for items that are good/bad as opposed to importance
[wow, given this is a study over ten years of results, it really calls into question whether probability samples are worth the time and effort]
[audience member said most differences were due to the presence of the interviewer and nothing to do with the mode, the online version was foudn to be truer]

Randall Thomas at #esra15 shows means across non-prob and prob. based polls are similar. To me, only option if you don't need precision.

— Peter Lugtig – now also on Bluesky (@PeterLugtig) July 17, 2015

representative web survey

only a sample without bias can generalize, the correct answer should be just as often a little bit higher or a little bit lower than reality
in their sample, they underreprested 18-34, elementary school education, lowest and highest income people
[yes, there are demographic differences in panels compared to census and that is dependent completely on your recruitment method. the issue is how you deal with those differences]
online panel showed a socially positive picture of population
can you correct bias through targeted sampling and weighting, ethnicity and employment are still biased but income is better [that’s why invites based on returns not outgo are better]
need to select on more than gender, age, and region
[i love how some speakers still have non-english sections in their presentation – parts they forgot to translate or that weren’t translatable. now THIS is learning from peers around the world!]

Peter Linde quotes Chinese proverb wrt repr web panels: "it doesn't matter what color a cat is as long as it catches nice" but… #esra15

— Gerry Nicolaas (@GerryNicolaas) July 17, 2015

https://twitter.com/gerrynicolaas/status/621985749022449664

measuring subjective wellbeing: does the use of websurveys bias the results? evidence from the 2013 GEM data from luxembourg

almost everyone is completely reachable by internet
web surveys are cool – convenient for respondents, less social desirability bias, can use multimedia, less expensive, less coding errors; but there are sampling issues and bias from the mode
measures of subjective well being – i am satisfied with my life, i have obtained all the important things i want in my life, the condition of my life are excellent, my life is close to my ideal [all positive keyed]
online survey gave very slightly lower satisfaction
the results is robuts to three econometric techqnies
results from happiness equations using differing modes are compatible
web surveys are reliable for collecting information on wellbeing

Posted in: marketing research | Tagged: conference, data quality, ESRA, nonprobability, representative, sampling

Assessing and addressing measurement equivalence in cross-cultural surveys #ESRA15 #MRX

By LoveStats on July 16, 2015

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

Today’s lunch included vanilla Skyr. Made with actual vanilla beans. Beat that yoghurt of home! Once again, i cannot choose a favourite among coconut, pear, banana, and vanilla other than to say it completely beats yoghurt. I even have a favourite brand although since I don’t have the container in front of me right now, I can’t tell you the brand. It still counts very much as brand loyalty though because I know exactly what the container looks like once I get in the store.

I have to say I remain really impressed with the sessions. They are very detail oriented and most people provide sufficient data for me to judge for myself whether I agree with their conclusions. There’s no grandstanding, essentially no sales pitches, and I am getting take-aways in one form or another from nearly every paper. I’m feeling a lot less presentation pressure here simply because it doesn’t seem competitive. If you’ve never been to an ESRA conference, I highly recommend it. Just be prepared to pack your own lunch every day. And that works just great for me.

cross cultural equivalence of survey response latencies

how long does it take for a respondent to provide their answer, easy to capture with computer assisted interviewing, uninfluenced by self reports
longer latencies seem to represent more processing time for cognitive operations, also represents presence and accessibility of attitudes and strength of those attitudes
longer latencies correlated with age, alcohol use, and poorly designed and ambiguous questions, perhaps there is a relationship with ethnic status
does latency differ by race/ethnicity; do they vary by language of interview
n=600 laboratory interview, 4 race groups, 300 questions taking 77 minutes all about health, order of sections rotated
required interviewer to hit a button when they stopped talking and hit a button when the respondent started talking; also recorded whether there were interruptions in the response process; only looked at perfect responses [which are abnormal, right?]
reviewed all types of question – dichotomous, categorical, bipolar scales, etc
Hispanic, black, Korean indeed took longer to answer compared to white people on the English survey in the USA
more educated took slightly less time to answer
numeric responses took much longer, yes not took the least, uni-polar was second least
trend was about the same by ethnicity
language was an important indicator

comparing survey data quality form native and nonnative English speakers

me!
conclusion – using all of our standard data quality measures may eliminate people based on their language skills not on their data quality skills. But, certain data quality measures are more likely to predict language rather than data quality. We should focus more on on straightlining and overclicking and ignore underclicking as a major error.
ask me for the paper 🙂

trust in physicians or trust in physician – testing measurement invariance of trust in physicians in different health care cultures

trust reduces social complexity, solves problems of risk, makes interactions possible
we lack knowledge of various professions – lawyers, doctors, etc, we don’t understand diagnosis, treatments
we must rely on certificates, clothes such as doctors white, location such as a hospital
is there generalized trust in doctors
different health care systems produce different kinds of trust, ditto cultural contexts, political and values systems
compared three countries with health care coverage and similar doctors per person measurements
[sorry, didn’t get the main conclusion from the statement “results were significant”]

Posted in: marketing research | Tagged: conference, data quality, ESRA

Advancements of survey design in election polls and surveys #ESRA15 #MRX

By LoveStats on July 16, 2015

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

I decided to take the plunge and choose a session in a different building this time. The bravery isn’t much to be noted as I’ve realized that the campus and buildings and rooms at the University of Iceland are far tinier than what I am used to. Where I’d expect neighboring buildings to be a ten minute walk from one end to the other, here it is a 30 second walk. It must be fabulous to attend this university where everything and everyone is so close!

I’m quite loving the facilities. For the most part, the chairs are comfortable. Where it looks like you just have a chair, there is usually a table hiding in the seat in front of you. There is instantly connecting and always on wifi no matter which building you’re in. There are computers in the hallways, and multiple plugs at all the very comfy public seating areas. They make it very easy to be a student here! Perhaps I need another degree?

Designing effective likely voter models in pre-election surveys

voter and intention and turnout can be extremely different. 80% say they will vote but 10% to 50% is often the number that actually votes
democratic vote share is often over represented [social desirability?]
education has a lot of error – 5% error rate, worst demographic variable
what voter model reduces these inaccuracies
behavioural models (intent do vote, have you voted, dichotomous variables) and resource based models (
vote intention does predict turnout – 86% are accurate, also reduces demographic errors
there’s not a lot of room to improve except when the polls look really close
Gallup tested a two item measure of voting intention – how much have you thought about this election, how likely are you to vote
2 item scale performed far better than the 7 item scale, error rate of 4% vs 1.4%
[just shown a histogram with four bars. all four bars look essentially the same. zero attempt to create a non-existent different. THAT’S how you use a chart 🙂 ]
gallup approach didnt work well, probability approach performed better
best measure of voting intention = Thought about election + likelihood of voting + education + voted before + strength of partisan identify

polls on national independence: the scottish case in a comparative perspective

[Claire Durand from the University of Montreal speaks now. Go Canada! 🙂 ]
what happened in Quebec in 1995? referendum on independence
Quebec and Scotland are nationalist in a British type system, proportion of non-nationals is similar
referendum are 50% + 1 wins
but polls have many errors, is there an ant-incumbent effect
“no” is always underestimated – whatever the no is
are referendum on national independence different – ethnic divide, feeling of exclusion, emotional debate, ideological divide
No side has to bring together enemies and don’t have a unified strategy
how do you assign non-disclosure?
don’t know doesn’t always mean don’t know
don’t distribute non-disclosures proportionally, they aren’t random
asking how people would vote TODAY resulted in 5 points less nondisclosure
corrections need to be applied after the referendum as well
people may agree with the general demands of the national parties but not with the solution they propose. maintaining the threat allows them to maintain pressure for change.
the Quebec newspapers reported the raw data plus the proportional response so people could judge for themselves

Key learnings #esra15 The no vote to referenda is often underestimated @TNS_opinion @clairedurand

— Becuwe Nicolas (@NicolasBecuwe) July 16, 2015

Scotland situation: 'People may agree with the demands of the nationalist parties but not with the solution they propose' @tnsbmrb #esra15

— Becuwe Nicolas (@NicolasBecuwe) July 16, 2015

how good are surveys at measuring past electoral behaviour? lessons from an experiment in a french online panel study

study bias in individual vote recall
sample size of 6000
over-reporting of popular party, under-reporting of less popular party
30% of voter recall was inconsistent
inconsistent respondents change their recall, changed parties, memory problems, concealing problems, said they didn’t vote, said you vote and then said you didn’t or vice versa
could be any number of interviewer issues
older people found it more difficult to remember but perhaps they have more voter loyalty
when available, use ]vote real from pre-election survey
use vote recall from post election underestimates voter transfers
caution in using vote recall to weight samples

Predicting the outcome of referenda #esra15 by @clairedurand – in principle should be easy but in reality much more complex @TNS_PS

— Becuwe Nicolas (@NicolasBecuwe) July 16, 2015

methodological issues in measuring vote recall – an analysis of the individual consistency of vote recall in two election longitudinal surveys

popularity = weighted average % of electorate represented
universality = weighted frequency of representing a majority
used four versions of non/weighting including google hits
measured 38 questions related to political issues
voters are driven by political traditional even if outdated, or by personal images of politicians not based on party manifestors
voters are irrational, political landscape has shifted even though people see the parties the same way they were decades ago
coalition formation aggravate the situation even more
discrepancy between the electorate and the government elected

Posted in: marketing research | Tagged: conference, ESRA, polling, survey design

The impact of questionnaire design on measurements in surveys #4 #ESRA15 #MRX

By LoveStats on July 16, 2015

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

Well, last night i managed to stay up until midnight. The lights at the church went on, lighting up the tower and the very top in an unusual way. They were quite pretty! The rest of the town enjoyed mood lighting as it didn’t really get dark at all. Tourists were still wandering in the streets since there’s no point going to bed in a delightful foreign city if you can still see where you’re going. And if you weren’t a fan of the mood lighting, have no fear! The sun ‘rose’ again just four hours later. If you’re scared of the dark, this is a great place to be – in summer!

Today’s program for me includes yet another sessions of question data quality, polling question design, and my second presentation on how non-native English speakers respond to English surveys. We may like to think that everyone answering our surveys is perfectly fluent but let’s be realistic. About 10% of Americans have difficulty reading/writing in English because it is not their native language. Add to that weakly and non-literate people, and there’s potential big trouble at hand.

the impact of answer format and item order on the quality of measurement

compared 2 point scale and 11 point scale, different order of questions and question can even be very widely apart, looked at perceived prestige of occupations
separated two pages of the surveys with a music game of guessing the artist and song, purely as distraction from the survey. the second page was the same questions in a completely different order, did the same thing numerous times changing the number of response options and question orders each time. whole experiment lasted one hour
assumed scale was uni-dimensional
no differences comparing 4 point to 9 point scale, none between 2 point and 9 point scale [so STOP USING HUGE SCALES!!!]
prestige does not change depending on order in the survey [but this is to be expected with non-emotional, non-socially desirable items]
respondents confessed they tried to answer well but maybe not the best of their ability or maybe their answers would change the next time [glad to see people know their answers aren’t perfect. and i wouldn’t expect anything different. why SHOULD they put 100% effort into a silly task with no legitimate outcome for them.]

measuring attitudes towards immigration with direct questions – can we compare 4 answer categories with dichotomous responses

when sensitive questions are asked, social desirability affects response distributions
different groups are affected in different ways
asked questions about racial immigration – asked binary or as a 4 point scale
it’s not always clear that slightly is closer to none or that moderately is closer to strongly. can’t just assume the bottom two boxes are the same or the top two boxes are the same
education does have an effect, as well as age in some cases
expression of opposition for immigration depends on the response scale
binary responses leads to 30 to 50% more “allow none” responses than the 4 point scale
respondents with lower education have lower probability to choose middle scale point

cross cultural differences in the impact of number of response categories on response behaviour and data structure of a short scale for locus of control

locus of control scale, 4 items, 2 internal, 2 external
tested 5 point vs 9 point scale
do the means differ, does the factor structure differ
I’m own boss; if i work hard, i’ll succeed; when at work or in m private life what I do is mainly determined by others; bad luck often gets in the way of m plans
labeled doesn’t apply at all, applies completely
didn’t see important demographic differences
saw one interaction but it didn’t really make sense [especially given sample size of 250 and lots of other tests happening]
[lots of chatter about significance and non-significance but little discussion of what that meant in real words]
there was no effect of item order, # of answer options mattered for external locus but not internal locus of control
[i’d say hard to draw any conclusions given the tiny number of items, small sample size. desperately needs a lot of replication]

the optimal number of categories in item specific scales

type of rating scale where the answer is specific to the scale and doesn’t necessarily apply to every other item – what is your health? excellent, good, poor
quality increased with the number of answer options comparing 11,7,5,3 point scales but not comparing 10,6,4 point scales
[not sure what quality means in this case, other audience members didn’t know either, lacking clear explanation of operationalization]

Posted in: marketing research | Tagged: conference, data quality, ESRA, questionnaire design

The impact of questionnaire design on measurements in surveys #2 #ESRA15 #MRX

By LoveStats on July 15, 2015

Live blogged at #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

Breaktime treated us to fruit and croissants this morning. I was hoping for another unique to iceland treat but perhaps that was a sign to stop eating. No, just kidding! Apparently you’re not allowed to bring food or drink into the classrooms. The signs say so. The signs also say no Facebook in the classrooms. Shhhh…. I was on Facebook in the classroom!

The sun is out again and I took a quick walk outside. I am thankful my hotel is at the foot of the famous church. No matter where I am in this city, I can always, easily, and instantly find my hotel. No map needed when the church is several times higher than the next highest building!

I’ve noticed that the questions at this conference are far more nit-picky and critical than I’m used. I suspect that is because the audience includes many academics whose entire job is focused on these topics. They know every minute detail because they’ve done similar studies themselves. It makes for great comments and questions, though it does seem to put the speaker on the spot every time!

smart respondents: let’s keep it short.

do we really need scale instructions in the question stem? they add length, mobile screens have limited space, and respondents skip the instructions if the response scale is already labeled [isn’t this just an artifact of old fashioned face to face surveys, telephone surveys]
they tested instructions that matched and did not match what was actually in the scale [i can imagine some panelists emailing the company to complain that the survey had errors!]
used a probability survey [this is one case where a nonprobability sample would have been well served, easier cheaper to obtain with no need to generalize precisely to a population]
answer frequencies looked very similar for correct and incorrect instructions, no significant differences, she’s happy to have nonsignificant results, unaffected by mobile device or age
[more regression results shown, once again, speaker did not apologize and the audience did not have a heart attack]
it seems like responsents ignore instructions in the question, they reply on the words in the answer options, e.g., grid headers
you can omit instructions if the labeling is provided in the answer options
works better for experienced survey takers [hm, i doubt that. anyone seeing the answer options will understand. at least, thats my opinion.]

What do survey Rs do when getting the wrong instructions on a scale? Ignore instructions and use the labels. I Becher #esra15

— Joe Murphy (@Joejohnmurphy) July 15, 2015

Becher: Keep it simple! Scale instructions in the question stem are superfluous if scale labels are also shown in the answer. #esra15

— Jo d'Ardenne (@JodArdenne) July 15, 2015

from web to paper: evaluation from data providers and data analysts. The case of annual survey finances of enterprises

we send out questionaires, something happens, we get data back – we don’t know what happens 🙂
wanted to keep question codes in the survey which seemed unnecessary to respondents, had really long instructions for some questions that didn’t fit on the page so they put them on a pdf
64% of people evaluted the codes on the online questionnaire positively, 12% rated the codes negatively. people liked that they could communicate with statistics netherlands by using the codes
74% negative responses to explanations of question which were intended to reduce calls from statistics netherlands, only 11% were positive
only 25% of people consulted the pdf with instructions
most people wanted to received a printed version of the questionnaire they filled out, people really wanted to print it and they screen capped it, people liked being able to return later, they could easily get an english version
data editors liked that they didn’t have to do data entry but now they needed more time to read and understand what was being said
they liked having the email address because they got more direct and precise answers, responses came back faster, they didn’t notice any changes in the time series data

Business survey Rs like online option but prefer to call w comments over recording in web form. D Giesen @statisticscbs #esra15

— Joe Murphy (@Joejohnmurphy) July 15, 2015

is variation in perception of inequality and redistribution of earnings actual or artifactual. effects of wording, order, and number of items

opinions differ when you ask how much should people make vs how much should the top quintile of peopl emake
they asked people how much a number of occupations should earn, they also varied how specific the title was e.g., teacher vs math teacher in a public highschool
estimates for specific descriptions were higher, high status jobs got much higher estimates
adding more occupations to the list makes reliability in earnings decrease

exploring a new way to avoid errors in attitude measurements due to complexity of scientific terms: an example with the term biodiversity

how do people talk about complicated terms, their own words often differ from scientific definitions
“what comes to mind when you think of biodiversity?” – used text analysis for word frequencies, co-occurences, correspondence analysis, used the results to design items for the second study
found five classes of items – standard common definition, associated with human actions to protect it, human envionment relationship, global actions and consequences, scientific definition
turned each of the five types of defiintions into a common word definition
people gave more positive opinions about biodiversity when they were asked immediately after the definition
items based on representations of biodiversity were valid and reliable
[quite like this methodology, could be really useful in politics]

[if any of these papers interest you, i recomend finding the author on the ESRA program and asking for an official summary. Global speakers and weak microphones makes note taking more challenging. 🙂 ]

Posted in: marketing research | Tagged: conference, ESRA, questionnaire design

The impact of questionnaire design on measurements in surveys #1 #ESRA15 #MRX

By LoveStats on July 15, 2015

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

I tried to stay up until midnight last night but ended going to bed around 10:30pm. Naturally, it was still daylight outside. I woke up this morning at 6am in broad daylight again. I’m pretty sure it never gets dark here no matter what they say. I began my morning routine as usual. Banged my head on the slanted ceiling, stared out the window at the amazing church, made myself waffles in the kitchen, and then walked past the pond teaming with baby ducks. Does it get any better? I think no. Except of course knowing i had another day of great content rich sessions ahead of me!

designs and developments of the income measures in the european social surveys

tested different income questions. allowed people to use a weekly, monthly, or annual income scale as they wished. there was also no example response, and no example of what constitutes income. Provided about 30 answer options to choose from, shown in three columns. Provided same result as a very specific question in some countries but not others.
also tested every country getting the same number breaks, groups weren’t arranged to reflect each countries distribution. this resulted in some empty breaks [but that’s not necessarily a problem if the other breaks are all well and evenly used]
when countries are asked to set up number breaks in well defined deciles, high incomes are chosen more often – affected because people had different ideas of what is and isn’t taxable income
[apologies for incomplete notes, i couldn’t quite catch all the details, we did get a “buy the book” comment.]

#esra15 Uwe Warner: income measures ESS very hard to standardise. But it seems to improve over time.

— Ineke Stoop (@ialstoop) July 15, 2015

item non-response and readability of survey questionnaire

any non-substantive outcome – missing values, refusals, don’t knows all count
non response can lower validity of survey results
semantic complexity measured by familiarity of words, length of words, abstract words that can’t be visualized, structural complexity
Measured – characters in an item, length of words, percent of abstract words, percent of lesser known words, percent of long words 12 or more characters
used the european social survey which is a highly standardized international survey, compared english and estonian, it is conducted face to face, 350 questions, 2422 uk respondents
less known and abstract words create more non-response
long words increase nonresponse in estonian but not in english, perhaps because english words are shorter anyways
percent of long words in english created more nonresponse
total length of an item didn’t affect nonresponse
[they used a list of uncommon words for measurement, such a book/list does exist in english. I used it in school to choose a list of swear words that had the same frequency levels as regular words.]
[audience comment – some languages join many words together which means their words are longer but then there are fewer words, makes comparisons more difficult]

Dr Ainsaar: 5 indicators to keep in mind if you want to maximise quest. readability in int'l comp. surveys #esra15 pic.twitter.com/AKV05EZLpf

— Olivier Parnet (@oparnet) July 15, 2015

Lesser-known words and abstract words increase the total nonresponse #esra15 pic.twitter.com/wcOscaNLbm

— Ana Slavec (@aslavec) July 15, 2015

helping respondents provide good answers in web surveys

some tasks are inherently difficult in surveys, often because people have to write in an answer, coding is expensive and error prone
this study focused on prescription drugs which are difficult to spell, many variations of the same thing, level of detail is unclear, but we have full lists of all these drugs available to us
tested text box, drop box to select from list, javascript (type ahead look up)
examined breakoff rates, missing data, response times, and codability of responses
asked people if they are taking drugs, tell us about three
study 1 – breakoffs higher from dropbox and javascript; median response times longer, but codability was better. LIsts didn’t work well at all.
study 2 – cleaned up the list, made all the capitalization the same. break off rates were now all the same. response times lower but still higher than the textbox version. codability still better for list versions.
study 3 – if they couldn’t find a drug in the list, they were allowed to type it out. unlike previous studies which proceeded with the missing data. dropbox had highest missing data. javascript had lowest missing data. median times highest for drop box. trends for more and more drugs as expected, effect is more but not as much more.
older browswers had trouble with dropdowns and javascript and had to be routed to the textbox options
if goal is to get codable answers, use a text box. if goal is to create skip patterns then javascript is the way to go.

And this is why it is a bad idea to use drop box questions. Mick Couper shows it is better to use text boxes. #esra15 pic.twitter.com/D52fiuZrdw

— Ana Slavec (@aslavec) July 15, 2015

rating scale labelling in web surveys – are numeric labels an advantage

you can use all words to label scales or just words on the end with numbers in between
research says there is less satisficing with verbal scales, they are more natural than numbers and there is no inherent meaning of numbers
means of the scales were different
less tie to completes the end labeled groups
people paid more attention to the five point labeled scale, and least to the end point labeled score
mean opinions did differ by scale, more positive on fully labeled scale
high cognitive burden to map responses of the numeric scales
lower reliability for the numeric labels

Posted in: marketing research | Tagged: conference, data quality, ESRA, questionnaire design

Surveying sensitive issues – challenges and solutions #ESRA15 #MRX

By LoveStats on July 14, 2015

Live blogged at #ESRA15 in Reykjavik. Any errors or bad jokes are my own. Break time brought some delightful donuts. I personally only ate one however on behalf of my friend, Seda, I ate several more just for her. By the way, since donuts are in each area, you can just breeze from one area to the next grabbing another donut each time. Just saying…

surveying sensitive questions – prevalence estimates of self-reported delinquency using the crosswise model

crime rates differ by country but rates of individuals reporting their own criminal behaviour shows opposite expectations. Thus countries with high rates have lower rates of self-report. Social desirability seems to be the case. Is this true?
Need to add random noise to the model so the respondent can hide themself. Needs no randomization device.
ask a non-sensiive quesiton and a sensitive question and asked to answer both the same way. Let the respondent indicate whether the answer to both are the same or different. You only need to know the answer of the first question (e.g., is your moms birthday in january? well 1/12 are in january).
crosswise model generates vastly high self-criminal rates in countries where you’d expect.
also asked people in the survey whether they answered carefully – 15% admitted they did not
crosswise results in mugh higher prevalence rates, causal models of deliquent behaviour could be very different
satisficing respondents gives less bias than expected
estimates of the crosswise model are conservative

pouring water into the wine – the advantages of teh crosswise model asking sensitive questions revisited

its easier to implement in self-administtered surveys, no extra randomization device necessary, cognitive burden is lower, no self-protection answering strategies
blood donation rates – direction question says 12% but crosswise says 18%
crosswise model had a much higher answering time, even after dropping extraordinarily slow people
model has some weakneses, the less the better approach is good to determine if the crosswise model works
do people understand the instructions and do they specifically follow those instructions

effects of survey sponsorship and mode of administration on respondents answers about their racial attitudes

used a number of prejudice scales both blatant and subtle
no difference in racial measures on condition of interviewer administration
blatant prejudice scale showed a significant interaction for type of sponsor
matters more when there is an interviewer and therefore insufficient privacy
sponsor effect is likely the result of social desirability
response bias is in opposite direction for academic and market research groups
does it depend which department does the study – law department, sociology department

impact of survey mode (mail vs telephone) and asking about future intentions

evidence suggests that asking about intent to get screened before asking about screening may minimize over reporting of cancer screening. removes the social pressure to over report.
people report behaviors more truthfully in self-administrered forms than interviews
purchased real estate on an omnibus survey
no main effect for mode
in mail mode, asking about intent first was more reflective of reality of screening rates
30% false positive said they had a test but it wasn’t in their medical record
little evidence that the intention item affected screening accuracy
mailed surveys may positively affected accuracy – but mail survey was one topic whereas the telephone was omnibus

effect of socio-demographic (mis)match between interviewers and respondents on the data quality of answers to sensitive questions

theory of liking, some say matching improves chances of participation, may also improve disclosure and reporting, especially gender matching
current matched within about five years of age as opposed to arbitrary cut-off points
also matched on education
male interviewer to female interviewee had lowest response rate
older interviewer had lower response rate
no effects for education
income had the most missing data, parent’s education was next highest missing data likely because education from 50 years ago was different and you’d have to translate, political party had high missing rate
if female subject refuses a male interviewer, send a female to try to convince them
it’s easier to refuse a person who is the same age as you [maybe it’s a feeling of superiority/inferiority – you’re no better than me, i don’t have to answer to you]
men together generate the least item non-response
women together might get too comfortable together, too chatty, more non-response, role-boundary issue
age matching is less item non-response
same education is less item non-response, why do interviewers allow more item non-response when theirrespondent has a lower education

Posted in: marketing research | Tagged: data quality, ESRA

Direction of response scales #ESRA15 #MRX

By LoveStats on July 14, 2015

Live blogged at #ESRA15 in Reykjavik. Any errors or bad jokes in the notes are my own.

I discovered that all the buildings are linked indoors. Let it rain, let it rain, i don’t care how much it rains…. [Feel free to sing that as loud as you can.] Lunch was Skyr, oat cookies and some weird beet drink. Yup. I packed it myself. I always try to like yogurt and never really do. Skyr works for me. So far, coconut is my favourite. I’ve forgotten to take pictures of speakers today so let’s see if I can keep the trend going! Lots of folks in this session so @MelCourtright and I are not the only scale geeks out there . 🙂

Response scales: Effects of scale length and direction on reported political attitudes

instruments are not neutral, they are a form of communication
cross national projects use different scales for the same question so how do you compare the reuslts
trust in parliament is a fairly standard question for researchers and so makes a good example
4 point scale is most popular but it is used up to 11 points, traditional format is very positive to very negative
included a don’t know in the answer options
transformed all scales into a 0 to 1 scale and evenly distributed all scores in between
means highest with 7 point scale traditional direction and lowest with 4 point and 11 point traditional direction
reverse direction had much fewer mean differences, essentially all the same
four point scales show differences in direction, 7 and 11 point show fewer differences in direction
[regression results shown on the screen – no one fainted or died, the speaker did not apologize or say she didn’t understand them. interesting difference compared to MRX events.]

Does satisficing drive scale direction effects

research shows answers shift towards the start fo the scale but this is not consistent
achoring and adjustment effects whereby people use the first answer option as the anchor, interpretative heuristics suggest people choose an early response to express their agreement with the questions, primacy effects due to satisficing decreases cognitive load
scores were more positive when the scale started positive, differences were huge across all the brands
the pattern is the same but the differences are noticeable
speeding measured as 300 milliseconds per word
speeders more likely to choose early answer option
answers are pushed to the start of the scale, limited evidnce that it is caused by satisficing

https://twitter.com/cernat_a/status/620964371288694784

Ordering your attention: response order effects in web-based surveys

primacy happens more often visually and recency more often orally
scales have an inherence order. if you know the first answer option, you know the remainder of the options
sample size over 100 000, random assigned to scale order, also tested labeling, orientation, and number of response categories from 2 to 11
the order effect was always a primacy effect, differences were significant though small; significant more due to sample size [then why mention the results if you know they aren’t important?]
order effects occurred more with fully labeled scales, end labeled scales did not see response order effects
second study also supported the primacy effect with half of questions showing the effect
much stronger response seen with unipolar scales
vertical scales are much stronger response as well
largest effect seen for horizontal unipolar scale
need to run the same tests with grids, don’t know which response is more valid, need to know what they will be and when

https://twitter.com/cernat_a/status/620968924071464960

Impact of repsonse scale direction on survey repsonses in web and mobile web surveys

why does this effect happen?
tested agreement scales and frequency scales
shorter scale decreases primacy effect
scale length has a signifciant moderating effect – strongly effect for 7 point scales compared to 5 point scale
labeling has significant moderating effects – stronger effect for fully labeled
question location matters – stronger effect on earlier questions
labeled behavioural scale shows the largest impact, end labeled attitudinal scale has the smallest effect
scale direction affects responses – more endorsement at start of scale
7 point fully labeled frequency scale is most affected
we must use shorter scales and end labeling to reuce scale direction effects in web surveys

https://twitter.com/cernat_a/status/620972872622833664

Importance of scale direction between different modes

term used is forward/reverse scale [as opposed to ascending/descending or positive/negative keyed]
in the forward version of the scale, the web creates more agreement; but face to face it’s very weak. face to face shows recency effect
effect is the same for general scales (all scales are agreement) and item specific scales (each scale reflects the specific question), more cognitive effort in the item specific scale so maybe less effort is invested in the response
item specific scale affected more by the web
randomizing scale matters more in online surveys

Posted in: marketing research | Tagged: conference, data quality, ESRA, scale

Assessing the quality of survey data (Good session!) #ESRA15 #MRX

By LoveStats on July 14, 2015

Live blogged from #ESRA15 in Reykjavik. Any error or bad jokes in the notes are my own. As you can see, I managed to find the next building from the six buildings the conference is using. From here on, it’s smooth sailing! Except for the drizzle. Which makes wandering between buildings from session to session a little less fun and a little more like going to a pool. Without the nakedness.

Session #1 – Data quality in repeated surveys: evidence from a quasi-experimental design by multiple professors from university of Rome

respondents can refuse to participate in the study resulting in series of missing data but their study had very little missing data, only about 5% this time [that’s what student respondents does for you, would like to see a study with much larger missing rates]
questions had an i do not know option, and there was only one correct answer
19% of gender/birthday/socioeconomic status changed from survey to survey [but we now understand that gender can change, researchers need to be open to this. And of course, economic status can change in a second]
Session #2 – me! Lots of great questions, thank you everyone!

Asking for phone number at the beginning doesn't affect responses but ppl who give number respond differently, study by @LoveStats #esra15

— Ana Slavec (@aslavec) July 14, 2015

Session #3 – Processing errors in the cross national surveys

we don’t consider process errors very often as part of total survey error
found 154 processing errors in the series of studies – illegitimate variable values such as education that makes little sense or age over 100, misleading variable values, contradictory values, value discrepancies, lack of value labels, maybe you’re expecting a range but you get a specific value, what if 2 is coded as yes in the software but no in the survey
age and education were most problematic, followed by schooling
lack of labels was the worst problem, followed by illegitimate values, and misleading values
is 22% discrepancies out of all variables checked good or bad?

Session #4 – how does household composition derived from census data describe or misrepresent different family types

strength of census data is their exhaustivity, how does census data differ from a smaller survey
census counts household members, family survey describes families and explores people outside the household such as living apart, they desribe different universe. a boarder may not be measured in the family survey but yes mentioned in the census survey
in 10% of cases, more people are counted in the census, 87% have the same number of people on both surveys
census is an accounting tool, not a tool for understanding social life, people do not organize their lives to be measured and captured at one point and one place in time
census only has a family with at least one adult and at least one child
isolated adult in a household with other people is 5% of adults in the census, not classified the same in both surveys
there is a problem attributing children to the right people – problem with single parent families; single adults are often ‘assigned’ a child from the household
a household can include one or two families at the most – complicated when adult children are married and maybe have a kid. A child may be assigned to a grandparent which is in err.
isolated adults may live with a partner in the dwelling, some live with their parents, some live with a child (but children move from one household to another), 44% of ‘isolated’ adults live with family members, they aren’t isolated at all
previously couples had to be heterosexual, even though they survey as a union the rules split them into isolated adults [that’s depressing. thank you for changing this rule.]
census is more imperfect than the survey, it doesnt catch subtle transformations in societal life. calls into question definitions of marginal groups
also a problem for young adults who leave home but still have strong ties to the parents home – they may claim their own home and their parents may also still claim them as living together
[very interesting talk. never really thought about it]

INED insitute's family survey finds a higher number of household members than census data in 10% of cases, especially for complex hh #esra15

— Ana Slavec (@aslavec) July 14, 2015

Session #5 – Unexpectedly high number of duplicates in survey data

simulated duplicates created greater bias of the regression coefficient when up to 50% of cases were duplicated 2 to 5 times
birthday paradox – how many persons are needed in order to find two having an identical birthday – 23. A single duplicate in a dataset is likely.
New method – the Hamming diagram – diversity of data for survey – it looks like a normal curve with some outliers so i’m thing Hamming is simply a score like mahalonobis is for outliers
found duplicate sin 10% of surveys, 14 surveys comprised 80% of total duplicates with one survey at 33%
which case do you delete? which one is right if indeed one is right. always screen your data before starting a substantial analysis.
[i’m thinking that ESRA and AAPOR are great places to do your first conference presentation. there are LOTS of newcomers and presentation skills aren’t fabulous. so you won’t feel the same pressure as at other conferences. Of course, you must have really great content because here, content truly is king]
[for my first ESRA conference, i’m quite happy with the quality of the content. now let’s hope for a little sun over the lunch hour while I enjoy Skyr, my new favourite food!]

Posted in: marketing research | Tagged: conference, data quality, ESRA, survey

The LoveStats Blog

Tag Archives: ESRA

Representativeness of surveys using internet-based data collection #ESRA15 #MRX

Sample composition in online studies #ESRA15 #MRX

Assessing and addressing measurement equivalence in cross-cultural surveys #ESRA15 #MRX

Advancements of survey design in election polls and surveys #ESRA15 #MRX

The impact of questionnaire design on measurements in surveys #4 #ESRA15 #MRX

The impact of questionnaire design on measurements in surveys #2 #ESRA15 #MRX

The impact of questionnaire design on measurements in surveys #1 #ESRA15 #MRX

Surveying sensitive issues – challenges and solutions #ESRA15 #MRX

Direction of response scales #ESRA15 #MRX

Assessing the quality of survey data (Good session!) #ESRA15 #MRX

Hi and welcome!

Mmm books!

My Chatter

Tag Archives: ESRA

Sharing is nice:

Sharing is nice:

Sharing is nice:

Sharing is nice:

Sharing is nice:

Related articles

Sharing is nice:

Related articles

Sharing is nice:

Related articles

Sharing is nice:

Related articles

Sharing is nice:

Related articles

Sharing is nice:

Hi and welcome!

Mmm books!

My Chatter