Tag Archives: reykjavik

Representativeness of surveys using internet-based data collection #ESRA15 #MRX 

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

Yup, it’s sunny outside. And now i’m back inside for the next session. Fortunately, or unfortunately, this session is once again in a below ground room with no windows so I will not be basking in sunlight nor gazing longingly out the window. I guess I’ll be paying full attention to another really great topic.

 

conditional vs unconditional incentives: comparing the effect on sample composition in the recruitment of the german internet panel study GIP

  • unconditional incentives tend to perform better than promised incentives
  • include $5 with advance letter compared to promised $10 with thank you letter; assuming 50% response rate, cost of both groups is the same
  • consider nonresponse bias, consider sample demo distribution
  • unconditional incentive had 51% response rate, conditional incentive had 42% response rate
  • didn’t see a nonresponse bias [by demographics I assume, so many speakers are talking about important effects but not specifically saying what those effects are]
  • as a trend, the two sets of data provide very similar research results, yes differences in means but always fairly close together, confidence intervals always overlap

https://twitter.com/ialstoop/status/622001573481312256

evolution of representativeness in an online probability panel

  • LISS panel – probability panel, includes households without internet accesst, 30 minutes per month, paid for every completed questionnaire
  • is there systematic attrition, are core questionnaires affected by attrition
  • normally sociademographics only which is restrictive
  • missing data imputed using Mice
  • strongest loss in panel of sociodemographic properties
  • there are seasonal drops in attrition, for instance in June which is lots of holidays
  • has more effects for survey attitudes and health traits, less so for political and personality traits which are quite stable even with attrition
  • try to decrease attrition through refreshement based on targets

https://twitter.com/ialstoop/status/622004420314812417

moderators of survey representativeness – a meta analysis

  • measured single mode vs multimode surveys
  • R-indicators – single measure from 0 to 1 for sample representativeness, based on logistic regression models for repsonse propensity
  • hypothesize mixed mode surveys are more representative than single mode surveys
  • hypothesize cross-sectional surveys are more representative than longitudinal survyes
  • heterogeneity not really explained by moderators

setting up a probability based web panel. lessons learned fromt he ELIPSS pilot study

  • online panel in france, 1000 people, monthly questionnaires, internet access given to each member [we often wonder about the effect of people being on panels since they get used to and learn how to answer surveys, have we forgotten this happens in probability panels too? especially when they are often very small panels]
  • used different contact mdoes including letters, phone, face to face
  • underrepresented on youngest, elderly, less educated, offline people
  • reason for participatign in order – trust in ELIPSS 46%, originality of project 37%, interested in research 32%, free internet access 13%
  • 16% attiriont after 30 months (that’s amazing, really low and really good!), response rate generally above 80%
  • automated process – invites on thursday, sustematic reminders, by text message, app message and email
  • individual followups by phone calls and letters [wow. well that’s how they get a high response rate]
  • individual followups are highly effective [i’d call them stalking and invasive but that’s just me. i guess when you accept free 4g internet and a tablet, you are asking for that invasiveness]
  • age becomes less representative over time, employment status changes a lot, education changes the most but of course young people gain more education over time
  • need to give feedback to panel members as they keep asking for it
  • want to broaden use of panel to scientific community by expanding panel to 3500 people

https://twitter.com/nicolasbecuwe/status/622009359082647552

https://twitter.com/ialstoop/status/622011086783557632

the pretest of wave 2 of the german health interview and examination survey for children and adolescents as a mixed mode survey, composition of participant groups

  • mixed mode helps to maintain high response, web is prefered by younger people, representativeness could be increased by using multiple modes
  • compared sequential and simultaneous surveys
  • single mode has highest response rate, mixed mode simultaneous was extremely close behind, mixed mode multi-step had the lowest rate
  • paper always gave back the highest porportion of data even when people had the choice of both, 11% to 43% chose the paper among 3 groups
  • sample composition was the same among all four groups, all confidence intervals overlap – age, gender, nationality, immigration, education
  • metaanalysis – overall trend is the same
  • 4% lower response rate in mixed mode – additional mode creates cognitive burden, creates a break in response process, higher breakoffs
  • mixed mode doesn’t increase sample composition nor response rates [that is, giving people multiple options as opposed to just one option, as opposed to multiple groups whereby each groups only knows about one mode of participation.]
  • current study is now a single mode study

https://twitter.com/oparnet/status/622015032231075840


 

Advertisements

Sample composition in online studies #ESRA15 #MRX 

Live blogged at #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

I’ve been pulling out every ounce of bravery I have here in Iceland and I went to the pool again last night (see prevoius posts on public nakedness!). I could have also broken my rule about not traveling after dark in strange cities but since it never gets dark here, I didn’t have to worry about that! The pool was much busier this time. I guess kiddies are more likely to be out and about after dinner on a weekday rather than sunday morning at 9am.  All it meant is that I had a lot more people watching to do. All in all good fun to see little babies and toddlers enjoying a good splash and float!

This morning, the sun was very much up and the clouds very much gone. I’ll be dreaming of breaktime all morning! Until then however, i’ve got five sessions on sample composition in online surveys, and representativeness of online studies to pay attention to. It’s going to be tough but a morning chock full of learning will get me a reward of more pool time!  what is the gain in a probability based online panel to provide internet access to sampling unites that did not have access before

  • germany has GIP, france has ELPSS, netherlands has LISS as probability panels
  • weighting might not be enough to account for bias of people who do not have internet access
  • but representativeness is still a problem because people may not want to participate even if they are given access, recruitment rates are much lower among non-interenet households
  • probaility panels still have problems, you won’t answer every survey you are sent, attrition
  • do we lose much without a representative panel? is it worth the extra cost
  • in Elipss panel, everyone is provided a tablet, not just people without access. the 3G tablet is the incentive you get to keep as long as you are on the panel. so everyone uses the same device to participate in the research
  • what does it mean to not have Internet access – used to be computer + modem. Now there are internet cafes, free wifi is everywhere. hard to define someone as no internet access now. We mean access to complete a survey so tiny smartphones don’t count.
  • 14.5% of adults in france were classified as not having internet. turned out to be 76 people in the end which is a bit small for analytics purposes. But 31 of them still connected every day.
  • non-internet access people always participated less than people who did have internet.
  • people without internet always differ on demographics [proof is chi-square, can’t see data]
  • populations are closer on nationality, being in a relationship, and education – including non-internet helps with these variables, improves representativity
  • access does not equal usage does not equal using it to answer surveys
  • maybe consider a probability based panel without providing access to people who don’t have computer/tablet/home access

parallel phone and web-based interviews: comparability and validity

  • phones are relied on for research and assumed to be good enough for representativeness, however most people don’t answer phone calls when they don’t recognize the number, cant use autodialler in the USA for research
  • online surveys can generate better quality due to programming validation and ability to only be able to choose allowable answers
  • phone and online have differences in presentation mode, presence of human interviewer, can read and reread responses if you wish, social desirability and self-presentation issues – why should online and offline be the same
  • caution about combining data from different modes should be exercised [actually, i would want to combine everything i possibly can. more people contributing in more modes seems to be more representative than excluding people because they aren’t identical]
  • how different is online nonprobability from telephone probability  [and for me, a true probability panel cannot technically exist. its theoretically possible but practically impossible]
  • harris did many years of these studies side by side using very specific methodologies
  • measured variety of topics – opinions of nurses, bug business trust, happiness with health, ratings of president
  • across all questions, average correlation between methods was .92 for unweighted means and .893 for weighted means – more bias with weighted version
  • is it better for scales with many response categories – corrections go up to .95
  • online means of attitudinal items were on average 0.05 lower on scale from 0 to 1. online was systematically biased lower
  • correlations in many areas were consistently extremey high, means were consistently very slightly lower for online data; also nearly identical rank order of items
  • for political polling, the two methods were again massively similar, highly comparable results; mean values were generally very slightly lower – thought to be ability to see the scale online as well as social desirability in telephone method, positivity bias especially for items that are good/bad as opposed to importance 
  • [wow, given this is a study over ten years of results, it really calls into question whether probability samples are worth the time and effort]
  • [audience member said most differences were due to the presence of the interviewer and nothing to do with the mode, the online version was foudn to be truer]

representative web survey

  • only a sample without bias can generalize, the correct answer should be just as often a little bit higher or a little bit lower than reality
  • in their sample, they underreprested 18-34, elementary school education, lowest and highest income people
  • [yes, there are demographic differences in panels compared to census and that is dependent completely on your recruitment method. the issue is how you deal with those differences]
  • online panel showed a socially positive picture of population
  • can you correct bias through targeted sampling and weighting, ethnicity and employment are still biased but income is better [that’s why invites based on returns not outgo are better]
  • need to select on more than gender, age, and region
  • [i love how some speakers still have non-english sections in their presentation – parts they forgot to translate or that weren’t translatable. now THIS is learning from peers around the world!]

measuring subjective wellbeing: does the use of websurveys bias the results? evidence from the 2013 GEM data from luxembourg

  • almost everyone is completely reachable by internet
  • web surveys are cool – convenient for respondents, less social desirability bias, can use multimedia, less expensive, less coding errors; but there are sampling issues and bias from the mode
  • measures of subjective well being – i am satisfied with my life, i have obtained all the important things i want in my life, the condition of my life are excellent, my life is close to my ideal [all positive keyed]
  • online survey gave very slightly lower satisfaction
  • the results is robuts to three econometric techqnies
  • results from happiness equations using differing modes are compatible
  • web surveys are reliable for collecting information on wellbeing

Assessing and addressing measurement equivalence in cross-cultural surveys #ESRA15 #MRX 

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

Today’s lunch included vanilla Skyr. Made with actual vanilla beans. Beat that yoghurt of home! Once again, i cannot choose a favourite among coconut, pear, banana, and vanilla other than to say it completely beats yoghurt. I even have a favourite brand although since I don’t have the container in front of me right now, I can’t tell you the brand. It still counts very much as brand loyalty though because I know exactly what the container looks like once I get in the store.

I have to say I remain really impressed with the sessions. They are very detail oriented and most people provide sufficient data for me to judge for myself whether I agree with their conclusions. There’s no grandstanding, essentially no sales pitches, and I am getting take-aways in one form or another from nearly every paper. I’m feeling a lot less presentation pressure here simply because it doesn’t seem competitive. If you’ve never been to an ESRA conference, I highly recommend it. Just be prepared to pack your own lunch every day. And that works just great for me.

cross cultural equivalence of survey response latencies

  • how long does it take for a respondent to provide their answer, easy to capture with computer assisted interviewing, uninfluenced by self reports
  • longer latencies seem to represent more processing time for cognitive operations, also represents presence and accessiility of attitudes and strength of those attitudes
  • longer latencies correlated with age, alcohol use, and poorly designed and ambiguous questions, perhaps there is a relationship with ethnic status
  • does latency differ by race/ethnicity; do they vary by language of interview
  • n=600 laboratory interview, 4 race groups, 300 questions taking 77 minutes all about health, order of sections rotated
  • required interviwer to hit a button when they stopped talking and hit a button when the respondent started talking; also recorded whether there were interruptions in the response process; only looked at perfect responses [which are abnormal, right?]
  • reviewed all types of question – dichotomous, categorical, bipolar scales, etc
  • hispanic, black, korean indeed took longer to answer compared to white people on the english survey in the usa
  • more educated took slightly less time to answer
  • numeric responses took much longer, yes not took the least, unipolar was second least
  • trend was about the same by ethnicity
  • language was an important indicator

comparing survey data quality form native and nonnative english speakers

  • me!
  • conclusion – using all of our standard data quality measures may eliminate people based on their language skills not on their data quality skills. But, certain data quality measures are more likely to predict language rather than data quality. We should focus more on on straightlining and overclicking and ignore underclicking as a major error.
  • ask me for the paper 🙂

trust in physicians or trust in physician – testing measurement invariance of trust in phsycians in different health care cultures

  • trust reduces social complexity, solves problems of risk, makes interactions possible
  • we lack knowledge of various professions – lawyers, doctors, etc, we don’t understand diagnosis, treatments
  • we must rely on certificates, clothes such as doctors white, location such as a hospital
  • is there generalized trust in doctors
  • different health care systems produce different kinds of trust, ditto cultural contexts, political and values systems
  • compared three countries with health care coverage and similar doctors per person measurements
  • [sorry, didn’t get the main conclusion from the statement “results were significant”]

Advancements of survey design in election polls and surveys #ESRA15 #MRX 

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

I decided to take the plunge and choose a session in a different building this time. The bravery isn’t much to be noted as I’ve realized that the campus and buildings and rooms at the University of Iceland are far tinier than what I am used to. Where I’d expect neighboring buildings to be a ten minute walk from one end to the other, here it is a 30 second walk. It must be fabulous to attend this university where everything and everyone is so close!

I’m quite loving the facilities. For the most part, the chairs are comfortable. Where it looks like you just have a chair, there is usually a table hiding in the seat in front of you. There is instantly connecting and always on wifi no matter which building you’re in. There are computers in the hallways, and multiple plugs at all the very comfy public seating areas. They make it very easy to be a student here! Perhaps I need another degree?


Designing effective likely voter models in pre-election surveys

  • voter and intention and turnout can be extremely different. 80% say they will vote but 10% to 50% is often the number that actually votes
  • democratic vote share is often over represented [social desirability?]
  • education has a lot of error – 5% error rate, worst demographic variable
  • what voter model reduces these inaccuracies
  • behavioural models (intent do vote, have you voted, dichotomous variables) and resource based models (
  • vote intention does predict turnout – 86% are accurate, also reduces demographic errors
  • there’s not a lot of room to improve except when the polls look really close
  • Gallup tested a two item measure of voting intention – how much have you thought about this election, how likely are you to vote
  • 2 item scale performed far better than the 7 item scale, error rate of 4% vs 1.4%
  • [just shown a histogram with four bars. all four bars look essentially the same. zero attempt to create a non-existent different. THAT’S how you use a chart 🙂 ]
  • gallup approach didnt work well, probability approach performed better
  • best measure of voting intention = Thought about election + likelikhood of voting  + education + voted before + strength of partisan identify

polls on national independence: the scottish case in a comparative perspective

  • [Claire Durand from the University of Montreal speaks now. Go Canada! 🙂 ]
  • what happened in quebec in 1995? referendum on independence
  • quebec and scotland are nationalist in a british type system, proportion of nonnationals is similar
  • referenda are 50% + 1 wins
  • but polls have many errors, is there an ant-incumbent effect
  • “no” is always underestimated – whatever the no is
  • are referenda on national independence different – ethnic divide, feeling of exclusion, emotional debate, ideological divide
  • No side has to bring together enemies and don’t have a unified strategy
  • how do you assign non-disclosure?
  • don’t know doesn’t always mean don’t know
  • don’t distrbute non-disclosures proportionally, they aren’t random
  • asking how people woud vote TODAY resulted in 5 points less nondisclosure
  • corrections need to be applied after the referendum as well
  • people may agree with the general demans of the national parties but not with the solution they propose. maintaining the threat allows them to maintain pressure for change.
  • the quebec newspapers reported the raw data plus the proportional response so people could judge for themself

how good are surveys at measuring past electora behaviour? lessions from an experiment in a french online panel study

  • study bias in individual vote recall
  • sample size of 6000
  • overreporting of popular party, underreporting of less popular party
  • 30% of voter recall was inconsistent
  • inconsistent respondents change their recall, changed parties, memory problems, concealing problems, said they didn’t vote, said you vote and then said you didn’t or vice versa
  • could be any number of interviewer issues
  • older people found it more difficult to remember but perhaps they have more voter loyalty
  • when avalable, use  ]vote reall from preelection survey
  • use vote reall from post election underestimates voter transfers
  • caution in using vote recall to weight samples

methodological issues in measuring vote recall – an analysis of the individual consistency of vote recall in two election longitudinal surveys

  • popularity = weighted average % of electorate represented
  • universality = weighted frequency of representing a majority
  • used four versions of non/weighting including google hits
  • measured 38 questions related to political issues
  • voters are driven by political traditional even if outdated, or by personal images of politicians not based on party manifestors
  • voters are irrational, political landscape has shifted even though people see the parties the same way they were decades ago
  • coalition formation aggravate the situation even more
  • discrepancy between the electorate and the government elected

The impact of questionnaire design on measurements in surveys #4 #ESRA15 #MRX 

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

Well, last night i managed to stay up until midnight. The lights at the church went on, lighting up the tower and the very top in an unusual way. They were quite pretty! The rest of the town enjoyed mood lighting as it didn’t really get dark at all. Tourists were still wandering in the streets since there’s no point going to bed in a delightful foreign city if you can still see where you’re going. And if you weren’t a fan of the mood lighting, have no fear! The sun ‘rose’ again just four hours later. If you’re scared of the dark, this is a great place to be – in summer!

Today’s program for me includes yet another sessions of question data quality, polling question design, and my second presentation on how non-native English speakers respond to English surveys. We may like to think that everyone answering our surveys is perfectly fluent but let’s be realistic. About 10% of Americans have difficulty reading/writing in English because it is not their native language. Add to that weakly and non-literate people, and there’s potential big trouble at hand.


the impact of answer format and item order on the quality of measurement

  • compared 2 point scale and 11 point scale, different order of questions and question can even be very widely apart, looked at perceived prestige of occupations
  • separated two pages of the surveys with a music game of guessing the artist and song, purely as distraction from the survey. the second page was the same questions in a completely different order, did the same thing numerous times changing the number of reponse options and question orders each time. whole experiment lasted one hour
  • assumed scale was unidimensional
  • no differences comparing 4 point to 9 point scale, none between 2 point and 9 point scale [so STOP USING HUGE SCALES!!!]
  •  prestige does not change depending on order in the survey [but this is to be expected with non-emotional, non-socially desirable items]
  • respondents confessed they tried to answer well but maybe not the best of their ability or maybe their answers would change the next time [glad to see people know their answers aren’t perfect. and i wouldn’t expect anything different. why SHOULD they put 100% effort into a silly task with no legitimate outcome for them.]

measuring attitudes towards immigration with direct questions – can we compare 4 answer categories with dichotomous responses

  • when sensitive questions are asked, social desirability affects response distributions
  • different groups are affected in different ways
  • asked questions about racial immigration – asked binary or as a 4 point scale
  • it’s not always clear that slightly is closer to none or that moderately is closer to strongly. can’t just assume the bottom two boxes are the same or the top two boxes are the same
  • education does have an effect, as well as age in some cases
  • expression of opposition for immigration depends on the response scale
  • binary responses leads to 30 to 50% more “allow none” responses than the 4 point scale
  • responents with lower education have lower probability to choose middle scale point

cross cultural differences in the impact of number of repsonse categories on response behaviour and data structure of a short scale for locus of control

  • locus of control scale, 4 items, 2 internal, 2 external
  • tested 5 point vs 9 point scale
  • do the means differ, does the factor structure differ
  • I’m  own boss; if i work hard, i’ll succeed; when at work or in m private life what I do is mainly determined by others; bad luck often gets in the way of m plans
  • labeled doesn’t apply at all, applies completely
  • didn’t see important demographic differences
  • saw one interaction but it didn’t really make sense [especially given sample size of 250 and lots of other tests happening]
  • [lots of chatter about significance and non-significance but little discussion of what that meant in real words]
  • there was no effect of item order, # of answer options mattered for external locus but not internal locus of control
  • [i’d say hard to draw any conclusions given the tiny number of items, small sample size. desperately needs a lot of replication]

the optimal number of categories in item specific scales

  • type of rating scale where the answer is specific to the scale and doesn’t necessarly apply to every other item – what is your health? excellent, good, poor
  • quality increased with the number of answer options comparing 11,7,5,3 point scales but not comparing 10,6,4 point scales
  • [not sure what quality means in this case, other audience members didn’t know either, lacking clear explanation of operationalization]

The impact of questionnaire design on measurements in surveys #3 #ESRA15 #MRX 

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

We had 90 minutes for lunch today which is far too long. Poor me.  I had pear skyr today to contrast yesterday’s coconut skyr. I can’t decide which one i like better. Oh, the hard decisions I have to make!  I went for a walk which was great since it drizzled all day yesterday. The downtown is tiny compared to my home so it’s quite fun to walk from one end to the other, including dawdling and eating, in less than half an hour. It’s so tiny that you don’t need a map. Just start walking and take any street that catches your fancy. I dare you to get lost. Or feel like you’re in an unsafe neighbourhood. It’s not possible.

I am in complete awe at the bird life here. There are a number of species i’ve never seen before which on its own is fun. It is also baby season so most of the ducks are paired off and escorting 2 to 8 tiny babies. They are utterly adorable as the babies float so well that they can barely swim underwater to eat. I haven’t seen any puffins along the shore line. I’m still hopeful that a random one will accidentally wander across my path.

By the way, exceptional beards really are a thing here. In case you were curious.

  the Who: experimental evidence on the effect of respondent selection on collecting individual asset ownership information

  • how do you choose who to interview?
  • “Most knowledgeble person”, random selection, the couple together, each individual adult by themself about themself, by themself about other people
  • research done in uganda so certainly not generalizable to north america
  • ask about dwelling, land, livestock, banking, bequeathing, selling, renting, collateral, investments
  • used CAPI, interviews matched on gender, average interview was 30 minutes
  • challenges included hard to find couples together as one person might be working in the field, hard to explain what assets were
  • asking couple together shows differences from ownership incidence but the rest is the same
  • [sorry, couldn’t determine what “signficant positive results” actually meant. would like to know. 😦 ]

Portuguese national health examination survey: questionnaire development

  • study includes physical measurements and a survey of health status, health behaviours, medication, income, expenses
  • pre-tested the survey for comprehension and complexity
  • found they were asking for things from decades ago and people couldn’t remember (eg when did you last smoke)
  • some mutually exclusive questions actually were not
  • you can’t just ask about ‘activity’ you have to ask about ‘physical activity that makes you sweat’
  • responses cards helped so that people didn’t have to say an embarrassing word
  • had to add instructions that “some questions may not apply to you but answer anyways” because people felt that if you saw them walking you shouldn’t ask whether they can walk
  • gave examples of what sitting on the job, or light activity on the job meant so that desk sitters don’t include walking to the bathroom as activity
  • pretest revealed a number of errors that could be corrected, language and recall problems can be overcome with better questions

an integrated household survey for Wales

  • “no change” is not a realistic option [i wish more people felt that way]
  • duplication among the various surveys, inefficient, survey costs are high
  • opportunity to build more flexibility into a new survey
  • annual sample size of 12000, randomly selected 16+ adults, 45 minutes
  • want to examine effects of offering incentives
  • survey is still in field
  • 40% lower cost compared to previous, significant gains in flexibility

undesired repsonse to sureys, wrong answers or poorly worded question? how respondents insist on reporting their situation despite unclear questionning

  • compared census information with family survey information
  • interested in open text answers
  • census has been completed since 1881
  • belle-mere can mean stepmother and mother in law in french
  • can’t tell if grandchildren in the house belong to which adult child in the house
  • ami can mean friend or boyfriend or partner or spouse, some people will also specify childhood friend or unemployed friend or family friend
  • can’t tell if an unknown location of child means they don’t know the address or the child has died
  • do people with an often changing address live in a camper, or travel for work?
  • if you only provide age in years for babies you won’t know if it’s stillborn or actually 1 year old

ask a positive question and get a positive answer: evidence on acquiesence bias from health care centers in nigeria

  • created two pairs of questions were one was positive and one was negative – avoided the word no [but the extremeness of the questions differed, eg., “Price was reasonable” vs “Price was too expensive” ]
  • some got all positive, all negative, or a random mix
  • pilot test was a disaster, in rural nigeria people weren’t familiar with this type of question
  • instead, started out asking a question about football so people could understand how the question worked. asked agree or disagree, then asked moderately or strongly – two stage likert scale
  • lab fees were reasonable generated very different result than lab fees were unreasonable [so what is reality?]
  • it didn’t matter if negatives were mixed in with positives
  • acquiescence bias affects both positive and negative questions, can’t say if it’s truly satisficing, real answer is probably somewhere in between [makes we wonder, can we develop an equation to tease out truth]
  •  large ceiling effects on default positive framing — clinics are satisfactory despite serious deficiencies
  • can’t increase scores with any intervention but you can easily decrease the scores
  • maybe patient satisfaction is the wrong measure
  • recommend using negative framing to avoid ceiling effects [I wonder if in north america, we’re so good at complaining that this isn’t relevant]

The impact of questionnaire design on measurements in surveys #1 #ESRA15  #MRX  

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

I tried to stay up until midnight last night but ended going to bed around 10:30pm. Naturally, it was still daylight outside. I woke up this morning at 6am in broad daylight again. I’m pretty sure it never gets dark here no matter what they say. I began my morning routine as usual. Banged my head on the slanted ceiling, stared out the window at the amazing church, made myself waffles in the kitchen, and then walked past the pond teaming with baby ducks. Does it get any better? I think no. Except of course knowing i had another day of great content rich sessions ahead of me!


designs and developments of the income measures in the european social surveys

  • tested different income questions. allowed people to use a weekly, monthly, or annual income scale as they wished. there was also no example response, and no example of what constitutes income. Provided about 30 answer options to choose from, shown in three columns. Provided same result as a very specific question in some countries but not others.
  • also tested every country getting the same number breaks, groups weren’t arranged to reflect each countries distribution. this resulted in some empty breaks [but that’s not necessarily a problem if the other breaks are all well and evenly used]
  • when countries are asked to set up number breaks in well defined deciles, high incomes are chosen more often – affected because people had different ideas of what is and isn’t taxable income
  • [apologies for incomplete notes, i couldn’t quite catch all the details, we did get a “buy the book” comment.]

item non-response and readability of survey questionnaire

  • any non-substantive outcome – missing values, refusals, don’t knows all count
  • non response can lower validity of survey results
  • semantic complexity measured by familiarity of words, length of words, abstract words that can’t be visualized, structural complexity
  • Measured – characters in an item, length of words, percent of abstract words, percent of lesser known words, percent of long words 12 or more characters
  • used the european social survey which is a highly standardized international survey, compared english and estonian, it is conducted face to face, 350 questions, 2422 uk respondents
  • less known and abstract words create more non-response
  • long words increase nonresponse in estonian but not in english, perhaps because english words are shorter anyways
  • percent of long words in english created more nonresponse
  • total length of an item didn’t affect nonresponse
  • [they used a list of uncommon words for measurement, such a book/list does exist in english. I used it in school to choose a list of swear words that had the same frequency levels as regular words.]
  • [audience comment – some languages join many words together which means their words are longer but then there are fewer words, makes comparisons more difficult]

helping respondents provide good answers in web surveys

  • some tasks are inherently difficult in surveys, often because people have to write in an answer, coding is expensive and error prone
  • this study focused on prescription drugs which are difficult to spell, many variations of the same thing, level of detail is unclear, but we have full lists of all these drugs available to us
  • tested text box, drop box to select from list, javascript (type ahead look up)
  • examined breakoff rates, missing data, response times, and codability of responses
  • asked people if they are taking drugs, tell us about three
  • study 1 – breakoffs higher from dropbox and javascript; median response times longer, but codability was better. LIsts didn’t work well at all.
  • study 2 – cleaned up the list, made all the capitalization the same. break off rates were now all the same. response times lower but still higher than the textbox version. codability still better for list versions.
  • study 3 – if they couldn’t find a drug in the list, they were allowed to type it out. unlike previous studies which proceeded with the missing data. dropbox had highest missing data. javascript had lowest missing data. median times highest for drop box. trends for more and more drugs as expected, effect is more but not as much more.
  • older browswers had trouble with dropdowns and javascript and had to be routed to the textbox options
  • if goal is to get codable answers, use a text box. if goal is to create skip patterns then javascript is the way to go.

rating scale labelling in web surveys – are numeric labels an advantage

  • you can use all words to label scales or just words on the end with numbers in between
  • research says there is less satisficing with verbal scales, they are more natural than numbers and there is no inherent meaning of numbers
  • means of the scales were different
  • less tie to completes the end labeled groups
  • people paid more attention to the five point labeled scale, and least to the end point labeled score
  • mean opinions did differ by scale, more positive on fully labeled scale
  • high cognitive burden to map responses of the numeric scales
  • lower reliability for the numeric labels

Direction of response scales #ESRA15 #MRX 

Live blogged at #ESRA15 in Reykjavik. Any errors or bad jokes in the notes are my own.

I discovered that all the buildings are linked indoors. Let it rain, let it rain, i don’t care how much it rains….  [Feel free to sing that as loud as you can.] Lunch was Skyr, oat cookies and some weird beet drink. Yup. I packed it myself. I always try to like yogurt and never really do. Skyr works for me. So far, coconut is my favourite. I’ve forgotten to take pictures of speakers today so let’s see if I can keep the trend going! Lots of folks in this session so @MelCourtright and I are not the only scale geeks out there . 🙂

Response scales: Effects of scale length and direction on reported political attitudes

  • instruments are not neutral, they are a form of communication
  • cross national projects use different scales for the same question so how do you compare the reuslts
  • trust in parliament is a fairly standard question for researchers and so makes a good example
  • 4 point scale is most popular but it is used up to 11 points, traditional format is very positive to very negative
  • included a don’t know in the answer options
  • transformed all scales into a 0 to 1 scale and evenly distributed all scores in between
  • means highest with 7 point scale traditional direction and lowest with 4 point and 11 point traditional direction
  • reverse direction had much fewer mean differences, essentially all the same
  • four point scales show differences in direction, 7 and 11 point show fewer differences in direction
  • [regression results shown on the screen – no one fainted or died, the speaker did not apologize or say she didn’t understand them. interesting difference compared to MRX events.]

Does satisficing drive scale direction effects

  • research shows answers shift towards the start fo the scale but this is not consistent
  • achoring and adjustment effects whereby people use the first answer option as the anchor, interpretative heuristics suggest people choose an early response to express their agreement with the questions, primacy effects due to satisficing decreases cognitive load
  • scores were more positive when the scale started positive, differences were huge across all the brands
  • the pattern is the same but the differences are noticeable
  • speeding measured as 300 milliseconds per word
  • speeders more likely to choose early answer option
  • answers are pushed to the start of the scale, limited evidnce that it is caused by satisficing

Ordering your attention: response order effects in web-based surveys

  • primacy happens more often visually and recency more often orally
  • scales have an inherence order. if you know the first answer option, you know the remainder of the options
  • sample size over 100 000, random assigned to scale order, also tested labeling, orientation, and number of response categories from 2 to 11
  • the order effect was always a primacy effect, differences were significant though small; significant more due to sample size [then why mention the results if you know they aren’t important?]
  • order effects occurred more with fully labeled scales, end labeled scales did not see response order effects
  • second study also supported the primacy effect with half of questions showing the effect
  • much stronger response seen with unipolar scales
  • vertical scales are much stronger response as well
  • largest effect seen for horizontal unipolar scale
  • need to run the same tests with grids, don’t know which response is more valid, need to know what they will be and when

Impact of repsonse scale direction on survey repsonses in web and mobile web surveys

  • why does this effect happen?
  • tested agreement scales and frequency scales
  • shorter scale decreases primacy effect
  • scale length has a signifciant moderating effect – strongly effect for 7 point scales compared to 5 point scale
  • labeling has significant moderating effects – stronger effect for fully labeled
  • question location matters – stronger effect on earlier questions
  • labeled behavioural scale shows the largest impact, end labeled attitudinal scale has the smallest effect
  • scale direction affects responses – more endorsement at start of scale
  • 7 point fully labeled frequency scale is most affected
  • we must use shorter scales and end labeling to reuce scale direction effects in web surveys

Importance of scale direction between different modes

  • term used is forward/reverse scale [as opposed to ascending/descending or positive/negative keyed]
  • in the forward version of the scale, the web creates more agreement; but face to face it’s very weak. face to face shows recency effect
  • effect is the same for general scales (all scales are agreement) and item specific scales (each scale reflects the specific question), more cognitive effort in the item specific scale so maybe less effort is invested in the response
  • item specific scale affected more by the web
  • randomizing scale matters more in online surveys


Related Posts

 

Assessing the quality of survey data (Good session!) #ESRA15 #MRX 

Live blogged from #ESRA15 in Reykjavik. Any error or bad jokes in the notes are my own. As you can see, I managed to find the next building from the six buildings the conference is using. From here on, it’s smooth sailing! Except for the drizzle. Which makes wandering between buildings from session to session a little less fun and a little more like going to a pool. Without the nakedness. 

Session #1 – Data quality in repeated surveys: evidence from a quasi-experimental design by multiple professors from university of Rome

  • respondents can refuse to participate in the study resulting in series of missing data but their study had very little missing data, only about 5% this time [that’s what student respondents does for you, would like to see a study with much larger missing rates]
  • questions had an i do not know option, and there was only one correct answer
  • 19% of gender/birthday/socioeconomic status changed from survey to survey [but we now understand that gender can change, researchers need to be open to this. And of course, economic status can change in a second]
  • Session #2 – me!  Lots of great questions, thank you everyone!

Session #3 – Processing errors in the cross national surveys

  • we don’t consider process errors very often as part of total survey error
  • found 154 processing errors in the series of studies – illegitimate variable values such as education that makes little sense or age over 100, misleading variable values, contradictory values, value discrepancies, lack of value labels, maybe you’re expecting a range but you get a specific value, what if 2 is coded as yes in the software but no in the survey
  • age and education were most problematic, followed by schooling
  • lack of labels was the worst problem, followed by illegitimate values, and misleading values
  • is 22% discrepancies out of all variables checked good or bad?

Session #4 – how does household composition derived from census data describe or misrepresent different family types

  • strength of census data is their exhaustivity, how does census data differ from a smaller survey
  • census counts household members, family survey describes families and explores people outside the household such as living apart, they desribe different universe. a boarder may not be measured in the family survey but yes mentioned in the census survey
  • in 10% of cases, more people are counted in the census, 87% have the same number of people on both surveys
  • census is an accounting tool, not a tool for understanding social life, people do not organize their lives to be measured and captured at one point and one place in time
  • census only has a family with at least one adult and at least one child
  • isolated adult in a household with other people is 5% of adults in the census, not classified the same in both surveys
  • there is a problem attributing children to the right people – problem with single parent families; single adults are often ‘assigned’ a child from the household
  • a household can include one or two families at the most – complicated when adult children are married and maybe have a kid. A child may be assigned to a grandparent which is in err.
  • isolated adults may live with a partner in the dwelling, some live with their parents, some live with a child (but children move from one household to another), 44% of ‘isolated’ adults live with family members, they aren’t isolated at all
  • previously couples had to be heterosexual, even though they survey as a union the rules split them into isolated adults [that’s depressing. thank you for changing this rule.]
  • census is more imperfect than the survey, it doesnt catch subtle transformations in societal life. calls into question definitions of marginal groups
  • also a problem for young adults who leave home but still have strong ties to the parents home – they may claim their own home and their parents may also still claim them as living together
  • [very interesting talk. never really thought about it]

Session #5 – Unexpectedly high number of duplicates in survey data

  • simulated duplicates created greater bias of the regression coefficient when up to 50% of cases were duplicated 2 to 5 times
  • birthday paradox – how many persons are needed in order to find two having an identical birthday – 23. A single duplicate in a dataset is likely.
  • New method – the Hamming diagram – diversity of data for survey – it looks like a normal curve with some outliers so i’m thing Hamming is simply a score like mahalonobis is for outliers
  • found duplicate sin 10% of surveys, 14 surveys comprised 80% of total duplicates with one survey at 33%
  • which case do you delete? which one is right if indeed one is right. always screen your data before starting a substantial analysis.
  • [i’m thinking that ESRA and AAPOR are great places to do your first conference presentation. there are LOTS of newcomers and presentation skills aren’t fabulous. so you won’t feel the same pressure as at other conferences. Of course, you must have really great content because here, content truly is king]
  • [for my first ESRA conference, i’m quite happy with the quality of the content. now let’s hope for a little sun over the lunch hour while I enjoy Skyr, my new favourite food!]

Related Posts

How to go to a pool in Reykjavik Iceland #ESRA15

I know that sounds stupid. But, to anyone from North America, it’s going to catch you off guard. I took the plunge first so you won’t be as dumbstruck as I was. Note: I did not die in the process.

  1. Bring your bathing suit and towel. If you forgot yours, you can rent both at the pool. You can also borrow a towel from your hotel.
  2. Arrive at the pool and pay the fee. (Around 650 which for me was $6.50) You will be given a ticket or a bracelet that you scan in order to get in.
  3. There is a change room for females and separate one for males. Have no fear.
  4. Leave your shoes on the rack which is either outside the change room or immediately inside. They will be there when you get back after swimming.
  5. Walk past any naked people and find an empty locker. Take the key from the lock and put the elastic band around your wrist.
  6. Strip naked. (In your head, pretend you do this all the time and it’s no big deal.) Take your towel and swimsuit to the shower area. NO ONE is looking at you. They’re busy with whatever they’re doing.
  7. Put your towel and suit on a nearby rack with everyone else’s towel.
  8. Shower naked with the liquid soap provided. Many people bring their own soap. Wash all of you with soap. WASH, not just a quick dip.
  9. Put on your swimsuit. (Good luck with that. Wet body + dry suit = gymnastics.) Leave your towel there. It will be there when you return.
  10. Go out into the pool area. Swim, get hot, swim, chat people up, swim, get hot.
  11. When you’re done, go back inside.
  12. Shower with or without your suit but you might as well…..
  13. strip naked. Dry off. You are not allowed back at the lockers until you are completely dry and won’t drip on the floor.
  14. If you’re lucky, there’s a bathing suit spinner. Spin your suit dry.
  15. While naked, carry your suit and towel back to your locker.
  16. Get dressed.
  17. Get your shoes.
  18. Return your wristband if you were given one.
  19. Leave with fond memories of the swim and be proud that you got naked in public and no one died in the process.

There’s a lot of naked going on there but the important part to remember is that this is how it’s always done. No one’s looking at you. No one cares about you. In fact, there may only be a couple of other people in the change room at the same time as you anyways. Just resign to the nakedness. The pool and hot tubs are worth it. 🙂

I managed to get to two pools:

  • Laugardalslaug: A+ outdoor pool. Really fun slide. Every ADULT should go on the slide. I promise you’ll squeal. Several hot tubs of different temperatures. Really nice lane pool. Lots of fun toys for the kiddies who get a separate play area from the adults. Good choice if this is the only one you go to. They use a great electronic wristband locker system.
  • Vesturbaejarlaug: B+ outdoor pool. Standard lane pool plus hot tubs. It’s smaller and feels more cozy. They use a physical key for the lockers. Convenient but not as spiffy techie as the previous pool.

Questions? Ask away 🙂

%d bloggers like this: