Tag Archives: nonprobability

New Math For Nonprobability Samples #AAPOR 

Moderator: Hanyu Sun, Westat

Next Steps Towards a New Math for Nonprobability Sample Surveys; Mansour Fahimi, GfK Custom Research Frances M. Barlas, GfK Custom Research Randall K. Thomas, GfK Custom Research Nicole R. Buttermore, GfK Custom Research

  • Neuman paradigm requires completes sampling frames and complete response rates
  • Non-prob is important because those assumptions are not met, sampling frames are incomplete, response rates are low, budget and time crunches
  • We could ignore that we are dealing with nonprobability samples, find new math to handle this, try more weighting methods [speaker said commercial research ignores the issue – that is absolutely not true. We are VERY aware of it and work within appropriate guidelines]
  • In practice, there is incomplete sampling frames so samples aren’t random and respondents choose to not respond and weighting has to be more creative, uncertainty with inferences is increasing
  • There is fuzz all over, relationship is nonlinear and complicated 
  • Geodemographic weighting is inadequate; weighted estimates to benchmarks show huge significant differences [this assumes the benchmarks were actually valid truth but we know there is error around those numbers too]
  • Calibration 1.0 – correct for higher agreement propensity with early adopters – try new products first, like variety of new brands, shop for new, first among my friends, tell others about new brands; this is in addition to geography
  • But this is only a Université adjustment, one theme, sometimes it’s insufficient
  • Sought a Multivariate adjustment
  • Calibration 2.0 – social engagement, self importance, shopping habits, happiness, security, politics, community, altruism, survey participation, Internet and social media
  • But these dozens of questions would burden the task for respondents, and weighting becomes an issue
  • What is the right subset of questions for biggest effort
  • Number of surveys per month, hours on Internet for personal use, trying new products before others, time spend watching TV, using coupons, number of relocations in past 5 years
  • Tested against external benchmarks, election, BRFSS questions, NSDUH, CPS/ACS questions
  • Nonprobability samples based on geodemogarphics are the worst of the set, adding calibration improves them, nonprobability plus calibration is even better, probability panel was the best [pseudo probability]
  • Calibration 3.0 is hours on Internet, time watching TV, trying new products, frequency expressing opinions online
  • Remember Total Research Error, there is more error than just sampling error
  • Combining nonprobability and probability samples, use stratification methods so you have resemblance of target population, gives you better sample size for weighting adjustments
  • Because there are so many errors everywhere, even nonprobability samples can be improved
  • Evading calibration is wishing thinking and misleading 

Quota Controls in Survey Research: A Test of Accuracy and Inter-source Reliability in Online Samples; Steven H. Gittelman, MKTG, INC.; Randall K. Thomas, GfK Custom Research Paul J. Lavrakas, Independent Consultant Victor Lange, Consultant

  • A moment of silence for a probabilistic frame 🙂
  • FoQ 2 – do quota controls help with effectiveness of sample selections, what about propensity weight, matching models
  • 17 panels gave 3000 interviews via three sampling methods each; panels remain anonymous, 2012-2013; plus telephone sample including cell phone; English only; telephone was 23 minutes 
  • A – nested region, sex, age
  • B – added non nested ethnicity quotas
  • C – add no nested education quotas
  • D – companies proprietary method
  • 27 benchmark variables across six government and academic studies; 3 questions were deleted because of social desirability bias
  • Doing more than A did not result in reduction of bias, nested age and sex within region was sufficient; race had no effect and neither did C and those made the method more difficult; but this is overall only and not looking at subsamples
  • None of the proprietary methods provided any improvement to accuracy, on average they weren’t powerful and they were a ton of work with tons of sample
  • ABC were essentially identical; one proprietary methods did worse;  phone was not all that better
  • Even phone – 33% of differences were statistically significant [makes me think that benchmarks aren’t really gold standard but simply another sample with its own error bars]
  • The proprietary methods weren’t necessarily better than phone
  • [shout out to Reg Baker 🙂 ]
  • Some benchmarks performed better than others, some questions were more of a problem than others. If you’re studying Top 16 you’re in trouble
  • Demo only was better than the advanced models, advanced models were much worse or no better than demo only models
  • An advanced model could be better or worse on any benchmark but you can’t predict which benchmark
  • Advanced models show promise but we don’t know which is best for which topic
  • Need to be careful to not create circular predictions, covariates overly correlated, if you balance a study on bananas you’re going to get bananas
  • Icarus syndrome – covariates too highly correlated
  • Its’ okay to test privately but clients need to know what the modeling questions are, you don’t want to end up with weighting models using the study variables
  • [why do we think that gold standard benchmarks have zero errors?]

Capitalizing on Passive Data in Online Surveys; Tobias B. Konitzer, Stanford University David Rothschild, Microsoft Research 

  • Most of our data is nonprobability to some extent
  • Can use any variable for modeling, demos, survey frequency, time to complete surveys
  • Define target population from these variables, marginal percent is insufficient, this constrains variables to only those where you know that information 
  • Pollfish is embedded in phones, mobile based, has extra data beyond online samples, maybe it’s a different mode, it’s cheaper faster than face to face and telephone, more flexible than face to face though perhaps less so than online,efficient incentives
  • 14 questions, education, race, age, location, news consumption, news knowledge, income, party ID, also passive data for research purposes – geolocation, apps, device info
  • Geo is more specific than IP address, frequency at that location, can get FIPS information from it which leads to race data, with Lat and long can reduce the number of questions on survey
  • Need to assign demographics based on FIPS data in an appropriate way, modal response wouldn’t work, need to use probabilities, eg if 60% of a FIPS is white, then give the person a 60% chance of being white
  • Use app data to improve group assignments

Non-Probability Sampling and Online Panel: They’re all grown up now

Written by
Annie Pettit, Canadian Chair of ISO TC225
Debrah Harding, UK Chair
Elissa Molloy, Australian Chair

In the seven years since the creation of the quality standard ISO 26362, the use of online panels for market, opinion and social research has experienced massive growth and evolution. The standard was extremely useful in helping both clients and vendors explain and understand the technical aspects of what is now a ‘traditional’ online panel. And while online panels are now default sample sources for many researchers, new options that must also be considered have been developed since then.

In the online world, we have seen the introduction of panels that use not ‘traditional’ email invitations but rather options such as pop-up intercepts, or requiring people to visit a specific website and select from available research opportunities, or offering opportunities from pre-roll webpages. We now have to consider whether automated inventory and survey routing is appropriate for our needs. And of course, we now have the option to engage panel and sample brokers who will find sample providers for us.

The great success of online sample led to the decline of offline sample in rich areas of the world. But don’t let that fool you. There still exist large communities of people around the world where access to online services, or financial resources, means that advanced online surveys are simply not feasible. Offline panels are still very necessary and important in many communities and for many types of research.

And, what may seem surprising to some is that, now, in both offline and online environments, we must consider whether the sample or panel has probability or nonprobability characteristics.

In the time that our sector has greatly advanced researchers’ capabilities, people have also advanced in their responses to surveys.  For some, answering surveys is now a normal activity for people, many of whom participate in one or more panels, in addition to innumerable surveys from ad hoc outreach programs and end-client research studies. Participants are more familiar than ever with techniques for increasing their chances of qualifying for incentives as well as techniques for completing surveys as quickly as possible, sometimes with less than good intentions and sometimes as a reaction to poor quality research tools and services.

It is clear that we have reached a new stage with samples where both offline and online sample have been accepted as valid and reliable techniques, each with a host of new intricate technical requirements.

On March 11 and 12, representatives from around the world, including Canada, UK, USA, The Netherlands, Australia, Japan, Austria, and more, will gather in London, England. There, we will discuss and debate the advancements our industry has made and how we can incorporate those advancements into the ISO standards. Our goal will be to update the online panel standard to better reflect the current and future state of sampling for market, opinion and social research. Also high on the agenda will be the new draft ISO standard for digital analytics and web analyses, which aims to develop the service requirements for digital research services.  These leaders will also bring to light the global differences in research requirements and practice, to help solidify the wider issue of how the ISO research standards can best serve the research sector well into the future.

Representativeness of surveys using internet-based data collection #ESRA15 #MRX 

Live blogged from #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

Yup, it’s sunny outside. And now i’m back inside for the next session. Fortunately, or unfortunately, this session is once again in a below ground room with no windows so I will not be basking in sunlight nor gazing longingly out the window. I guess I’ll be paying full attention to another really great topic.

 

conditional vs unconditional incentives: comparing the effect on sample composition in the recruitment of the german internet panel study GIP

  • unconditional incentives tend to perform better than promised incentives
  • include $5 with advance letter compared to promised $10 with thank you letter; assuming 50% response rate, cost of both groups is the same
  • consider nonresponse bias, consider sample demo distribution
  • unconditional incentive had 51% response rate, conditional incentive had 42% response rate
  • didn’t see a nonresponse bias [by demographics I assume, so many speakers are talking about important effects but not specifically saying what those effects are]
  • as a trend, the two sets of data provide very similar research results, yes differences in means but always fairly close together, confidence intervals always overlap

https://twitter.com/ialstoop/status/622001573481312256

evolution of representativeness in an online probability panel

  • LISS panel – probability panel, includes households without internet accesst, 30 minutes per month, paid for every completed questionnaire
  • is there systematic attrition, are core questionnaires affected by attrition
  • normally sociademographics only which is restrictive
  • missing data imputed using Mice
  • strongest loss in panel of sociodemographic properties
  • there are seasonal drops in attrition, for instance in June which is lots of holidays
  • has more effects for survey attitudes and health traits, less so for political and personality traits which are quite stable even with attrition
  • try to decrease attrition through refreshement based on targets

https://twitter.com/ialstoop/status/622004420314812417

moderators of survey representativeness – a meta analysis

  • measured single mode vs multimode surveys
  • R-indicators – single measure from 0 to 1 for sample representativeness, based on logistic regression models for repsonse propensity
  • hypothesize mixed mode surveys are more representative than single mode surveys
  • hypothesize cross-sectional surveys are more representative than longitudinal survyes
  • heterogeneity not really explained by moderators

setting up a probability based web panel. lessons learned fromt he ELIPSS pilot study

  • online panel in france, 1000 people, monthly questionnaires, internet access given to each member [we often wonder about the effect of people being on panels since they get used to and learn how to answer surveys, have we forgotten this happens in probability panels too? especially when they are often very small panels]
  • used different contact mdoes including letters, phone, face to face
  • underrepresented on youngest, elderly, less educated, offline people
  • reason for participatign in order – trust in ELIPSS 46%, originality of project 37%, interested in research 32%, free internet access 13%
  • 16% attiriont after 30 months (that’s amazing, really low and really good!), response rate generally above 80%
  • automated process – invites on thursday, sustematic reminders, by text message, app message and email
  • individual followups by phone calls and letters [wow. well that’s how they get a high response rate]
  • individual followups are highly effective [i’d call them stalking and invasive but that’s just me. i guess when you accept free 4g internet and a tablet, you are asking for that invasiveness]
  • age becomes less representative over time, employment status changes a lot, education changes the most but of course young people gain more education over time
  • need to give feedback to panel members as they keep asking for it
  • want to broaden use of panel to scientific community by expanding panel to 3500 people

https://twitter.com/nicolasbecuwe/status/622009359082647552

https://twitter.com/ialstoop/status/622011086783557632

the pretest of wave 2 of the german health interview and examination survey for children and adolescents as a mixed mode survey, composition of participant groups

  • mixed mode helps to maintain high response, web is prefered by younger people, representativeness could be increased by using multiple modes
  • compared sequential and simultaneous surveys
  • single mode has highest response rate, mixed mode simultaneous was extremely close behind, mixed mode multi-step had the lowest rate
  • paper always gave back the highest porportion of data even when people had the choice of both, 11% to 43% chose the paper among 3 groups
  • sample composition was the same among all four groups, all confidence intervals overlap – age, gender, nationality, immigration, education
  • metaanalysis – overall trend is the same
  • 4% lower response rate in mixed mode – additional mode creates cognitive burden, creates a break in response process, higher breakoffs
  • mixed mode doesn’t increase sample composition nor response rates [that is, giving people multiple options as opposed to just one option, as opposed to multiple groups whereby each groups only knows about one mode of participation.]
  • current study is now a single mode study

https://twitter.com/oparnet/status/622015032231075840


 

Sample composition in online studies #ESRA15 #MRX 

Live blogged at #ESRA15 in Reykjavik. Any errors or bad jokes are my own.

I’ve been pulling out every ounce of bravery I have here in Iceland and I went to the pool again last night (see prevoius posts on public nakedness!). I could have also broken my rule about not traveling after dark in strange cities but since it never gets dark here, I didn’t have to worry about that! The pool was much busier this time. I guess kiddies are more likely to be out and about after dinner on a weekday rather than sunday morning at 9am.  All it meant is that I had a lot more people watching to do. All in all good fun to see little babies and toddlers enjoying a good splash and float!

This morning, the sun was very much up and the clouds very much gone. I’ll be dreaming of breaktime all morning! Until then however, i’ve got five sessions on sample composition in online surveys, and representativeness of online studies to pay attention to. It’s going to be tough but a morning chock full of learning will get me a reward of more pool time!  what is the gain in a probability based online panel to provide internet access to sampling unites that did not have access before

  • germany has GIP, france has ELPSS, netherlands has LISS as probability panels
  • weighting might not be enough to account for bias of people who do not have internet access
  • but representativeness is still a problem because people may not want to participate even if they are given access, recruitment rates are much lower among non-interenet households
  • probaility panels still have problems, you won’t answer every survey you are sent, attrition
  • do we lose much without a representative panel? is it worth the extra cost
  • in Elipss panel, everyone is provided a tablet, not just people without access. the 3G tablet is the incentive you get to keep as long as you are on the panel. so everyone uses the same device to participate in the research
  • what does it mean to not have Internet access – used to be computer + modem. Now there are internet cafes, free wifi is everywhere. hard to define someone as no internet access now. We mean access to complete a survey so tiny smartphones don’t count.
  • 14.5% of adults in france were classified as not having internet. turned out to be 76 people in the end which is a bit small for analytics purposes. But 31 of them still connected every day.
  • non-internet access people always participated less than people who did have internet.
  • people without internet always differ on demographics [proof is chi-square, can’t see data]
  • populations are closer on nationality, being in a relationship, and education – including non-internet helps with these variables, improves representativity
  • access does not equal usage does not equal using it to answer surveys
  • maybe consider a probability based panel without providing access to people who don’t have computer/tablet/home access

parallel phone and web-based interviews: comparability and validity

  • phones are relied on for research and assumed to be good enough for representativeness, however most people don’t answer phone calls when they don’t recognize the number, cant use autodialler in the USA for research
  • online surveys can generate better quality due to programming validation and ability to only be able to choose allowable answers
  • phone and online have differences in presentation mode, presence of human interviewer, can read and reread responses if you wish, social desirability and self-presentation issues – why should online and offline be the same
  • caution about combining data from different modes should be exercised [actually, i would want to combine everything i possibly can. more people contributing in more modes seems to be more representative than excluding people because they aren’t identical]
  • how different is online nonprobability from telephone probability  [and for me, a true probability panel cannot technically exist. its theoretically possible but practically impossible]
  • harris did many years of these studies side by side using very specific methodologies
  • measured variety of topics – opinions of nurses, bug business trust, happiness with health, ratings of president
  • across all questions, average correlation between methods was .92 for unweighted means and .893 for weighted means – more bias with weighted version
  • is it better for scales with many response categories – corrections go up to .95
  • online means of attitudinal items were on average 0.05 lower on scale from 0 to 1. online was systematically biased lower
  • correlations in many areas were consistently extremey high, means were consistently very slightly lower for online data; also nearly identical rank order of items
  • for political polling, the two methods were again massively similar, highly comparable results; mean values were generally very slightly lower – thought to be ability to see the scale online as well as social desirability in telephone method, positivity bias especially for items that are good/bad as opposed to importance 
  • [wow, given this is a study over ten years of results, it really calls into question whether probability samples are worth the time and effort]
  • [audience member said most differences were due to the presence of the interviewer and nothing to do with the mode, the online version was foudn to be truer]

representative web survey

  • only a sample without bias can generalize, the correct answer should be just as often a little bit higher or a little bit lower than reality
  • in their sample, they underreprested 18-34, elementary school education, lowest and highest income people
  • [yes, there are demographic differences in panels compared to census and that is dependent completely on your recruitment method. the issue is how you deal with those differences]
  • online panel showed a socially positive picture of population
  • can you correct bias through targeted sampling and weighting, ethnicity and employment are still biased but income is better [that’s why invites based on returns not outgo are better]
  • need to select on more than gender, age, and region
  • [i love how some speakers still have non-english sections in their presentation – parts they forgot to translate or that weren’t translatable. now THIS is learning from peers around the world!]

measuring subjective wellbeing: does the use of websurveys bias the results? evidence from the 2013 GEM data from luxembourg

  • almost everyone is completely reachable by internet
  • web surveys are cool – convenient for respondents, less social desirability bias, can use multimedia, less expensive, less coding errors; but there are sampling issues and bias from the mode
  • measures of subjective well being – i am satisfied with my life, i have obtained all the important things i want in my life, the condition of my life are excellent, my life is close to my ideal [all positive keyed]
  • online survey gave very slightly lower satisfaction
  • the results is robuts to three econometric techqnies
  • results from happiness equations using differing modes are compatible
  • web surveys are reliable for collecting information on wellbeing

Comparing probability and nonprobability samples #AAPOR #MRX 

prezzie #1: how different and probability and nonprobability designs

  • nonprobability samples often get the correct rresults and probability samples are sometimes wrong. maybe they are more similar than we realize
  • nonprobability sampling may have a sample frame but it’s not the same as a census population
  • how do you choose, which factors are important
  • what method does the job that you require, that fits your purpose
  • is the design relevant, does it meet the goals with the resources, does the method gives you results in the time you need, accessability, can you find the people you need, interpretability and reliability, accuracy of estimates with acceptable mean square error, coherance in terms of results matching up with other data points from third parties [of course, who’s to say what the right answer is, everyone could be wrong as we’ve seen in recent elections]
  • nonprobability can be much faster, probability can be more relevant
  • nonprobability can get you right to the people you want to listen to
  • both methods suffer from various types of error, some more than others, must consider total survey error [i certainly hope you’ve been considering TSE since day 1]
  • driver will decide the type of study you end up doing
  • how can nonprob methods help prob methods, because they do offer much good stuff
  • [interesting talk, nice differentiation between prob and nonprob even though I did cringe at a few definitions, eg I dont see that quality is the differentiator between prob and nonprob]

prezzie #2: comparison of surveys based on prob and nonprob

  • limbo – how low can you go with a nonprob sample
  • bandwagon – well everyone else is doing nonprob sample [feelings getting hurt here]
  • statistical adjustment of nonprob samples helps but it is only a partial solution
  • nonprob panel may have an undefined response rate
  • need to look at point estimates and associations in both the samples, does sampling only matter when you need population point estimates
  • psychology research is often done all with college students [been there, done that!]
  • be sure to weight and stratify the data
  • education had a large effect between prob and nonprob sample [as it usually does along with income]
  • point estimates were quite different in cases, but the associations were much closer so if you don’t need a precise point estimate a nonprob sample could do the trick

prezzie #4: sample frame and mode effects

  • used very similar omnibus surveys, included questions where they expected to find differences
  • compared point estimates of the methods as well as to benchmarks of larger census surveys
  • for health estimates, yes, there were differences but where the benchmark was high so were the point estimates, similarly low or moderate point estimates, total raw differences maxed out around ten point
  • there was no clear winner for any of the question types though all highs were highs and lows were low
  • no one design is consistently superior

Combining a probability based telephone sample with an opt-in web panel by Randal ZuWallack and James Dayton #CASRO #MRX

Live blogging from Nashville. Any errors or bad jokes are my own.

– National Alcohol Survey in the US, for 18 years plus [because children don’t drink alcohol]
– even people who do not drink end up taking a 34 minute survey compared to 48 minutes for someone who does drink. this is far too long
– only at 18 minutes are people determined to be drinkers or abstainers. [wow, worst screen-out position EVER]
– why data fusion? not everyone is online [please, not everyone is on a panel either. and what about refusals? this fascination with probability panels is often silly]
– RDD measures population percents
– web measures depth of information conditional on who is who
– they matched an online and RDD sample using overlapping variables
– problem is matching can create strange ‘people’ that doesn’t explain real people. however, in aggregate, the distributions work out. we think about it being right on an individual level
– “The awesome thing about having a 45 minute survey”…is the statistical analyses you can do with it [made me laugh. there IS an awesome thing? 🙂 ]
– [SAS user 🙂 Have I told you lately….. that I love SAS]
– There were small differences in frequencies between the RDD and web surveys for both wine and beer. averages are very close but significantly different [enter conversation – when does significantly different mean meaningfully different]
– heavy drinking is much much greater on web surveys
– is there social desirability, recall bias 🙂
– not everything lines up perfectly RDD vs web, general trends are the same but point estimates are different
– so how do you know which set of data is true or better?
– regardless, web does not reproduce RDD estimates
– problem now is which data is correct, need multiple samples from the same panel to test

Probability and Non-Probability Samples in Internet Surveys #AAPOR #MRX

AAPOR… Live blogging from beautiful Boston, any errors are my own…

Probability and Non-Probability Samples in Internet Surveys
Moderator: Brad Larson

Understanding Bias in Probability and Non-Probability Samples of a Rare Population John Boyle, ICF International

  • If everything was equal, we would choose a probability sample. But everything is not always equal. Cost and speed are completely different. This can be critical to the objective of the survey.
  • Did an influenza vaccination study with pregnant women. Would required 1200 women if you wanted to look at minority samples. Not happening. Influenza data isn’t available at a whim’s notice and women aren’t pregnant at your convenience. Non-probability sample is pretty much the only alternative.
  • Most telephone surveys are landline only for cost reasons. RDD has coverage issues. It’s a probability sample but it still has issues.
  • Unweighted survey looked quite similar to census data. Looked good when crossed by age as well. Landline are more likely to be older and cell phone only are more likely to be younger. Landline more likely to be married, own a home, be employed, higher income, have insurance from employer.
  • Landline vs cell only – no difference on tetanus shot, having a fever. Big differences by flu vaccination though.
  • There are no gold standards for this measure, there are mode effects,
  • Want probability samples but can’t always achieve them

A Comparison of Results from Dual Frame RDD Telephone Surveys and Google Consumer Surveys

  • PEW and Google partnered on this study; 2 question survey
  • Consider fit for purpose – can you use it for trends over time, quick reactions, pretesting questions, open-end testing, question format tests
  • Not always interested in point estimates but better understanding
  • RDD vs Google surveys – average different 6.5 percentage points, distribution closer to zero but there were a number that were quite different
  • Demographics were quite similar, google samples were a bit more male, google had fewer younger people, google was much better educated
  • Correlations of age and “i always vote” was very high, good correlation of age and “prefer smaller government”
  • Political partisanship was very similar, similar for a number of generic opinions – earth is warming, same sex marriage, always vote, school teaching subjects
  • Difficult to predict when point estimates will line up to telephone surveys

A Comparison of a Mailed-in Probability Sample Survey and a Non-Probability Internet Panel Survey for Assessing Self-Reported Influenza Vaccination Levels Among Pregnant Women

  • Panel survey via email invite, weighted data by census, region, age groups
  • Mail survey was a sampling frame of birth certificates, weighted on nonresponse, non-coerage
  • Tested demographics  and flu behaviours of the two methods
  • age distributions were similar [they don’t present margin of error on panel data]
  • panel survey had more older people, more education
  • Estimates differed on flu vaccine rates, some very small, some larger
  • Two methods are generally comparable, no stat testing due to non-prob sample
  • Trends of the two methods were similar
  • Ppanel survey is good for timely results

Probability vs. Non-Probability Samples: A Comparison of Five Surveys

  • [what is a probability panel? i have a really hard time believing this]
  • Novus and TNS Sifo considered probability
  • YouGov and Cint considered non-probability
  • Response rates range from 24% to 59%
  • SOM institute (mail), Detector (phone), LORe (web) – random population sample, rates from 8% to 53%
  • Data from Sweden
  • On average, three methods differ from census results by 4% to 7%, web was worst; demos similar expect education where higher educated were over-represented, driving licence over-rep
  • Non-prob samples were more accurate on demographics compared ot prob samples; when they are weighted they are all the same on demographics but education is still a problem
  • The five data sources were very similar on a number of different measures, whether prob or non-prob
  • demographic accuracy of non-prob panels was better. also closer to political atittudes. No evidence that self recruited panels are worse.
  • Need to test more indicators, retest

Modeling a Probability Sample? An Evaluation of Sample Matching for an Internet Measurement Panel

  • “construct” a panel that best matches the characteristics of a probability sample
  • Select – Match – Measure
  • Matched on age, gender, education, race, time online, also looked at income, employment, ethnicity
  • Got good correlations and estimates from prob and non-prob.
  • Sample matching works quite well [BOX PLOTS!!! i love box plots, so good in so many ways!]
  • Non-prob panel has more heavy internet users
%d bloggers like this: