The Data Quality Circle of Life

SVG version of the apps/edu_languages icon fro...

Image via Wikipedia

So what’s next?

We went through an era of worrying about the data quality of telephone surveys. Make sure the interviewers don’t lead responses with unnecessary “uh huh”s. Make sure the interviewers probe every time and often to avoid lazy responses. Make sure to spread calls out over every day of the week and every time of day, including supper time, nap time, and when you’re in the shower, to ensure maximum generalizability.

We’re now in an era of worrying about online data quality. Now, straightlining and incentive grabbers and random responding and heavy responders are topical issues. We’ve designed a bazillion methods to catch deliberate poor responders, a bizillion other methods to be lenient to occasional and accidental poor responders, and we’re still working on identifying the many people who still slip through the cracks.

What is next? Here’s my guess. With the emerging prospect of Twesearch, data quality measures for ensuring data gathered from the internet is fair and honest and true will be of utmost importance. Methods to ignore posts that automatically appear a bazillion times (see the bazillion trend), methods to tease out posts by spammers, methods to identify ‘marketing’ posts.

Data quality is a never ending issue. You think you’ve got it solved, or at least reasonably handled, and then everything just goes out the window when the next method comes along. Such is life~

Related Posts


  • I’m Told I Have No Opinion
  • The most horrible stupidest smartest amazing way to write surveys!!!
  • My Tastebuds are Leptokurtic, How About Yours? #MRX
  • ARF AM5 Day 1: Bacon! No, not that bacon.
  • Preaching to the drunks #mrx
  • ARF AM5 Day 2: Bees, not buzz but busy!
  • Advertisements
    %d bloggers like this: