Display options
Share it on

PLoS One. 2016 Jan 05;11(1):e0145406. doi: 10.1371/journal.pone.0145406. eCollection 2016.

Online and Social Media Data As an Imperfect Continuous Panel Survey.

PloS one

Fernando Diaz, Michael Gamon, Jake M Hofman, Emre Kıcıman, David Rothschild

Affiliations

  1. Microsoft Research, New York, NY, United States of America.
  2. Microsoft Research, Redmond, WA, United States of America.

PMID: 26730933 PMCID: PMC4711590 DOI: 10.1371/journal.pone.0145406

Abstract

There is a large body of research on utilizing online activity as a survey of political opinion to predict real world election outcomes. There is considerably less work, however, on using this data to understand topic-specific interest and opinion amongst the general population and specific demographic subgroups, as currently measured by relatively expensive surveys. Here we investigate this possibility by studying a full census of all Twitter activity during the 2012 election cycle along with the comprehensive search history of a large panel of Internet users during the same period, highlighting the challenges in interpreting online and social media activity as the results of a survey. As noted in existing work, the online population is a non-representative sample of the offline world (e.g., the U.S. voting population). We extend this work to show how demographic skew and user participation is non-stationary and difficult to predict over time. In addition, the nature of user contributions varies substantially around important events. Furthermore, we note subtle problems in mapping what people are sharing or consuming online to specific sentiment or opinion measures around a particular topic. We provide a framework, built around considering this data as an imperfect continuous panel survey, for addressing these issues so that meaningful insight about public interest and opinion can be reliably extracted from online and social media data.

References

  1. Nature. 2009 Feb 19;457(7232):1012-4 - PubMed
  2. Proc Natl Acad Sci U S A. 2010 Oct 12;107(41):17486-90 - PubMed
  3. Science. 2011 Sep 30;333(6051):1878-81 - PubMed
  4. PLoS One. 2011;6(12):e26752 - PubMed
  5. Chaos. 2012 Jun;22(2):023138 - PubMed
  6. Nature. 2012 Sep 13;489(7415):295-8 - PubMed
  7. Science. 2012 Oct 26;338(6106):472-3 - PubMed
  8. Nature. 2013 Feb 14;494(7436):155-6 - PubMed
  9. Proc Natl Acad Sci U S A. 2013 Apr 9;110(15):5802-5 - PubMed
  10. PLoS One. 2013 May 29;8(5):e64679 - PubMed
  11. PLoS One. 2013 Nov 27;8(11):e79449 - PubMed
  12. PLoS One. 2013 Dec 09;8(12):e83672 - PubMed
  13. Public Opin Q. 2016 Spring;80(1):180-211 - PubMed

MeSH terms

Publication Types