Display options
Share it on

PLoS One. 2017 Sep 18;12(9):e0184188. doi: 10.1371/journal.pone.0184188. eCollection 2017.

Multi-level computational methods for interdisciplinary research in the HathiTrust Digital Library.

PloS one

Jaimie Murdock, Colin Allen, Katy Börner, Robert Light, Simon McAlister, Andrew Ravenscroft, Robert Rose, Doori Rose, Jun Otsuka, David Bourget, John Lawrence, Chris Reed

Affiliations

  1. Program in Cognitive Science, Indiana University, Bloomington, IN, United States of America.
  2. School of Informatics and Computing, Indiana University, Bloomington, IN, United States of America.
  3. Department of History & Philosophy of Science & Medicine, Indiana University, Bloomington, IN, United States of America.
  4. Department of History & Philosophy of Science, University of Pittsburgh, Pittsburgh, PA, United States of America.
  5. Indiana University Network Science Institute (IUNI), Bloomington, IN, United States of America.
  6. User-Centered Social Media, Department of Computer Science and Applied Cognitive Science, University of Duisburg-Essen, Duisburg, Germany.
  7. International Centre for Public Pedagogy (ICPuP), Cass School of Education & Communities, University of East London, London, United Kingdom.
  8. Department of Mathematics, Indiana University, Bloomington, IN, United States of America.
  9. Department of Philosophy, Kyoto University, Kyoto, Japan.
  10. Department of Philosophy, University of Western Ontario, London, Ontario, Canada.
  11. Centre for Argument Technology, University of Dundee, Dundee, United Kingdom.

PMID: 28922416 PMCID: PMC5602542 DOI: 10.1371/journal.pone.0184188

Abstract

We show how faceted search using a combination of traditional classification systems and mixed-membership topic models can go beyond keyword search to inform resource discovery, hypothesis formulation, and argument extraction for interdisciplinary research. Our test domain is the history and philosophy of scientific work on animal mind and cognition. The methods can be generalized to other research areas and ultimately support a system for semi-automatic identification of argument structures. We provide a case study for the application of the methods to the problem of identifying and extracting arguments about anthropomorphism during a critical period in the development of comparative psychology. We show how a combination of classification systems and mixed-membership models trained over large digital libraries can inform resource discovery in this domain. Through a novel approach of "drill-down" topic modeling-simultaneously reducing both the size of the corpus and the unit of analysis-we are able to reduce a large collection of fulltext volumes to a much smaller set of pages within six focal volumes containing arguments of interest to historians and philosophers of comparative psychology. The volumes identified in this way did not appear among the first ten results of the keyword search in the HathiTrust digital library and the pages bear the kind of "close reading" needed to generate original interpretations that is the heart of scholarly work in the humanities. Zooming back out, we provide a way to place the books onto a map of science originally constructed from very different data and for different purposes. The multilevel approach advances understanding of the intellectual and societal contexts in which writings are interpreted.

References

  1. PLoS One. 2015 Apr 07;10(4):e0121898 - PubMed
  2. Cognition. 2017 Feb;159:117-126 - PubMed
  3. Scientometrics. 2011 Aug;88(2):675-677 - PubMed
  4. Science. 2008 Jul 18;321(5887):395-9 - PubMed
  5. PLoS One. 2009 Nov 11;4(11):e7678 - PubMed
  6. Proc Conf Empir Methods Nat Lang Process. 2016 Nov;2016:2116-2121 - PubMed
  7. Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1:5228-35 - PubMed
  8. Science. 2011 Jan 14;331(6014):176-82 - PubMed
  9. J R Soc Interface. 2016 Jun;13(119):null - PubMed
  10. PLoS One. 2012;7(7):e39464 - PubMed
  11. Hist Philos Life Sci. 2007;29(3):275-84 - PubMed
  12. Nature. 2000 Sep 28;407(6803):470 - PubMed

MeSH terms

Publication Types