Display options
Share it on

Database (Oxford). 2012 Nov 17;2012:bas043. doi: 10.1093/database/bas043. Print 2012.

Biocuration workflows and text mining: overview of the BioCreative 2012 Workshop Track II.

Database : the journal of biological databases and curation

Zhiyong Lu, Lynette Hirschman

Affiliations

  1. National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA.

PMID: 23160416 PMCID: PMC3500522 DOI: 10.1093/database/bas043

Abstract

Manual curation of data from the biomedical literature is a rate-limiting factor for many expert curated databases. Despite the continuing advances in biomedical text mining and the pressing needs of biocurators for better tools, few existing text-mining tools have been successfully integrated into production literature curation systems such as those used by the expert curated databases. To close this gap and better understand all aspects of literature curation, we invited submissions of written descriptions of curation workflows from expert curated databases for the BioCreative 2012 Workshop Track II. We received seven qualified contributions, primarily from model organism databases. Based on these descriptions, we identified commonalities and differences across the workflows, the common ontologies and controlled vocabularies used and the current and desired uses of text mining for biocuration. Compared to a survey done in 2009, our 2012 results show that many more databases are now using text mining in parts of their curation workflows. In addition, the workshop participants identified text-mining aids for finding gene names and symbols (gene indexing), prioritization of documents for curation (document triage) and ontology concept assignment as those most desired by the biocurators. DATABASE URL: http://www.biocreative.org/tasks/bc-workshop-2012/workflow/.

References

  1. BMC Bioinformatics. 2009 Oct 08;10:326 - PubMed
  2. PLoS Biol. 2004 Nov;2(11):e309 - PubMed
  3. BMC Bioinformatics. 2007 Jul 26;8:269 - PubMed
  4. Database (Oxford). 2013 Jan 17;2013:bas056 - PubMed
  5. BMC Bioinformatics. 2011 Oct 03;12 Suppl 8:S2 - PubMed
  6. Database (Oxford). 2012 Nov 17;2012:bas041 - PubMed
  7. Database (Oxford). 2012 Mar 21;2012:bas017 - PubMed
  8. BMC Bioinformatics. 2005;6 Suppl 1:S11 - PubMed
  9. Database (Oxford). 2012 Dec 05;2012:bas049 - PubMed
  10. BMC Bioinformatics. 2003 Mar 27;4:11 - PubMed
  11. Genome Biol. 2008;9 Suppl 2:S3 - PubMed
  12. BMC Bioinformatics. 2009 Jul 21;10:228 - PubMed
  13. Genome Biol. 2008;9(2):R31 - PubMed
  14. IEEE/ACM Trans Comput Biol Bioinform. 2010 Jul-Sep;7(3):385-99 - PubMed
  15. Database (Oxford). 2009;2009:bap019 - PubMed
  16. Genome Biol. 2008;9 Suppl 2:S2 - PubMed
  17. BMC Bioinformatics. 2011 Oct 03;12 Suppl 8:S3 - PubMed
  18. Pac Symp Biocomput. 2008;:556-67 - PubMed
  19. BMC Bioinformatics. 2005;6 Suppl 1:S16 - PubMed
  20. Nat Genet. 2000 May;25(1):25-9 - PubMed
  21. Summit Transl Bioinform. 2009 Mar 01;2009:56-60 - PubMed
  22. Pac Symp Biocomput. 2007;:245-56 - PubMed
  23. BMC Bioinformatics. 2011 Oct 03;12 Suppl 8:S4 - PubMed
  24. Database (Oxford). 2012 Apr 18;2012:bas020 - PubMed
  25. Genome Biol. 2008;9 Suppl 2:S1 - PubMed
  26. J Biomed Inform. 2011 Apr;44(2):310-8 - PubMed
  27. BMC Bioinformatics. 2005;6 Suppl 1:S1 - PubMed
  28. Genome Biol. 2008;9 Suppl 2:S4 - PubMed
  29. BMC Bioinformatics. 2005;6 Suppl 1:S2 - PubMed
  30. BMC Bioinformatics. 2011 Oct 03;12 Suppl 8:S1 - PubMed

MeSH terms

Publication Types

Grant support