Display options
Share it on

J Biomed Semantics. 2014 Sep 18;5(1):41. doi: 10.1186/2041-1480-5-41. eCollection 2014.

Structuring research methods and data with the research object model: genomics workflows as a case study.

Journal of biomedical semantics

Kristina M Hettne, Harish Dharuri, Jun Zhao, Katherine Wolstencroft, Khalid Belhajjame, Stian Soiland-Reyes, Eleni Mina, Mark Thompson, Don Cruickshank, Lourdes Verdes-Montenegro, Julian Garrido, David de Roure, Oscar Corcho, Graham Klyne, Reinout van Schouwen, Peter A C 't Hoen, Sean Bechhofer, Carole Goble, Marco Roos

Affiliations

  1. Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands.
  2. Department of Zoology, University of Oxford, Oxford, UK.
  3. School of Computer Science, University of Manchester, Manchester, UK ; Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands.
  4. School of Computer Science, University of Manchester, Manchester, UK.
  5. Instituto de Astrofísica de Andalucía, Granada, Spain.
  6. Ontology Engineering Group, Universidad Politécnica de Madrid, Madrid, Spain.

PMID: 25276335 PMCID: PMC4177597 DOI: 10.1186/2041-1480-5-41

Abstract

BACKGROUND: One of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinformatics community. Our assumption is that offering means of digital, structured aggregation and annotation of the objects of an experiment will provide necessary meta-data for a scientist to understand and recreate the results of an experiment. To support this we explored a model for the semantic description of a workflow-centric Research Object (RO), where an RO is defined as a resource that aggregates other resources, e.g., datasets, software, spreadsheets, text, etc. We applied this model to a case study where we analysed human metabolite variation by workflows.

RESULTS: We present the application of the workflow-centric RO model for our bioinformatics case study. Three workflows were produced following recently defined Best Practices for workflow design. By modelling the experiment as an RO, we were able to automatically query the experiment and answer questions such as "which particular data was input to a particular workflow to test a particular hypothesis?", and "which particular conclusions were drawn from a particular workflow?".

CONCLUSIONS: Applying a workflow-centric RO model to aggregate and annotate the resources used in a bioinformatics experiment, allowed us to retrieve the conclusions of the experiment in the context of the driving hypothesis, the executed workflows and their input data. The RO model is an extendable reference model that can be used by other systems as well.

AVAILABILITY: The Research Object is available at http://www.myexperiment.org/packs/428 The Wf4Ever Research Object Model is available at http://wf4ever.github.io/ro.

Keywords: Digital libraries; Genome wide association study; Scientific workflows; Semantic web models

References

  1. Drug Discov Today. 2014 Jul;19(7):882-9 - PubMed
  2. Nat Genet. 2010 Feb;42(2):137-41 - PubMed
  3. Nat Genet. 2008 May;40(5):638-45 - PubMed
  4. Nucleic Acids Res. 2012 Jan;40(Database issue):D109-14 - PubMed
  5. Bioinformatics. 2013 Feb 15;29(4):525-7 - PubMed
  6. Nat Genet. 2011 Mar 29;43(4):281-3 - PubMed
  7. J Biomed Semantics. 2013 Nov 22;4:38 - PubMed
  8. Drug Discov Today. 2008 Sep;13(17-18):771-7 - PubMed
  9. OMICS. 2008 Jun;12(2):143-9 - PubMed
  10. Nature. 2011 Aug 31;477(7362):54-60 - PubMed
  11. Nucleic Acids Res. 2013 Jul;41(Web Server issue):W557-61 - PubMed
  12. Bioinformatics. 2013 May 15;29(10):1325-32 - PubMed
  13. J Biomed Semantics. 2013 Nov 22;4(1):37 - PubMed
  14. Nat Rev Genet. 2011 Nov 03;12(12):821-32 - PubMed
  15. Nat Genet. 2000 May;25(1):25-9 - PubMed
  16. Comput Biol Chem. 2007 Oct;31(5-6):305-19 - PubMed
  17. J Biomed Semantics. 2011 May 17;2 Suppl 2:S4 - PubMed
  18. Genome Biol. 2010;11(8):R86 - PubMed
  19. Science. 2011 Dec 2;334(6060):1226-7 - PubMed
  20. J R Soc Interface. 2006 Dec 22;3(11):795-803 - PubMed
  21. Brief Bioinform. 2008 Jan;9(1):57-68 - PubMed
  22. Hum Mutat. 2012 Nov;33(11):1503-12 - PubMed
  23. Genome Biol. 2008;9(6):R96 - PubMed
  24. Brief Bioinform. 2013 Jan;14(1):109-25 - PubMed
  25. J Biomed Semantics. 2011 Oct 24;2(1):8 - PubMed
  26. PLoS Genet. 2008 Nov;4(11):e1000282 - PubMed
  27. Nucleic Acids Res. 2010 Jul;38(Web Server issue):W677-82 - PubMed
  28. Gigascience. 2012 Jul 12;1(1):11 - PubMed
  29. BMC Bioinformatics. 2009 Oct 01;10 Suppl 10:S9 - PubMed
  30. Nature. 2012 Feb 22;482(7386):485-8 - PubMed
  31. J Biomed Semantics. 2013 Apr 15;4 Suppl 1:S5 - PubMed
  32. BMC Med Genomics. 2013 Jan 29;6:2 - PubMed
  33. Nat Methods. 2013 May;10(5):367 - PubMed
  34. Drug Discov Today. 2012 Nov;17(21-22):1188-98 - PubMed
  35. Nat Rev Genet. 2008 May;9(5):356-69 - PubMed
  36. Nat Biotechnol. 2008 Aug;26(8):889-96 - PubMed
  37. J Biomed Inform. 2008 Oct;41(5):739-51 - PubMed
  38. Bioinformatics. 2010 Sep 15;26(18):2354-6 - PubMed
  39. J Biomed Semantics. 2010 Jun 22;1 Suppl 1:S7 - PubMed

Publication Types