Display options
Share it on

Database (Oxford). 2021 Mar 30;2021. doi: 10.1093/database/baab008.

Bioinformatics tools developed to support BioCompute Objects.

Database : the journal of biological databases and curation

Janisha A Patel, Dennis A Dean, Charles Hadley King, Nan Xiao, Soner Koc, Ekaterina Minina, Anton Golikov, Phillip Brooks, Robel Kahsay, Rahi Navelkar, Manisha Ray, Dave Roberson, Chris Armstrong, Raja Mazumder, Jonathon Keeney

Affiliations

  1. The Department of Biochemistry & Molecular Medicine, The George Washington University School of Medicine and Health Sciences, Washington, DC 20037, USA.
  2. Seven Bridges, Charlestown, MA 02129, USA.
  3. The McCormick Genomic and Proteomic Center, The George Washington University, Washington, DC 20037, USA.
  4. CBER-HIVE, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, MD 20993, USA.

PMID: 33784373 PMCID: PMC8009203 DOI: 10.1093/database/baab008

Abstract

Developments in high-throughput sequencing (HTS) result in an exponential increase in the amount of data generated by sequencing experiments, an increase in the complexity of bioinformatics analysis reporting and an increase in the types of data generated. These increases in volume, diversity and complexity of the data generated and their analysis expose the necessity of a structured and standardized reporting template. BioCompute Objects (BCOs) provide the requisite support for communication of HTS data analysis that includes support for workflow, as well as data, curation, accessibility and reproducibility of communication. BCOs standardize how researchers report provenance and the established verification and validation protocols used in workflows while also being robust enough to convey content integration or curation in knowledge bases. BCOs that encapsulate tools, platforms, datasets and workflows are FAIR (findable, accessible, interoperable and reusable) compliant. Providing operational workflow and data information facilitates interoperability between platforms and incorporation of future dataset within an HTS analysis for use within industrial, academic and regulatory settings. Cloud-based platforms, including High-performance Integrated Virtual Environment (HIVE), Cancer Genomics Cloud (CGC) and Galaxy, support BCO generation for users. Given the 100K+ userbase between these platforms, BioCompute can be leveraged for workflow documentation. In this paper, we report the availability of platform-dependent and platform-independent BCO tools: HIVE BCO App, CGC BCO App, Galaxy BCO API Extension and BCO Portal. Community engagement was utilized to evaluate tool efficacy. We demonstrate that these tools further advance BCO creation from text editing approaches used in earlier releases of the standard. Moreover, we demonstrate that integrating BCO generation within existing analysis platforms greatly streamlines BCO creation while capturing granular workflow details. We also demonstrate that the BCO tools described in the paper provide an approach to solve the long-standing challenge of standardizing workflow descriptions that are both human and machine readable while accommodating manual and automated curation with evidence tagging. Database URL:  https://www.biocomputeobject.org/resources.

© Oxford University Press 2021.

References

  1. Cancer Res. 2017 Nov 1;77(21):e3-e6 - PubMed
  2. Nucleic Acids Res. 2018 Jul 2;46(W1):W537-W544 - PubMed
  3. Nucleic Acids Res. 2020 Jul 2;48(W1):W395-W402 - PubMed
  4. Genomics. 2017 Jul;109(3-4):131-140 - PubMed
  5. JCO Clin Cancer Inform. 2020 Mar;4:210-220 - PubMed
  6. PLoS One. 2014 Jun 11;9(6):e99033 - PubMed
  7. Bioinformatics. 2013 Jul 01;29(13):1685-6 - PubMed
  8. F1000Res. 2020 Sep 16;9:1144 - PubMed
  9. Nucleic Acids Res. 2018 Jan 4;46(D1):D1128-D1136 - PubMed
  10. PDA J Pharm Sci Technol. 2017 Mar-Apr;71(2):136-146 - PubMed
  11. Cell Syst. 2018 Jun 27;6(6):631-635 - PubMed
  12. BMC Bioinformatics. 2017 Jul 12;18(1):337 - PubMed
  13. Glycobiology. 2020 Jan 28;30(2):72-73 - PubMed
  14. AMIA Annu Symp Proc. 2020 Mar 04;2019:1226-1235 - PubMed
  15. Bioinformatics. 2012 Oct 1;28(19):2520-2 - PubMed
  16. Sci Data. 2016 Mar 15;3:160018 - PubMed
  17. Genes (Basel). 2014 Sep 30;5(4):957-81 - PubMed
  18. BMC Genomics. 2014 Oct 21;15:918 - PubMed
  19. Database (Oxford). 2016 Mar 17;2016: - PubMed
  20. PLoS Biol. 2018 Dec 31;16(12):e3000099 - PubMed

MeSH terms

Publication Types

Grant support