Display options
Share it on

Sci Data. 2021 May 04;8(1):124. doi: 10.1038/s41597-021-00905-y.

A resource to explore the discovery of rare diseases and their causative genes.

Scientific data

Friederike Ehrhart, Egon L Willighagen, Martina Kutmon, Max van Hoften, Leopold M G Curfs, Chris T Evelo

Affiliations

  1. Department of Bioinformatics - BiGCaT, NUTRIM School of Nutrition and Translational Research in Metabolism, Maastricht University, Maastricht, The Netherlands. [email protected].
  2. Governor Kremers Centre - Rett Expertise Centre, Maastricht University Medical Center, Maastricht, The Netherlands. [email protected].
  3. Department of Bioinformatics - BiGCaT, NUTRIM School of Nutrition and Translational Research in Metabolism, Maastricht University, Maastricht, The Netherlands.
  4. Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands.
  5. Governor Kremers Centre - Rett Expertise Centre, Maastricht University Medical Center, Maastricht, The Netherlands.

PMID: 33947870 PMCID: PMC8096966 DOI: 10.1038/s41597-021-00905-y

Abstract

Here, we describe a dataset with information about monogenic, rare diseases with a known genetic background, supplemented with manually extracted provenance for the disease itself and the discovery of the underlying genetic cause. We assembled a collection of 4166 rare monogenic diseases and linked them to 3163 causative genes, annotated with OMIM and Ensembl identifiers and HGNC symbols. The PubMed identifiers of the scientific publications, which for the first time described the rare diseases, and the publications, which found the genes causing the diseases were added using information from OMIM, PubMed, Wikipedia, whonamedit.com, and Google Scholar. The data are available under CC0 license as spreadsheet and as RDF in a semantic model modified from DisGeNET, and was added to Wikidata. This dataset relies on publicly available data and publications with a PubMed identifier, but by our effort to make the data interoperable and linked, we can now analyse this data. Our analysis revealed the timeline of rare disease and causative gene discovery and links them to developments in methods.

References

  1. Nucleic Acids Res. 2012 Jan;40(Database issue):D580-6 - PubMed
  2. Database (Oxford). 2016 Mar 17;2016: - PubMed
  3. J Biol Chem. 1988 Jun 5;263(16):7734-40 - PubMed
  4. Nucleic Acids Res. 2017 Jan 4;45(D1):D619-D625 - PubMed
  5. J Biol Chem. 1989 Jul 15;264(20):11893-900 - PubMed
  6. J Biomed Semantics. 2014 Mar 06;5(1):14 - PubMed
  7. BMC Genomics. 2016 Jun 06;17:427 - PubMed
  8. J Biomed Inform. 2007 Feb;40(1):30-43 - PubMed
  9. F1000Res. 2018 Jun 14;7: - PubMed
  10. PLoS One. 2018 Apr 4;13(4):e0193515 - PubMed
  11. Front Genet. 2019 Feb 21;10:59 - PubMed
  12. Am J Hum Genet. 2007 Apr;80(4):588-604 - PubMed
  13. Eur J Hum Genet. 2012 May;20(5):490-7 - PubMed
  14. Sci Data. 2016 Mar 15;3:160018 - PubMed
  15. Science. 1967 Mar 31;155(3770):1682-4 - PubMed
  16. Nucleic Acids Res. 2017 Jan 4;45(D1):D833-D839 - PubMed
  17. Nat Med. 2019 Oct;25(10):1477-1487 - PubMed
  18. Genome Res. 2003 Nov;13(11):2498-504 - PubMed
  19. Hum Mutat. 2018 Jul;39(7):914-924 - PubMed
  20. Nucleic Acids Res. 2018 Jan 4;46(D1):D754-D761 - PubMed
  21. Nature. 2015 Oct 1;526(7571):68-74 - PubMed
  22. PLoS One. 2013 Dec 05;8(12):e82160 - PubMed
  23. Am J Med. 1964 Apr;36:561-70 - PubMed

MeSH terms

Publication Types

Grant support