Display options
Share it on

Nat Biotechnol. 2021 Dec;39(12):1563-1573. doi: 10.1038/s41587-021-00968-7. Epub 2021 Jul 08.

MaxDIA enables library-based and library-free data-independent acquisition proteomics.

Nature biotechnology

Pavel Sinitcyn, Hamid Hamzeiy, Favio Salinas Soto, Daniel Itzhak, Frank McCarthy, Christoph Wichmann, Martin Steger, Uli Ohmayer, Ute Distler, Stephanie Kaspar-Schoenefeld, Nikita Prianichnikov, Şule Yılmaz, Jan Daniel Rudolph, Stefan Tenzer, Yasset Perez-Riverol, Nagarjuna Nagaraj, Sean J Humphrey, Jürgen Cox

Affiliations

  1. Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany.
  2. Chan Zuckerberg Biohub, San Francisco, CA, USA.
  3. Evotec München GmbH, Martinsried, Germany.
  4. Institute for Immunology, Johannes Gutenberg University, Mainz, Germany.
  5. Bruker Daltonik, GmbH, Bremen, Germany.
  6. Bosch Center for Artificial Intelligence, Renningen, Germany.
  7. European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
  8. School of Life and Environmental Sciences, Charles Perkins Centre, University of Sydney, Camperdown, New South Wales, Australia.
  9. Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany. [email protected].
  10. Department of Biological and Medical Psychology, University of Bergen, Bergen, Norway. [email protected].

PMID: 34239088 DOI: 10.1038/s41587-021-00968-7

Abstract

MaxDIA is a software platform for analyzing data-independent acquisition (DIA) proteomics data within the MaxQuant software environment. Using spectral libraries, MaxDIA achieves deep proteome coverage with substantially better coefficients of variation in protein quantification than other software. MaxDIA is equipped with accurate false discovery rate (FDR) estimates on both library-to-DIA match and protein levels, including when using whole-proteome predicted spectral libraries. This is the foundation of discovery DIA-hypothesis-free analysis of DIA samples without library and with reliable FDR control. MaxDIA performs three- or four-dimensional feature detection of fragment data, and scoring of matches is augmented by machine learning on the features of an identification. MaxDIA's bootstrap DIA workflow performs multiple rounds of matching with increasing quality of recalibration and stringency of matching to the library. Combining MaxDIA with two new technologies-BoxCar acquisition and trapped ion mobility spectrometry-both lead to deep and accurate proteome quantification.

© 2021. The Author(s).

References

  1. Doerr, A. DIA mass spectrometry. Nat. Methods 12, 35–35 (2014). - PubMed
  2. Navarro, P. et al. A multicenter study benchmarks software tools for label-free proteome quantification. Nat. Biotechnol. 34, 1130–1136 (2016). - PubMed
  3. Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008). - PubMed
  4. Azvolinsky, A., DeFrancesco, L., Waltz, E. & Webb, S. 20 years of Nature Biotechnology research tools. Nat. Biotechnol. 34, 256–261 (2016). - PubMed
  5. Sinitcyn, P., Rudolph, J. D. & Cox, J. Computational methods for understanding mass spectrometry-based shotgun proteomics. Annu. Rev. Biomed. Data Sci. 1, 207–234 (2018). - PubMed
  6. Sinitcyn, P. et al. MaxQuant goes Linux. Nat. Methods 15, 401 (2018). - PubMed
  7. Röst, H. L. et al. OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat. Biotechnol. 32, 219–223 (2014). - PubMed
  8. MacLean, B. et al. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26, 966–968 (2010). - PubMed
  9. Bruderer, R. et al. Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues. Mol. Cell. Proteomics 14, 1400–1410 (2015). - PubMed
  10. Demichev, V., Messner, C. B., Vernardis, S. I., Lilley, K. S. & Ralser, M. DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nat. Methods 14, 41–44 (2020). - PubMed
  11. Cox, J. et al. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol. Cell. Proteomics 13, 2513–2526 (2014). - PubMed
  12. Rosenberger, G. et al. Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses. Nat. Methods 14, 921–927 (2017). - PubMed
  13. Elias, J. E. & Gygi, S. P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007). - PubMed
  14. Tsou, C.-C. et al. DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat. Methods 12, 258–264 (2015). - PubMed
  15. Tiwary, S. et al. High quality MS/MS spectrum prediction for data-dependent and -independent acquisition data analysis. Nat. Methods 16, 519–525 (2019). - PubMed
  16. Gessulat, S. et al. Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat. Methods 16, 509–518 (2019). - PubMed
  17. Yang, Y. et al. In silico spectral libraries by deep learning facilitate data-independent acquisition proteomics. Nat. Commun. 11, 146 (2020). - PubMed
  18. Searle, B. C. et al. Generating high quality libraries for DIA MS with empirically corrected peptide predictions. Nat. Commun. 11, 1548 (2020). - PubMed
  19. Lou, R. et al. Hybrid spectral library combining DIA-MS data and a targeted virtual library substantially deepens the proteome coverage. iScience 23, 100903 (2020). - PubMed
  20. Tran, N. H. et al. Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry. Nat. Methods 16, 62–66 (2019). - PubMed
  21. Graves, A. et al. A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31, 855–868 (2009). - PubMed
  22. Chen, T. & Guestrin, C. XGBoost: reliable large-scale tree boosting system. Preprint at https://arxiv.org/abs/1603.02754 (2016). - PubMed
  23. Prianichnikov, N. et al. MaxQuant software for ion mobility enhanced shotgun proteomics. Mol. Cell. Proteomics 19, 1058–1069 (2020). - PubMed
  24. Meier, F., Geyer, P. E., Virreira Winter, S., Cox, J. & Mann, M. BoxCar acquisition method enables single-shot proteomics at a depth of 10,000 proteins in 100 minutes. Nat. Methods 15, 440–448 (2018). - PubMed
  25. Fernandez-Lima, F., Kaplan, D. A., Suetering, J. & Park, M. A. Gas-phase separation using a trapped ion mobility spectrometer. Int. J. Ion Mobil. Spectrom. https://doi.org/10.1007/s12127-011-0067-8 (2011). - PubMed
  26. Silveira, J. A., Ridgeway, M. E. & Park, M. A. High resolution trapped ion mobility spectrometery of peptides. Anal. Chem. 86, 5624–5627 (2014). - PubMed
  27. Meier, F. et al. Online parallel accumulation–serial fragmentation (PASEF) with a novel trapped ion mobility mass spectrometer. Mol. Cell. Proteomics 17, 2534–2545 (2018). - PubMed
  28. Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019). - PubMed
  29. Griss, J. et al. The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience. Mol. Cell. Proteomics 13, 2765–2775 (2014). - PubMed
  30. Martens, L. et al. mzML—a community standard for mass spectrometry data. Mol. Cell. Proteomics 10, R110 000133 (2011). - PubMed
  31. Cox, J., Michalski, A. & Mann, M. Software lock mass by two-dimensional minimization of peptide mass errors. J. Am. Soc. Mass. Spectrom. 22, 1373–1380 (2011). - PubMed
  32. Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001). - PubMed
  33. Käll, L., Canterbury, J. D., Weston, J., Noble, W. S. & MacCoss, M. J. Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat. Methods 4, 923–925 (2007). - PubMed
  34. Bruderer, R. et al. Optimization of experimental parameters in data-independent mass spectrometry significantly increases depth and reproducibility of results. Mol. Cell. Proteomics 16, 2296–2309 (2017). - PubMed
  35. Ludwig, C. et al. Data‐independent acquisition‐based SWATH‐MS for quantitative proteomics: a tutorial. Mol. Syst. Biol. 14, e8126 (2018). - PubMed
  36. Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat. Methods 5, 621–628 (2008). - PubMed
  37. Selbach, M. et al. Widespread changes in protein synthesis induced by microRNAs. Nature 455, 58–63 (2008). - PubMed
  38. Buccitelli, C. & Selbach, M. mRNAs, proteins and the emerging principles of gene expression control. Nat. Rev. Genet. 21, 630–644 (2020). - PubMed
  39. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45, D158–D169 (2017). - PubMed
  40. Tsai, T. H. et al. Selection of features with consistent profiles improves relative protein quantification in mass spectrometry experiments. Mol. Cell. Proteomics 19, 944–959 (2020). - PubMed
  41. Vaca Jacome, A. S. et al. Avant-garde: an automated data-driven DIA data curation tool. Nat. Methods 17, 1237–1244 (2020). - PubMed
  42. Searle, B. C. et al. Chromatogram libraries improve peptide detection and quantification by data independent acquisition mass spectrometry. Nat. Commun. 9, 5128 (2018). - PubMed
  43. Teo, G. et al. MapDIA: preprocessing and statistical analysis of quantitative proteomics data from data independent acquisition mass spectrometry. J. Proteomics 129, 108–120 (2015). - PubMed
  44. Hebenstreit, D. et al. RNA sequencing reveals two major classes of gene expression levels in metazoan cells. Mol. Syst. Biol. 7, 497 (2011). - PubMed
  45. Bekker-Jensen, D. B. et al. Rapid and site-specific deep phosphoproteome profiling by data-independent acquisition without the need for spectral libraries. Nat. Commun. 11, 787 (2020). - PubMed
  46. Müller, F., Kolbowski, L., Bernhardt, O. M., Reiter, L. & Rappsilber, J. Data-independent acquisition improves quantitative cross-linking mass spectrometry. Mol. Cell. Proteomics 18, 786–795 (2019). - PubMed
  47. Rappsilber, J., Ishihama, Y. & Mann, M. Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics. Anal. Chem. 75, 663–670 (2003). - PubMed
  48. Fonslow, B. R. et al. Digestion and depletion of abundant proteins improves proteomic coverage. Nat. Methods 10, 54–56 (2013). - PubMed
  49. Wiśniewski, J. R., Zougman, A., Nagaraj, N. & Mann, M. Universal sample preparation method for proteome analysis. Nat. Methods 6, 359–362 (2009). - PubMed
  50. Distler, U., Kuharev, J., Navarro, P. & Tenzer, S. Label-free quantification in ion mobility-enhanced data-independent acquisition proteomics. Nat. Protoc. 11, 795–812 (2016). - PubMed
  51. Mertins, P. et al. Reproducible workflow for multiplexed deep-scale proteome and phosphoproteome analysis of tumor tissues by liquid chromatography–mass spectrometry. Nat. Protocols 13, 1632–1661 (2018). - PubMed
  52. Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012). - PubMed
  53. Thul, P. J. et al. A subcellular map of the human proteome. Science 356, eaal3321 (2017). - PubMed
  54. Maglott, D., Ostell, J., Pruitt, K. D. & Tatusova, T. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 39, D52–D57 (2011). - PubMed
  55. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2115–2120 (2014). - PubMed
  56. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013). - PubMed
  57. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009). - PubMed
  58. Tyanova, S. et al. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat. Methods 13, 731–740 (2016). - PubMed

Publication Types

Grant support