Display options
Share it on

IUCrJ. 2020 Feb 27;7:342-354. doi: 10.1107/S2052252520000895. eCollection 2020 Mar 01.

The predictive power of data-processing statistics.

IUCrJ

Melanie Vollmar, James M Parkhurst, Dominic Jaques, Arnaud Baslé, Garib N Murshudov, David G Waterman, Gwyndaf Evans

Affiliations

  1. Diamond Light Source Ltd, Harwell Science and Innovation Campus, Didcot OX11 0DE, England.
  2. MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England.
  3. Institute for Cell and Molecular Biosciences, Newcastle University, Framlington Place, Newcastle upon Tyne NE2 1HH, England.
  4. Science Technology and Facilities Council, Rutherford Appleton Laboratory, Didcot OX11 0FA, England.
  5. Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, England.

PMID: 32148861 PMCID: PMC7055369 DOI: 10.1107/S2052252520000895

Abstract

This study describes a method to estimate the likelihood of success in determining a macromolecular structure by X-ray crystallography and experimental single-wavelength anomalous dispersion (SAD) or multiple-wavelength anomalous dispersion (MAD) phasing based on initial data-processing statistics and sample crystal properties. Such a predictive tool can rapidly assess the usefulness of data and guide the collection of an optimal data set. The increase in data rates from modern macromolecular crystallography beamlines, together with a demand from users for real-time feedback, has led to pressure on computational resources and a need for smarter data handling. Statistical and machine-learning methods have been applied to construct a classifier that displays 95% accuracy for training and testing data sets compiled from 440 solved structures. Applying this classifier to new data achieved 79% accuracy. These scores already provide clear guidance as to the effective use of computing resources and offer a starting point for a personalized data-collection assistant.

© Melanie Vollmar et al. 2020.

Keywords: X-ray crystallography; experimental phasing; machine learning; macromolecular crystallography; phasing; structure determination

References

  1. Radiology. 1983 Sep;148(3):839-43 - PubMed
  2. Acta Crystallogr D Biol Crystallogr. 2002 Oct;58(Pt 10 Pt 2):1772-9 - PubMed
  3. Acta Crystallogr D Biol Crystallogr. 2002 Mar;58(Pt 3):494-506 - PubMed
  4. Acta Crystallogr D Struct Biol. 2016 Mar;72(Pt 3):346-58 - PubMed
  5. Nat Struct Biol. 1997 Apr;4(4):269-75 - PubMed
  6. Acta Crystallogr D Biol Crystallogr. 2013 Jul;69(Pt 7):1204-14 - PubMed
  7. Acta Crystallogr D Struct Biol. 2018 Feb 1;74(Pt 2):106-116 - PubMed
  8. J Mol Biol. 1968 Apr 28;33(2):491-7 - PubMed
  9. Acta Crystallogr D Biol Crystallogr. 2010 Apr;66(Pt 4):393-408 - PubMed
  10. Acta Crystallogr D Biol Crystallogr. 2013 Jul;69(Pt 7):1215-22 - PubMed
  11. Science. 2012 May 25;336(6084):1030-3 - PubMed
  12. Acta Crystallogr D Biol Crystallogr. 2011 Apr;67(Pt 4):235-42 - PubMed
  13. J Appl Crystallogr. 2015 Apr 25;48(Pt 3):927-932 - PubMed
  14. Acta Crystallogr D Biol Crystallogr. 2006 Jan;62(Pt 1):72-82 - PubMed
  15. Acta Crystallogr D Biol Crystallogr. 2010 Apr;66(Pt 4):479-85 - PubMed
  16. J Sci Instrum. 1968 May;1(5):510-6 - PubMed
  17. Curr Opin Struct Biol. 2015 Oct;34:60-8 - PubMed
  18. Acta Crystallogr D Struct Biol. 2016 Mar;72(Pt 3):359-74 - PubMed

Publication Types

Grant support