Display options
Share it on

SAR QSAR Environ Res. 2016 Oct;27(10):799-811. doi: 10.1080/1062936X.2016.1238010. Epub 2016 Oct 06.

Classification of biodegradable materials using QSAR modelling with uncertainty estimation.

SAR and QSAR in environmental research

W F C Rocha, D A Sheen

Affiliations

  1. Division of Chemical Metrology, National Institute of Metrology, Quality and Technology - INMETRO, Duque de Caxias, Brazil.
  2. Chemical Sciences Division, National Institute of Standards and Technology, Gaithersburg, USA.

PMID: 27710037 PMCID: PMC5382130 DOI: 10.1080/1062936X.2016.1238010

Abstract

The ability to determine the biodegradability of chemicals without resorting to expensive tests is ecologically and economically desirable. Models based on quantitative structure-activity relations (QSAR) provide some promise in this direction. However, QSAR models in the literature rarely provide uncertainty estimates in more detail than aggregated statistics such as the sensitivity and specificity of the model's predictions. Almost never is there a means of assessing the uncertainty in an individual prediction. Without an uncertainty estimate, it is impossible to assess the trustworthiness of any particular prediction, which leaves the model with a low utility for regulatory purposes. In the present work, a QSAR model with uncertainty estimates is used to predict biodegradability for a set of substances from a publicly available data set. Separation was performed using a partial least squares discriminant analysis model, and the uncertainty was estimated using bootstrapping. The uncertainty prediction allows for confidence intervals to be assigned to any of the model's predictions, allowing for a more complete assessment of the model than would be possible through a traditional statistical analysis. The results presented here are broadly applicable to other areas of modelling as well, because the calculation of the uncertainty will clearly demonstrate where additional tests are needed.

Keywords: Partial least squares discriminant analysis; QSAR; biodegradable materials; bootstrap; machine learning; uncertainty estimation

References

  1. Mol Divers. 2007 Feb;11(1):23-36 - PubMed
  2. Environ Res. 2015 Oct;142:161-8 - PubMed
  3. J R Stat Soc Series B Stat Methodol. 2010 Jan;72(1):3-25 - PubMed
  4. Nat Protoc. 2011 Jun;6(6):743-60 - PubMed
  5. Acta Biol Hung. 2014 Sep;65(3):252-64 - PubMed
  6. Chemosphere. 2014 Aug;108:10-6 - PubMed
  7. J Chem Inf Model. 2013 Apr 22;53(4):867-78 - PubMed
  8. J Chem Inf Comput Sci. 2001 Sep-Oct;41(5):1218-27 - PubMed
  9. Curr Top Med Chem. 2008;8(18):1606-27 - PubMed
  10. Mater Sci Eng C Mater Biol Appl. 2016 May;62:407-13 - PubMed
  11. Food Chem. 2016 Apr 15;197(Pt A):250-6 - PubMed
  12. Mol Inform. 2014 Jan;33(1):73-85 - PubMed
  13. Sci Total Environ. 2014 Nov 1;497-498:60-67 - PubMed
  14. J Environ Sci (China). 2015 Apr 1;30:180-5 - PubMed
  15. Chemosphere. 2015 Dec;140:129-42 - PubMed
  16. Stat Med. 2016 Mar 30;35(7):1090-102 - PubMed
  17. Environ Toxicol Chem. 2015 Jun;34(6):1224-31 - PubMed
  18. Anal Biochem. 2013 Feb 15;433(2):102-4 - PubMed

Publication Types

Grant support