Display options
Share it on

IEEE Trans Neural Netw Learn Syst. 2012 Aug;23(8):1304-12. doi: 10.1109/TNNLS.2012.2199516.

Study on the impact of partition-induced dataset shift on k-fold cross-validation.

IEEE transactions on neural networks and learning systems

Jose García Moreno-Torres, José A Saez, Francisco Herrera

PMID: 24807526 DOI: 10.1109/TNNLS.2012.2199516

Abstract

Cross-validation is a very commonly employed technique used to evaluate classifier performance. However, it can potentially introduce dataset shift, a harmful factor that is often not taken into account and can result in inaccurate performance estimation. This paper analyzes the prevalence and impact of partition-induced covariate shift on different k-fold cross-validation schemes. From the experimental results obtained, we conclude that the degree of partition-induced covariate shift depends on the cross-validation scheme considered. In this way, worse schemes may harm the correctness of a single-classifier performance estimation and also increase the needed number of repetitions of cross-validation to reach a stable performance estimation.

Publication Types