Academic
Publications
Analysis of Variance of Cross-Validation Estimators of the Generalization Error

Analysis of Variance of Cross-Validation Estimators of the Generalization Error,Journal of Machine Learning Research,Marianthi Markatou,Hong Tian,Sham

Analysis of Variance of Cross-Validation Estimators of the Generalization Error   (Citations: 13)
BibTex | RIS | RefWorks Download
This paper brings together methods from two different disciplines: statistics and machine learning. We address the problem of estimating the variance of cross-validation (CV) estimators of the generalization error. In particular, we approach the problem of variance estimation of the CV estimators of generalization error as a problem in approximating the moments of a statistic. The approximation illustrates the role of training and test sets in the performance of the algorithm. It provides a unifying approach to evaluation of various methods used in obtaining training and test sets and it takes into account the variability due to different training and test sets. For the simple problem of predicting the sample mean and in the case of smooth loss functions, we show that the variance of the CV estimator of the generalization error is a function of the moments of the random variables Y= are the corresponding test sets. We prove that the distribution of Y and Y* is hypergeometric and we compare our estimator with the one proposed by Nadeau and Bengio (2003). We extend these results in the regression case and the case of absolute error loss, and indicate how the methods can be extended to the classification case. We illustrate the results through simulation.
Journal: Journal of Machine Learning Research - JMLR , vol. 6, pp. 1127-1168, 2005
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
Sort by: