Sign in
Author

Conference

Journal

Organization

Year

DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all fields of study
Limit my searches in the following fields of study
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(7)
Analysis of Variance
Cross Validation
Generalization Error
Loss Function
Machine Learning
Random Variable
Variance Estimation
Related Publications
(3)
Inference for the Generalization Error
Statistical Comparisons of Classifiers over Multiple Data Sets
No Unbiased Estimator of the Variance of KFold CrossValidation
Subscribe
Academic
Publications
Analysis of Variance of CrossValidation Estimators of the Generalization Error
Analysis of Variance of CrossValidation Estimators of the Generalization Error,Journal of Machine Learning Research,Marianthi Markatou,Hong Tian,Sham
Edit
Analysis of Variance of CrossValidation Estimators of the Generalization Error
(
Citations: 13
)
BibTex

RIS

RefWorks
Download
Marianthi Markatou
,
Hong Tian
,
Shameek Biswas
,
George Hripcsak
This paper brings together methods from two different disciplines: statistics and machine learning. We address the problem of estimating the variance of crossvalidation (CV) estimators of the generalization error. In particular, we approach the problem of
variance estimation
of the CV estimators of
generalization error
as a problem in approximating the moments of a statistic. The approximation illustrates the role of training and test sets in the performance of the algorithm. It provides a unifying approach to evaluation of various methods used in obtaining training and test sets and it takes into account the variability due to different training and test sets. For the simple problem of predicting the sample mean and in the case of smooth loss functions, we show that the variance of the CV estimator of the
generalization error
is a function of the moments of the random variables Y= are the corresponding test sets. We prove that the distribution of Y and Y* is hypergeometric and we compare our estimator with the one proposed by Nadeau and Bengio (2003). We extend these results in the regression case and the case of absolute error loss, and indicate how the methods can be extended to the classification case. We illustrate the results through simulation.
Journal:
Journal of Machine Learning Research  JMLR
, vol. 6, pp. 11271168, 2005
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
sci2s.ugr.es
)
(
jmlr.csail.mit.edu
)
(
www.informatik.unitrier.de
)
(
portal.acm.org
)
(
www.jmlr.org
)
More »
Citation Context
(10)
...The goal of parameter selection is to identify the best hyperparameter pair (c, γ ) so that the classifier can minimize the generalization error, generate accurate predictions, and mitigate the overfitting problem [8,
27
]...
...The problem of overfitting in supervised machine learning is a situation that in order to minimize the generalization error of classifier [
27
] (i.e...
...fitting. The mfold crossvalidation is one of the most effective methods to avoid overfitting based on the findings of [4,
27
]...
...Depending on the result of [
27
], the generalization error of 10fold crossvalidation is smaller than those of others...
YungMing Li
,
et al.
Building a qualitative recruitment system via SVM with MCDM approach
...The performance of each classification algorithm is evaluated as the average probability of error computed using fivefold crossvalidation.Standarderrorsofthecrossvalidationestimate are computed using the adjusted variance estimate [52], [
53
] of the form...
Paul L. D. Roberts
,
et al.
Multiview, Broadband Acoustic Classification of Marine Fish: A Machine...
...There are some research studies, such as
Markatou et al. (2005)
, Blum et al. (1999) and Ka ¨a ¨ria ¨inen and Langford (2005), that analyzed the generalization performance of SVM and proposed to estimate the true error of learned classifiers...
Tanasanee Phienthrakul
,
et al.
Evolutionary strategies for hyperparameters of support vector machines...
...Recently, statistical analysis is highly demanded in any research work and thus, we can find recent studies that propose some methods for conducting comparisons among various approaches (Dems ˇar 2006;
Markatou et al. 2005
)...
Salvador García
,
et al.
A study of statistical techniques and performance measures for genetic...
...the Markatou estimator (see
Markatou
et al.
, 2005
...
Alexander Binun
,
et al.
A decision theoretic approach to combining information filtering
References
(27)
No Unbiased Estimator of the Variance of KFold CrossValidation
(
Citations: 46
)
Yoshua Bengio
,
Yves Grandvalet
Journal:
Journal of Machine Learning Research  JMLR
, vol. 5, pp. 10891105, 2004
Approximate Statistical Test For Comparing Supervised Classification Learning Algorithms
(
Citations: 421
)
Thomas G. Dietterich
Journal:
Neural Computation  NECO
, vol. 10, no. 7, pp. 18951923, 1998
Estimating the Error Rate of a Prediction Rule: Improvement on CrossValidation
(
Citations: 558
)
Bradley Efron
Journal:
Journal of The American Statistical Association  J AMER STATIST ASSN
, vol. 78, no. 382, pp. 316331, 1983
An introductin to the bootstrap
(
Citations: 4383
)
B. Efron
,
R. J. Tibshirani
Published in 1993.
The Estimation of Prediction Error: Covariance Penalties and CrossValidation
(
Citations: 65
)
Bradley Efron
Published in 2004.
Sort by:
Citations
(13)
Building a qualitative recruitment system via SVM with MCDM approach
(
Citations: 1
)
YungMing Li
,
ChengYang Lai
,
ChienPang Kao
Journal:
Applied Intelligence  APIN
, vol. 35, no. 1, pp. 7588, 2011
Multiview, Broadband Acoustic Classification of Marine Fish: A Machine Learning Framework and Comparative Analysis
Paul L. D. Roberts
,
Jules S. Jaffe
,
Mohan M. Trivedi
Journal:
IEEE Journal of Oceanic Engineering  IEEE J OCEANIC ENG
, vol. 36, no. 1, pp. 90104, 2011
Concentration inequalities of the crossvalidation estimator for Empirical Risk Minimiser
(
Citations: 1
)
Matthieu Cornec
,
Gabriel Peri
,
Timbre G
Published in 2010.
Evolutionary strategies for hyperparameters of support vector machines based on multiscale radial basis function kernels
Tanasanee Phienthrakul
,
Boonserm Kijsirikul
Journal:
Soft Computing  SOCO
, vol. 14, no. 7, pp. 681699, 2010
Concentration inequalities of the crossvalidation estimate for stable predictors
Matthieu Cornec
Published in 2010.