A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs work

Ralf Herbrich; Thore Graepel

Back to NeurIPS

NeurIPS 2000

A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs work

Conference Paper Artificial Intelligence · Machine Learning

PDF Details

Abstract

We present a bound on the generalisation error of linear classifiers in terms of a refined margin quantity on the training set. The result is obtained in a PAC- Bayesian framework and is based on geometrical arguments in the space of linear classifiers. The new bound constitutes an exponential improvement of the so far tightest margin bound by Shawe-Taylor et al. [8] and scales logarithmically in the inverse margin. Even in the case of less training examples than input dimensions sufficiently large margins lead to non-trivial bound values and - plexity term. Furthermore, the classical margin is too coarse a measure for the essential quantity that controls the generalisation error: the volume ratio between the whole hypothesis space and the subset of consistent hypotheses. The practical relevance of the result lies in the fact that the well-known support vector machine is optimal w. r. t. the new bound only if the feature vectors are all of the same length. As a consequence we recommend to use SVMs on normalised feature vectors only - a recommendation that is well supported by our numerical experiments on two benchmark data sets. for maximum margins - to a vanishing com(cid: 173)

A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs work

Abstract

Authors

Keywords

Context