Arrow Research search
Back to JMLR

JMLR 2014

Training Highly Multiclass Classifiers

Journal Article Articles Artificial Intelligence ยท Machine Learning

Abstract

Classification problems with thousands or more classes often have a large range of class-confusabilities, and we show that the more-confusable classes add more noise to the empirical loss that is minimized during training. We propose an online solution that reduces the effect of highly confusable classes in training the classifier parameters, and focuses the training on pairs of classes that are easier to differentiate at any given time in the training. We also show that the adagrad method, recently proposed for automatically decreasing step sizes for convex stochastic gradient descent, can also be profitably applied to the nonconvex joint training of supervised dimensionality reduction and linear classifiers as done in Wsabie. Experiments on ImageNet benchmark data sets and proprietary image recognition problems with 15,000 to 97,000 classes show substantial gains in classification accuracy compared to one-vs- all linear SVMs and Wsabie. [abs] [ pdf ][ bib ] &copy JMLR 2014. ( edit, beta )

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
Journal of Machine Learning Research
Archive span
2000-2026
Indexed papers
4180
Paper id
322133866214159422