Universal Gradient Methods for Stochastic Convex Optimization

Anton Rodomanov; Ali Kavis; Yongtao Wu; Kimon Antonakopoulos; Volkan Cevher

Back to ICML

ICML 2024

Universal Gradient Methods for Stochastic Convex Optimization

Conference Paper Accept (Poster) Artificial Intelligence · Machine Learning

Details

Abstract

We develop universal gradient methods for Stochastic Convex Optimization (SCO). Our algorithms automatically adapt not only to the oracle’s noise but also to the Hölder smoothness of the objective function without a priori knowledge of the particular setting. The key ingredient is a novel strategy for adjusting step-size coefficients in the Stochastic Gradient Method (SGD). Unlike AdaGrad, which accumulates gradient norms, our Universal Gradient Method accumulates appropriate combinations of gradientand iterate differences. The resulting algorithm has state-of-the-art worst-case convergence rate guarantees for the entire Hölder class including, in particular, both nonsmooth functions and those with Lipschitz continuous gradient. We also present the Universal Fast Gradient Method for SCO enjoying optimal efficiency estimates.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: International Conference on Machine Learning
Archive span: 1993-2025
Indexed papers: 16471
Paper id: 664411243395166004