Scaling-Up Robust Gradient Descent Techniques

Matthew J. Holland

Back to AAAI

AAAI 2021

Scaling-Up Robust Gradient Descent Techniques

Conference Paper AAAI Technical Track on Machine Learning II Artificial Intelligence

PDF Details

Abstract

We study a scalable alternative to robust gradient descent (RGD) techniques that can be used when losses and/or gradients can be heavy-tailed, though this will be unknown to the learner. The core technique is simple: instead of trying to robustly aggregate gradients at each step, which is costly and leads to sub-optimal dimension dependence in risk bounds, we choose a candidate which does not diverge too far from the majority of cheap stochastic sub-processes run over partitioned data. This lets us retain the formal strength of RGD methods at a fraction of the cost.

Authors

Matthew J. Holland Osaka University, Ibaraki, Osaka 567-0047 Japan

Keywords

No keywords are indexed for this paper.

Context

Venue: AAAI Conference on Artificial Intelligence
Archive span: 1980-2026
Indexed papers: 28718
Paper id: 581824847651193764