On Distributed Adaptive Optimization with Gradient Compression

Xiaoyun Li; Belhal Karimi; Ping Li 0001

Back to ICLR

ICLR 2022

On Distributed Adaptive Optimization with Gradient Compression

Conference Paper Poster Presentations Artificial Intelligence · Machine Learning

Details

Abstract

We study COMP-AMS, a distributed optimization framework based on gradient averaging and adaptive AMSGrad algorithm. Gradient compression with error feedback is applied to reduce the communication cost in the gradient transmission process. Our convergence analysis of COMP-AMS shows that such compressed gradient averaging strategy yields same convergence rate as standard AMSGrad, and also exhibits the linear speedup effect w.r.t. the number of local workers. Compared with recently proposed protocols on distributed adaptive methods, COMP-AMS is simple and convenient. Numerical experiments are conducted to justify the theoretical findings, and demonstrate that the proposed method can achieve same test accuracy as the full-gradient AMSGrad with substantial communication savings. With its simplicity and efficiency, COMP-AMS can serve as a useful distributed training framework for adaptive methods.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: International Conference on Learning Representations
Archive span: 2013-2025
Indexed papers: 10294
Paper id: 385582517821990027