Memory-sample tradeoffs for linear regression with small error

Vatsal Sharan; Aaron Sidford; Gregory Valiant

Back to STOC

STOC 2019

Memory-sample tradeoffs for linear regression with small error

Conference Paper COLT Sister Session ML Foundations Algorithms and Complexity · Theoretical Computer Science

Details

Abstract

We consider the problem of performing linear regression over a stream of d -dimensional examples, and show that any algorithm that uses a subquadratic amount of memory exhibits a slower rate of convergence than can be achieved without memory constraints. Specifically, consider a sequence of labeled examples ( a 1 , b 1 ), ( a 2 , b 2 )…, with a i drawn independently from a d -dimensional isotropic Gaussian, and where b i = ⟨ a i , x ⟩ + η i , for a fixed x ∈ ℝ d with || x || 2 = 1 and with independent noise η i drawn uniformly from the interval [−2 − d /5 ,2 − d /5 ]. We show that any algorithm with at most d 2 /4 bits of memory requires at least Ω( d loglog1/є) samples to approximate x to ℓ 2 error є with probability of success at least 2/3, for є sufficiently small as a function of d . In contrast, for such є, x can be recovered to error є with probability 1− o (1) with memory O ( d 2 log(1/є)) using d examples. This represents the first nontrivial lower bounds for regression with super-linear memory, and may open the door for strong memory/sample tradeoffs for continuous optimization.

Memory-sample tradeoffs for linear regression with small error

Abstract

Authors

Keywords

Context