STOC 2019
Memory-sample tradeoffs for linear regression with small error
Abstract
We consider the problem of performing linear regression over a stream of d -dimensional examples, and show that any algorithm that uses a subquadratic amount of memory exhibits a slower rate of convergence than can be achieved without memory constraints. Specifically, consider a sequence of labeled examples ( a 1 , b 1 ), ( a 2 , b 2 )…, with a i drawn independently from a d -dimensional isotropic Gaussian, and where b i = ⟨ a i , x ⟩ + η i , for a fixed x ∈ ℝ d with || x || 2 = 1 and with independent noise η i drawn uniformly from the interval [−2 − d /5 ,2 − d /5 ]. We show that any algorithm with at most d 2 /4 bits of memory requires at least Ω( d loglog1/є) samples to approximate x to ℓ 2 error є with probability of success at least 2/3, for є sufficiently small as a function of d . In contrast, for such є, x can be recovered to error є with probability 1− o (1) with memory O ( d 2 log(1/є)) using d examples. This represents the first nontrivial lower bounds for regression with super-linear memory, and may open the door for strong memory/sample tradeoffs for continuous optimization.
Authors
Keywords
Context
- Venue
- ACM Symposium on Theory of Computing
- Archive span
- 1969-2025
- Indexed papers
- 4364
- Paper id
- 920449888245057447