FOCS Conference 2023 Conference Paper
- Yair Carmon
- Arun Jambulapati
- Yujia Jin
- Yin Tat Lee
- Daogao Liu
- Aaron Sidford
- Kevin Tian
We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJ+20], [ACJ+21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in $\mathbb{R}^{d}$, we obtain the following results (up to polylogarithmic factors). 1)We give a parallel algorithm obtaining optimization error $\epsilon_{\text {opt} }$ with $d^{1 / 3} \epsilon_{\text {opt} }^{-2 / 3}$ gradient oracle query depth and $d^{1 / 3} \epsilon_{\text {opt} }^{-2 / 3}+\epsilon_{\text {opt} }^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For $\epsilon_{\text {opt} } \in\left[d^{-1}, d^{-1 / 4}\right]$, our algorithm matches the state-of-the-art oracle depth of [BJL+19] while maintaining the optimal total work of stochastic gradient descent. 2)Given n samples of Lipschitz loss functions, prior works [BFTT19], [BFGT20], [AFKT21], [KLL21] established that if $n \gt rsim d \epsilon_{\mathrm{dp}}^{-2}, \left(\epsilon_{\mathrm{dp}}, \delta\right)$-differential privacy is attained at no asymptotic cost to the SCO utility. However, these prior works all required a superlinear number of gradient queries. We close this gap for sufficiently large $n \gt rsim d^{2} \epsilon_{{\bf d p}}^{-3}$, by using ReSQue to design an algorithm with near-linear gradient query complexity in this regime.