A PTAS for ℓp-Low Rank Approximation

Frank Ban; Vijay Bhattiprolu; Karl Bringmann; Pavel Kolev; Euiwoong Lee; David P. Woodruff

Back to SODA

SODA 2019

A PTAS for ℓp-Low Rank Approximation

Conference Paper Accepted Paper Algorithms and Complexity · Theoretical Computer Science

Details

Abstract

A number of recent works have studied algorithms for entrywise ℓ p -low rank approximation, namely algorithms which given an n × d matrix A (with n ≥ d ), output a rank- k matrix B minimizing ‖ A – B ‖ p p = ∑ i, j |A i, j – B i, j | p when p > 0; and ‖ A – B ‖0 = ∑ i, j [ A i, j ≠ B i, j ] for p = 0, where [·] is the Iverson bracket, that is, ‖ A – B ‖ 0 denotes the number of entries ( i, j ) for which A i, j ≠ B i, j. For p = 1, this is often considered more robust than the SVD, while for p = 0 this corresponds to minimizing the number of disagreements, or robust PCA. This problem is known to be NP-hard for p ∊ {0, 1}, already for k = 1, and while there are polynomial time approximation algorithms, their approximation factor is at best poly( k ). It was left open if there was a polynomial-time approximation scheme (PTAS) for ℓ p -approximation for any p ≥ 0. We show the following: 1. On the algorithmic side, for p ∊ (0, 2), we give the first n poly( k / ε ) time (1 + ε )-approximation algorithm. For p = 0, there are various problem formulations, a common one being the binary setting in which A ∊ {0, 1} n × d and B = U · V, where U ∊ {0, 1} n × k and V ∊ {0, 1} k × d. There are also various notions of multiplication U · V, such as a matrix product over the reals, over a finite field, or over a Boolean semiring. We give the first almost-linear time approximation scheme for what we call the Generalized Binary ℓ 0 - Rank-k problem, for which these variants are special cases. Our algorithm computes (1 + ε )-approximation in time (1/ ε ) 2 O ( k ) / ε 2 · nd 1 + o (1), where o (1) hides a factor (log log d ) 1. 1 / log d. In addition, for the case of finite fields of constant size, we obtain an alternate PTAS running in time n · d poly( k / ε ). 2. On the hardness front, for p ∊ (1, 2), we show under the Small Set Expansion Hypothesis and Exponential Time Hypothesis (ETH), there is no constant factor approximation algorithm running in time 2 kδ for a constant δ > 0, showing an exponential dependence on k is necessary. For p = 0, we observe that there is no approximation algorithm for the Generalized Binary ℓ 0 -Rank- k problem running in time 2 2 δk for a constant δ > 0. We also show for finite fields of constant size, under the ETH, that any fixed constant factor approximation algorithm requires 2 kδ time for a constant δ > 0.

A PTAS for ℓp-Low Rank Approximation

Abstract

Authors

Keywords

Context