Arrow Research search

Author name cluster

Jonathan A. Kelner

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

24 papers
2 author rows

Possible papers

24

NeurIPS Conference 2024 Conference Paper

Semi-Random Matrix Completion via Flow-Based Adaptive Reweighting

  • Jonathan A. Kelner
  • Jerry Li
  • Allen Liu
  • Aaron Sidford
  • Kevin Tian

We consider the well-studied problem of completing a rank-$r$, $\mu$-incoherent matrix $\mathbf{M} \in \mathbb{R}^{d \times d}$ from incomplete observations. We focus on this problem in the semi-random setting where each entry is independently revealed with probability at least $p = \frac{\textup{poly}(r, \mu, \log d)}{d}$. Whereas multiple nearly-linear time algorithms have been established in the more specialized fully-random setting where each entry is revealed with probablity exactly $p$, the only known nearly-linear time algorithm in the semi-random setting is due to [CG18], whose sample complexity has a polynomial dependence on the inverse accuracy and condition number and thus cannot achieve high-accuracy recovery. Our main result is the first high-accuracy nearly-linear time algorithm for solving semi-random matrix completion, and an extension to the noisy observation setting. Our result builds upon the recent short-flat decomposition framework of [KLLST23a, KLLST23b] and leverages fast algorithms for flow problems on graphs to solve adaptive reweighting subproblems efficiently.

FOCS Conference 2023 Conference Paper

Matrix Completion in Almost-Verification Time

  • Jonathan A. Kelner
  • Jerry Li 0001
  • Allen Liu
  • Aaron Sidford
  • Kevin Tian

We give a new framework for solving the fundamental problem of low-rank matrix completion, i. e. , approximating a rank- r matrix $\mathbf{M} \in \mathbb{R}^{m \times n}$ (where $m \geq n$) from random observations. First, we provide an algorithm which completes M on $99 \%$ of rows and columns under no further assumptions on M from $\approx m r$ samples and using $\approx m r^{2}$ time. Then, assuming the row and column spans of M satisfy additional regularity properties, we show how to boost this partial completion guarantee to a full matrix completion algorithm by aggregating solutions to regression problems involving the observations. In the well-studied setting where M has incoherent row and column spans, our algorithms complete M to high precision from $m r^{2+o(1)}$ observations in $m r^{3+o(1)}$ time (omitting logarithmic factors in problem parameters), improving upon the prior state-of-the-art [JN15] which used $\approx m r^{5}$ samples and $\approx m r^{7}$ time. Under an assumption on the row and column spans of M we introduce (which is satisfied by random subspaces with high probability), our sample complexity improves to an almost information-theoretically optimal $m r^{1+o(1)}$, and our runtime improves to $m r^{2+o(1)}$. Our runtimes have the appealing property of matching the best known runtime to verify that a rankr decomposition $\mathrm{UV}^{\top}$ agrees with the sampled observations. We also provide robust variants of our algorithms that, given random observations from $\mathrm{M}+\mathrm{N}$ with $\|\mathrm{N}\|_{\mathrm{F}} \leq \Delta$, complete M to Frobenius norm distance $\approx r^{1. 5} \Delta$ in the same runtimes as the noiseless setting. Prior noisy matrix completion algorithms [CP10] only guaranteed a distance of $\approx \sqrt{n} \Delta$.

ICLR Conference 2022 Conference Paper

Optimization and Adaptive Generalization of Three layer Neural Networks

  • Khashayar Gatmiry
  • Stefanie Jegelka
  • Jonathan A. Kelner

While there has been substantial recent work studying generalization of neural networks, the ability of deep nets in automating the process of feature extraction still evades a thorough mathematical understanding. As a step toward this goal, we analyze learning and generalization of a three-layer neural network with ReLU activations in a regime that goes beyond the linear approximation of the network, and is hence not captured by the common Neural Tangent Kernel. We show that despite nonconvexity of the empirical loss, a variant of SGD converges in polynomially many iterations to a good solution that generalizes. In particular, our generalization bounds are adaptive: they automatically optimize over a family of kernels that includes the Neural Tangent Kernel, to provide the tightest bound.

FOCS Conference 2021 Conference Paper

On the Power of Preconditioning in Sparse Linear Regression

  • Jonathan A. Kelner
  • Frederic Koehler
  • Raghu Meka
  • Dhruv Rohatgi

Sparse linear regression is a fundamental problem in high-dimensional statistics, but strikingly little is known about how to efficiently solve it without restrictive conditions on the design matrix. We consider the (correlated) random design setting, where the covariates are independently drawn from a multivariate Gaussian $N(0, \ \Sigma)$, for some $n\times n$ positive semi-definite matrix $\Sigma$, and seek estimators $\hat{w}$ minimizing $(\hat{w}-w^{\ast})^{T}\Sigma(\hat{w}-w^{\ast})$, where $w^{\ast}$ is the k-sparse ground truth. Information theoretically, one can achieve strong error bounds with only $O(k\log n)$ samples for arbitrary $\Sigma$ and $w^{\ast}$; however, no efficient algorithms are known to match these guarantees even with $o(n)$ samples, without further assumptions on $\Sigma$ or $w^{\ast}$. Yet there is little evidence for this gap in the random design setting: computational lower bounds are only known for worst-case design matrices. To date, random-design instances (i. e. specific covariance matrices $\Sigma$ ) have only been proven hard against the Lasso program and variants. More precisely, these “hard” instances can often be solved by Lasso after a simple change-of-basis (i. e. preconditioning). In this work, we give both upper and lower bounds clarifying the power of preconditioning as a tool for solving sparse linear regression problems. On the one hand, we show that the preconditioned Lasso can solve a large class of sparse linear regression problems nearly optimally: it succeeds whenever the dependency structure of the covariates, in the sense of the Markov property, has low treewidth - even if $\Sigma$ is highly ill-conditioned. This upper bound builds on ideas from the wavelet and signal processing literature. As a special case of this result, we give an algorithm for sparse linear regression with covariates from an autoregressive time series model, where we also show that the (usual) Lasso provably fails. On the other hand, we construct (for the first time) random-design instances which are provably hard even for an optimally preconditioned Lasso. In fact, we complete our treewidth classification by proving that for any treewidth-t graph, there exists a Gaussian Markov Random Field on this graph such that the preconditioned Lasso, with any choice of preconditioner, requires $\Omega(t^{1/20})$ samples to recover $O(\log n)$ -sparse signals when covariates are drawn from this model.

FOCS Conference 2020 Conference Paper

High-precision Estimation of Random Walks in Small Space

  • AmirMahdi Ahmadinejad
  • Jonathan A. Kelner
  • Jack Murtagh
  • John Peebles
  • Aaron Sidford
  • Salil P. Vadhan

In this paper, we provide a deterministic $\tilde{O}(\log N)$ -space algorithm for estimating random walk probabilities on undirected graphs, and more generally Eulerian directed graphs, to within inverse polynomial additive error $(\epsilon=1/\text{poly}(N))$ where $N$ is the length of the input. Previously, this problem was known to be solvable by a randomized algorithm using space $O(\log N)$ (following Aleliunas et al. , FOCS '79) and by a deterministic algorithm using space $O(\log^{3/2}N)$ (Saks and Zhou, FOCS '95 and JCSS '99), both of which held for arbitrary directed graphs but had not been improved even for undirected graphs. We also give improvements on the space complexity of both of these previous algorithms for non-Eulerian directed graphs when the error is negligible $(\epsilon=1/N^{\omega(1)})$, generalizing what Hoza and Zuckerman (FOCS '18) recently showed for the special case of distinguishing whether a random walk probability is 0 or greater than $\epsilon$. We achieve these results by giving new reductions between powering Eulerian random-walk matrices and inverting Eulerian Laplacian matrices, providing a new notion of spectral approximation for Eulerian graphs that is preserved under powering, and giving the first deterministic $\tilde{O}(\log N)$ -space algorithm for inverting Eulerian Laplacian matrices. The latter algorithm builds on the work of Murtagh et al. (FOCS '17) that gave a deterministic $\tilde{O}(\log N)$ -space algorithm for inverting undirected Laplacian matrices, and the work of Cohen et al. (FOCS '19) that gave a randomized $\tilde{O}(N)$ -time algorithm for inverting Eulerian Laplacian matrices. A running theme throughout these contributions is an analysis of “cycle-lifted graphs, ” where we take a graph and “lift” it to a new graph whose adjacency matrix is the tensor product of the original adjacency matrix and a directed cycle (or variants of one).

FOCS Conference 2018 Conference Paper

Solving Directed Laplacian Systems in Nearly-Linear Time through Sparse LU Factorizations

  • Michael B. Cohen
  • Jonathan A. Kelner
  • Rasmus Kyng
  • John Peebles
  • Richard Peng
  • Anup B. Rao
  • Aaron Sidford

In this paper, we show how to solve directed Laplacian systems in nearly-linear time. Given a linear system in an n × n Eulerian directed Laplacian with m nonzero entries, we show how to compute an ε-approximate solution in time O(m log O(1) (n) log (1/ε)). Through reductions from [Cohen et al. FOCS'16], this gives the first nearly-linear time algorithms for computing ε-approximate solutions to row or column diagonally dominant linear systems (including arbitrary directed Laplacians) and computing ε-approximations to various properties of random walks on directed graphs, including stationary distributions, personalized PageRank vectors, hitting times, and escape probabilities. These bounds improve upon the recent almost-linear algorithms of [Cohen et al. STOC'17], which gave an algorithm to solve Eulerian Laplacian systems in time O((m+n2 O(√ log n log log n) )log O(1) (n ε -1 )). To achieve our results, we provide a structural result that we believe is of independent interest. We show that Eulerian Laplacians (and therefore the Laplacians of all strongly connected directed graphs) have sparse approximate LU-factorizations. That is, for every such directed Laplacian there are lower upper triangular matrices each with at most Õ(n) nonzero entries such that there product spectrally approximates the directed Laplacian in an appropriate norm. This claim can be viewed as an analog of recent work on sparse Cholesky factorizations of Laplacians of undirected graphs. We show how to construct such factorizations in nearly-linear time and prove that once constructed they yield nearly-linear time algorithms for solving directed Laplacian systems.

FOCS Conference 2016 Conference Paper

A Nearly Tight Sum-of-Squares Lower Bound for the Planted Clique Problem

  • Boaz Barak
  • Samuel B. Hopkins 0001
  • Jonathan A. Kelner
  • Pravesh K. Kothari
  • Ankur Moitra
  • Aaron Potechin

We prove that with high probability over the choice of a random graph G from the Erdös-Rényi distribution G(n, 1/2), the n O(d) -time degree d Sum-of-Squares semidefinite programming relaxation for the clique problem will give a value of at least n 1/2-c(d/log n)1/2 for some constant c > 0. This yields a nearly tight n 1/2-o(1) bound on the value of this program for any degree d = o(log n). Moreover we introduce a new framework that we call pseudo-calibration to construct Sum-of-Squares lower bounds. This framework is inspired by taking a computational analogue of Bayesian probability theory. It yields a general recipe for constructing good pseudo-distributions (i. e. , dual certificates for the Sum-of-Squares semidefinite program), and sheds further light on the ways in which this hierarchy differs from others.

FOCS Conference 2016 Conference Paper

Faster Algorithms for Computing the Stationary Distribution, Simulating Random Walks, and More

  • Michael B. Cohen
  • Jonathan A. Kelner
  • John Peebles
  • Richard Peng
  • Aaron Sidford
  • Adrian Vladu

In this paper, we provide faster algorithms for computing variousfundamental quantities associated with random walks on a directedgraph, including the stationary distribution, personalized PageRankvectors, hitting times, and escape probabilities. In particular, ona directed graph with n vertices and m edges, we show how tocompute each quantity in time Õ(m3/4n + mn2/3), wherethe Õ notation suppresses polylog factors in n, the desired accuracy, and the appropriate condition number (i. e. themixing time or restart probability). Our result improves upon the previous fastest running times for these problems, previous results either invoke a general purpose linearsystem solver on a n × n matrix with m non-zero entries, or depend polynomially on the desired error or natural condition numberassociated with the problem (i. e. the mixing time or restart probability). For sparse graphs, we obtain a running time of Õ(n7/4), breaking the O(n2) barrier of the best running time one couldhope to achieve using fast matrix multiplication. We achieve our result by providing a similar running time improvementfor solving directed Laplacian systems, a natural directedor asymmetric analog of the well studied symmetric or undirected Laplaciansystems. We show how to solve such systems in time Õ(m3/4n + mn2/3), and efficiently reduce a broad range of problems to solving Õ(1) directed Laplacian systems on Eulerian graphs. We hope these resultsand our analysis open the door for further study into directedspectral graph theory.

STOC Conference 2012 Conference Paper

Hypercontractivity, sum-of-squares proofs, and their applications

  • Boaz Barak
  • Fernando G. S. L. Brandão
  • Aram W. Harrow
  • Jonathan A. Kelner
  • David Steurer
  • Yuan Zhou 0007

We study the computational complexity of approximating the 2-to-q norm of linear operators (defined as |A| 2->q = max v≠ 0 |Av| q /|v| 2 ) for q > 2, as well as connections between this question and issues arising in quantum information theory and the study of Khot's Unique Games Conjecture (UGC). We show the following: For any constant even integer q ≥ 4, a graph G is a small-set expander if and only if the projector into the span of the top eigenvectors of G's adjacency matrix has bounded 2->q norm. As a corollary, a good approximation to the 2->q norm will refute the Small-Set Expansion Conjecture --- a close variant of the UGC. We also show that such a good approximation can be obtained in exp(n 2/q ) time, thus obtaining a different proof of the known subexponential algorithm for Small-Set-Expansion. Constant rounds of the "Sum of Squares" semidefinite programing hierarchy certify an upper bound on the 2->4 norm of the projector to low degree polynomials over the Boolean cube, as well certify the unsatisfiability of the "noisy cube" and "short code" based instances of Unique-Games considered by prior works. This improves on the previous upper bound of exp(log O(1) n) rounds (for the "short code"), as well as separates the "Sum of Squares"/"Lasserre" hierarchy from weaker hierarchies that were known to require ω(1) rounds. We show reductions between computing the 2->4 norm and computing the injective tensor norm of a tensor, a problem with connections to quantum information theory. Three corollaries are: (i) the 2->4 norm is NP-hard to approximate to precision inverse-polynomial in the dimension, (ii) the 2->4 norm does not have a good approximation (in the sense above) unless 3-SAT can be solved in time exp(√n poly log(n)), and (iii) known algorithms for the quantum separability problem imply a non-trivial additive approximation for the 2->4 norm.

FOCS Conference 2009 Conference Paper

Faster Generation of Random Spanning Trees

  • Jonathan A. Kelner
  • Aleksander Madry

In this paper, we set forth a new algorithm for generating approximately uniformly random spanning trees in undirected graphs. We show how to sample from a distribution that is within a multiplicative (1+ ¿) of uniform in expected time O¿(m¿n log 1/¿). This improves the sparse graph case of the best previously known worst-case bound of O(min {mn, n 2. 376 }), which has stood for twenty years. To achieve this goal, we exploit the connection between random walks on graphs and electrical networks, and we use this to introduce a new approach to the problem that integrates discrete random walk-based techniques with continuous linear algebraic methods. We believe that our use of electrical networks and sparse linear system solvers in conjunction with random walks and combinatorial partitioning techniques is a useful paradigm that will find further applications in algorithmic graph theory.

ICML Conference 2009 Conference Paper

Fitting a graph to vector data

  • Samuel I. Daitch
  • Jonathan A. Kelner
  • Daniel A. Spielman

We introduce a measure of how well a combinatorial graph fits a collection of vectors. The optimal graphs under this measure may be computed by solving convex quadratic programs and have many interesting properties. For vectors in d dimensional space, the graphs always have average degree at most 2( d + 1), and for vectors in 2 dimensions they are always planar. We compute these graphs for many standard data sets and show that they can be used to obtain good solutions to classification, regression and clustering problems.

FOCS Conference 2009 Conference Paper

Higher Eigenvalues of Graphs

  • Jonathan A. Kelner
  • James R. Lee
  • Gregory N. Price
  • Shang-Hua Teng

We present a general method for proving upper bounds on the eigenvalues of the graph Laplacian. In particular, we show that for any positive integer k, the k th smallest eigenvalue of the Laplacian on a bounded-degree planar graph is O(k/n). This bound is asymptotically tight for every k, as it is easily seen to be achieved for planar grids. We also extend this spectral result to graphs with bounded genus, graphs which forbid fixed minors, and other natural families. Previously, such spectral upper bounds were only known for k = 2, i. e. for the Fiedler value of these graphs. In addition, our result yields a new, combinatorial proof of the celebrated result of Korevaar in differential geometry.

FOCS Conference 2009 Conference Paper

Local Graph Partitions for Approximation and Testing

  • Avinatan Hassidim
  • Jonathan A. Kelner
  • Huy N. Nguyen
  • Krzysztof Onak

We introduce a new tool for approximation and testing algorithms called partitioning oracles. We develop methods for constructing them for any class of bounded-degree graphs with an excluded minor, and in general, for any hyperfinite class of bounded-degree graphs. These oracles utilize only local computation to consistently answer queries about a global partition that breaks the graph into small connected components by removing only a small fraction of the edges. We illustrate the power of this technique by using it to extend and simplify a number of previous approximation and testing results for sparse graphs, as well as to provide new results that were unachievable with existing techniques. For instance: 1. We give constant-time approximation algorithms for the size of the minimum vertex cover, the minimum dominating set, and the maximum independent set for any class of graphs with an excluded minor. 2. We show a simple proof that any minor-closed graph property is testable in constant time in the bounded degree model. 3. We prove that it is possible to approximate the distance to almost any hereditary property in any bounded degree hereditary families of graphs. Hereditary properties of interest include bipartiteness, k-colorability, and perfectness.

FOCS Conference 2007 Conference Paper

On the Hardness and Smoothed Complexity of Quasi-Concave Minimization

  • Jonathan A. Kelner
  • Evdokia Nikolova

In this paper, we resolve, the smoothed and approximative complexity of low-rank quasi-concave minimization, providing both upper and lower bounds. As an upper bound, we provide the first smoothed analysis of quasi-concave, minimization. The analysis is based on a smoothed bound for the number of extreme points of the projection of the feasible polytope onto a k-dimensional subspace. where k is the rank (informally, the dimension of nonconvexity)ofthe quasi-concave function. Our smoothed bound is polynomial in the original dimension of the problem n and the perturbation size p. and it is exponential in the rank of the function k. From this, we obtain the first randomized fully polynomial-time approximation scheme for low-rank quasi-concave minimization under broad conditions. In contrast with this, we prove log n-hardness of approximation for general quasi-concave minimization. This shows that our smoothed bound is essentially tight, in that no polynomial smoothed bound is possible for quasi-concave functions of general rank k. The tools that we introduce for the smoothed analysis may be of independent interest. All previous smoothed analyses of polytopes analyzed projections onto two-dimensional subspaces and studied them using trigonometry to examine the angles between vectors and 2-planes in Ropf". In this paper, we provide what is, to our knowledge, the first smoothed analysis of the projection of polytopes onto higher-dimensional subspaces. To do this, we replace the trigonometry with tools from random matrix theory and differential geometry on the Grassmannian. Our hardness reduction is based on entirely different proofs that may also be of independent interest; we show that the stochastic 2-stage minimum spanning tree problem has a supermodular objective and that supermodular minimization is hard to approximate.

STOC Conference 2004 Conference Paper

Spectral partitioning, eigenvalue bounds, and circle packings for graphs of bounded genus

  • Jonathan A. Kelner

In this paper, we address two longstanding questions about finding good separators in graphs of bounded genus and degree: 1. It is a classical result of Gilbert, Hutchinson, and Tarjan [12] that one can find asymptotically optimal separators on these graphs if he is given both the graph and an embedding of it onto a low genus surface. Does there exist a simple, efficient algorithm to find these separators given only the graph and not the embedding? 2. In practice, spectral partitioning heuristics work extremely well on these graphs. Is there a theoretical reason why this should be the case.We resolve these two questions by showing that a simple spectral algorithm finds separators of cut ratio O(√g/n) and vertex bisectors of size O(√gn) in these graphs, both of which are optimal. As our main technical lemma, we prove an O(g/n) bound on the second smallest eigenvalue of the Laplacian of such graphs and show that this is tight, thereby resolving a conjecture of Spielman and Teng. While this lemma is essentially combinatorial in nature, its proof comes from continuous mathematics, drawing on the theory of circle packings and the geometry of compact Riemann surfaces.