Arrow Research search

Author name cluster

Amir Abboud

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

38 papers
2 author rows

Possible papers

38

FOCS Conference 2025 Conference Paper

Deterministic Almost-Linear-Time Gomory-Hu Trees

  • Amir Abboud
  • Rasmus Kyng
  • Jason Li 0006
  • Debmalya Panigrahi
  • Maximilian Probst Gutenberg
  • Thatchaphol Saranurak
  • Weixuan Yuan
  • Wuwei Yuan

Given an undirected, weighted graph $G=(V, E, w)$, a Gomory-Hu tree or cut tree (Gomory and Hu, 1961) is a tree T over the vertex set V such that for every pair of vertices $s, t \in V$, the ($s, t$) min-cut in T is also an ($s, t$) min-cut in G and has the same value. In this article, we give the first deterministic almost-linear-time algorithm for constructing a Gomory-Hu tree. Our algorithm runs in $m^{1+o(1)}$-time, where m denotes the number of edges in the input graph G; this is clearly optimal up to the $m^{o(1)}$ term in the running time. Prior to our work, the best deterministic algorithm for this problem dated back to the original algorithm of Gomory and Hu that runs in $n m^{1+o(1)}$ time using current maxflow algorithms. In fact, our algorithm is also the first almost-linear-time deterministic algorithm for even simpler problems, such as finding the k-edge-connected components of a graph. Our new result hinges on two separate and novel components that each introduce a distinct set of de-randomization tools of independent interest: - a deterministic reduction from the all-pairs min-cuts problem to the single-source min-cuts problem incurring only sub-polynomial overhead, and - a deterministic almost-linear time algorithm for the singlesource min-cuts problem.

FOCS Conference 2023 Conference Paper

All-Pairs Max-Flow is no Harder than Single-Pair Max-Flow: Gomory-Hu Trees in Almost-Linear Time

  • Amir Abboud
  • Jason Li 0006
  • Debmalya Panigrahi
  • Thatchaphol Saranurak

A Gomory-Hu tree (also called a cut tree) succinctly represents $(s, t)$ min-cuts (and therefore, $(s, t)$ max-flow values) of all pairs of vertices $s, t$ in an undirected graph. In this paper, we give an $m^{1+o(1)}$-time algorithm for constructing a Gomory-Hu tree for a graph with m edges. This shows that the all-pairs max-flows problem has the same running time as the single-pair max-flow problem, up to a subpolynomial factor. Prior to our work, the best known Gomory-Hu tree algorithm was obtained in recent work by Abboud et al. (FOCS 2022) and requires $\tilde{O}\left(n^{2}\right)$ time for a graph with n vertices. Our result marks a natural culmination of over 60 years of research into the all-pairs maxflows problem that started with Gomory and Hu’s pathbreaking result introducing the Gomory-Hu tree in 1961.

FOCS Conference 2022 Conference Paper

Breaking the Cubic Barrier for All-Pairs Max-Flow: Gomory-Hu Tree in Nearly Quadratic Time

  • Amir Abboud
  • Robert Krauthgamer
  • Jason Li 0006
  • Debmalya Panigrahi
  • Thatchaphol Saranurak
  • Ohad Trabelsi

In 1961, Gomory and Hu showed that the All-Pairs Max-Flow problem of computing the max-flow between all $\begin{pmatrix}n\\2\end{pmatrix}$ pairs of vertices in an undirected graph can be solved using only $n-1$ calls to any (single-pair) max-flow algorithm. Even assuming a linear-time max-flow algorithm, this yields a running time of $O(mn)$, which is $O(n^{3})$ when $m=\Theta(n^{2})$. While subsequent work has improved this bound for various special graph classes, no subcubic-time algorithm has been obtained in the last 60 years for general graphs. We break this longstanding barrier by giving an $\tilde{O}(n^{2})$-time algorithm on general, integer-weighted graphs. Combined with a popular complexity assumption, we establish a counter-intuitive separation: all-pairs max-flows are strictly easier to compute than all-pairs shortest-paths. Our algorithm produces a cut-equivalent tree, known as the Gomory-Hu tree, from which the max-flow value for any pair can be retrieved in near-constant time. For unweighted graphs, we refine our techniques further to produce a Gomory-Hu tree in the time of a poly-logarithmic number of calls to any maxflow algorithm. This shows an equivalence between the all-pairs and single-pair max-flow problems, and is optimal up to polylogarithmic factors. Using the recently announced $m^{1+o(1)}$-time max-flow algorithm (Chen et al. , March 2022), our Gomory-Hu tree algorithm for unweighted graphs also runs in $m^{1+o(1)}$-time.

SODA Conference 2022 Conference Paper

Friendly Cut Sparsifiers and Faster Gomory-Hu Trees

  • Amir Abboud
  • Robert Krauthgamer
  • Ohad Trabelsi

We devise new cut sparsifiers that are related to the classical sparsification of Nagamochi and Ibaraki [Algorithmica, 1992], which is an algorithm that, given an unweighted graph G on n nodes and a parameter k, computes a subgraph with O ( nk ) edges that preserves all cuts of value up to k. We put forward the notion of a friendly cut sparsifier, which is a minor of G that preserves all friendly cuts of value up to k, where a cut in G is called friendly if every node has more edges connecting it to its own side of the cut than to the other side. We present an algorithm that, given a simple graph G, computes in almost-linear time a friendly cut sparsifier with edges. Using similar techniques, we also show how, given in addition a terminal set T, one can compute in almost-linear time a terminal sparsifier, which preserves the minimum st -cut between every pair of terminals, with edges. Plugging these sparsifiers into the recent n 2+ o (1) -time algorithms for constructing a Gomory-Hu tree of simple graphs, along with a relatively simple procedure for handling the unfriendly minimum cuts, we improve the running time for moderately dense graphs (e. g. , with m = n 1. 75 edges). In particular, assuming a linear-time Max-Flow algorithm, the new state-of-the-art for Gomory-Hu tree is the minimum between our ( m + n 1. 75 ) 1+ o (1) and the known mn 1/2+ o (1). We further investigate the limits of this approach and the possibility of better sparsification. Under the hypothesis that an Õ ( n )-edge sparsifier that preserves all friendly minimum st -cuts can be computed efficiently, our upper bound improves to Õ ( m + n 1. 5 ) which is the best possible without breaking the cubic barrier for constructing Gomory-Hu trees in non-simple graphs.

FOCS Conference 2021 Conference Paper

APMF < APSP? Gomory-Hu Tree for Unweighted Graphs in Almost-Quadratic Time

  • Amir Abboud
  • Robert Krauthgamer
  • Ohad Trabelsi

We design an $n^{2+o(1)}$ -time algorithm that constructs a cut-equivalent (Gomory-Hu) tree of a simple graph on $n$ nodes. This bound is almost-optimal in terms of $n$, and it improves on the recent $\tilde{O}(n^{2. 5})$ bound by the authors (STOC 2021), which was the first to break the cubic barrier. Consequently, the All-Pairs Maximum-Flow (APMF) problem has time complexity $n^{2+o(1)}$, and for the first time in history, this problem can be solved faster than All-Pairs Shortest Paths (APSP). We further observe that an almost-linear time algorithm (in terms of the number of edges $m$ ) is not possible without first obtaining a subcubic algorithm for multigraphs. Finally, we derandomize our algorithm, obtaining the first subcubic deterministic algorithm for Gomory-Hu Tree in simple graphs, showing that randomness is not necessary for beating the $n-1$ times max-flow bound from 1961. The upper bound is $\tilde{O}(n^{2\frac{2}{3}})$ and it would improve to $n^{2+o(1)}\ \mathbf{i}\mathbf{f}$ there is a deterministic single-pair maximum-flow algorithm that is almost-linear. The key novelty is in using a “dynamic pivot” technique instead of the randomized pivot selection that was central in recent works.

STOC Conference 2021 Conference Paper

Subcubic algorithms for Gomory-Hu tree in unweighted graphs

  • Amir Abboud
  • Robert Krauthgamer
  • Ohad Trabelsi

Every undirected graph G has a (weighted) cut-equivalent tree T , commonly named after Gomory and Hu who discovered it in 1961. Both T and G have the same node set, and for every node pair s , t , the minimum ( s , t )-cut in T is also an exact minimum ( s , t )-cut in G . We give the first subcubic-time algorithm that constructs such a tree for a simple graph G (unweighted with no parallel edges). Its time complexity is Õ( n 2.5 ), for n =| V ( G )|; previously, only Õ( n 3 ) was known, except for restricted cases like sparse graphs. Consequently, we obtain the first algorithm for All-Pairs Max-Flow in simple graphs that breaks the cubic-time barrier. Gomory and Hu compute this tree using n −1 queries to (single-pair) Max-Flow; the new algorithm can be viewed as a fine-grained reduction to Õ(√ n ) Max-Flow computations on n -node graphs.

FOCS Conference 2020 Conference Paper

Cut-Equivalent Trees are Optimal for Min-Cut Queries

  • Amir Abboud
  • Robert Krauthgamer
  • Ohad Trabelsi

Min-Cut queries are fundamental: Preprocess an undirected edge-weighted graph, to quickly report a minimum-weight cut that separates a query pair of nodes s, t. The best data structure known for this problem simply builds a cut-equivalent tree, discovered 60 years ago by Gomory and Hu, who also showed how to construct it using n-1 minimum st-cut computations. Using state-of-the-art algorithms for minimum st-cut (Lee and Sidford, FOCS 2014), one can construct the tree in time ~O(mn 3/2 ), which is also the preprocessing time of the data structure. (Throughout, we focus on polynomially-bounded edge weights, noting that faster algorithms are known for small/ u nit edge weights, and use n and m for the number of nodes and edges in the graph.) Our main result shows the following equivalence: Cut-equivalent trees can be constructed in near-linear time if and only if there is a data structure for Min-Cut queries with near-linear preprocessing time and polylogarithmic (amortized) query time, and even if the queries are restricted to a fixed source. That is, equivalent trees are an essentially optimal solution for Min-Cut queries. This equivalence holds even for every minor-closed family of graphs, such as bounded-treewidth graphs, for which a two-decade old data structure (Arikati, Chaudhuri, and Zaroliagis, J. Algorithms 1998) implies the first near-linear time construction of cut-equivalent trees. Moreover, unlike all previous techniques for constructing cut-equivalent trees, ours is robust to relying on approximation algorithms. In particular, using the almost-linear time algorithm for ( 1+ε)-approximate minimum st-cut (Kelner, Lee, Orecchia, and Sidford, SODA 2014), we can construct a ( 1+ε)-approximate flow-equivalent tree (which is a slightly weaker notion) in time n 2+o(1). This leads to the first ( 1+ε)-approximation for All-Pairs Max-Flow that runs in time n 2+o(1), and matches the output size almost-optimally.

NeurIPS Conference 2020 Conference Paper

Impossibility Results for Grammar-Compressed Linear Algebra

  • Amir Abboud
  • Arturs Backurs
  • Karl Bringmann
  • Marvin Künnemann

To handle vast amounts of data, it is natural and popular to compress vectors and matrices. When we compress a vector from size N down to size n << N, it certainly makes it easier to store and transmit efficiently, but does it also make it easier to process? In this paper we consider lossless compression schemes, and ask if we can run our computations on the compressed data as efficiently as if the original data was that small. That is, if an operation has time complexity T(input-size), can we perform it on the compressed representation in time T(n) rather than T(N)? We consider the most basic linear algebra operations: inner product, matrix-vector multiplication, and matrix multiplication. In particular, given two compressed vectors, can we compute their inner product in time O(n)? Or perhaps we must decompress first and then multiply, spending Omega(N) time? The answer depends on the compression scheme. While for simple ones such as Run-Length-Encoding (RLE) the inner product can be done in O(n) time, we prove that this is impossible for compressions from a richer class: essentially n^2 or even larger runtimes are needed in the worst case (under complexity assumptions). This is the class of \emph{grammar-compressions} containing most popular methods such as the Lempel-Ziv family. These schemes are more compressing than the simple RLE, but alas, we prove that performing computations on them is much harder.

SODA Conference 2020 Conference Paper

New Algorithms and Lower Bounds for All-Pairs Max-Flow in Undirected Graphs

  • Amir Abboud
  • Robert Krauthgamer
  • Ohad Trabelsi

We investigate the time-complexity of the All-Pairs Max-Flow problem: Given a graph with n nodes and m edges, compute for all pairs of nodes the maximum-flow value between them. If Max-Flow (the version with a given source-sink pair s, t ) can be solved in time T ( m ), then an O ( n 2 ) · T ( m ) is a trivial upper bound. But can we do better? For directed graphs, recent results in fine-grained complexity suggest that this time bound is essentially optimal. In contrast, for undirected graphs with edge capacities, a seminal algorithm of Gomory and Hu (1961) runs in much faster time O ( n ) • T ( m ). Under the plausible assumption that Max-Flow can be solved in near-linear time m 1+ o (1), this half-century old algorithm yields an nm 1+ o (1) bound. Several other algorithms have been designed through the years, including Õ ( mn ) time for unit-capacity edges (unconditionally), but none of them break the O ( mn ) barrier. Meanwhile, no super-linear lower bound was shown for undirected graphs. We design the first hardness reductions for All-Pairs Max-Flow in undirected graphs, giving an essentially optimal lower bound for the node-capacities setting. For edge capacities, our efforts to prove similar lower bounds have failed, but we have discovered a surprising new algorithm that breaks the O ( mn ) barrier for graphs with unit-capacity edges! Assuming T ( m ) = m 1+ o (1), our algorithm runs in time m 3/2 + o (1) and outputs a cut-equivalent tree (similarly to the Gomory-Hu algorithm). Even with current Max-Flow algorithms we improve state-of-the-art as long as m = O ( n 5/3− ε ). Finally, we explain the lack of lower bounds by proving a non-reducibility result. This result is based on a new quasi-linear time Õ ( m ) non-deterministic algorithm for constructing a cut-equivalent tree and may be of independent interest.

STOC Conference 2019 Conference Paper

Dynamic set cover: improved algorithms and lower bounds

  • Amir Abboud
  • Raghavendra Addanki
  • Fabrizio Grandoni 0001
  • Debmalya Panigrahi
  • Barna Saha

We give new upper and lower bounds for the dynamic set cover problem. First, we give a (1+є) f -approximation for fully dynamic set cover in O ( f 2 log n /є 5 ) (amortized) update time, for any є > 0, where f is the maximum number of sets that an element belongs to. In the decremental setting, the update time can be improved to O ( f 2 /є 5 ), while still obtaining an (1+є) f -approximation. These are the first algorithms that obtain an approximation factor linear in f for dynamic set cover, thereby almost matching the best bounds known in the offline setting and improving upon the previous best approximation of O ( f 2 ) in the dynamic setting. To complement our upper bounds, we also show that a linear dependence of the update time on f is necessary unless we can tolerate much worse approximation factors. Using the recent distributed PCP-framework, we show that any dynamic set cover algorithm that has an amortized update time of O ( f 1−є ) must have an approximation factor that is Ω( n δ ) for some constant δ>0 under the Strong Exponential Time Hypothesis.

NeurIPS Conference 2019 Conference Paper

Subquadratic High-Dimensional Hierarchical Clustering

  • Amir Abboud
  • Vincent Cohen-Addad
  • Hussein Houdrouge

We consider the widely-used average-linkage, single-linkage, and Ward's methods for computing hierarchical clusterings of high-dimensional Euclidean inputs. It is easy to show that there is no efficient implementation of these algorithms in high dimensional Euclidean space since it implicitly requires to solve the closest pair problem, a notoriously difficult problem. However, how fast can these algorithms be implemented if we allow approximation? More precisely: these algorithms successively merge the clusters that are at closest average (for average-linkage), minimum distance (for single-linkage), or inducing the least sum-of-square error (for Ward's). We ask whether one could obtain a significant running-time improvement if the algorithm can merge $\gamma$-approximate closest clusters (namely, clusters that are at distance (average, minimum, or sum-of-square error) at most $\gamma$ times the distance of the closest clusters). We show that one can indeed take advantage of the relaxation and compute the approximate hierarchical clustering tree using $\widetilde{O}(n)$ $\gamma$-approximate nearest neighbor queries. This leads to an algorithm running in time $\widetilde{O}(nd) + n^{1+O(1/\gamma)}$ for $d$-dimensional Euclidean space. We then provide experiments showing that these algorithms perform as well as the non-approximate version for classic classification tasks while achieving a significant speed-up.

Highlights Conference 2018 Conference Abstract

Fine-Grained Complexity and Hardness in P

  • Amir Abboud

ABSTRACT. The celebrated theory of NP-Hardness classifies problems into “easy” (in P) and “NP-hard” (probably not in P). However, while the easy problems do have polynomial time algorithms, in many cases, a polynomial runtime like O(n^2) or O(n^3) can be too slow. Many fundamental "easy" problems have resisted decades of attempts at getting truly fast algorithms, and are typically solved by potentially inaccurate but fast algorithms in practice. Such problems include Sequence Alignment, Parsing, Distance Computation in Graphs, and so many other problems we encounter in basic Algorithms courses. Can we formally explain the lack of faster algorithms and justify the use of heuristics, even for “easy” problems? In this talk, I will overview a theoretical framework for proving hardness for problems in P. Inspired by NP-Hardness, this approach is based on reductions which have uncovered fascinating structure among the different problems in P. We can now point at a small set of core problems, and say: For dozens of problems in P there is no hope to get faster algorithms, until we know how to solve the core problems faster.

STOC Conference 2018 Conference Paper

More consequences of falsifying SETH and the orthogonal vectors conjecture

  • Amir Abboud
  • Karl Bringmann
  • Holger Dell
  • Jesper Nederlof

The Strong Exponential Time Hypothesis and the OV-conjecture are two popular hardness assumptions used to prove a plethora of lower bounds, especially in the realm of polynomial-time algorithms. The OV-conjecture in moderate dimension states there is no ε>0 for which an O ( N 2−ε ) poly ( D ) time algorithm can decide whether there is a pair of orthogonal vectors in a given set of size N that contains D -dimensional binary vectors. We strengthen the evidence for these hardness assumptions. In particular, we show that if the OV-conjecture fails, then two problems for which we are far from obtaining even tiny improvements over exhaustive search would have surprisingly fast algorithms. If the OV conjecture is false, then there is a fixed ε>0 such that: - For all d and all large enough k , there is a randomized algorithm that takes O ( n (1−ε) k ) time to solve the Zero-Weight- k -Clique and Min-Weight- k -Clique problems on d -hypergraphs with n vertices. As a consequence, the OV-conjecture is implied by the Weighted Clique conjecture. - For all c , the satisfiability of sparse TC1 circuits on n inputs (that is, circuits with cn wires, depth c log n , and negation, AND, OR, and threshold gates) can be computed in time O ((2−ε) n ).

SODA Conference 2018 Conference Paper

Reachability Preservers: New Extremal Bounds and Approximation Algorithms

  • Amir Abboud
  • Greg Bodwin

In this paper we prove new results about the extremal structure of paths in directed graphs. Say we are given a directed graph G = ( V, E ) on n nodes, a set of sources S ⊆ V of size | S | = n 1/3, and a subset P ⊆ S × V of pairs ( s, t ) where s ∊ S, of size O ( n 2/3 ), such that for all pairs ( s, t ) ∊ P, there is a path from s to t. Our goal is to remove as many edges from G as possible while maintaining the reachability of all pairs in P. How many edges will we have to keep? Can you always go down to n 1+ o (1) edges? Or maybe for some nasty graphs G you cannot even go below the simple bound of O ( n 4/3 ) edges? Embarrassingly, in a world where graph reachability is ubiquitous in countless scientific fields, the current bounds on the answer to this question are far from tight. In this paper, we make polynomial progress in both the upper and lower bounds for these Reachability Preservers over bounds that were implicit in the literature. We show that in the above scenario, O ( n ) edges will always be sufficient, and in general one is even guaranteed a subgraph on edges that preserves the reachability of all pairs in P. We complement this with a lower bound graph construction, establishing that the above result fully characterizes the settings in which we are guaranteed a preserver of size O ( n ). Moreover, we design an efficient algorithm that can always compute a preserver of existentially optimal size. The second contribution of this paper is a new connection between extremal graph sparsification results and classical Steiner Network Design problems. Surprisingly, prior to this work, the osmosis of techniques between these two fields had been superficial. This allows us to improve the state of the art approximation algorithms for the most basic Steiner-type problem in directed graphs from the O ( n 0. 6+ ε ) of Chlamatac, Dinitz, Kortsarz, and Laekhanukit (SODA’17) to O ( n 0. 577+ ε ).

SODA Conference 2017 Conference Paper

A Hierarchy of Lower Bounds for Sublinear Additive Spanners

  • Amir Abboud
  • Greg Bodwin
  • Seth Pettie

Spanners, emulators, and approximate distance oracles can be viewed as lossy compression schemes that represent an unweighted graph metric in small space, say Õ ( n 1+δ ) bits. There is an inherent tradeoff between the sparsity parameter δ and the stretch function f of the compression scheme, but the qualitative nature of this tradeoff has remained a persistent open problem. It has been known for some time that when δ > 1/3 there are schemes with constant additive stretch (distance d is stretched to at most f (d) = d + O (1)), and recent results of Abboud and Bodwin show that when δ < 1/3 there are no such schemes. Thus, to get practically efficient graph compression with δ → 0 we must pay super-constant additive stretch, but exactly how much do we have to pay? In this paper we show that the lower bound of Abboud and Bodwin is just the first step in a hierarchy of lower bounds that characterize the asymptotic behavior of the optimal stretch function f for sparsity parameter δ ∊ (0, 1/3). Specifically, for any integer k ≥ 2, any compression scheme with size has a sublinear additive stretch function f: This lower bound matches Thorup and Zwick's (2006) construction of sublinear additive emulators. It also shows that Elkin and Peleg's (1 + ∊, ß )-spanners have an essentially optimal tradeoff between δ, ∊, and β, and that the sublinear additive spanners of Pettie (2009) and Chechik (2013) are not too far from optimal. To complement these lower bounds we present a new construction of (1 + ∊, O ( k / ∊) k—1 )-spanners with size where h k < 3/4. This size bound improves on the spanners of Elkin and Peleg (2004), Thorup and Zwick (2006), and Pet- tie (2009). According to our lower bounds neither the size nor stretch function can be substantially improved. Our lower bound technique exhibits several interesting degrees of freedom in the framework of Abboud and Bodwin. By carefully exploiting these freedoms, we are able to obtain lower bounds for several related combinatorial objects. We get lower bounds on the size of (β, ∊ )-hopsets, matching Elkin and Neiman's construction (2016), and lower bounds on shortcut- ting sets for digraphs that preserve the transitive closure. Our lower bound simplifies Hesse's (2003) refutation of Thorup's conjecture (1992), which stated that adding a linear number of shortcuts suffices to reduce the diameter to polylogarithmic. Finally, we show matching upper and lower bounds for graph compression schemes that work for graph metrics with girth at least 2γ + 1. One consequence is that Baswana et al. 's (2010) additive O (7)-spanners with size cannot be improved in the exponent.

FOCS Conference 2017 Conference Paper

Distributed PCP Theorems for Hardness of Approximation in P

  • Amir Abboud
  • Aviad Rubinstein
  • R. Ryan Williams

We present a new distributed model of probabilistically checkable proofs (PCP). A satisfying assignment x ∈ {0, 1} n to a CNF formula φ is shared between two parties, where Alice knows x 1, .. ., x n/2, Bob knows x n/2+1, .. ., xn, and both parties know φ. The goal is to have Alice and Bob jointly write a PCP that x satisfies φ, while exchanging little or no information. Unfortunately, this model as-is does not allow for nontrivial query complexity. Instead, we focus on a non-deterministic variant, where the players are helped by Merlin, a third party who knows all of x. Using our framework, we obtain, for the first time, PCP-like reductions from the Strong Exponential Time Hypothesis (SETH) to approximation problems in P. In particular, under SETH we show that there are no trulysubquadratic approximation algorithms for Maximum Inner Product over {0, 1}-vectors, LCS Closest Pair over permutations, Approximate Partial Match, Approximate Regular Expression Matching, and Diameter in Product Metric. All our inapproximability factors are nearly-tight. In particular, for the first three problems we obtain nearly-polynomial factors of 2 (log n) 1-o(1); only (1+o(1))-factor lower bounds (under SETH) were known before. As an additional feature of our reduction, we obtain new SETH lower bounds for the exact “monochromatic” Closest Pair problem in the Euclidean, Manhattan, and Hamming metrics.

FOCS Conference 2017 Conference Paper

Fine-Grained Complexity of Analyzing Compressed Data: Quantifying Improvements over Decompress-and-Solve

  • Amir Abboud
  • Arturs Backurs
  • Karl Bringmann
  • Marvin Künnemann

Can we analyze data without decompressing it? As our data keeps growing, understanding the time complexity of problems on compressed inputs, rather than in convenient uncompressed forms, becomes more and more relevant. Suppose we are given a compression of size n of data that originally has size N, and we want to solve a problem with time complexity T(·). The naive strategy of “decompress-and-solve” gives time T(N), whereas “the gold standard” is time T(n): to analyze the compression as efficiently as if the original data was small. We restrict our attention to data in the form of a string (text, files, genomes, etc.) and study the most ubiquitous tasks. While the challenge might seem to depend heavily on the specific compression scheme, most methods of practical relevance (Lempel-Ziv-family, dictionary methods, and others) can be unified under the elegant notion of Grammar-Compressions. A vast literature, across many disciplines, established this as an influential notion for Algorithm design. We introduce a direly needed framework for proving (conditional) lower bounds in this field, allowing us to assess whether decompress-and-solve can be improved, and by how much. Our main results are: (1) The O(nN√(log N/n)) bound for LCS and the O(min{N log N, nM}) bound for Pattern Matching with Wildcards are optimal up to N o(1) factors, under the Strong Exponential Time Hypothesis. (Here, M denotes the uncompressed length of the compressed pattern.) (2) Decompress-and-solve is essentially optimal for ContextFree Grammar Parsing and RNA Folding, under the k-Clique conjecture. (3) We give an algorithm showing that decompress-and-solve is not optimal for Disjointness.

SODA Conference 2016 Conference Paper

Error Amplification for Pairwise Spanner Lower Bounds

  • Amir Abboud
  • Greg Bodwin

A pairwise spanner of a graph G = ( V, E ) and a “pair set” P ⊆ V × V is a subgraph H that preserves all pairwise distances in P, up to some additive error term + β. When β = 0 the object is called a pairwise distance preserver. A large and growing body of work has considered upper bounds for these objects, but lower bounds have been elusive. The only known lower bound results are (1) Coppersmith and Elkin (SODA'05) against preservers, and (2) considerably weaker bounds by Woodruff (FOCS'06) against spanners. Our main result is an amplification theorem: we prove that lower bounds against pairwise distance preservers imply lower bounds against pairwise spanners. In other words, to prove lower bounds against any constant error spanners, it is enough to consider only subgraphs that are not allowed any error at all! We apply this theorem to obtain drastically improved lower bounds. Some of these include: Linear size pairwise spanners with up to +(2 k – 1) error cannot span | P | = ω ( n (1+ k )/(3+ k ) ) pairs. This is a large improvement over Woodruff's | P | = ω ( n 2–2/ k ) (| P | is now linear, rather than quadratic, as k gets large). | E ( H )| = Ω( n 1+1/ k ) edges are required for a +(2 k – 1) spanner of | P | = Ω( n 1+1/ k ) pairs – this is another large improvement over Woodruff's | P | = Ω( n 2 ). The first tight bounds for pairwise spanners: for +2 error and P = ⊝( n 3/2 ) we show that ⊝( n 3/2 ) edges are necessary and sufficient (this also reflects a new upper bound: we construct +2 pairwise spanners on O ( n | P | 1/3 ) edges, removing a log factor from a prior algorithm). We also show analogous improved lower bounds against subset spanners (where P = S × S for some node subset S ), and the first lower bounds against D threshold spanners (where P is the set of node pairs at distance at least D ).

FOCS Conference 2016 Conference Paper

Popular Conjectures as a Barrier for Dynamic Planar Graph Algorithms

  • Amir Abboud
  • Søren Dahlgaard

The dynamic shortest paths problem on planar graphs asks us to preprocess a planar graph G such that we may support insertions and deletions of edges in G as well as distance queries between any two nodes u, v subject to the constraint that the graph remains planar at all times. This problem has been extensively studied in both the theory and experimental communities over the past decades. The best known algorithm performs queries and updates in Õ(n2/3) time, based on ideas of a seminal paper by Fakcharoenphol and Rao [FOCS'01]. A (1+ε)-approximation algorithm of Abraham et al. [STOC'12] performs updates and queries in Õ(√n) time. An algorithm with a more practical O(polylog(n)) runtime would be a major breakthrough. However, such runtimes are only known for a (1+ε)-approximation in a model where only restricted weight updates are allowed due to Abraham et al. [SODA'16], or for easier problems like connectivity. In this paper, we follow a recent and very active line of work on showing lower bounds for polynomial time problems based on popular conjectures, obtaining the first such results for natural problems in planar graphs. Such results were previously out of reach due to the highly non-planar nature of known reductions and the impossibility of "planarizing gadgets". We introduce a new framework which is inspired by techniques from the literatures on distance labelling schemes and on parameterized complexity. Using our framework, we show that no algorithm for dynamic shortest paths or maximum weight bipartite matching in planar graphs can support both updates and queries in amortized O(n1/2-ε) time, for any ε>0, unless the classical all-pairs-shortest-paths problem can be solved in truly subcubic time, which is widely believed to be impossible. We extend these results to obtain strong lower bounds for other related problems as well as for possible trade-offs between query and update time. Interestingly, our lower bounds hold even in very restrictive models where only weight updates are allowed.

STOC Conference 2016 Conference Paper

The 4/3 additive spanner exponent is tight

  • Amir Abboud
  • Greg Bodwin

A spanner is a sparse subgraph that approximately preserves the pairwise distances of the original graph. It is well known that there is a smooth tradeoff between the sparsity of a spanner and the quality of its approximation, so long as distance error is measured multiplicatively . A central open question in the field is to prove or disprove whether such a tradeoff exists also in the regime of additive error. That is, is it true that for all ε>0, there is a constant k ε such that every graph has a spanner on O ( n 1+ε ) edges that preserves its pairwise distances up to + k ε ? Previous lower bounds are consistent with a positive resolution to this question, while previous upper bounds exhibit the beginning of a tradeoff curve: all graphs have +2 spanners on O ( n 3/2 ) edges, +4 spanners on Õ( n 7/5 ) edges, and +6 spanners on O ( n 4/3 ) edges. However, progress has mysteriously halted at the n 4/3 bound, and despite significant effort from the community, the question has remained open for all 0 < ε < 1/3. Our main result is a surprising negative resolution of the open question, even in a highly generalized setting. We show a new information theoretic incompressibility bound: there is no function that compresses graphs into O ( n 4/3 − ε ) bits so that distance information can be recovered within + n o (1) error. As a special case of our theorem, we get a tight lower bound on the sparsity of additive spanners: the +6 spanner on O ( n 4/3 ) edges cannot be improved in the exponent, even if any subpolynomial amount of additive error is allowed. Our theorem implies new lower bounds for related objects as well; for example, the twenty-year-old +4 emulator on O ( n 4/3 ) edges also cannot be improved in the exponent unless the error allowance is polynomial. Central to our construction is a new type of graph product, which we call the Obstacle Product . Intuitively, it takes two graphs G , H and produces a new graph G H whose shortest paths structure looks locally like H but globally like G .

FOCS Conference 2015 Conference Paper

If the Current Clique Algorithms are Optimal, So is Valiant's Parser

  • Amir Abboud
  • Arturs Backurs
  • Virginia Vassilevska Williams

The CFG recognition problem is: given a context-free grammar G and a string w of length n, decide if w can be obtained from G. This is the most basic parsing question and is a core computer science problem. Valiant's parser from 1975 solves the problem in O(nO) time, where? <; 2: 373 is the matrix multiplication exponent. Dozens of parsing algorithms have been proposed over the years, yet Valiant's upper bound remains unbeaten. The best combinatorial algorithms have mildly subcubic O(n3= log3 n) complexity. Lee (JACM'01) provided evidence that fast matrix multiplication is needed for CFG parsing, and that very efficient and practical algorithms might be hard or even impossible to obtain. Lee showed that any algorithm for a more general parsing problem with running time O(|G| n3 -- e) can be converted into a surprising subcubic algorithm for Boolean Matrix Multiplication. Unfortunately, Lee' s hardness result required that the grammar size be |G| = O(n6). Nothing was known for the more relevant case of constant size grammars. In this work, we prove that any improvement on Valiant' s algorithm, even for constant size grammars, either in terms of runtime or by avoiding the inefficiencies of fast matrix multiplication, would imply a breakthrough algorithm for the k-Clique problem: given a graph on n nodes, decide if there are k that form a clique. Besides classifying the complexity of a fundamental problem, our reduction has led us to similar lower bounds for more modern and well-studied cubic time problems for which faster algorithms are highly desirable in practice: RNA Folding, a central problem in computational biology, and Dyck Language Edit Distance, answering an open question of Saha (FOCS'14).

STOC Conference 2015 Conference Paper

Matching Triangles and Basing Hardness on an Extremely Popular Conjecture

  • Amir Abboud
  • Virginia Vassilevska Williams
  • Huacheng Yu

Due to the lack of unconditional polynomial lower bounds, it is now in fashion to prove conditional lower bounds in order to advance our understanding of the class P. The vast majority of these lower bounds are based on one of three famous hypotheses: the 3-SUM conjecture, the APSP conjecture, and the Strong Exponential Time Hypothesis. Only circumstantial evidence is known in support of these hypotheses, and no formal relationship between them is known. In hopes of obtaining "less conditional" and therefore more reliable lower bounds, we consider the conjecture that at least one of the above three hypotheses is true. We design novel reductions from 3-SUM, APSP, and CNF-SAT, and derive interesting consequences of this very plausible conjecture, including: Tight n 3-o(1) lower bounds for purely-combinatorial problems about the triangles in unweighted graphs. New n 1-o(1) lower bounds for the amortized update and query times of dynamic algorithms for single-source reachability, strongly connected components, and Max-Flow. New n 1.5-o(1) lower bound for computing a set of n st-maximum-flow values in a directed graph with n nodes and ~O(n) edges. There is a hierarchy of natural graph problems on n nodes with complexity n c for c ∈ (2,3). Only slightly non-trivial consequences of this conjecture were known prior to our work. Along the way we also obtain new conditional lower bounds for the Single-Source-Max-Flow problem.

SODA Conference 2015 Conference Paper

More Applications of the Polynomial Method to Algorithm Design

  • Amir Abboud
  • R. Ryan Williams
  • Huacheng Yu

In low-depth circuit complexity, the polynomial method is a way to prove lower bounds by translating weak circuits into low-degree polynomials, then analyzing properties of these polynomials. Recently, this method found an application to algorithm design: Williams (STOC 2014) used it to compute all-pairs shortest paths in time on dense n -node graphs. In this paper, we extend this methodology to solve a number of problems in combinatorial pattern matching and Boolean algebra, considerably faster than previously known methods. First, we give an algorithm for B oolean O rthogonal D etection, which is to detect among two sets A, B ⊆ {0, 1} d of size n if there is an x ∊ A and y ∊ B such that 〈 x, y 〉 = 0. For vectors of dimension d = c ( n ) log n, we solve B oolean O rthogonal D etection in n 2–1/ O (log c ( n )) time by a Monte Carlo randomized algorithm. We apply this as a subroutine in several other new algorithms: In B atch P artial M atch, we are given n query strings from from {0, 1, ⋆} c ( n ) log n (⋆ is a “don't care”), n strings from {0, 1} c ( n )log n, and wish to determine for each query whether or not there is a string matching the query. We solve this problem in n 2–1 / O (log c ( n )) time by a Monte Carlo randomized algorithm. Let t ≤ ν be integers. Given a DNF F on c log t variables with t terms, and v arbitrary assignments on the variables, F can be evaluated on all ν assignments in ν · t 1–1 / O (log c ) time, with high probability. There is a randomized algorithm that solves the Longest Common Substring with don't cares problem on two strings of length n in time. Given two strings S, T of length n, there is a randomized algorithm that computes the length of the longest substring of S that has Edit-Distance less than k to a substring of T in time. Symmetric Boolean Constraint Satisfaction Problems (CSPs) with n variables and m constraints are solvable in poly( m ). 2 n (1–1/ O (log mn )) time.

SODA Conference 2015 Conference Paper

Subcubic Equivalences Between Graph Centrality Problems, APSP and Diameter

  • Amir Abboud
  • Fabrizio Grandoni 0001
  • Virginia Vassilevska Williams

Measuring the importance of a node in a network is a major goal in the analysis of social networks, biological systems, transportation networks etc. Different centrality measures have been proposed to capture the notion of node importance. For example, the center of a graph is a node that minimizes the maximum distance to any other node (the latter distance is the radius of the graph). The median of a graph is a node that minimizes the sum of the distances to all other nodes. Informally, the betweenness centrality of a node w measures the fraction of shortest paths that have w as an intermediate node. Finally, the reach centrality of a node w is the smallest distance r such that any s - t shortest path passing through w has either s or t in the ball of radius r around w. The fastest known algorithms to compute the center and the median of a graph, and to compute the betweenness or reach centrality even of a single node take roughly cubic time in the number n of nodes in the input graph. It is open whether these problems admit truly subcubic algorithms, i. e. algorithms with running time Õ ( n 3– δ ) for some constant δ > 0 1. We relate the complexity of the mentioned centrality problems to two classical problems for which no truly subcubic algorithm is known, namely All Pairs Shortest Paths (APSP) and Diameter. We show that Radius, Median and Betweenness Centrality are equivalent under subcubic reductions to APSP, i. e. that a truly subcubic algorithm for any of these problems implies a truly subcubic algorithm for all of them. We then show that Reach Centrality is equivalent to Diameter under subcubic reductions. The same holds for the problem of approximating Betweenness Centrality within any constant factor. Thus the latter two centrality problems could potentially be solved in truly subcubic time, even if APSP requires essentially cubic time.

FOCS Conference 2015 Conference Paper

Tight Hardness Results for LCS and Other Sequence Similarity Measures

  • Amir Abboud
  • Arturs Backurs
  • Virginia Vassilevska Williams

Two important similarity measures between sequences are the longest common subsequence (LCS) and the dynamic time warping distance (DTWD). The computations of these measures for two given sequences are central tasks in a variety of applications. Simple dynamic programming algorithms solve these tasks in O(n 2 ) time, and despite an extensive amount of research, no algorithms with significantly better worst case upper bounds are known. In this paper, we show that for any constant ε >0, an O(n 2-ε ) time algorithm for computing the LCS or the DTWD of two sequences of length n over a constant size alphabet, refutes the popular Strong Exponential Time Hypothesis (SETH).

FOCS Conference 2014 Conference Paper

Popular Conjectures Imply Strong Lower Bounds for Dynamic Problems

  • Amir Abboud
  • Virginia Vassilevska Williams

We consider several well-studied problems in dynamic algorithms and prove that sufficient progress on any of them would imply a breakthrough on one of five major open problems in the theory of algorithms: 1) Is the 3SUM problem on n numbers in O(n 2-ε ) time for some ε > 0? 2) Can one determine the satisfiability of a CNF formula on n variables and poly n clauses in O(( 2 - ε )npoly n) time for some ε > 0? 3) Is the All Pairs Shortest Paths problem for graphs on n vertices in O(n 3-ε ) time for some ε > 0? 4) Is there a linear time algorithm that detects whether a given graph contains a triangle? 5) Is there an O(n 3-ε ) time combinatorial algorithm for n×n Boolean matrix multiplication? The problems we consider include dynamic versions of bipartite perfect matching, bipartite maximum weight matching, single source reachability, single source shortest paths, strong connectivity, subgraph connectivity, diameter approximation and some nongraph problems such as Pagh's problem defined in a recent paper by Patrascu[STOC 2010].