Arrow Research search

Author name cluster

Cliff Stein 0001

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

31 papers
1 author row

Possible papers

31

FOCS Conference 2022 Conference Paper

Estimating the Longest Increasing Subsequence in Nearly Optimal Time

  • Alexandr Andoni
  • Negev Shekel Nosatzki
  • Sandip Sinha
  • Cliff Stein 0001

Longest Increasing Subsequence (LIS) is a fundamental statistic of a sequence, and has been studied for decades. While the LIS of a sequence of length n can be computed exactly in time $O(n\log n)$, the complexity of estimating the (length of the) LIS in sublinear time, especially when LIS $\ll n$, is still open. We show that for any $n\in\mathbb{N}$ and $\lambda=o(1)$, there exists a (randomized) non-adaptive algorithm that, given a sequence of length n with LIS $\geq\lambda n$, approximates the LIS up to a factor of $1/\lambda^{o(1)}$ in $ n^{o(1)}/\lambda$ time. Our algorithm improves upon prior work substantially in terms of both approximation and run-time: (i) we provide the first sub-polynomial approximation for LIS in sub-linear time; and (ii) our run-time complexity essentially matches the trivial sample complexity lower bound of $\Omega(1/\lambda)$, which is required to obtain any non-trivial approximation of the LIS. As part of our solution, we develop two novel ideas which may be of independent interest. First, we define a new Genuine-LIS problem, in which each sequence element may be either genuine or corrupted. In this model, the user receives unrestricted access to the actual sequence, but does not know a priori which elements are genuine. The goal is to estimate the LIS using genuine elements only, with the minimal number of tests for genuineness. The second idea, Precision Tree, enables accurate estimations for composition of general functions from “coarse” (sub-)estimates. Precision Tree essentially generalizes classical precision sampling, which works only for summations. As a central tool, the Precision Tree is pre-processed on a set of samples, which thereafter is repeatedly used by multiple components of the algorithm, improving their amortized complexity.

STOC Conference 2020 Conference Paper

Parallel approximate undirected shortest paths via low hop emulators

  • Alexandr Andoni
  • Cliff Stein 0001
  • Peilin Zhong

We present a (1+ε)-approximate parallel algorithm for computing shortest paths in undirected graphs, achieving poly (log n ) depth and m poly (log n ) work for n -nodes m -edges graphs. Although sequential algorithms with (nearly) optimal running time have been known for several decades, near-optimal parallel algorithms have turned out to be a much tougher challenge. For (1+ε)-approximation, all prior algorithms with poly (log n ) depth perform at least Ω( mn c ) work for some constant c >0. Improving this long-standing upper bound obtained by Cohen (STOC’94) has been open for 25 years. We develop several new tools of independent interest. One of them is a new notion beyond hopsets — low hop emulator — a poly (log n )-approximate emulator graph in which every shortest path has at most O (loglog n ) hops (edges). Direct applications of the low hop emulators are parallel algorithms for poly (log n )-approximate single source shortest path (SSSP), Bourgain’s embedding, metric tree embedding, and low diameter decomposition, all with poly (log n ) depth and m poly (log n ) work. To boost the approximation ratio to (1+ε), we introduce compressible preconditioners and apply it inside Sherman’s framework (SODA’17) to solve the more general problem of uncapacitated minimum cost flow (a.k.a., transshipment problem). Our algorithm computes a (1+ε)-approximate uncapacitated minimum cost flow in poly (log n ) depth using m poly (log n ) work. As a consequence, it also improves the state-of-the-art sequential running time from m · 2 O (√log n ) to m poly (log n ).

SODA Conference 2019 Conference Paper

Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs

  • Sepehr Assadi
  • MohammadHossein Bateni
  • Aaron Bernstein
  • Vahab Mirrokni
  • Cliff Stein 0001

There is a rapidly growing need for scalable algorithms that solve classical graph problems, such as maximum matching and minimum vertex cover, on massive graphs. For massive inputs, several different computational models have been introduced, including the streaming model, the distributed communication model, and the massively parallel computation (MPC) model that is a common abstraction of MapReduce-style computation. In each model, algorithms are analyzed in terms of resources such as space used or rounds of communication needed, in addition to the more traditional approximation ratio. In this paper, we give a single unified approach that yields better approximation algorithms for matching and vertex cover in all these models. The highlights include: The first one pass, significantly-better-than-2-approximation for matching in random arrival streams that uses subquadratic space, namely a (1. 5 + ε )-approximation streaming algorithm that uses Õ ( n 15 ) space for constant ε > 0. The first 2-round, better-than-2-approximation for matching in the MPC model that uses subquadratic space per machine, namely a (1. 5 + ε )-approximation algorithm with memory per machine for constant ε > 0. By building on our unified approach, we further develop parallel algorithms in the MPC model that give a (1+ ∊ )-approximation to matching and an O (1)-approximation to vertex cover in only O (log log n ) MPC rounds and O ( n /polylog( n )) memory per machine. These results settle multiple open questions posed by Czumaj et al. [STOC 2018]. We obtain our results by a novel combination of two previously disjoint set of techniques, namely randomized composable coresets and edge degree constrained subgraphs (EDCS). We significantly extend the power of these techniques and prove several new structural results. For example, we show that an EDCS is a sparse certificate for large matchings and small vertex covers that is quite robust to sampling and composition.

FOCS Conference 2019 Conference Paper

Fully Dynamic Maximal Independent Set with Polylogarithmic Update Time

  • Soheil Behnezhad
  • Mahsa Derakhshan
  • MohammadTaghi Hajiaghayi
  • Cliff Stein 0001
  • Madhu Sudan 0001

We present the first algorithm for maintaining a maximal independent set (MIS) of a fully dynamic graph-which undergoes both edge insertions and deletions-in polylogarithmic time. Our algorithm is randomized and, per update, takes O(log 2 Δ log 2 n) expected time. Furthermore, the algorithm can be adjusted to have O(log 2 Δ log 4 n) worst-case update-time with high probability. Here, n denotes the number of vertices and Δ is the maximum degree in the graph. The MIS problem in fully dynamic graphs has attracted significant attention after a breakthrough result of Assadi, Onak, Schieber, and Solomon [STOC'18] who presented an algorithm with O(m 3/4 ) update-time (and thus broke the natural Ω(m) barrier) where m denotes the number of edges in the graph. This result was improved in a series of subsequent papers, though, the update-time remained polynomial. In particular, the fastest algorithm prior to our work had Õ(min{√n, m 1/3 }) update-time [Assadi et al. SODA'19]. Our algorithm maintains the lexicographically first MIS over a random order of the vertices. As a result, the same algorithm also maintains a 3-approximation of correlation clustering. We also show that a simpler variant of our algorithm can be used to maintain a random-order lexicographically first maximal matching in the same update-time.

FOCS Conference 2018 Conference Paper

Parallel Graph Connectivity in Log Diameter Rounds

  • Alexandr Andoni
  • Zhao Song 0002
  • Cliff Stein 0001
  • Zhengyu Wang
  • Peilin Zhong

Many modern parallel systems, such as MapReduce, Hadoop and Spark, can be modeled well by the MPC model. The MPC model captures well coarse-grained computation on large data — data is distributed to processors, each of which has a sublinear (in the input data) amount of memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. This model is stronger than the classical PRAM model, and it is an intriguing question to design algorithms whose running time is smaller than in the PRAM model. One fundamental graph problem is connectivity. On an undirected graph with n nodes and m edges, O(log n) round connectivity algorithms have been known for over 35 years. However, no algorithms with better complexity bounds were known. In this work, we give fully scalable, faster algorithms for the connectivity problem, by parameterizing the time complexity as a function of the diameter of the graph. Our main result is a O(log D log log_m/n n) time connectivity algorithm for diameter-d graphs, using Θ(m) total memory. If our algorithm can use more memory, it can terminate in fewer rounds, and there is no lower bound on the memory per processor. We extend our results to related graph problems such as spanning forest, finding a DFS sequence, exact/approximate minimum spanning forest, and bottleneck spanning forest. We also show that achieving similar bounds for reachability in directed graphs would imply faster boolean matrix multiplication algorithms. We introduce several new algorithmic ideas. We describe a general technique called double exponential speed problem size reduction which roughly means that if we can use total memory n to reduce a problem from size n to n/k, for k=(N/n)^Θ(1) in one phase, then we can solve the problem in O(loglog_N/n n) phases. In order to achieve this fast reduction for graph connectivity, we use a multistep algorithm. One key step is a carefully constructed truncated broadcasting scheme where each node broadcasts neighbor sets to its neighbors in a way that limits the size of the resulting neighbor sets. Another key step is random leader contraction, where we choose a smaller set of leaders than many previous works do.

SODA Conference 2016 Conference Paper

Faster Fully Dynamic Matchings with Small Approximation Ratios

  • Aaron Bernstein
  • Cliff Stein 0001

Maximum cardinality matching is a fundamental algorithmic problem with many algorithms and applications. The fully dynamic version, in which edges are inserted and deleted over time has also been the subject of much attention. Existing algorithms for dynamic matching (in general n -vertex m -edge graphs) fall into two groups: there are fast (mostly randomized) algorithms that achieve a 2-approximation or worse, and there are slow algorithms with update time that achieve a better-than-2 approximation. Thus the obvious question is whether we can design an algorithm that achieves a tradeoff between these two: a update time and a better-than-2 approximation simultaneously. We answer this question in the affirmative. Previously, such bounds were only known for the special case of bipartite graphs. Our main result is a fully dynamic deterministic algorithm that maintains a (3/2 + ∊)-approximation in amortized update time O ( m 1/4∊–2. 5 ). In addition to achieving the trade-off described above, our algorithm manages to be polynomially faster than all existing deterministic algorithms (excluding an existing log n -approximation of Onak and Rubinfeld), while still maintaining a better-than-2 approximation. We also give stronger results for graphs whose arboricity is at most α. We show how to maintain a (1 + ∊)-approximate fractional matching or a (3/2 + ∊)-approximate integral matching in worst-case time O ( α ( α + log n )) for constant ∊. When the arboricity is constant, this bound is O (log n ) and when the arboricity is polylogarithmic the update time is also polylogarithmic. Previous results for small arboricity non-bipartite graphs could only maintain a maximal matching (2-approximation). We maintain the approximate matching without explicitly using augmenting paths. We define an intermediate graph, called an EDCS and show that the EDCS H contains a large matching, and show how to maintain an EDCS in G. The EDCS was used in previous works on bipartite graphs, however the details and proofs are completely different in general graphs. The algorithm for bipartite graphs relies on ideas from flows and cuts to non-constructively prove the existence of a good matching in H, but these ideas do not seem to extend to non-bipartite graphs. In this paper we instead explicitly construct a large fractional matching in H. In some cases we can guarantee that this fractional matching is γ-restricted, which means that it only uses values either in the range [0, γ] or 1. We then combine this matching with a new structural property of maximum matchings in non-bipartite graphs, which is analogous to the cut induced by maximum matchings in bipartite graphs.

STOC Conference 2014 Conference Paper

Cluster before you hallucinate: approximating node-capacitated network design and energy efficient routing

  • Ravishankar Krishnaswamy
  • Viswanath Nagarajan
  • Kirk Pruhs
  • Cliff Stein 0001

We consider circuit routing with an objective of minimizing energy, in a network of routers that are speed scalable and that may be shutdown when idle. It is known that this energy minimization problem can be reduced to a capacitated flow network design problem, where vertices have a common capacity but arbitrary costs, and the goal is to choose a minimum cost collection of vertices whose induced subgraph will support the specified flow requirements. For the multicast (single-sink) capacitated design problem we give a polynomial-time algorithm that is O (log 3 n )- approximate with O (log 4 n ) congestion. This translates back to a O (log 4α+3 n )-approximation for the multicast energy-minimization routing problem, where α is the polynomial exponent in the dynamic power used by a router. For the unicast (multicommodity) capacitated design problem we give a polynomial-time algorithm that is O (log 5 n )-approximate with O (log 12 n ) congestion, which translates back to a O (log 12α+5 n )-approximation for the unicast energy-minimization routing problem.

SODA Conference 2014 Conference Paper

Maintaining Assignments Online: Matching, Scheduling, and Flows

  • Anupam Gupta 0001
  • Amit Kumar 0001
  • Cliff Stein 0001

Consider the following edge-orientation problem: edges of a graph appear online one-by-one and they to be directed—given an “orientation”. We want to ensure that the in-degree of each vertex remains low. (This is a simple case of scheduling unit-sized jobs on machines, where each job can only go on one of two machines.) If the edge-orientations we assign are irrevocable, we suffer a significant loss in quality due to online decision-making (as compared to the offline performance). For instance the best online competitive ratio achievable is Θ(log m ) for even this toy problem. But what if the decisions are not irrevocable — what if we allow a limited number of reassignments? Can we do much better? We show that indeed we can. For instance, for edge-orientation, we can achieve a constant-competitive load while doing only a constant number of re-orientations per edge (in an amortized sense). For more substantial problems, our results are as follows: For online matching, where the left vertices arrive online and must be matched to the right vertices, we give an algorithm that reassigns the left vertices an (amortized) constant number of times, and maintains a constant factor to the optimal load on the right vertices. We extend this to restricted machine scheduling with arbitrary sized jobs and give an algorithm that maintains load which is O (log log mn ) times the optimum, and reassigns each job only an (amortized) constant number of times. Consider a digraph with a single source, where sinks arrive online and want to send unit flow to the source. The goal is to minimize the congestion on the edges. Suppose there is an offline flow such that the total length of the flow paths is F *. We give an algorithm that reroutes flow along O ( F * ) edges and achieves a O (1)-approximation to the congestion.

FOCS Conference 2007 Conference Paper

Non-Preemptive Min-Sum Scheduling with Resource Augmentation

  • Nikhil Bansal 0001
  • Ho-Leung Chan
  • Rohit Khandekar
  • Kirk Pruhs
  • Cliff Stein 0001
  • Baruch Schieber

We give the first O(l)-speed O(l) approximation polynomial-time algorithms for several nonpreemptive min-sum scheduling problems where jobs arrive over time and must be processed on one machine. More precisely, we give the first O(l)-speed O(l)-approximations for the non-preemptive scheduling problems; l|r j | Sigmaw j F j (weighted flow time), l |r j | SigmaT j (total tardiness), the broadcast version of 1 |r j | Sigmaw j F j, an O(I)-speed, 1-approximation for l |r j | Sigma U macr j (throughput maximization), and an O(l)-machine, O(l)-speed O(1)-approximation for l |r j | Sigmaw j T j (weighted tardiness). Our main contribution is an integer programming formulation whose relaxation is sufficiently close to the integer optimum, and which can be transformed to a schedule on a faster machine.

FOCS Conference 1999 Conference Paper

Approximation Schemes for Minimizing Average Weighted Completion Time with Release Dates

  • Foto N. Afrati
  • Evripidis Bampis
  • Chandra Chekuri
  • David R. Karger
  • Claire Mathieu
  • Sanjeev Khanna
  • Ioannis Milis
  • Maurice Queyranne

We consider the problem of scheduling n jobs with release dates on m machines so as to minimize their average weighted completion time. We present the first known polynomial time approximation schemes for several variants of this problem. Our results include PTASs for the case of identical parallel machines and a constant number of unrelated machines with and without preemption allowed. Our schemes are efficient: for all variants the running time for /spl alpha/(1+/spl epsiv/) approximation is of the form f(1//spl epsiv/, m)poly(n).

FOCS Conference 1997 Conference Paper

Improved Approximation Algorithms for Unsplittable Flow Problems

  • Stavros G. Kolliopoulos
  • Cliff Stein 0001

In the single-source unsplittable flow problem we are given a graph G, a source vertex s and a set of sinks t/sub 1/, .. ., t/sub k/ with associated demands. We seek a single s-t/sub i/ flow path for each commodity i so that the demands are satisfied and the total flow routed across any edge e is bounded by its capacity c/sub e/. The problem is an NP-hard variant of max flow and a generalization of single-source edge-disjoint paths with applications to scheduling, load balancing and virtual-circuit routing problems. In a significant development, Kleinberg gave recently constant-factor approximation algorithms for several natural optimization versions of the problem. In this paper we give a generic framework, that yields simpler algorithms and significant improvements upon the constant factors. Our framework, with appropriate subroutines applies to all optimization versions previously considered and treats in a unified manner directed and undirected graphs.

FOCS Conference 1994 Conference Paper

Long Tours and Short Superstrings (Preliminary Version)

  • S. Rao Kosaraju
  • James K. Park
  • Cliff Stein 0001

This paper considers weight-maximizing variants of the classical symmetric and asymmetric traveling-salesman problems. Like their weight-minimizing counterparts, these variants are MAX SNP-hard. We present the first nontrivial approximation algorithms for these problems. Our algorithm for directed graphs finds a tour whose weight is at least 38/63/spl ap/0. 603 times the weight of a maximum-weight tour, and our algorithm for undirected graphs finds a tour whose weight is at least 5/7/spl ap/0. 714 times optimal. These bounds compare favorably with the 1/2 and 2/3 bounds that can be obtained for undirected and directed graphs, respectively, by simply deleting the minimum-weight edge from each cycle of a maximum-weight cycle cover. Our algorithm for directed graphs can be used to improve several recent approximation results for the shortest-superstring problem. >