Arrow Research search

Author name cluster

Sanjeev Khanna

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

92 papers
2 author rows

Possible papers

92

FOCS Conference 2025 Conference Paper

A Polynomial Space Lower Bound for Diameter Estimation in Dynamic Streams

  • Sanjeev Khanna
  • Ashwin Padaki
  • Krish Singal
  • Erik Waingarten

We study the space complexity of estimating the diameter of a subset of points in an arbitrary metric space in the dynamic (turnstile) streaming model. The input is given as a stream of updates to a frequency vector $x \in \mathbb{Z}_{\geq 0}^{n}$, where the support of x defines a multiset of points in a fixed metric space $\mathcal{M}=([n], \mathrm{d})$. The goal is to estimate the diameter of this multiset, defined as max $\left\{\mathrm{d}(i, j): x_{i}, x_{j} \gt \right. 0\}$, to a specified approximation factor while using as little space as possible. In insertion-only streams, a simple $O(\log n)$-space algorithm achieves a $\mathbf{2}$-approximation. In sharp contrast to this, we show that in the dynamic streaming model, any algorithm achieving a constant-factor approximation to diameter requires polynomial space. Specifically, we prove that a c-approximation to the diameter requires $n^{\Omega(1 / c)}$ space. Our lower bound relies on two conceptual contributions: (1) a new connection between dynamic streaming algorithms and linear sketches for scale-invariant functions, a class that includes diameter estimation, and (2) a connection between linear sketches for diameter and the minrank of graphs, a notion previously studied in index coding. We complement our lower bound with a nearly matching upper bound, which gives a c-approximation to the diameter in general metrics using $n^{O(1 / c)}$ space.

STOC Conference 2025 Conference Paper

Correlation Clustering and (De)Sparsification: Graph Sketches Can Match Classical Algorithms

  • Sepehr Assadi
  • Sanjeev Khanna
  • Aaron Putterman

Correlation clustering is a widely-used approach for clustering large data sets based only on pairwise similarity information. In recent years, there has been a steady stream of better and better classical algorithms for approximating this problem. Meanwhile, another line of research has focused on porting the classical advances to various sublinear algorithm models, including semi-streaming, Massively Parallel Computation (MPC), and distributed computing. Yet, these latter works typically rely on ad-hoc approaches that do not necessarily keep up with advances in approximation ratios achieved by classical algorithms. Hence, the motivating question for our work is this: can the gains made by classical algorithms for correlation clustering be ported over to sublinear algorithms in a black-box manner ? We answer this question in the affirmative by introducing the paradigm of graph de-sparsification. A versatile approach for designing sublinear algorithms across various models is the graph (linear) sketching. It is known that one can find a cut sparsifier of a given graph—which approximately preserves cut structures—via graph sketching, and that this is sufficient information-theoretically for recovering a near-optimal correlation clustering solution. However, no efficient algorithms are known for this task as the resulting cut sparsifier is necessarily a weighted graph, and correlation clustering is known to be a distinctly harder problem on weighted graphs. Our main result is a randomized linear sketch of O ( n ) size for n -vertex graphs, from which one can recover with high probability an (α+ o (1))-approximate correlation clustering in polynomial time, where α is the best approximation ratio of any polynomial time classical algorithm for (unweighted) correlation clustering. This is proved via our new de-sparsification result: we recover in polynomial-time from some O ( n ) size linear sketch of a graph G , an unweighted , simple graph that approximately preserves the cut structure of G . In fact we show that under some mild conditions, any spectral sparsifier of a graph G can be de-sparsified into an unweighted simple graph with nearly the same spectrum. We believe the de-sparsification paradigm is interesting in its own right as a way of reducing graph complexity when weighted version of a problem is harder than its unweighted version. Finally, we use our techniques to get efficient algorithms for correlation clustering that match the performance of best classical algorithms, in a variety of different models, including dynamic streaming, MPC, and distributed communication models.

STOC Conference 2025 Conference Paper

Efficient Algorithms and New Characterizations for CSP Sparsification

  • Sanjeev Khanna
  • Aaron Putterman
  • Madhu Sudan 0001

CSP sparsification, introduced by Kogan and Krauthgamer (ITCS 2015), considers the following question: how much can an instance of a constraint satisfaction problem be sparsified (by retaining a reweighted subset of the constraints) while still roughly capturing the weight of constraints satisfied by every assignment. CSP sparsification captures as a special case several well-studied problems including graph cut-sparsification, hypergraph cut-sparsification, hypergraph XOR-sparsification, and corresponds to a general class of hypergraph sparsification problems where an arbitrary 0/1-valued splitting function is used to define the notion of cutting a hyperedge (see, for instance, Veldt-Benson-Kleinberg SIAM Review 2022). The main question here is to understand, for a given constraint predicate P :Σ r → {0,1} (where variables are assigned values in Σ), the smallest constant c such that O ( n c ) sized sparsifiers exist for every instance of a constraint satisfaction problem over P . A recent work of Khanna, Putterman and Sudan (SODA 2024) [KPS24] showed existence of near-linear size sparsifiers for new classes of CSPs. In this work, (1) we significantly extend the class of CSPs for which nearly linear-size sparsifications can be shown to exist while also extending the scope to settings with non-linear-sized sparsifications; (2) we give a polynomial-time algorithm to extract such sparsifications for all the problems we study including the first efficient sparsification algorithms for the problems studied in [KPS24]. Our results captured in item (1) lead to two new classifications: First we get a complete classification of all symmetric Boolean predicates P (i.e., on the Boolean domain Σ = {0,1}) that allow nearly-linear-size sparsifications. This classification reveals an inherent, and previously unsuspected, number-theoretic phenomenon that determines near-linear size sparsifiability. Second, we also completely classify the set of Boolean predicates P that allow non-trivial ( o ( n r )-size) sparsifications, thus answering an open question from the work of Kogan and Krauthgamer. The constructive aspect of our result is an arguably unexpected strengthening of [KPS24]. Their work roughly seemed to suggest that sparsifications can be found by solving problems related to finding the minimum distance of linear codes. These problems remain unsolved (in fact, are NP-hard) to this date and our work finds a different path to achieve poly-time sparsification, resolving an open problem from their work. As a consquence we also get the first efficient algorithms to spectrally sparsify Cayley graphs over F 2 n in time polynomial in the number of generators. Our techniques build on [KPS24] which proves the existence of nearly-linear size sparsifiers for CSPs where the unsatisfying assignments of the underlying predicate P are given by a linear equation over a finite field. Our main contributions are to extend this framework to higher-degree equations over general Abelian groups (both elements are crucial for our classification results) as well as designing polynomial-time sparsification algorithms for all problems in our framework.

FOCS Conference 2025 Conference Paper

On the Parallel Complexity of Finding a Matroid Basis

  • Sanjeev Khanna
  • Aaron Putterman
  • Junkai Song

A fundamental question in parallel computation, posed by Karp, Upfal, and Wigderson (FOCS 1985, JCSS 1988), asks: given only independence-oracle access to a matroid on n elements, how many adaptive rounds are required to find a basis using only polynomially many queries? This question generalizes, among others, the complexity of finding bases of linear spaces, partition matroids, and spanning forests in graphs. In their work, they established an upper bound of $O(\sqrt{n})$ rounds and a lower bound of $\widetilde{\Omega}\left(n^{1 / 3}\right)$ rounds for this problem, and these bounds have remained unimproved since then. In this work, we make the first progress in narrowing this gap by designing a parallel algorithm that finds a basis of an arbitrary matroid in $\tilde{O}\left(n^{7 / 15}\right)$ rounds (using polynomially many independence queries per round) with high probability, surpassing the long-standing $O(\sqrt{n})$ barrier. Our approach introduces a novel matroid decomposition technique and other structural insights that not only yield this general result but also lead to a much improved new algorithm for the class of partition matroids (which underlies the $\widetilde{\Omega}\left(n^{1 / 3}\right)$ lower bound of Karp, Upfal, and Wigderson). Specifically, we develop an $\tilde{O}\left(n^{1 / 3}\right)$-round algorithm, thereby settling the round complexity of finding a basis in partition matroids. As a further application, we also improve the parallel complexity of the classic matroid intersection problem. By plugging our basis-finding algorithm into a known algorithmic framework for matroid intersection, we obtain an $\tilde{O}\left(n^{37 / 45}\right)$ round algorithm for matroid intersection, improving upon the prior $O\left(n^{5 / 6}\right)$ bound. Collectively, these results represent the first progress on the parallel complexity of finding matroid bases in 40 years, and we believe that techniques developed here may prove useful for other problems on matroids.

FOCS Conference 2025 Conference Paper

Stochastic Knapsack without Relaxing the Capacity

  • Anindya De
  • Sanjeev Khanna
  • Nathan White

We present the first polynomial-time approximation scheme (PTAS) for the stochastic knapsack problem that does not relax the knapsack’s capacity. Given n items with known arbitrary independent size distributions and fixed profits, an accuracy parameter $\varepsilon \in(0, 1)$, and an overflow probability bound $\alpha$, our algorithm computes a set of items with profit at least $(1-\varepsilon)$ times optimal, while ensuring the probability of exceeding the capacity is at most $4 \sqrt{\alpha}+\varepsilon$. Prior to our work, no PTAS was known without either allowing a ($1+\varepsilon$) capacity expansion or restricting to special distribution classes (such as Poisson or Gaussian). A key tool in our algorithm is an anti-concentration result that allows us to handle “low-profit” items by adapting a known PTAS result for the case when we are allowed to expand knapsack capacity by a ($1+\varepsilon$) factor. We then show that we are able to convert this solution into another solution with a similar profit which strictly obeys the knapsack capacity, but requires that we relax the overflow probability to a $4 \sqrt{\alpha}+\varepsilon$ factor. In the special case where the item sizes are scaled Bernoulli random variables (which have support on 0 and exactly one other value), we extend our approach to obtain an improved overflow probability guarantee of $\alpha+\varepsilon$. We make this improvement by exploiting the fact that these random variables are defined by only two parameters (the probability of being non-zero and the non-zero value in the support), which allows us to avoid some of the complexity and overhead of our algorithm for arbitrary distributions.

FOCS Conference 2024 Conference Paper

Near-Optimal Size Linear Sketches for Hypergraph Cut Sparsifiers

  • Sanjeev Khanna
  • Aaron Putterman
  • Madhu Sudan 0001

A $(1\pm\epsilon)$ -sparsifier of a hypergraph $G(V, E)$ is a (weighted) subgraph that preserves the value of every cut to within a $(1\pm\epsilon)$ -factor. It is known that every hypergraph with $n$ vertices admits a $(1 \pm \epsilon)$ -sparsifier with $\tilde{O}(n/{\epsilon}^{2})$ hyperedges. In this work, we explore the task of building such a sparsifier by using only linear measurements (a linear sketch) over the hyperedges of $G$, and provide nearly-matching upper and lower bounds for this task. Specifically, we show that there is a randomized linear sketch of size $\tilde{O}(nr\log(m)/\epsilon^{2})$ bits which with high probability contains sufficient information to recover a $(1\pm\epsilon)$ cut-sparsifier with $\tilde{O}(n/\epsilon^{2})$ hyperedges for any hypergraph with at most $m$ edges each of which has arity bounded by $r$. This immediately gives a dynamic streaming algorithm for hypergraph cut sparsification with an identical space complexity, improving on the previous best known bound of $\tilde{O}(nr^{2}\log^{4}({m})/\epsilon^{2})$ bits of space (Guha, McGregor, and Tench, PODS 2015). We complement our algorithmic result above with a nearly-matching lower bound. We show that for every $\epsilon\in(0, 1)$, one needs $\Omega(nr\log(m/n)/\log(n))$ bits to construct a $(1\pm\epsilon)$ -sparsifier via linear sketching, thus showing that our linear sketch achieves an optimal dependence on both $r$ and $\log(m)$. The starting point for our improved algorithm is importance sampling of hyperedges based on the new notion of $k$ -cut strength introduced in the recent work of Quanrud (SODA 2024). The natural algorithm based on this concept leads to $\log m$ levels of sampling where errors can potentially accumulate, and this accounts for the polylog $(m)$ losses in the sketch size of the natural algorithm. We develop a more intricate analysis of the accumulation in error to show most levels do not contribute to the error and actual loss is only polylog $(n)$. Combining with careful preprocessing (and analysis) this enables us to get rid of all extraneous $\log m$ factors in the sketch size, but the quadratic dependence on $r$ remains. This dependence originates from use of correlated $\ell_{0}$ -samplers to recover a large number of low-strength edges in a hypergraph simultaneously by looking at neighborhoods of individual vertices. In graphs, this leads to discovery of $\Omega(n)$ edges in a single shot, whereas in hypergraphs, this may potentially only reveal $O$ ( $n$ / $r$ ) new edges, thus requiring $\Omega(r)$ rounds of recovery. To remedy this we introduce a new technique of random fingerprinting of hyperedges which effectively eliminates the correlations created by large arity hyperedges, and leads to a scheme for recovering hyperedges of low strength with an optimal dependence on $r$. Putting all these ingredients together yields our linear sketching algorithm. Our lower bound is established by a reduction from the universal relation problem in the one-way communication setting.

SODA Conference 2022 Conference Paper

New Trade-Offs for Fully Dynamic Matching via Hierarchical EDCS

  • Soheil Behnezhad
  • Sanjeev Khanna

We study the maximum matching problem in fully dynamic graphs: a graph is undergoing both edge insertions and deletions, and the goal is to efficiently maintain a large matching after each edge update. This problem has received considerable attention in recent years. The known algorithms naturally exhibit a trade-off between the quality of the matching maintained (i. e. , the approximation ratio) and the time needed per update. While several interesting results have been obtained, the optimal behavior of this trade-off remains largely unclear. Our main contribution is a new approach to designing fully dynamic approximate matching algorithms that in a unified manner not only (essentially) recovers all previously known trade-offs that were achieved via very different techniques, but reveals some new ones as well. Specifically, we introduce a generalization of the edge-degree constrained subgraph (EDCS) of Bernstein and Stein (2015) that we call the hierarchical EDCS (HEDCS). We also present a randomized algorithm for efficiently maintaining an HEDCS. In an m -edge graph with maximum degree Δ, for any integer k ≥ 0 that is essentially the number of levels of the hierarchy in HEDCS, our algorithm takes Õ (min{Δ 1/( k + 1), m 1/(2 k +2) }) worst-case update-time and maintains an (almost) α ( k )-approximate matching where we show: • These bounds recover all previous trade-offs known for dynamic matching in the literature up to logarithmic factors in the update-time. • α (2) >. 612 for bipartite graphs, and α (2) >. 609 for general graphs. Note that these approximations are obtained in Õ (min{Δ 1/3, m 1/6 }) update-time. • α (3) >. 563 for bipartite graphs, and α (3) >. 532 for general graphs. Note that these approximations are obtained in Õ (min{Δ 1/4, m 1/8 }) update-time.

FOCS Conference 2022 Conference Paper

On Weighted Graph Sparsification by Linear Sketching

  • Yu Chen 0039
  • Sanjeev Khanna
  • Huan Li 0002

A seminal work of [Ahn-Guha-McGregor, PODS’12] showed that one can compute a cut sparsifier of an unweighted undirected graph by taking a near-linear number of linear measurements on the graph. Subsequent works also studied computing other graph sparsifiers using linear sketching, and obtained near-linear upper bounds for spectral sparsifiers [Kapralov-Lee-Musco-Musco-Sidford, FOCS’14] and first non-trivial upper bounds for spanners [Filtser-Kapralov-Nouri, SODA’21]. All these linear sketching algorithms, however, only work on unweighted graphs, and are extended to weighted graphs by weight grouping, a non-linear operation not implementable in, for instance, general turnstile streams. In this paper, we initiate the study of weighted graph sparsification by linear sketching by investigating a natural class of linear sketches that we call incidence sketches, in which each measurement is a linear combination of the weights of edges incident on a single vertex. This class captures all aforementioned linear sketches for unweighted sparsification. It also covers linear sketches implementable in the simultaneous communication model, where edges are distributed across n machines. Our results are: 1)Weighted cut sparsification: We give an algorithm that computes a $(1+\epsilon)$-cut sparsifier using $\tilde{O}(n\epsilon^{-3})$ linear measurements, which is nearly optimal. This also implies a turnstile streaming algorithm with $\tilde{O}(n\epsilon^{-3})$ space. Our algorithm is achieved by building a so-called “weighted edge sampler” for each vertex. 2)Weighted spectral sparsification: We give an algorithm that computes a $(1+\epsilon)$-spectral sparsifier using $\tilde{O}(n^{6/5}\epsilon^{-4})$ linear measurements. This also implies a turnstile streaming algorithm with $\tilde{O}(n^{6/5}\epsilon^{-4})$ space. Key to our algorithm is a novel analysis of how the effective resistances change under vertex sampling. Complementing our algorithm, we then prove a superlinear lower bound of $\Omega(n^{21/20-o(1)})$ measurements for computing some O(1)-spectral sparsifier using incidence sketches. 3)Weighted spanner computation: We first show that any $o(n^{2})$ linear measurements can only recover a spanner of stretch that in general depends linearly on $\frac{w_{\max}}{w_{\min}}$. We thus focus on graphs with $\frac{w_{\max}}{w_{\min}}=O(1)$ and study the stretch’s dependence on n. On such graphs, the algorithm in [FiltserKapralov-Nouri, SODA’21] can obtain a spanner of stretch $\tilde{O}\left(n^{\frac{2}{3}\left(1-\alpha\right)}\right)$ using $\tilde{O}(n^{1+\alpha})$ measurements for any $\alpha\in [0, 1]$. We prove that, for incidence sketches, this tradeoff is optimal up to an $n^{o(1)}$ factor for all $\alpha\lt 1/10$. We prove both our lower bounds by analyzing the “effective resistances” in certain matrix-weighted graphs, where we develop a number of new tools for reasoning about such graphs – most notably (i) a matrix-weighted analog of the widely used expander decomposition of ordinary graphs, and (ii) a proof that a random vertex-induced subgraph of a matrix-weighted expander is also an expander. We believe these tools are of independent interest.

NeurIPS Conference 2022 Conference Paper

Sublinear Algorithms for Hierarchical Clustering

  • Arpit Agarwal
  • Sanjeev Khanna
  • Huan Li
  • Prathamesh Patil

Hierarchical clustering over graphs is a fundamental task in data mining and machine learning with applications in many domains including phylogenetics, social network analysis, and information retrieval. Specifically, we consider the recently popularized objective function for hierarchical clustering due to Dasgupta~\cite{Dasgupta16}, namely, minimum cost hierarchical partitioning. Previous algorithms for (approximately) minimizing this objective function require linear time/space complexity. In many applications the underlying graph can be massive in size making it computationally challenging to process the graph even using a linear time/space algorithm. As a result, there is a strong interest in designing algorithms that can perform global computation using only sublinear resources (space, time, and communication). The focus of this work is to study hierarchical clustering for massive graphs under three well-studied models of sublinear computation which focus on space, time, and communication, respectively, as the primary resources to optimize: (1) (dynamic) streaming model where edges are presented as a stream, (2) query model where the graph is queried using neighbor and degree queries, (3) massively parallel computation (MPC) model where the edges of the graph are partitioned over several machines connected via a communication channel. We design sublinear algorithms for hierarchical clustering in all three models above. At the heart of our algorithmic results is a view of the objective in terms of cuts in the graph, which allows us to use a relaxed notion of cut sparsifiers to do hierarchical clustering while introducing only a small distortion in the objective function. Our main algorithmic contributions are then to show how cut sparsifiers of the desired form can be efficiently constructed in the query model and the MPC model. We complement our algorithmic results by establishing nearly matching lower bounds that rule out the possibility of designing algorithms with better performance guarantees in each of these models.

FOCS Conference 2021 Conference Paper

A Polynomial Lower Bound on the Number of Rounds for Parallel Submodular Function Minimization

  • Deeparnab Chakrabarty
  • Yu Chen 0039
  • Sanjeev Khanna

The problem of minimizing a submodular function (SFM) is a common generalization of several fundamental combinatorial optimization problems, including minimum $s-t$ cuts in graphs and matroid intersection. It is well-known that a submodular function can be minimized with only $\text{poly} (N)$ function evaluation queries where $N$ denotes the universe size. However, all known polynomial query algorithms for SFM are highly adaptive, requiring at least $N$ rounds of adaptivity. A natural question is if SFM can be efficiently solved in a highly parallel manner, namely, with $\text{poly} (N)$ queries using only poly-logarithmic rounds of adaptivity. An important step towards understanding the adaptivity needed to solve SFM efficiently was taken in the very recent work of Balkanski and Singer who showed that any SFM algorithm with $\text{poly} (N)$ queries. This left open the possibility of efficient SFM algorithms with poly-logarithmic rounds of adaptivity. In this work, we strongly rule out this possibility by showing that any, possibly randomized, algorithm for submodular function minimization making $\text{poly} (N)$ queries requires $\tilde{\Omega}(N^{1/3})$ rounds of adaptivity. In fact, we show a polynomial lower bound on the number of rounds of adaptivity even for algorithms that make up to $2^{N^{1-\delta}}$ queries, for any constant $\delta > 0$.

NeurIPS Conference 2021 Conference Paper

Approximate optimization of convex functions with outlier noise

  • Anindya De
  • Sanjeev Khanna
  • Huan Li
  • MohammadHesam NikpeySalekde

We study the problem of minimizing a convex function given by a zeroth order oracle that is possibly corrupted by {\em outlier noise}. Specifically, we assume the function values at some points of the domain are corrupted arbitrarily by an adversary, with the only restriction being that the total volume of corrupted points is bounded. The goal then is to find a point close to the function's minimizer using access to the corrupted oracle. We first prove a lower bound result showing that, somewhat surprisingly, one cannot hope to approximate the minimizer {\em nearly as well} as one might expect, even if one is allowed {\em an unbounded number} of queries to the oracle. Complementing this negative result, we then develop an efficient algorithm that outputs a point close to the minimizer of the convex function, where the specific distance matches {\em exactly}, up to constant factors, the distance bound shown in our lower bound result.

SODA Conference 2021 Conference Paper

Hardness of Approximation for Orienteering with Multiple Time Windows

  • Naveen Garg 0001
  • Sanjeev Khanna
  • Amit Kumar 0001

Vehicle routing problems are a broad class of combinatorial optimization problems that can be formulated as the problem of finding a tour in a weighted graph that optimizes some function of the visited vertices. For instance, a canonical and extensively studied vehicle routing problem is the orienteering problem where the goal is to find a tour that maximizes the number of vertices visited by a given deadline. In this paper, we consider the computational tractability of a well-known generalization of the orienteering problem called the Orient-MTW problem. The input to Orient-MTW consists of a weighted graph G ( V, E ) where for each vertex v ∊ V we are given a set of time instants T v ⊆ [ T ], and a source vertex s. A tour starting at s is said to visit a vertex v if it transits through v at any time in the set T v. The goal is to find a tour starting at the source vertex that maximizes the number of vertices visited. It is known that this problem admits a quasi-polynomial time O (log OPT)-approximation ratio where OPT is the optimal solution value but until now no hardness better than an APX-hardness was known for this problem. Our main result is an -hardness for this problem that holds even when the underlying graph G is an undirected tree. This is the first super-constant hardness result for the Orient-MTW problem. The starting point for our result is the hardness of the SetCover problem which is known to hold on instances with a special structure. We exploit this special structure of the hard SetCover instances to first obtain a new proof of the APX-hardness result for Orient-MTW that holds even on trees of depth 2. We then recursively amplify this constant factor hardness to an -hardness, while keeping the resulting topology to be a tree. Our amplified hardness proof crucially utilizes a delicate concavity property which shows that in our encoding of SetCover instances as instances of the Orient-MTW problem, whenever the optimal cost for SetCover instance is large, any tour, no matter how it allocates its time across different sub-trees, can not visit too many vertices overall. We believe that this reduction template may also prove useful in showing hardness of other vehicle routing problems.

FOCS Conference 2020 Conference Paper

Near-linear Size Hypergraph Cut Sparsifiers

  • Yu Chen 0039
  • Sanjeev Khanna
  • Ansh Nagda

Cuts in graphs are a fundamental object of study, and play a central role in the study of graph algorithms. The problem of sparsifying a graph while approximately preserving its cut structure has been extensively studied and has many applications. In a seminal work, Benczúr and Karger (1996) showed that given any $n$ -vertex undirected weighted graph $G$ and a parameter $\varepsilon\in(0, 1)$, there is a near-linear time algorithm that outputs a weighted subgraph $G^{\prime}$ of $G$ of size $\tilde{O}(n/\varepsilon^{2})$ such that the weight of every cut in $G$ is preserved to within a ( $1\pm\varepsilon$ )-factor in $G^{\prime}$. The graph $G^{\prime}$ is referred to as a ( $1\pm\varepsilon$ )-approximate cut sparsifier of $G$. A natural question is if such cut-preserving sparsifiers also exist for hypergraphs. Kogan and Krauthgamer (2015) initiated a study of this question and showed that given any weighted hypergraph $H$ where the cardinality of each hyperedge is bounded by $r$, there is a polynomial-time algorithm to find a ( $1\pm\varepsilon$ )-approximate cut sparsifier of $H$ of size $\tilde{O}(\frac{nr}{\varepsilon^{2}})$. Since $r$ can be as large as $n$, in general, this gives a hypergraph cut sparsifier of size $\tilde{O}(n^{2}/\varepsilon^{2})$, which is a factor $n$ larger than the Benczúr-Karger bound for graphs. It has been an open question whether or not Benczúr-Karger bound is achievable on hypergraphs. In this work, we resolve this question in the affirmative by giving a new polynomial-time algorithm for creating hypergraph sparsifiers of size $\tilde{O}(n/\varepsilon^{2})$.

ICML Conference 2020 Conference Paper

Rank Aggregation from Pairwise Comparisons in the Presence of Adversarial Corruptions

  • Arpit Agarwal 0001
  • Shivani Agarwal 0001
  • Sanjeev Khanna
  • Prathamesh Patil

Rank aggregation from pairwise preferences has widespread applications in recommendation systems and information retrieval. Given the enormous economic and societal impact of these applications, and the consequent incentives for malicious players to manipulate ranking outcomes in their favor, an important challenge is to make rank aggregation algorithms robust to adversarial manipulations in data. In this paper, we initiate the study of robustness in rank aggregation under the popular Bradley-Terry-Luce (BTL) model for pairwise comparisons. We consider a setting where pairwise comparisons are initially generated according to a BTL model, but a fraction of these comparisons are corrupted by an adversary prior to being reported to us. We consider a strong contamination model, where an adversary having complete knowledge of the initial truthful data and the underlying true BTL parameters, can subsequently corrupt the truthful data by inserting, deleting, or changing data points. The goal is to estimate the true score/weight of each item under the BTL model, even in the presence of these corruptions. We characterize the extent of adversarial corruption under which the true BTL parameters are uniquely identifiable. We also provide a novel pruning algorithm that provably cleans the data of adversarial corruption under reasonable conditions on data generation and corruption. We corroborate our theory with experiments on both synthetic as well as real data showing that previous algorithms are vulnerable to even small amounts of corruption, whereas our algorithm can clean a reasonably high amount of corruption.

STOC Conference 2019 Conference Paper

A new algorithm for decremental single-source shortest paths with applications to vertex-capacitated flow and cut problems

  • Julia Chuzhoy
  • Sanjeev Khanna

We study the vertex-decremental Single-Source Shortest Paths (SSSP) problem: given an undirected graph G =( V , E ) with lengths ℓ( e )≥ 1 on its edges that undergoes vertex deletions, and a source vertex s , we need to support (approximate) shortest-path queries in G : given a vertex v , return a path connecting s to v , whose length is at most (1+є) times the length of the shortest such path, where є is a given accuracy parameter. The problem has many applications, for example to flow and cut problems in vertex-capacitated graphs. Decremental SSSP is a fundamental problem in dynamic algorithms that has been studied extensively, especially in the more standard edge-decremental setting, where the input graph G undergoes edge deletions. The classical algorithm of Even and Shiloach supports exact shortest-path queries in O ( mn ) total update time. A series of recent results have improved this bound to O ( m 1+ o (1) log L ), where L is the largest length of any edge. However, these improved results are randomized algorithms that assume an oblivious adversary. To go beyond the oblivious adversary restriction, recently, Bernstein, and Bernstein and Chechik designed deterministic algorithms for the problem, with total update time Õ( n 2 log L ), that by definition work against an adaptive adversary. Unfortunately, their algorithms introduce a new limitation, namely, they can only return the approximate length of a shortest path, and not the path itself. Many applications of the decremental SSSP problem, including the ones considered in this paper, crucially require both that the algorithm returns the approximate shortest paths themselves and not just their lengths, and that it works against an adaptive adversary. Our main result is a randomized algorithm for vertex-decremental SSSP with total expected update time O ( n 2+ o (1) log L ), that responds to each shortest-path query in Õ( n log L ) time in expectation, returning a (1+є)-approximate shortest path. The algorithm works against an adaptive adversary. The main technical ingredient of our algorithm is an Õ(| E ( G )|+ n 1+ o (1) )-time algorithm to compute a core decomposition of a given dense graph G , which allows us to compute short paths between pairs of query vertices in G efficiently. We use our result for vertex-decremental SSSP to obtain (1+є)-approximation algorithms for maximum s - t flow and minimum s - t cut in vertex-capacitated graphs, in expected time n 2+ o (1) , and an O (log 4 n )-approximation algorithm for the vertex version of the sparsest cut problem with expected running time n 2+ o (1) . These results improve upon the previous best known algorithms for these problems in the regime where m = ω( n 1.5 + o (1) ).

IJCAI Conference 2019 Conference Paper

Network Formation under Random Attack and Probabilistic Spread

  • Yu Chen
  • Shahin Jabbari
  • Michael Kearns
  • Sanjeev Khanna
  • Jamie Morgenstern

We study a network formation game where agents receive benefits by forming connections to other agents but also incur both direct and indirect costs from the formed connections. Specifically, once the agents have purchased their connections, an attack starts at a randomly chosen vertex in the network and spreads according to the independent cascade model with a fixed probability, destroying any infected agents. The utility or welfare of an agent in our game is defined to be the expected size of the agent's connected component post-attack minus her expenditure in forming connections. Our goal is to understand the properties of the equilibrium networks formed in this game. Our first result concerns the edge density of equilibrium networks. A network connection increases both the likelihood of remaining connected to other agents after an attack as well the likelihood of getting infected by a cascading spread of infection. We show that the latter concern primarily prevails and any equilibrium network in our game contains only $O(n\log n)$ edges where $n$ denotes the number of agents. On the other hand, there are equilibrium networks that contain $\Omega(n)$ edges showing that our edge density bound is tight up to a logarithmic factor. Our second result shows that the presence of attack and its spread through a cascade does not significantly lower social welfare as long as the network is not too dense. We show that any non-trivial equilibrium network with $O(n)$ edges has $\Theta(n^2)$ social welfare, asymptotically similar to the social welfare guarantee in the game without any attacks.

SODA Conference 2019 Conference Paper

Sublinear Algorithms for (Δ + 1) Vertex Coloring

  • Sepehr Assadi
  • Yu Chen 0039
  • Sanjeev Khanna

Any graph with maximum degree Δ admits a proper vertex coloring with Δ+1 colors that can be found via a simple sequential greedy algorithm in linear time and space. But can one find such a coloring via a sublinear algorithm? We answer this fundamental question in the affirmative for several canonical classes of sublinear algorithms including graph streaming, sublinear time, and massively parallel computation (MPC) algorithms. In particular, we design: A single-pass semi-streaming algorithm in dynamic streams using Õ ( n ) space. The only known semi-streaming algorithm prior to our work was a folklore O (log n )-pass algorithm obtained by simulating classical distributed algorithms in the streaming model. A sublinear-time algorithm in the standard query model that allows neighbor queries and pair queries using time. We further show that any algorithm that outputs a valid coloring with sufficiently large constant probability requires time. No non-trivial sublinear time algorithms were known prior to our work. A parallel algorithm in the massively parallel computation (MPC) model using Õ ( n ) memory per machine and O (1) MPC rounds. Our number of rounds significantly improves upon the recent O (log log Δ · log* ( n ))-round algorithm of Parter [ICALP 2018]. At the core of our results is a remarkably simple meta-algorithm for the (Δ + 1) coloring problem: Sample O (log n ) colors for each vertex independently and uniformly at random from the Δ + 1 colors; find a proper coloring of the graph using only the sampled colors of each vertex. As our main result, we prove that the sampled set of colors with high probability contains a proper coloring of the input graph. The sublinear algorithms are then obtained by designing efficient algorithms for finding a proper coloring of the graph from the sampled colors in each model. We note that all our upper bound results for (Δ + 1) coloring are either optimal or close to best possible in each model studied. We also establish new lower bounds that rule out the possibility of achieving similar results in these models for the closely related problems of maximal independent set and maximal matching. Collectively, our results highlight a sharp contrast between the complexity of (Δ+1) coloring vs maximal independent set and maximal matching in various models of sublinear computation even though all three problems are solvable by a simple greedy algorithm in the classical setting.

FOCS Conference 2018 Conference Paper

Testing Graph Clusterability: Algorithms and Lower Bounds

  • Ashish Chiplunkar
  • Michael Kapralov
  • Sanjeev Khanna
  • Aida Mousavifar
  • Yuval Peres

We consider the problem of testing graph cluster structure: given access to a graph G = (V, E), can we quickly determine whether the graph can be partitioned into a few clusters with good inner conductance, or is far from any such graph? This is a generalization of the well-studied problem of testing graph expansion, where one wants to distinguish between the graph having good expansion (i. e. being a good single cluster) and the graph having a sparse cut (i. e. being a union of at least two clusters). A recent work of Czumaj, Peng, and Sohler (STOC'15) gave an ingenious sublinear time algorithm for testing k-clusterability in time Õ(n^1/2 poly(k)). Their algorithm implicitly embeds a random sample of vertices of the graph into Euclidean space, and then clusters the samples based on estimates of Euclidean distances between the points. This yields a very efficient testing algorithm, but only works if the cluster structure is very strong: it is necessary to assume that the gap between conductances of accepted and rejected graphs is at least logarithmic in the size of the graph G. In this paper we show how one can leverage more refined geometric information, namely angles as opposed to distances, to obtain a sublinear time tester that works even when the gap is a sufficiently large constant. Our tester is based on the singular value decomposition of a natural matrix derived from random walk transition probabilities from a small sample of seed nodes. We complement our algorithm with a matching lower bound on the query complexity of testing clusterability. Our lower bound is based on a novel property testing problem, which we analyze using Fourier analytic tools. As a byproduct of our techniques, we also achieve new lower bounds for the problem of approximating MAX-CUT value in sublinear time.

SODA Conference 2017 Conference Paper

(1 + Ω(1))-Αpproximation to MAX-CUT Requires Linear Space

  • Michael Kapralov
  • Sanjeev Khanna
  • Madhu Sudan 0001
  • Ameya Velingker

We consider the problem of estimating the value of MAXCUT in a graph in the streaming model of computation. We show that there exists a constant ∊ * > 0 such that any randomized streaming algorithm that computes a (1 + ∊ * )- approximation to MAX-CUT requires Ω( n ) space on an n vertex graph. By contrast, there are algorithms that produce a (1 + ∊)-approximation in space O ( n /∊ 2 ) for every ∊ > 0. Our result is the first linear space lower bound for the task of approximating the max cut value and partially answers an open question from the literature [2]. The prior state of the art ruled out (2 - ∊)-approximation in space or (1 + ∊)-approximation in space, for any ∊ > 0. Previous lower bounds for the MAX-CUT problem relied, in essence, on a lower bound on the communication complexity of the following task: Several players are each given some edges of a graph and they wish to determine if the union of these edges is ε-close to forming a bipartite graph, using one-way communication. The previous works proved a lower bound of for this task when ∊ = 1/2, and n 1_ O (∊) for every ∊ > 0, even when one of the players is given a candidate bipartition of the graph and the graph is promised to be bipartite with respect to this partition or ε-far from bipartite. This added information was essential in enabling the previous analyses but also yields a weak bound since, with this extra information, there is an n 1_ O (∊) communication protocol for this problem. In this work, we give an O ( n ) lower bound on the communication complexity of the original problem (without the extra information) for ∊ = Ω(1) in the three-player setting. Obtaining this O ( n ) lower bound on the communication complexity is the main technical result in this paper. We achieve it by a delicate choice of distributions on instances as well as a novel use of the convolution theorem from Fourier analysis combined with graph-theoretic considerations to analyze the communication complexity.

SODA Conference 2017 Conference Paper

On Estimating Maximum Matching Size in Graph Streams

  • Sepehr Assadi
  • Sanjeev Khanna
  • Yang Li 0025

We study the problem of estimating the maximum matching size in graphs whose edges are revealed in a streaming manner. We consider both insertion-only streams, which only contain edge insertions, and dynamic streams that allow both insertions and deletions of the edges, and present new upper and lower bound results for both cases. On the upper bound front, we show that an α- approximate estimate of the matching size can be computed in dynamic streams using Õ( n 2 /a 4 ) space, and in insertion-only streams using Õ( n /a 2 )-space. These bounds respectively shave off a factor of α from the space necessary to compute an α-approximate matching (as opposed to only size), thus proving a non-trivial separation between approximate estimation and approximate computation of matchings in data streams. On the lower bound front, we prove that any α- approximation algorithm for estimating matching size in dynamic graph streams requires bits of space, even if the underlying graph is both sparse and has arboricity bounded by O ( α ). We further improve our lower bound to Ω( n /α 2 ) in the case of dense graphs. These results establish the first non-trivial streaming lower bounds for super- constant approximation of matching size. Furthermore, we present the first super-linear space lower bound for computing a (1 + ∊)-approximation of matching size even in insertion-only streams. In particular, we prove that a (1 + ∊)-approximation to matching size requires RS( n ) · η 1–0(∊) space; here, RS( n ) denotes the maximum number of edge-disjoint induced matchings of size Θ( n ) in an n -vertex graph. It is a major open problem with far-reaching implications to determine the value of RS( n ), and current results leave open the possibility that RS( n ) may be as large as n/logn. Moreover, using the best known lower bounds for RS( n ), our result already rules out any O ( n · poly(log n/e))-space algorithm for (1 + ∊)- approximation of matchings. We also show how to avoid the dependency on the parameter RS( n ) in proving lower bound for dynamic streams and present a near-optimal lower bound of n 2–0(£) for (1 + ∊)-approximation in this model. Using a well-known connection between matching size and matrix rank, all our lower bounds also hold for the problem of estimating matrix rank. In particular our results imply a near-optimal n 2–0(£) bit lower bound for (1 + ∊)- approximation of matrix ranks for dense matrices in dynamic streams, answering an open question of Li and Woodruff (STOC 2016).

SODA Conference 2016 Conference Paper

Maximum Matchings in Dynamic Graph Streams and the Simultaneous Communication Model

  • Sepehr Assadi
  • Sanjeev Khanna
  • Yang Li 0025
  • Grigory Yaroslavtsev

We study the problem of finding an approximate maximum matching in two closely related computational models, namely, the dynamic graph streaming model and the simultaneous multi-party communication model. In the dynamic graph streaming model, the input graph is revealed as a stream of edge insertions and deletions, and the goal is to design a small space algorithm to approximate the maximum matching. In the simultaneous model, the input graph is partitioned across k players, and the goal is to design a protocol where the k players simultaneously send a small-size message to a coordinator, and the coordinator computes an approximate matching. Dynamic graph streams. We resolve the space complexity of single-pass turnstile streaming algorithms for approximating matchings by showing that for any ∊ > 0, ⊝( n 2–3 e ) space is both sufficient and necessary (up to polylogarithmic factors) to compute an n ∊-approximate matching; here n denotes the number of vertices in the input graph. The simultaneous communication model. Our results for dynamic graph streams also resolve the (per-player) simultaneous communication complexity for approximating matchings in the edge partition model. For the vertex partition model, we design new randomized and deterministic protocols for k players to achieve an α -approximation. Specifically, for, we provide a randomized protocol with total communication of O ( nk / α 2 ) and a deterministic protocol with total communication of O ( nk / α ). Both these bounds are tight. Our work generalizes the results established by Dobzinski et al. (STOC 2014) for the special case of k = n. Finally, for the case of, we establish a new lower bound on the simultaneous communication complexity which is super-linear in n.

SODA Conference 2015 Conference Paper

On (1, ∊ )-Restricted Assignment Makespan Minimization

  • Deeparnab Chakrabarty
  • Sanjeev Khanna
  • Shi Li 0001

Makespan minimization on unrelated machines is a classic problem in approximation algorithms. No polynomial time (2 – δ)-approximation algorithm is known for the problem for constant δ > 0. This is true even for certain special cases, most notably the restricted assignment problem where each job has the same load on any machine but can be assigned to one from a specified subset. Recently in a breakthrough result, Svensson [16] proved that the integrality gap of a certain configuration LP relaxation is upper bounded by 1. 95 for the restricted assignment problem; however, the rounding algorithm is not known to run in polynomial time. In this paper we consider the (1, ε)-restricted assignment problem where each job is either heavy ( p j = 1) or light ( p j = ε), for some parameter ε > 0. Our main result is a (2 – δ)-approximate polynomial time algorithm for the (1, ε)-restricted assignment problem for a fixed constant δ > 0. Even for this special case, the best polynomial-time approximation factor known so far is 2. We obtain this result by rounding the configuration LP relaxation for this problem. A simple reduction from vertex cover shows that this special case remains NP-hard to approximate to within a factor better than 7/6.

ICRA Conference 2015 Conference Paper

On embeddability of modular robot designs

  • Yannis Mantzouratos
  • Tarik Tosun
  • Sanjeev Khanna
  • Mark Yim

We address the problem of detecting embeddability of modular robots: namely, to decide automatically whether a given modular robot design can simulate the functionality of a seemingly different design. To that end, we introduce a novel graph representation for modular robots and formalize the notion of embedding through topological and kinematic conditions. Based on that, we develop an algorithm that decides embeddability when the two involved designs have tree topologies. Our algorithm performs two passes and involves dynamic programming and maximum cardinality matching. We demonstrate our approach on real modular robots and show that we can detect embeddability of complex designs efficiently.

SODA Conference 2015 Conference Paper

Streaming Lower Bounds for Approximating MAX-CUT

  • Michael Kapralov
  • Sanjeev Khanna
  • Madhu Sudan 0001

We consider the problem of estimating the value of max cut in a graph in the streaming model of computation. At one extreme, there is a trivial 2-approximation for this problem that uses only O (log n ) space, namely, count the number of edges and output half of this value as the estimate for max cut value. On the other extreme, if one allows Õ ( n ) space, then a near-optimal solution to the max cut value can be obtained by storing an Õ ( n )-size sparsifier that essentially preserves the max cut. An intriguing question is if poly-logarithmic space suffices to obtain a non-trivial approximation to the max-cut value (that is, beating the factor 2). It was recently shown that the problem of estimating the size of a maximum matching in a graph admits a non-trivial approximation in poly-logarithmic space. Our main result is that any streaming algorithm that breaks the 2-approximation barrier requires space even if the edges of the input graph are presented in random order Our result is obtained by exhibiting a distribution over graphs which are either bipartite or -far from being bipartite, and establishing that space is necessary to differentiate between these two cases. Thus as a direct corollary we obtain that space is also necessary to test if a graph is bipartite or -far from being bipartite. We also show that for any ε > 0, any streaming algorithm that obtains a (1 + ε)-approximation to the max cut value when edges arrive in adversarial order requires n 1 - O (ε) space, implying that Ω( n ) space is necessary to obtain an arbitrarily good approximation to the max cut value.

SODA Conference 2014 Conference Paper

Approximating matching size from random streams

  • Michael Kapralov
  • Sanjeev Khanna
  • Madhu Sudan 0001

We present a streaming algorithm that makes one pass over the edges of an unweighted graph presented in random order, and produces a polylogarithmic approximation to the size of the maximum matching in the graph, while using only polylogarithmic space. Prior to this work the only approximations known were a folklore approximation with polylogarithmic space in an n vertex graph and a constant approximation with Ω( n ) space. Our work thus gives the first algorithm where both the space and approximation factors are smaller than any polynomial in n. Our algorithm is obtained by effecting a streaming implementation of a simple “local” algorithm that we design for this problem. The local algorithm produces a O ( k · n 1/ k ) approximation to the size of a maximum matching by exploring the radius k neighborhoods of vertices, for any parameter k. We show, somewhat surprisingly, that our local algorithm can be implemented in the streaming setting even for k = Ω (log n /log log n ). Our analysis exposes some of the problems that arise in such conversions of local algorithms into streaming ones, and gives techniques to overcome such problems.

SODA Conference 2014 Conference Paper

Disjoint Set Union with Randomized Linking

  • Ashish Goel
  • Sanjeev Khanna
  • Daniel H. Larkin
  • Robert Endre Tarjan

A classic result in the analysis of data structures is that path compression with linking by rank solves the disjoint set union problem in almost-constant amortized time per operation. Recent experiments suggest that in practice, a naïve linking method works just as well if not better than linking by rank, in spite of being theoretically inferior. How can this be? We prove that randomized linking is asymptotically as efficient as linking by rank. This result provides theory that matches the experiments, which implicitly do randomized linking as a result of the way the input instances are generated.

SODA Conference 2012 Conference Paper

On the communication and streaming complexity of maximum bipartite matching

  • Ashish Goel
  • Michael Kapralov
  • Sanjeev Khanna

Consider the following communication problem. Alice holds a graph G A = ( P, Q, E A ) and Bob holds a graph G B = ( P, Q, E B ), where | P | = | Q | = n. Alice is allowed to send Bob a message m that depends only on the graph G A. Bob must then output a matching M ⊆ E A ∪ E B. What is the minimum message size of the message m that Alice sends to Bob that allows Bob to recover a matching of size at least (1 − ∊) times the maximum matching in G A ∪ G B? The minimum message length is the one-round communication complexity of approximating bipartite matching. It is easy to see that the one-round communication complexity also gives a lower bound on the space needed by a one-pass streaming algorithm to compute a (1 − ∊)-approximate bipartite matching. The focus of this work is to understand one-round communication complexity and one-pass streaming complexity of maximum bipartite matching. In particular, how well can one approximate these problems with linear communication and space? Prior to our work, only a ½-approximation was known for both these problems. In order to study these questions, we introduce the concept of an ∊-matching cover of a bipartite graph G, which is a sparse subgraph of the original graph that preserves the size of maximum matching between every subset of vertices to within an additive en error. We give a polynomial time construction of a ½-matching cover of size O ( n ) with some crucial additional properties, thereby showing that Alice and Bob can achieve a ⅔-approximation with a message of size O ( n ). While we do not provide bounds on the size of ∊-matching covers for ∊ < 1/2, we prove that in general, the size of the smallest ∊-matching cover of a graph G on n vertices is essentially equal to the size of the largest so-called ∊-Ruzsa Szemerédi graph on n vertices. We use this connection to show that for any δ > 0, a (⅔ + δ)-approximation requires a communication complexity of n 1+Ω(1/ log log n ). We also consider the natural restrictingon of the problem in which G A and G B are only allowed to share vertices on one side of the bipartition, which is motivated by applications to one-pass streaming with vertex arrivals. We show that a ¾ -approximation can be achieved with a linear size message in this case, and this result is best possible in that super-linear space is needed to achieve any better approximation. Finally, we build on our techniques for the restricted version above to design one-pass streaming algorithm for the case when vertices on one side are known in advance, and the vertices on the other side arrive in a streaming manner together with all their incident edges. This is precisely the setting of the celebrated (1 − 1/ε)-competitive randomized algorithm of Karp-Vazirani-Vazirani (KVV) for the online bipartite matching problem [12]. We present here the first deterministic one-pass streaming (1 − 1/ε)-approximation algorithm using O ( n ) space for this setting.

FOCS Conference 2011 Conference Paper

Algorithms for the Generalized Sorting Problem

  • Zhiyi Huang 0002
  • Sampath Kannan
  • Sanjeev Khanna

We study the generalized sorting problem where we are given a set of n elements to be sorted but only a subset of all possible pairwise element comparisons is allowed. The goal is to determine the sorted order using the smallest possible number of allowed comparisons. The generalized sorting problem may be equivalently viewed as follows. Given an undirected graph G(V, E) where V is the set of elements to be sorted and E defines the set of allowed comparisons, adaptively find the smallest subset E' ⊆ E of edges to probe such that the directed graph induced by E' contains a Hamiltonian path. When G is a complete graph, we get the standard sorting problem, and it is well-known that Θ(n log n) comparisons are necessary and sufficient. An extensively studied special case of the generalized sorting problem is the nuts and bolts problem where the allowed comparison graph is a complete bipartite graph between two equal-size sets. It is known that for this special case also, there is a deterministic algorithm that sorts using Θ(n log n) comparisons. However, when the allowed comparison graph is arbitrary, to our knowledge, no bound better than the trivial Õ(n 2 ) bound is known. Our main result is a randomized algorithm that sorts any allowed comparison graph using O(n 3/2 ) comparisons with high probability (provided the input is sortable). We also study the sorting problem in randomly generated allowed comparison graphs, and show that when the edge probability is p, Õ(min{p 2 /n, n 3/2 √p}) comparisons suffice on average to sort.

FOCS Conference 2011 Conference Paper

Delays and the Capacity of Continuous-Time Channels

  • Sanjeev Khanna
  • Madhu Sudan 0001

Any physical channel of communication offers two potential reasons why its capacity (the number of bits it can transmit in a unit of time) might be unbounded: (1) (Uncountably) infinitely many choices of signal strength at any given instant of time, and (2) (Uncountably) infinitely many instances of time at which signals may be sent. However channel noise cancels out the potential unboundedness of the first aspect, leaving typical channels with only a finite capacity per instant of time. The latter source of infinity seems less extensively studied. A potential source of unreliability that might restrict the capacity also from the second aspect is ``delay'': Signals transmitted by the sender at a given point of time may not be received with a predictable delay at the receiving end. In this work we examine this source of uncertainty by considering a simple discrete model of delay errors. In our model the communicating parties get to subdivide time as microscopically finely as they wish, but still have to cope with communication delays that are macroscopic and variable. The continuous process becomes the limit of our process as the time subdivision becomes infinitesimal. We taxonomize this class of communication channels based on whether the delays and noise are stochastic or adversarial, and based on how much information each aspect has about the other when introducing its errors. We analyze the limits of such channels and reach somewhat surprising conclusions: The capacity of a physical channel is finitely bounded only if at least one of the two sources of error (signal noise or delay noise) is adversarial. In particular the capacity is finitely bounded only if the delay is adversarial, or the noise is adversarial and acts with knowledge of the stochastic delay. If both error sources are stochastic, or if the noise is adversarial and independent of the stochastic delay, then the capacity of the associated physical channel is infinite!

STOC Conference 2010 Conference Paper

Perfect matchings in o( n log n ) time in regular bipartite graphs

  • Ashish Goel
  • Michael Kapralov
  • Sanjeev Khanna

In this paper we consider the well-studied problem of finding a perfect matching in a d-regular bipartite graph on 2n nodes with m=nd edges. The best-known algorithm for general bipartite graphs (due to Hopcroft and Karp) takes time O(m√n). In regular bipartite graphs, however, a matching is known to be computable in O(m) time (due to Cole, Ost, and Schirra). In a recent line of work by Goel, Kapralov, and Khanna the O(m) time bound was improved first to ~ O(min m, n 2.5 /d) and then to ~O(min {m, n 2 /d\}). In this paper, we give a randomized algorithm that finds a perfect matching in a d-regular graph and runs in O(n log n) time (both in expectation and with high probability). The algorithm performs an appropriately truncated alternating random walk to successively find augmenting paths. Our algorithm may be viewed as using adaptive uniform sampling, and is thus able to bypass the limitations of (non-adaptive) uniform sampling established in earlier work. Our techniques also give an algorithm that successively finds a matching in the support of a doubly stochastic matrix in expected time O(n log 2 n), with O(m) pre-processing time; this gives a simple O(m+mn log 2 n) time algorithm for finding the Birkhoff-von Neumann decomposition of a doubly stochastic matrix. We show that randomization is crucial for obtaining o(nd) time algorithms by establishing an Ω(nd) lower bound for deterministic algorithms. We also show that there does not exist a randomized algorithm that finds a matching in a regular bipartite multigraph and takes o(n log n) time with high probability.

FOCS Conference 2009 Conference Paper

An O(k^3 log n)-Approximation Algorithm for Vertex-Connectivity Survivable Network Design

  • Julia Chuzhoy
  • Sanjeev Khanna

In the Survivable Network Design problem (SNDP), we are given an undirected graph G(V, E) with costs on edges, along with a connectivity requirement r(u, v) for each pair u, v of vertices. The goal is to find a minimum-cost subset E* of edges, that satisfies the given set of pairwise connectivity requirements. In the edge-connectivity version we need to ensure that there are r(u, v) edge-disjoint paths for every pair u, v of vertices, while in the vertex-connectivity version the paths are required to be vertex-disjoint. The edge-connectivity version of SNDP is known to have a 2-approximation. However, no non-trivial approximation algorithm has been known so far for the vertex version of SNDP, except for special cases of the problem. We present an extremely simple algorithm to achieve an O(k 3 log |T|)-approximation for this problem, where k denotes the maximum connectivity requirement, and T is the set of vertices that participate in one or more pairs with non-zero connectivity requirements. We also give a simple proof of the recently discovered O(k 3 log |T|)-approximation algorithm for the single-source version of vertex-connectivity SNDP. Our results establish a natural connection between vertex-connectivity and a well-understood generalization of edge-connectivity, namely, element-connectivity, in that, any instance of vertex-connectivity can be expressed by a small number of instances of the element-connectivity problem.

FOCS Conference 2009 Conference Paper

Dynamic and Non-uniform Pricing Strategies for Revenue Maximization

  • Tanmoy Chakraborty 0001
  • Zhiyi Huang 0002
  • Sanjeev Khanna

We study the ITEM PRICING problem for revenue maximization in the limited supply setting, where a single seller with n distinct items caters to m buyers with unknown subadditive valuation functions who arrive in a sequence. The seller sets the prices on individual items. Each buyer buys a subset of yet unsold items that maximizes her utility. Our goal is to design pricing strategies that guarantee an expected revenue that is within a small multiplicative factor of the optimal social welfare an upper bound on the maximum revenue that can be generated by any pricing mechanism. Most earlier work has focused on the unlimited supply setting, where selling an item to a buyer does not affect the availability of the item to the future buyers. Recently, Balcan et. al. studied the limited supply setting, giving a randomized pricing strategy that achieves a 2 O(? (log n log log n) -approximation; their strategy assigns a single price to all items (uniform pricing), and never changes it (static pricing). They also showed that no pricing strategy that is both static and uniform can give better than 2? ?(log1/4 n) -approximation. Our first result is a strengthening of the lower bound on approximation achievable by static uniform pricing to 2? ?(log n). We then design dynamic uniform pricing strategies (all items are identically priced but item prices can change over time), that achieves O(log 2 n)-approximation, and also show a lower bound of? ((log n/ log log n) 2 ) for this class of strategies. Our strategies are simple to implement, and in particular, one strategy is to smoothly decrease the price over time. We also design a static nonuniform pricing strategy (different items can have different prices but prices do not change over time), that give poly-logarithmic approximation in a more restricted setting with few buyers. Thus in the limited supply setting, our results highlight a strong separation between the power of dynamic and non-uniform pricing strategies versus static uniform pricing strategy. To our knowledge, this is the first non-trivial analysis of dynamic and non-uniform pricing schemes for revenue maximization in a setting with multiple distinct items.

FOCS Conference 2009 Conference Paper

On Allocating Goods to Maximize Fairness

  • Deeparnab Chakrabarty
  • Julia Chuzhoy
  • Sanjeev Khanna

We consider the Max-Min Allocation problem: given a set A of m agents and a set I of n items, where agent A ¿ A has utility u A, i for item i ¿ I, our goal is to allocate items to agents so as to maximize fairness. Specifically, the utility of an agent is the sum of its utilities for the items it receives, and we seek to maximize the minimum utility of any agent. While this problem has received much attention recently, its approximability has not been well-understood thus far: the best known approximation algorithm achieves an O¿(¿m)-approximation, and in contrast, the best known hardness of approximation stands at 2. Our main result is an algorithm that achieves an O¿(n ¿ )-approximation for any ¿ = ¿((log log n)/(log n)) in time n O(1/¿). In particular, we obtain poly-logarithmic approximation in quasipolynomial time, and for every constant ¿ > 0, we obtain an O¿(n ¿ )-approximation in polynomial time. An interesting technical aspect of our algorithm is that we use as a building block a linear program whose integrality gap is ¿(¿m). We bypass this obstacle by iteratively using the solutions produced by the LP to construct new instances with significantly smaller integrality gaps, eventually obtaining the desired approximation. As a corollary of our main result, we also show that for any constant ¿ > 0, an O(m ¿ )-approximation can be achieved in quasi-polynomial time. We also investigate the special case of the problem, where every item has non-zero utility for at most two agents. This problem is hard to approximate up to any factor better than 2. We give a factor 2-approximation algorithm.

FOCS Conference 2008 Conference Paper

Algorithms for Single-Source Vertex Connectivity

  • Julia Chuzhoy
  • Sanjeev Khanna

In the survivable network design problem (SNDP) the goal is to find a minimum cost subset of edges that satisfies a given set of pairwise connectivity requirements among the vertices. This general network design framework has been studied extensively and is tied to the development of major algorithmic techniques. For the edge-connectivity version of the problem, a 2-approximation algorithm is known for arbitrary pairwise connectivity requirements. However, no non-trivial algorithms are known for its vertex connectivity counterpart. In fact, even highly restricted special cases of the vertex connectivity version remain poorly understood. We study the single-source k-vertex connectivity version of SNDP. We are given a graph G(V, E) with a subset T of terminals and a source vertex s, and the goal is to find a minimum cost subset of edges ensuring that every terminal is k-vertex connected to s. Our main result is an O(k log n)-approximation algorithm for this problem; this improves upon the recent 2 O(k 2 ) log 4 n-approximation. Our algorithm is based on an intuitive rerouting scheme. The analysis relies on a structural result that may be of independent interest: we show that any solution can be decomposed into a disjoint collection of multiple-legged spiders, which are then used to re-route flow from terminals to the source via other terminals. We also obtain the first non-trivial approximation algorithm for the vertex-cost version of the same problem, achieving an O(k 7 log 2 n)-approximation.

FOCS Conference 2005 Conference Paper

Hardness of the Undirected Edge-Disjoint Paths Problem with Congestion

  • Matthew Andrews
  • Julia Chuzhoy
  • Sanjeev Khanna
  • Lisa Zhang 0001

In the edge-disjoint paths problem with congestion (EDPwC), we are given a graph with n nodes, a set of terminal pairs and an integer c. The objective is to route as many terminal pairs as possible, subject to the constraint that at most c demands can be routed through any edge in the graph. When c = 1, the problem is simply referred to as the edge-disjoint paths (EDP) problem. In this paper, we study the hardness of EDPwC in undirected graphs. We obtain an improved hardness result for EDP, and also show the first polylogarithmic integrality gaps and hardness of approximation results for EDPwC. Specifically, we prove that EDP is (log/sup 1/2 - /spl epsiv// n)-hard to approximate for any constant /spl epsiv/ > 0, unless NP /spl sube/ ZPTIME(n/sup polylog n/). We also show that for any congestion c = o(log log n/log log log n), there is no (log/sup (1-/spl epsiv/)/(c+1)/ n) approximation algorithm for EDPwC, unless NP /spl sube/ ZPTIME(n/sup polylog n/). For larger congestion, where c /spl les/ /spl eta/ log log n/log log log n for some constant /spl eta/, we obtain superconstant inapproximability ratios. All of our hardness results can be converted into integrality gaps for the multicommodity flow relaxation. We also present a separate elementary direct proof of this integrality gap result. Finally, we note that similar results can be obtained for the all-or-nothing flow (ANF) problem, a relaxation of EDP, in which the flow unit routed between the source-sink pairs does not have follow a single path, so the resulting flow is not necessarily integral. Using standard transformations, our results also extend to the node-disjoint versions of these problems as well as to the directed setting.

FOCS Conference 2004 Conference Paper

Edge-Disjoint Paths in Planar Graphs

  • Chandra Chekuri
  • Sanjeev Khanna
  • F. Bruce Shepherd

We study the maximum edge-disjoint paths problem (MEDP). We are given a graph G = (V, E) and a set T = {s/sub 1/t/sup 1/, s/sub 2/t/sup 2/, .. ., s/sub k/t/sup k/} of pairs of vertices: the objective is to find the maximum number of pairs in T that can be connected via edge-disjoint paths. Our main result is a poly-logarithmic approximation for MEDP on undirected planar graphs if a congestion of 2 is allowed, that is, we allow up to 2 paths to share an edge. Prior to our work, for any constant congestion, only a polynomial-factor approximation was known for planar graphs although much stronger results are known for some special cases such as grids and grid-like graphs. We note that the natural multi-commodity flow relaxation of the problem has an integrality gap of /spl Omega/(/spl radic/|V|) even on planar graphs when no congestion is allowed. Our starting point is the same relaxation and our result implies that the integrality gap shrinks to a poly-logarithmic factor once 2 paths are allowed per edge. Our result also extends to the unsplittable flow problem and the maximum integer multicommodity flow problem. A set X /spl sube/V is well-linked if for each S /spl sub/ V, |/spl delta/(S)| /spl ges/ min{|S /spl cap/ X |, |(V - S) /spl cap/ X|}. The heart of our approach is to show that in any undirected planar graph, given any matching M on a well-linked set X, we can route /spl Omega/(|M|) pairs in M with a congestion of 2. Moreover, all pairs in M can be routed with constant congestion for a sufficiently large constant. This results also yields a different proof of a theorem of Klein, Plotkin, and Rao that shows an O(1) maxflow-mincut gap for uniform multicommodity flow instances in planar graphs. The framework developed in this paper applies to general graphs as well. If a certain graph theoretic conjecture is true, it yields poly-logarithmic integrality gap for MEDP with constant congestion.

FOCS Conference 2004 Conference Paper

Machine Minimization for Scheduling Jobs with Interval Constraints

  • Julia Chuzhoy
  • Sudipto Guha
  • Sanjeev Khanna
  • Joseph Naor

The problem of scheduling jobs with interval constraints is a well-studied classical scheduling problem. The input to the problem is a collection of n jobs where each job has a set of intervals on which it can be scheduled. The goal is to minimize the total number of machines needed to schedule all jobs subject to these interval constraints. In the continuous version, the allowed intervals associated with a job form a continuous time segment, described by a release date and a deadline. In the discrete version of the problem, the set of allowed intervals for a job is given explicitly. So far, only an O(log n/( log log n))-approximation is known for either version of the problem, obtained by a randomized rounding of a natural linear programming relaxation of the problem. In fact, we show here that this analysis is tight for both versions of the problem by providing a matching lower bound on the integrality gap of the linear program. Moreover, even when all jobs can be scheduled on a single machine, the discrete case has recently been shown to be /spl Omega/(log log n)-hard to approximate. In this paper, we provide improved approximation factors for the number of machines needed to schedule all jobs in the continuous version of the problem. Our main result is an O(1)-approximation algorithm when the optimal number of machines needed is bounded by a fixed constant. Thus, our results separate the approximability of the continuous and the discrete cases of the problem. For general instances, we strengthen the natural linear programming relaxation in a recursive manner by forbidding certain configurations which cannot arise in an integral feasible solution. This yields an O(OPT)-approximation, where OPT denotes the number of machines needed by an optimal solution. Combined with earlier results, our work implies an O(/spl radic/log n/(log log n))-approximation for any value of OPT.

STOC Conference 2004 Conference Paper

Multi-processor scheduling to minimize flow time with epsilon resource augmentation

  • Chandra Chekuri
  • Ashish Goel
  • Sanjeev Khanna
  • Amit Kumar 0001

We investigate the problem of online scheduling of jobs to minimize flow time and stretch on m identical machines. We consider the case where the algorithm is given either (1+ε)m machines or m machines of speed (1+ε), for arbitrarily small ε > 0. We show that simple randomized and deterministic load balancing algorithms, coupled with simple single machine scheduling strategies such as SRPT (shortest remaining processing time) and SJF (shortest job first), are O(poly(1/ε))-competitive for both flow time and stretch. These are the first results which prove constant factor competitive ratios for flow time or stretch with arbitrarily small resource augmentation. Both the randomized and the deterministic load balancing algorithms are non-migratory and do immediate dispatch of jobs.The randomized algorithm just allocates each incoming job to a random machine. Hence this algorithm is non-clairvoyant, and coupled with SETF (shortest elapsed time first), yields the first non-clairvoyant algorithm which is constant competitive for minimizing flow time with arbitrarily small resource augmentation. The deterministic algorithm that we analyze is due to Avrahami and Azar. For this algorithm, we show O(1/ε)-competitiveness for total flow time and stretch, and also for their L p norms, for any fixed p ≥ 1.

IROS Conference 2003 Conference Paper

Target tracking with distributed sensors: the focus of attention problem

  • Volkan Isler
  • John R. Spletzer
  • Sanjeev Khanna
  • Camillo J. Taylor

In this paper, we investigate data fusion techniques for target tracking using distributed sensors. Specifically, we are interested in how pairs of bearing or range sensors can be best assigned to targets in order to minimize the expected error in the estimates. We refer to this as the focus of attention (FOA) problem. In its general form, FOA is NP-hard and not well approximable. However, for specific geometries we obtain significant approximation results: a 2-approximation algorithm for stereo cameras on a line, a PTAS for when the cameras are equidistant, and a 1. 42 approximation for equally spaced range sensors on a circle. By reposing as a maximization problem - where the goal is to maximize the number of tracks with bounded error - we are able to leverage results from maximum set-packing to render the problem approximable. We demonstrate the results in simulation for a target tracking task, and for localizing a team of mobile agents in a sensor network. These results provide insights into sensor/target assignment strategies, as well as sensor placement in a distributed network.

FOCS Conference 1999 Conference Paper

Approximation Schemes for Minimizing Average Weighted Completion Time with Release Dates

  • Foto N. Afrati
  • Evripidis Bampis
  • Chandra Chekuri
  • David R. Karger
  • Claire Mathieu
  • Sanjeev Khanna
  • Ioannis Milis
  • Maurice Queyranne

We consider the problem of scheduling n jobs with release dates on m machines so as to minimize their average weighted completion time. We present the first known polynomial time approximation schemes for several variants of this problem. Our results include PTASs for the case of identical parallel machines and a constant number of unrelated machines with and without preemption allowed. Our schemes are efficient: for all variants the running time for /spl alpha/(1+/spl epsiv/) approximation is of the form f(1//spl epsiv/, m)poly(n).

FOCS Conference 1994 Conference Paper

On Syntactic versus Computational Views of Approximability

  • Sanjeev Khanna
  • Rajeev Motwani 0001
  • Madhu Sudan 0001
  • Umesh V. Vazirani

We attempt to reconcile the two distinct views of approximation classes: syntactic and computational. Syntactic classes such as MAX SNP permit structural results and have natural complete problems, while computational classes such as APX allow us to work with classes of problems whose approximability is well-understood. Our results provide a syntactic characterization of computational classes, and give a computational framework for syntactic classes. >