Arrow Research search

Author name cluster

Alantha Newman

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers
2 author rows

Possible papers

10

STOC Conference 2025 Conference Paper

Solving the Correlation Cluster LP in Sublinear Time

  • Nairen Cao
  • Vincent Cohen-Addad
  • Euiwoong Lee
  • Shi Li 0001
  • David Rasmussen Lolck
  • Alantha Newman
  • Mikkel Thorup
  • Lukas Vogl

Correlation Clustering is a fundamental and widely-studied problem in unsupervised learning and data mining. The input is a graph and the goal is to construct a clustering minimizing the number of inter-cluster edges plus the number of missing intra-cluster edges. Cao, Cohen-Addad, Lee, Li, Newman, and Vogl [STOC 2024] introduced the cluster LP for Correlation Clustering, which they argued captures the problem much more succinctly than previous linear programming formulations. However, the cluster LP has exponential size, with a variable for every possible set of vertices in the input graph. Nevertheless, they showed how to find a feasible solution for the cluster LP in time O ( n poly(1/ε) ) with objective value at most (1+ε) times the value of an optimal solution for the respective Correlation Clustering instance. Furthermore, they showed how to round a solution to the cluster LP, yielding a (1.437+ε)-approximation algorithm for the Correlation Clustering problem. The main technical result of this paper is a new approach to find a feasible solution for the cluster LP with objective value at most (1+ε) of the optimum in time O (2 poly(1/ε) n ), where n is the number of vertices in the graph. We also show how to implement the rounding within the same time bounds, thus achieving a fast (1.437+ε)-approximation algorithm for the Correlation Clustering problem. This bridges the gap between state-of-the-art methods for approximating Correlation Clustering and the recent focus on fast algorithms.

FOCS Conference 2023 Conference Paper

Handling Correlated Rounding Error via Preclustering: A 1. 73-approximation for Correlation Clustering

  • Vincent Cohen-Addad
  • Euiwoong Lee
  • Shi Li 0001
  • Alantha Newman

We consider the classic correlation clustering problem: Given a complete graph where edges are labelled either + or −, the goal is to find a partition of the vertices that minimizes the sum of the +edges across parts plus the sum of the −edges within parts. Recently, Cohen-Addad, Lee and Newman [CLN22] gave a 1. 995-approximation for the problem using the Sherali-Adams hierarchy, hence beating the integrality gap of 2 of the classic linear program. We significantly improve upon this result by providing a 1. 73-approximation for the problem. Our approach brings together a new preprocessing of correlation clustering instances that enables a new LP formulation which combined with the algorithm from [CLN22] yields the improved bound.

FOCS Conference 2022 Conference Paper

Correlation Clustering with Sherali-Adams

  • Vincent Cohen-Addad
  • Euiwoong Lee
  • Alantha Newman

Given a complete graph G=(V, E) where each edge is labeled + or −, the CORRELATION CLUSTERING problem asks to partition V into clusters to minimize the number of +edges between different clusters plus the number of −edges within the same cluster. CORRELATION CLUSTERING has been used to model a large number of clustering problems in practice, making it one of the most widely studied clustering formulations. The approximability of CORRELATION CLUSTERING has been actively investigated [BBC04], [CGW05], [ACN08], culminating in a 2. 06-approximation algorithm [CMSY15], based on rounding the standard LP relaxation. Since the integrality gap for this formulation is 2, it has remained a major open question to determine if the approximation factor of 2 can be reached, or even breached. In this paper, we answer this question affirmatively by showing that there exists a($1. 994+\varepsilon$)-approximation algorithm based on $O(1/\varepsilon^{2})$ rounds of the Sherali-Adams hierarchy. In order to round a solution to the Sherali-Adams relaxation, we adapt the correlated rounding originally developed for CSPs [BRSII], [GSII], [RT12]. With this tool, we reach an approximation ratio of $2+\varepsilon$ for CORRELATION CLUSTERING. To breach this ratio, we go beyond the traditional triangle-based analysis by employing a global charging scheme that amortizes the total cost of the rounding across different triangles.

FOCS Conference 2012 Conference Paper

Beck's Three Permutations Conjecture: A Counterexample and Some Consequences

  • Alantha Newman
  • Ofer Neiman
  • Aleksandar Nikolov

Given three permutations on the integers 1 through n, consider the set system consisting of each interval in each of the three permutations. In 1982, Beck conjectured that the discrepancy of this set system is O(1). In other words, the conjecture says that each integer from 1 through n can be colored either red or blue so that the number of red and blue integers in each interval of each permutations differs only by a constant. (The discrepancy of a set system based on two permutations is at most two.) Our main result is a counterexample to this conjecture: for any positive integer n = 3 k, we construct three permutations whose corresponding set system has discrepancy Ω(log n). Our counterexample is based on a simple recursive construction, and our proof of the discrepancy lower bound is by induction. This construction also disproves a generalization of Beck's conjecture due to Spencer, Srinivasan and Tetali, who conjectured that a set √ system corresponding to £ permutations has discrepancy O(√ℓ). Our work was inspired by an intriguing paper from SODA 2011 by Eisenbrand, Palvolgyi and Rothvoß, who show a surprising connection between the discrepancy of three permutations and the bin packing problem: They show that Beck's conjecture implies a constant worst-case bound on the additive integrality gap for the Gilmore-Gomory LP relaxation for bin packing in the special case when all items have sizes strictly between 1/4 and 1/2, also known as the three partition problem. Our counterexample shows that this approach to bounding the additive integrality gap for bin packing will not work. We can, however, prove an interesting implication of our construction in the reverse direction: there are instances of bin packing and corresponding optimal basic feasible solutions for the Gilmore-Gomory LP relaxation such that any packing that contains only patterns from the support of these solutions requires at least opt + Ω(log m) bins, where m is the number of items. Finally, we discuss some implications that our construction has for other areas of discrepancy theory.

SODA Conference 2011 Conference Paper

Tight Hardness Results for Minimizing Discrepancy

  • Moses Charikar
  • Alantha Newman
  • Aleksandar Nikolov

In the Discrepancy problem, we are given M sets { S 1, …, S M } on N elements. Our goal is to find an assignment χ of {– 1, +1} values to elements, so as to minimize the maximum discrepancy. Recently, Bansal gave an efficient algorithm for achieving O (√ N ) discrepancy for any set system where M = O ( N ) [Ban10], giving a constructive version of Spencer's proof that the discrepancy of any set system is at most O (√ N ) for this range of M [Spe85]. We show that from the perspective of computational efficiency, these results are tight for general set systems where M = O ( N ). Specifically, we show that it is NP-hard to distinguish between such set systems with discrepancy zero and those with discrepancy Ω(√ N ). This means that even if the optimal solution has discrepancy zero, we cannot hope to efficiently find a coloring with discrepancy o (√ N ). We also consider the hardness of the Discrepancy problem on sets with bounded shatter function, and show that the upper bounds due to Matoušek [Mat95] are tight for these sets systems as well. The hardness results in both settings are obtained from a common framework: we compose a family of high discrepancy set systems with set systems for which it is NP-hard to distinguish instances with discrepancy zero from instances in which a large number of the sets (i. e. constant fraction of the sets) have non-zero discrepancy. Our composition amplifies this zero versus non-zero gap.

TCS Journal 2007 Journal Article

Decision-making based on approximate and smoothed Pareto curves

  • Heiner Ackermann
  • Alantha Newman
  • Heiko Röglin
  • Berthold Vöcking

We consider bicriteria optimization problems and investigate the relationship between two standard approaches to solving them: (i) computing the Pareto curve and (ii) the so-called decision maker’s approach in which both criteria are combined into a single (usually nonlinear) objective function. Previous work by Papadimitriou and Yannakakis showed how to efficiently approximate the Pareto curve for problems like Shortest Path, Spanning Tree, and Perfect Matching. We wish to determine for which classes of combined objective functions the approximate Pareto curve also yields an approximate solution to the decision maker’s problem. We show that an FPTAS for the Pareto curve also gives an FPTAS for the decision-maker’s problem if the combined objective function is growth bounded like a quasi-polynomial function. If the objective function, however, shows exponential growth then the decision-maker’s problem is NP-hard to approximate within any polynomial factor. In order to bypass these limitations of approximate decision making, we turn our attention to Pareto curves in the probabilistic framework of smoothed analysis. We show that in a smoothed model, we can efficiently generate the (complete and exact) Pareto curve with a small failure probability if there exists an algorithm for generating the Pareto curve whose worst-case running time is pseudopolynomial. This way, we can solve the decision-maker’s problem w. r. t. any non-decreasing objective function for randomly perturbed instances of, e. g. Shortest Path, Spanning Tree, and Perfect Matching.

STOC Conference 2005 Conference Paper

Aggregating inconsistent information: ranking and clustering

  • Nir Ailon
  • Moses Charikar
  • Alantha Newman

We address optimization problems in which we are given contradictory pieces of input information and the goal is to find a globally consistent solution that minimizes the number of disagreements with the respective inputs. Specifically, the problems we address are rank aggregation, the feedback arc set problem on tournaments, and correlation and consensus clustering. We show that for all these problems (and various weighted versions of them), we can obtain improved approximation factors using essentially the same remarkably simple algorithm. Additionally, we almost settle a long-standing conjecture of Bang-Jensen and Thomassen and show that unless NP⊆BPP, there is no polynomial time algorithm for the problem of minimum feedback arc set in tournaments.