Arrow Research search
Back to STOC

STOC 2025

Solving the Correlation Cluster LP in Sublinear Time

Conference Paper 7B Algorithms and Complexity · Theoretical Computer Science

Abstract

Correlation Clustering is a fundamental and widely-studied problem in unsupervised learning and data mining. The input is a graph and the goal is to construct a clustering minimizing the number of inter-cluster edges plus the number of missing intra-cluster edges. Cao, Cohen-Addad, Lee, Li, Newman, and Vogl [STOC 2024] introduced the cluster LP for Correlation Clustering, which they argued captures the problem much more succinctly than previous linear programming formulations. However, the cluster LP has exponential size, with a variable for every possible set of vertices in the input graph. Nevertheless, they showed how to find a feasible solution for the cluster LP in time O ( n poly(1/ε) ) with objective value at most (1+ε) times the value of an optimal solution for the respective Correlation Clustering instance. Furthermore, they showed how to round a solution to the cluster LP, yielding a (1.437+ε)-approximation algorithm for the Correlation Clustering problem. The main technical result of this paper is a new approach to find a feasible solution for the cluster LP with objective value at most (1+ε) of the optimum in time O (2 poly(1/ε) n ), where n is the number of vertices in the graph. We also show how to implement the rounding within the same time bounds, thus achieving a fast (1.437+ε)-approximation algorithm for the Correlation Clustering problem. This bridges the gap between state-of-the-art methods for approximating Correlation Clustering and the recent focus on fast algorithms.

Authors

Keywords

  • Approximation Algorithms
  • Clustering
  • Exponential-Size Linear Programming
  • MPC Algorithm
  • Sublinear-Time Algorithm

Context

Venue
ACM Symposium on Theory of Computing
Archive span
1969-2025
Indexed papers
4364
Paper id
1005006028389429210