Arrow Research search

Author name cluster

Slobodan Mitrovic

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

18 papers
2 author rows

Possible papers

18

ICML Conference 2025 Conference Paper

Breaking the n1. 5 Additive Error Barrier for Private and Efficient Graph Sparsification via Private Expander Decomposition

  • Anders Aamand
  • Justin Y. Chen
  • Mina Dalirrooyfard
  • Slobodan Mitrovic
  • Yuriy Nevmyvaka
  • Sandeep Silwal
  • Yinzhan Xu

We study differentially private algorithms for graph cut sparsification, a fundamental problem in algorithms, privacy, and machine learning. While significant progress has been made, the best-known private and efficient cut sparsifiers on $n$-node graphs approximate each cut within $\widetilde{O}(n^{1. 5})$ additive error and $1+\gamma$ multiplicative error for any $\gamma > 0$ [Gupta, Roth, Ullman TCC’12]. In contrast, inefficient algorithms, i. e. , those requiring exponential time, can achieve an $\widetilde{O}(n)$ additive error and $1+\gamma$ multiplicative error [Eliáš, Kapralov, Kulkarni, Lee SODA’20]. In this work, we break the $n^{1. 5}$ additive error barrier for private and efficient cut sparsification. We present an $(\varepsilon, \delta)$-DP polynomial time algorithm that, given a non-negative weighted graph, outputs a private synthetic graph approximating all cuts with multiplicative error $1+\gamma$ and additive error $n^{1. 25 + o(1)}$ (ignoring dependencies on $\varepsilon, \delta, \gamma$). At the heart of our approach lies a private algorithm for expander decomposition, a popular and powerful technique in (non-private) graph algorithms.

NeurIPS Conference 2025 Conference Paper

Differentially Private Gomory-Hu Trees

  • Anders Aamand
  • Justin Chen
  • Mina Dalirrooyfard
  • Slobodan Mitrovic
  • Yuriy Nevmyvaka
  • Sandeep Silwal
  • Yinzhan Xu

Given an undirected, weighted $n$-vertex graph $G = (V, E, w)$, a Gomory-Hu tree $T$ is a weighted tree on $V$ that preserves the Min-$s$-$t$-Cut between any pair of vertices $s, t \in V$. Finding cuts in graphs is a key primitive in problems such as bipartite matching, spectral and correlation clustering, and community detection. We design a differentially private (DP) algorithm that computes an approximate Gomory-Hu tree. Our algorithm is $\varepsilon$-DP, runs in polynomial time, and can be used to compute $s$-$t$ cuts that are $\tilde{O}(n/\varepsilon)$-additive approximations of the Min-$s$-$t$-Cuts in $G$ for all distinct $s, t \in V$ with high probability. Our error bound is essentially optimal, since [Dalirrooyfard, Mitrovic and Nevmyvaka, Neurips 2023] showed that privately outputting a single Min-$s$-$t$-Cut requires $\Omega(n)$ additive error even with $(\varepsilon, \delta)$-DP and allowing for multiplicative error. Prior to our work, the best additive error bounds for approximate all-pairs Min-$s$-$t$-Cuts were $O(n^{3/2}/\varepsilon)$ for $\varepsilon$-DP [Gupta, Roth, Ullman, TCC 2009] and $\tilde{O}(\sqrt{mn}/ \varepsilon)$ for $(\varepsilon, \delta)$-DP [Liu, Upadhyay and Zou, SODA 2024], both achieved by DP algorithms that preserve all cuts in the graph. To achieve our result, we develop an $\varepsilon$-DP algorithm for the Minimum Isolating Cuts problem with near-linear error, and introduce a novel privacy composition technique combining elements of both parallel and basic composition to handle `bounded overlap' computational branches in recursive algorithms, which maybe of independent interest.

NeurIPS Conference 2025 Conference Paper

New Parallel and Streaming Algorithms for Directed Densest Subgraph

  • Slobodan Mitrovic
  • Theodore Pan
  • Mahdi Qaempanah
  • Mohammad Amin Raeisi

Finding dense subgraphs is a fundamental problem with applications to community detection, clustering, and data mining. Our work focuses on finding approximate densest subgraphs in directed graphs in computational models for processing massive data. We consider two such models: Massively Parallel Computation (MPC) and semi-streaming. We show how to find a $(2+\varepsilon)$-approximation in $\tilde{O}(\sqrt{\log n})$ MPC rounds with sublinear memory per machine. This improves the state-of-the-art results by Bahmani et al. (WAW 2014) and Mitrovic \& Pan (ICML 2024). Moreover, we show how to find an $O(\log n)$-approximation in a single pass in semi-streaming. This is in stark contrast to prior work, which implies $\tilde{\Omega}(n^{1/6})$-approximation for a single pass; a better approximation is known only for randomized streams (Mitrovi\'c \& Pan). This is the first deterministic single-pass semi-streaming algorithm for the densest subgraph problem, both for undirected and directed graphs. Our semi-streaming approach is also an insertion-only dynamic algorithm, attaining the first directed densest subgraph algorithm with $O(\log^2 n)$ worst-case update time while using sub-linear memory. We empirically evaluate our approaches in two ways. First, we illustrate that our single-pass semi-streaming algorithm performs much better than the theoretical guarantee. Specifically, its approximation on temporal datasets matches the $(2+\varepsilon)$-approximation of an $O(\log n)$-pass algorithm by Bahmani et al. (VLDB 2012). Second, we demonstrate that our MPC algorithm requires fewer rounds than prior work.

ICML Conference 2025 Conference Paper

Sparse-pivot: Dynamic correlation clustering for node insertions

  • Mina Dalirrooyfard
  • Konstantin Makarychev
  • Slobodan Mitrovic

We present a new Correlation Clustering algorithm for a dynamic setting where nodes are added one at a time. In this model, proposed by Cohen-Addad, Lattanzi, Maggiori, and Parotsidis (ICML 2024), the algorithm uses database queries to access the input graph and updates the clustering as each new node is added. Our algorithm has the amortized update time of $\log^{O(1)}(n)$. Its approximation factor is $20+\varepsilon$, which is a substantial improvement over the approximation factor of the algorithm by Cohen-Addad et al. We complement our theoretical findings by empirically evaluating the approximation guarantee of our algorithm. The results show that it outperforms the algorithm by Cohen-Addad et al. in practice.

ICML Conference 2024 Conference Paper

Faster Streaming and Scalable Algorithms for Finding Directed Dense Subgraphs in Large Graphs

  • Slobodan Mitrovic
  • Theodore Pan

Finding dense subgraphs is a fundamental algorithmic tool in data mining, community detection, and clustering. In this problem, the aim is to find an induced subgraph whose edge-to-vertex ratio is maximized. We show how to find a $(2+\epsilon)$ approximation of the directed densest subgraph on randomized streams in a single pass while using $O(n \cdot {\rm poly} \log n)$ memory on $n$-vertex graphs. In contrast, the approach by Bahmani et al. (VLDB 2012) uses $O(\log n)$ passes and by Esfandiari et al. (2015) makes one pass but uses $O(n^{3/2})$ memory; both algorithms also apply to arbitrary-ordered streams. Our techniques extend to Massively Parallel Computation (MPC), yielding quadratic improvement over state-of-the-art by Bahmani et al. (VLDB 2012 and WAW 2014). We empirically show that the quality of our output is essentially the same as that of Bahmani et al. (VLDB 2012) while being $2$ times faster on large graphs, even on non-randomly ordered streams.

ICML Conference 2024 Conference Paper

Pruned Pivot: Correlation Clustering Algorithm for Dynamic, Parallel, and Local Computation Models

  • Mina Dalirrooyfard
  • Konstantin Makarychev
  • Slobodan Mitrovic

Given a graph with positive and negative edge labels, the correlation clustering problem aims to cluster the nodes so to minimize the total number of between-cluster positive and within-cluster negative edges. This problem has many applications in data mining, particularly in unsupervised learning. Inspired by the prevalence of large graphs and constantly changing data in modern applications, we study correlation clustering in dynamic, parallel (MPC), and local computation (LCA) settings. We design an approach that improves state-of-the-art runtime complexities in all these settings. In particular, we provide the first fully dynamic algorithm that runs in an expected amortized constant time, without any dependence on the graph size. Moreover, our algorithm essentially matches the approximation guarantee of the celebrated Pivot algorithm.

NeurIPS Conference 2023 Conference Paper

Nearly Tight Bounds For Differentially Private Multiway Cut

  • Mina Dalirrooyfard
  • Slobodan Mitrovic
  • Yuriy Nevmyvaka

Finding min $s$-$t$ cuts in graphs is a basic algorithmic tool, with applications in image segmentation, community detection, reinforcement learning, and data clustering. In this problem, we are given two nodes as terminals and the goal is to remove the smallest number of edges from the graph so that these two terminals are disconnected. We study the complexity of differential privacy for the min $s$-$t$ cut problem and show nearly tight lower and upper bounds where we achieve privacy at no cost for running time efficiency. We also develop a differentially private algorithm for the multiway $k$-cut problem, in which we are given $k$ nodes as terminals that we would like to disconnect. As a function of $k$, we obtain privacy guarantees that are exponentially more efficient than applying the advanced composition theorem to known algorithms for multiway $k$-cut. Finally, we empirically evaluate the approximation of our differentially private min $s$-$t$ cut algorithm and show that it almost matches the quality of the output of non-private ones.

STOC Conference 2022 Conference Paper

Deterministic (1+ ε )-approximate maximum matching with poly(1/ ε ) passes in the semi-streaming model and beyond

  • Manuela Fischer
  • Slobodan Mitrovic
  • Jara Uitto

We present a deterministic (1+ε)-approximate maximum matching algorithm in poly (1/ε) passes in the semi-streaming model, solving the long-standing open problem of breaking the exponential barrier in the dependence on 1/ε. Our algorithm exponentially improves on the well-known randomized (1/ε) O (1/ε) -pass algorithm from the seminal work by McGregor [APPROX05], the recent deterministic algorithm by Tirodkar with the same pass complexity [FSTTCS18]. Up to polynomial factors in 1/ε, our work matches the state-of-the-art deterministic (log n / loglog n ) · (1/ε)-pass algorithm by Ahn and Guha [TOPC18], that is allowed a dependence on the number of nodes n . Our result also makes progress on the Open Problem 60 at sublinear.info. Moreover, we design a general framework that simulates our approach for the streaming setting in other models of computation. This framework requires access to an algorithm computing a maximal matching and an algorithm for processing disjoint ( 1 / ε)-size connected components. Instantiating our framework in CONGEST yields a (log n , 1/ε) round algorithm for computing (1+ε)-approximate maximum matching. In terms of the dependence on 1/ε, this result improves exponentially state-of-the-art result by Lotker, Patt-Shamir, and Pettie [LPSP15]. Our framework leads to the same quality of improvement in the context of the Massively Parallel Computation model as well.

NeurIPS Conference 2022 Conference Paper

Near-Optimal Correlation Clustering with Privacy

  • Vincent Cohen-Addad
  • Chenglin Fan
  • Silvio Lattanzi
  • Slobodan Mitrovic
  • Ashkan Norouzi-Fard
  • Nikos Parotsidis
  • Jakub M. Tarnawski

Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labeling and many more. In the correlation clustering problem one receives as input a set of nodes and for each node a list of co-clustering preferences, and the goal is to output a clustering that minimizes the disagreement with the specified nodes' preferences. In this paper, we introduce a simple and computationally efficient algorithm for the correlation clustering problem with provable privacy guarantees. Our additive error is stronger than those obtained in prior work and is optimal up to polylogarithmic factors for fixed privacy parameters.

ICML Conference 2021 Conference Paper

Correlation Clustering in Constant Many Parallel Rounds

  • Vincent Cohen-Addad
  • Silvio Lattanzi
  • Slobodan Mitrovic
  • Ashkan Norouzi-Fard
  • Nikos Parotsidis
  • Jakub Tarnawski

Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining. In correlation clustering, one receives as input a signed graph and the goal is to partition it to minimize the number of disagreements. In this work we propose a massively parallel computation (MPC) algorithm for this problem that is considerably faster than prior work. In particular, our algorithm uses machines with memory sublinear in the number of nodes in the graph and returns a constant approximation while running only for a constant number of rounds. To the best of our knowledge, our algorithm is the first that can provably approximate a clustering problem using only a constant number of MPC rounds in the sublinear memory regime. We complement our analysis with an experimental scalability evaluation of our techniques.

SODA Conference 2020 Conference Paper

Improved Local Computation Algorithm for Set Cover via Sparsification

  • Christoph Grunau
  • Slobodan Mitrovic
  • Ronitt Rubinfeld
  • Ali Vakilian

We design a Local Computation Algorithm (LCA) for the set cover problem. Given a set system where each set has size at most s and each element is contained in at most t sets, the algorithm reports whether a given set is in some fixed set cover whose expected size is O (log s ) times the minimum fractional set cover value. Our algorithm requires s O (log s ) t O (log s +log log t )) queries. This result improves upon the application of the reduction of [Parnas and Ron, TCS’07] on the result of [Kuhn et al. , SODA’06], which leads to a query complexity of ( st ) O (log s · log t ). To obtain this result, we design a parallel set cover algorithm that admits an efficient simulation in the LCA model by using a sparsification technique introduced in [Ghaffari and Uitto, SODA’19] for the maximal independent set problem. The parallel algorithm adds a random subset of the sets to the solution in a style similar to the PRAM algorithm of [Berger et al. , FOCS’89]. However, our algorithm differs in the way that it never revokes its decisions, which results in a fewer number of adaptive rounds. This requires a novel approximation analysis which might be of independent interest.

SODA Conference 2020 Conference Paper

Space Efficient Approximation to Maximum Matching Size from Uniform Edge Samples

  • Michael Kapralov
  • Slobodan Mitrovic
  • Ashkan Norouzi-Fard
  • Jakab Tardos

Given a source of iid samples of edges of an input graph G with n vertices and m edges, how many samples does one need to compute a constant factor approximation to the maximum matching size in G? Moreover, is it possible to obtain such an estimate in a small amount of space? We show that this problem cannot be solved using a nontrivially sublinear (in m ) number of samples: m 1− o (1) samples are needed. On the other hand, O (log 2 n ) bits of space suffice to compute an estimate. Our main technical tool is a new peeling type algorithm for matching and its local simulation. We show that a delicate balance between exploration depth and sampling rate allows our simulation to not lose precision over a logarithmic number of levels of recursion and achieve a constant factor approximation. Our algorithm also yields a constant factor approximate local computation algorithm (LCA) for matching with O ( d log n ) exploration starting from any vertex. Previous approaches were based on local simulations of randomized greedy, which take O ( d ) time in expectation over the starting vertex or edge (Yoshida et al’09, Onak et al’12), and could not achieve a better than d 2 run-time. Interestingly, we also show that unlike our algorithm, the local simulation of randomized greedy that is the basis of the most efficient prior results does take ( d 2 ) ≫ O ( d log n ) time for a worst case edge even for.

ICML Conference 2019 Conference Paper

Improved Parallel Algorithms for Density-Based Network Clustering

  • Mohsen Ghaffari 0001
  • Silvio Lattanzi
  • Slobodan Mitrovic

Clustering large-scale networks is a central topic in unsupervised learning with many applications in machine learning and data mining. A classic approach to cluster a network is to identify regions of high edge density, which in the literature is captured by two fundamental problems: the densest subgraph and the $k$-core decomposition problems. We design massively parallel computation (MPC) algorithms for these problems that are considerably faster than prior work. In the case of $k$-core decomposition, our work improves exponentially on the algorithm provided by Esfandiari et al. (ICML’18). Compared to the prior work on densest subgraph presented by Bahmani et al. (VLDB’12, ’14), our result requires quadratically fewer MPC rounds. We complement our analysis with an experimental scalability analysis of our techniques.

ICML Conference 2018 Conference Paper

Beyond 1/2-Approximation for Submodular Maximization on Massive Data Streams

  • Ashkan Norouzi-Fard
  • Jakub Tarnawski
  • Slobodan Mitrovic
  • Amir Zandieh
  • Aida Mousavifar
  • Ola Svensson

Many tasks in machine learning and data mining, such as data diversification, non-parametric learning, kernel machines, clustering etc. , require extracting a small but representative summary from a massive dataset. Often, such problems can be posed as maximizing a submodular set function subject to a cardinality constraint. We consider this question in the streaming setting, where elements arrive over time at a fast pace and thus we need to design an efficient, low-memory algorithm. One such method, proposed by Badanidiyuru et al. (2014), always finds a 0. 5-approximate solution. Can this approximation factor be improved? We answer this question affirmatively by designing a new algorithm Salsa for streaming submodular maximization. It is the first low-memory, singlepass algorithm that improves the factor 0. 5, under the natural assumption that elements arrive in a random order. We also show that this assumption is necessary, i. e. , that there is no such algorithm with better than 0. 5-approximation when elements arrive in arbitrary order. Our experiments demonstrate that Salsa significantly outperforms the state of the art in applications related to exemplar-based clustering, social graph analysis, and recommender systems.

ICML Conference 2017 Conference Paper

Robust Submodular Maximization: A Non-Uniform Partitioning Approach

  • Ilija Bogunovic
  • Slobodan Mitrovic
  • Jonathan Scarlett
  • Volkan Cevher

We study the problem of maximizing a monotone submodular function subject to a cardinality constraint $k$, with the added twist that a number of items $\tau$ from the returned set may be removed. We focus on the worst-case setting considered by Orlin et al. \ (2016), in which a constant-factor approximation guarantee was given for $\tau = o(\sqrt{k})$. In this paper, we solve a key open problem raised therein, presenting a new Partitioned Robust (PRo) submodular maximization algorithm that achieves the same guarantee for more general $\tau = o(k)$. Our algorithm constructs partitions consisting of buckets with exponentially increasing sizes, and applies standard submodular optimization subroutines on the buckets in order to construct the robust solution. We numerically demonstrate the performance of PRo in data summarization and influence maximization, demonstrating gains over both the greedy algorithm and the algorithm of Orlin et al. \ (2016).

NeurIPS Conference 2017 Conference Paper

Streaming Robust Submodular Maximization: A Partitioned Thresholding Approach

  • Slobodan Mitrovic
  • Ilija Bogunovic
  • Ashkan Norouzi-Fard
  • Jakub Tarnawski
  • Volkan Cevher

We study the classical problem of maximizing a monotone submodular function subject to a cardinality constraint k, with two additional twists: (i) elements arrive in a streaming fashion, and (ii) m items from the algorithm’s memory are removed after the stream is finished. We develop a robust submodular algorithm STAR-T. It is based on a novel partitioning structure and an exponentially decreasing thresholding rule. STAR-T makes one pass over the data and retains a short but robust summary. We show that after the removal of any m elements from the obtained summary, a simple greedy algorithm STAR-T-GREEDY that runs on the remaining elements achieves a constant-factor approximation guarantee. In two different data summarization tasks, we demonstrate that it matches or outperforms existing greedy and streaming methods, even if they are allowed the benefit of knowing the removed subset in advance.