Arrow Research search

Author name cluster

Navid Nouri

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
2 author rows

Possible papers

5

UAI Conference 2023 Conference Paper

Differentially private synthetic data using KD-trees

  • Eleonora Kreacic
  • Navid Nouri
  • Vamsi K. Potluru
  • Tucker R. Balch
  • Manuela Veloso

Creation of a synthetic dataset that faithfully represents the data distribution and simultaneously preserves privacy is a major research challenge. Many space partitioning based approaches have emerged in recent years for answering statistical queries in a differentially private manner. However, for synthetic data generation problem, recent research has been mainly focused on deep generative models. In contrast, we exploit space partitioning techniques together with noise perturbation and thus achieve intuitive and transparent algorithms. We propose both data independent and data dependent algorithms for $\epsilon$-differentially private synthetic data generation whose kernel density resembles that of the real dataset. Additionally, we provide theoretical results on the utility-privacy trade-offs and show how our data dependent approach overcomes the curse of dimensionality and leads to a scalable algorithm. We show empirical utility improvements over the prior work, and discuss performance of our algorithm on a downstream classification task on a real dataset.

NeurIPS Conference 2021 Conference Paper

Efficient and Local Parallel Random Walks

  • Michael Kapralov
  • Silvio Lattanzi
  • Navid Nouri
  • Jakab Tardos

Random walks are a fundamental primitive used in many machine learning algorithms with several applications in clustering and semi-supervised learning. Despite their relevance, the first efficient parallel algorithm to compute random walks has been introduced very recently (Łącki et al. ). Unfortunately their method has a fundamental shortcoming: their algorithm is non-local in that it heavily relies on computing random walks out of all nodes in the input graph, even though in many practical applications one is interested in computing random walks only from a small subset of nodes in the graph. In this paper, we present a new algorithm that overcomes this limitation by building random walks efficiently and locally at the same time. We show that our technique is both memory and round efficient, and in particular yields an efficient parallel local clustering algorithm. Finally, we complement our theoretical analysis with experimental results showing that our algorithm is significantly more scalable than previous approaches.

SODA Conference 2021 Conference Paper

Graph Spanners by Sketching in Dynamic Streams and the Simultaneous Communication Model

  • Arnold Filtser
  • Michael Kapralov
  • Navid Nouri

Graph sketching is a powerful technique introduced by the seminal work of Ahn, Guha and McGregor'12 on connectivity in dynamic graph streams that has enjoyed considerable attention in the literature since then, and has led to near optimal dynamic streaming algorithms for many fundamental problems such as connectivity, cut and spectral sparsifiers and matchings. Interestingly, however, the sketching and dynamic streaming complexity of approximating the shortest path metric of a graph is still far from well-understood. Besides a direct k -pass implementation of classical spanner constructions (recently improved to -passes by Fernandez, Woodruff and Yasuda'20) the state of the art amounts to a O (log k )-pass algorithm of Ahn, Guha and McGregor'12, and a 2-pass algorithm of Kapralov and Woodruff'14. In particular, no single pass algorithm is known, and the optimal tradeoff between the number of passes, stretch and space complexity is open. In this paper we introduce several new graph sketching techniques for approximating the shortest path metric of the input graph. We give the first single pass sketching algorithm for constructing graph spanners: we show how to obtain a Õ ( n ⅔ )-spanner using Õ ( n ) space, and in general a Õ ( n ⅔(1– α ) )-spanner using Õ ( n 1+ α ) space for every α ∊ [0, 1], a tradeoff that we think may be close optimal. We also give new spanner construction algorithms for any number of passes, simultaneously improving upon all prior work on this problem. Finally, we note that unlike the original sketching approach of Ahn, Guha and McGregor'12, none of the existing spanner constructions yield simultaneous communication protocols with low per player information. We give the first such protocols for the spanner problem that use a small number of rounds.

FOCS Conference 2020 Conference Paper

Kernel Density Estimation through Density Constrained Near Neighbor Search

  • Moses Charikar
  • Michael Kapralov
  • Navid Nouri
  • Paris Siminelakis

In this paper we revisit the kernel density estimation problem: given a kernel K(x, y) and a dataset of n points in high dimensional Euclidean space, prepare a data structure that can quickly output, given a query q, a (1+ ε)-approximation to μ: =[1/(|P|)]Σ p∈P K(p, q). First, we give a single data structure based on classical near neighbor search techniques that improves upon or essentially matches the query time and space complexity for all radial kernels considered in the literature so far. We then show how to improve both the query complexity and runtime by using recent advances in data-dependent near neighbor search. We achieve our results by giving an new implementation of the natural importance sampling scheme. Unlike previous approaches, our algorithm first samples the dataset uniformly (considering a geometric sequence of sampling rates), and then uses existing approximate near neighbor search techniques on the resulting smaller dataset to retrieve the sampled points that lie at an appropriate distance from the query. We show that the resulting sampled dataset has strong geometric structure, making approximate near neighbor search return the required samples much more efficiently than for worst case datasets of the same size. As an example application, we show that this approach yields a data structure that achieves query time μ -(1+0(1))/4 and space complexity μ -(1+0(1)) for the Gaussian kernel. Our data dependent approach achieves query time μ -0. 173-0(1) and space μ -(1+0(1)) for the Gaussian kernel. The data dependent analysis relies on new techniques for tracking the geometric structure of the input datasets in a recursive hashing process that we hope will be of interest in other applications in near neighbor search.