Author name cluster

Rasmus Pagh

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

33 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

Differentially Private Quantiles with Smaller Error

Jacob Imola
Fabrizio Boninsegna
Hannah Keller
Anders Aamand
Amrita Roy Chowdhury
Rasmus Pagh

In the approximate quantiles problem, the goal is to output $m$ quantile estimates, the ranks of which are as close as possible to $m$ given quantiles $0 \leq q_1 \leq\dots \leq q_m \leq 1$. We present a mechanism for approximate quantiles that satisfies $\varepsilon$-differential privacy for a dataset of $n$ real numbers where the ratio between the distance between the closest pair of points and the size of the domain is bounded by $\psi$. As long as the minimum gap between quantiles is sufficiently large, $|q_i-q_{i-1}|\geq \Omega\left(\frac{m\log(m)\log(\psi)}{n\varepsilon}\right)$ for all $i$, the maximum rank error of our mechanism is $O\left(\frac{\log(\psi) + \log^2(m)}{\varepsilon}\right)$ with high probability. Previously, the best known algorithm under pure DP was due to Kaplan, Schnapp, and Stemmer (ICML '22), who achieved a bound of $O\left(\frac{\log(\psi)\log^2(m) + \log^3(m)}{\varepsilon}\right)$. Our improvement stems from the use of continual counting techniques which allows the quantiles to be randomized in a correlated manner. We also present an $(\varepsilon, \delta)$-differentially private mechanism that relaxes the gap assumption without affecting the error bound, improving on existing methods when $\delta$ is sufficiently close to zero. We provide experimental evaluation which confirms that our mechanism performs favorably compared to prior work in practice, in particular when the number of quantiles $m$ is large.

PDF Details

ICML Conference 2025 Conference Paper

Lightweight Protocols for Distributed Private Quantile Estimation

Anders Aamand
Fabrizio Boninsegna
Abigail Gentle
Jacob Imola
Rasmus Pagh

Distributed data analysis is a large and growing field driven by a massive proliferation of user devices, and by privacy concerns surrounding the centralised storage of data. We consider two adaptive algorithms for estimating one quantile (e. g. the median) when each user holds a single data point lying in a domain $[B]$ that can be queried once through a private mechanism; one under local differential privacy (LDP) and another for shuffle differential privacy (shuffle-DP). In the adaptive setting we present an $\varepsilon$-LDP algorithm which can estimate any quantile within error $\alpha$ only requiring $O(\frac{\log B}{\varepsilon^2\alpha^2})$ users, and an $(\varepsilon, \delta)$-shuffle DP algorithm requiring only $\widetilde{O}((\frac{1}{\varepsilon^2}+\frac{1}{\alpha^2})\log B)$ users. Prior (nonadaptive) algorithms require more users by several logarithmic factors in $B$. We further provide a matching lower bound for adaptive protocols, showing that our LDP algorithm is optimal in the low-$\varepsilon$ regime. Additionally, we establish lower bounds against non-adaptive protocols which paired with our understanding of the adaptive case, proves a fundamental separation between these models.