Author name cluster

Stefan Tiegel

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

Improved Robust Estimation for Erdős-Rényi Graphs: The Sparse Regime and Optimal Breakdown Point

Hongjie Chen
Jingqiu Ding
Yiding Hua
Stefan Tiegel

We study the problem of robustly estimating the edge density of Erdos Renyi random graphs $\mathbb{G}(n, d^\circ/n)$ when an adversary can arbitrarily add or remove edges incident to an $\eta$-fraction of the nodes. We develop the first polynomial-time algorithm for this problem that estimates $d^\circ$ up to an additive error $O\left({[\sqrt{\log(n) / n} + \eta\sqrt{\log(1/\eta)} ] \cdot \sqrt{d^\circ} + \eta \log(1/\eta)}\right)$. Our error guarantee matches information-theoretic lower bounds up to factors of $\log(1/\eta)$. Moreover, our estimator works for all $d^\circ \geq \Omega(1)$ and achieves optimal breakdown point $\eta = 1/2$. Previous algorithms [Acharya et al 2022, Chen et al 2024], including inefficient ones, incur significantly suboptimal errors. Furthermore, even admitting suboptimal error guarantees, only inefficient algorithms achieve optimal breakdown point. Our algorithm is based on the sum-of-squares (SoS) hierarchy. A key ingredient is to construct constant-degree SoS certificates for concentration of the number of edges incident to small sets in $\mathbb{G}(n, d^\circ/n)$. Crucially, we show that these certificates also exist in the sparse regime, when $d^\circ = o(\log n)$, a regime in which the performance of previous algorithms was significantly suboptimal.

PDF Details

STOC Conference 2025 Conference Paper

Near-Optimal Time-Sparsity Trade-Offs for Solving Noisy Linear Equations

Kiril Bangachev
Guy Bresler
Stefan Tiegel
Vinod Vaikuntanathan

Details

STOC Conference 2025 Conference Paper

Sample-Optimal Private Regression in Polynomial Time

Prashanti Anderson
Ainesh Bakshi
Mahbod Majid
Stefan Tiegel

We consider the task of privately obtaining prediction error guarantees in ordinary least-squares regression problems with Gaussian covariates (with unknown covariance structure). We provide the first sample-optimal polynomial time algorithm for this task under both pure and approximate differential privacy. We show that any improvement to the sample complexity of our algorithm would violate either statistical-query or information-theoretic lower bounds. Additionally, our algorithm is robust to a small fraction of arbitrary outliers and achieves optimal error rates as a function of the fraction of outliers. In contrast, all prior efficient algorithms either incurred sample complexities with sub-optimal dimension dependence, scaling with the condition number of the covariates, or obtained a polynomially worse dependence on the privacy parameters. Our technical contributions are two-fold: first, we leverage resilience guarantees of Gaussians within the sum-of-squares framework. As a consequence, we obtain efficient sum-of-squares algorithms for regression with optimal robustness rates and sample complexity. Second, we generalize the recent robustness-to-privacy framework of Hopkins, Kamath, Majid, and Narayanan to account for the geometry induced by the covariance of the input samples. This framework crucially relies on the robust estimators to be sum-of-squares algorithms, and combining the two steps yields a sample-optimal private regression algorithm. We believe our techniques are of independent interest, and we demonstrate this by obtaining an efficient algorithm for covariance-aware mean estimation, with an optimal dependence on the privacy parameters.

Details

STOC Conference 2025 Conference Paper

SoS Certifiability of Subgaussian Distributions and Its Algorithmic Applications

Ilias Diakonikolas
Samuel B. Hopkins 0001
Ankit Pensia
Stefan Tiegel

Details

STOC Conference 2025 Conference Paper

SoS Certificates for Sparse Singular Values and Their Applications: Robust Statistics, Subspace Distortion, and More

Ilias Diakonikolas
Samuel B. Hopkins 0001
Ankit Pensia
Stefan Tiegel

Details

NeurIPS Conference 2024 Conference Paper

Robust Mixture Learning when Outliers Overwhelm Small Groups

Daniil Dmitriev
Rares-Darius Buhai
Stefan Tiegel
Alexander Wolters
Gleb Novikov
Amartya Sanyal
David Steurer
Fanny Yang

We study the problem of estimating the means of well-separated mixtures when an adversary may add arbitrary outliers. While strong guarantees are available when the outlier fraction is significantly smaller than the minimum mixing weight, much less is known when outliers may crowd out low-weight clusters – a setting we refer to as list-decodable mixture learning (LD-ML). In this case, adversarial outliers can simulate additional spurious mixture components. Hence, if all means of the mixture must be recovered up to a small error in the output list, the list size needs to be larger than the number of (true) components. We propose an algorithm that obtains order-optimal error guarantees for each mixture mean with a minimal list-size overhead, significantly improving upon list-decodable mean estimation, the only existing method that is applicable for LD-ML. Although improvements are observed even when the mixture is non-separated, our algorithm achieves particularly strong guarantees when the mixture is separated: it can leverage the mixture structure to partially cluster the samples before carefully iterating a base learner for list-decodable mean estimation at different scales.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Testably Learning Polynomial Threshold Functions

Lucas Slot
Stefan Tiegel
Manuel Wiedmer

Rubinfeld \& Vasilyan recently introduced the framework of *testable learning* as an extension of the classical agnostic model. It relaxes distributional assumptions which are difficult to verify by conditions that can be checked efficiently by a *tester*. The tester has to accept whenever the data truly satisfies the original assumptions, and the learner has to succeed whenever the tester accepts. We focus on the setting where the tester has to accept standard Gaussian data. There, it is known that basic concept classes such as halfspaces can be learned testably with the same time complexity as in the (distribution-specific) agnostic model. In this work, we ask whether there is a price to pay for testably learning more complex concept classes. In particular, we consider polynomial threshold functions (PTFs), which naturally generalize halfspaces. We show that PTFs of arbitrary constant degree can be testably learned up to excess error $\varepsilon > 0$ in time $n^{\mathrm{poly}(1/\varepsilon)}$. This qualitatively matches the best known guarantees in the agnostic model. Our results build on a connection between testable learning and *fooling*. In particular, we show that distributions that approximately match at least $\mathrm{poly}(1/\varepsilon)$ moments of the standard Gaussian fool constant-degree PTFs (up to error $\varepsilon$). As a secondary result, we prove that a direct approach to show testable learning (without fooling), which was successfully used for halfspaces, cannot work for PTFs.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

Private estimation algorithms for stochastic block models and mixture models

Hongjie Chen
Vincent Cohen-Addad
Tommaso d’Orsi
Alessandro Epasto
Jacob Imola
David Steurer
Stefan Tiegel

We introduce general tools for designing efficient private estimation algorithms, in the high-dimensional settings, whose statistical guarantees almost match those of the best known non-private algorithms. To illustrate our techniques, we consider two problems: recovery of stochastic block models and learning mixtures of spherical Gaussians. For the former, we present the first efficient $(\epsilon, \delta)$-differentially private algorithm for both weak recovery and exact recovery. Previously known algorithms achieving comparable guarantees required quasi-polynomial time. For the latter, we design an $(\epsilon, \delta)$-differentially private algorithm that recovers the centers of the $k$-mixture when the minimum separation is at least $O(k^{1/t}\sqrt{t})$. For all choices of $t$, this algorithm requires sample complexity $n\geq k^{O(1)}d^{O(t)}$ and time complexity $(nd)^{O(t)}$. Prior work required either an additional additive $\Omega(\sqrt{\log n})$ term in the minimum separation or an explicit upper bound on the Euclidean norm of the centers.

PDF Details

NeurIPS Conference 2023 Conference Paper

Robust Mean Estimation Without Moments for Symmetric Distributions

Gleb Novikov
David Steurer
Stefan Tiegel

We study the problem of robustly estimating the mean or location parameter without moment assumptions. Known computationally efficient algorithms rely on strong distributional assumptions, such as sub-Gaussianity, or (certifiably) bounded moments. Moreover, the guarantees that they achieve in the heavy-tailed setting are weaker than those for sub-Gaussian distributions with known covariance. In this work, we show that such a tradeoff, between error guarantees and heavy-tails, is not necessary for symmetric distributions. We show that for a large class of symmetric distributions, the same error as in the Gaussian setting can be achieved efficiently. The distributions we study include products of arbitrary symmetric one-dimensional distributions, such as product Cauchy distributions, as well as elliptical distributions, a vast generalization of the Gaussian distribution. For product distributions and elliptical distributions with known scatter (covariance) matrix, we show that given an $\varepsilon$-corrupted sample, we can with probability at least $1-\delta$ estimate its location up to error $O(\varepsilon \sqrt{\log(1/\varepsilon)})$ using $\tfrac{d\log(d) + \log(1/\delta)}{\varepsilon^2 \log(1/\varepsilon)}$ samples. This result matches the best-known guarantees for the Gaussian distribution and known SQ lower bounds (up to the $\log(d)$ factor). For elliptical distributions with unknown scatter (covariance) matrix, we propose a sequence of efficient algorithms that approaches this optimal error. Specifically, for every $k \in \mathbb{N}$, we design an estimator using time and samples $\tilde{O}({d^k})$ achieving error $O(\varepsilon^{1-\frac{1}{2k}})$. This matches the error and running time guarantees when assuming certifiably bounded moments of order up to $k$. For unknown covariance, such error bounds of $o(\sqrt{\varepsilon})$ are not even known for (general) sub-Gaussian distributions. Our algorithms are based on a generalization of the well-known filtering technique [DK22]. More specifically, we show how this machinery can be combined with Huber-loss-based techniques to work with projections of the noise that behave more nicely than the initial noise. Moreover, we show how sum-of-squares proofs can be used to obtain algorithmic guarantees even for distributions without a first moment. We believe that this approach may find other applications in future works.

PDF Details

NeurIPS Conference 2021 Conference Paper

Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

Tommaso d'Orsi
Chih-Hung Liu
Rajai Nasser
Gleb Novikov
David Steurer
Stefan Tiegel

We develop machinery to design efficiently computable and \emph{consistent} estimators, achieving estimation error approaching zero as the number of observations grows, when facing an oblivious adversary that may corrupt responses in all but an $\alpha$ fraction of the samples. As concrete examples, we investigate two problems: sparse regression and principal component analysis (PCA). For sparse regression, we achieve consistency for optimal sample size $n\gtrsim (k\log d)/\alpha^2$ and optimal error rate $O(\sqrt{(k\log d)/(n\cdot \alpha^2)})$where $n$ is the number of observations, $d$ is the number of dimensions and $k$ is the sparsity of the parameter vector, allowing the fraction of inliers to be inverse-polynomial in the number of samples. Prior to this work, no estimator was known to be consistent when the fraction of inliers $\alpha$ is $o(1/\log \log n)$, even for (non-spherical) Gaussian design matrices. Results holding under weak design assumptions and in the presence of such general noise have only been shown in dense setting (i. e. , general linear regression) very recently by d'Orsi et al. ~\cite{ICML-linear-regression}. In the context of PCA, we attain optimal error guarantees under broad spikiness assumptions on the parameter matrix (usually used in matrix completion). Previous works could obtain non-trivial guarantees only under the assumptions that the measurement noise corresponding to the inliers is polynomially small in $n$ (e. g. , Gaussian with variance $1/n^2$). To devise our estimators, we equip the Huber loss with non-smooth regularizers such as the $\ell_1$ norm or the nuclear norm, and extend d'Orsi et al. 's approach~\cite{ICML-linear-regression} in a novel way to analyze the loss function. Our machinery appears to be easily applicable to a wide range of estimation problems. We complement these algorithmic results with statistical lower bounds showing that the fraction of inliers that our PCA estimator can deal with is optimal up to a constant factor.

PDF Details

SODA Conference 2021 Conference Paper

SoS Degree Reduction with Applications to Clustering and Robust Moment Estimation

David Steurer
Stefan Tiegel

We develop a general framework to significantly reduce the degree of sum-of-squares proofs by introducing new variables. To illustrate the power of this framework, we use it to speed up previous algorithms based on sum-of-squares for two important estimation problems, clustering and robust moment estimation. The resulting algorithms offer the same statistical guarantees as the previous best algorithms but have significantly faster running times. Roughly speaking, given a sample of n points in dimension d, our algorithms can exploit order- ℓ moments in time d O ( ℓ ) · n O (1), whereas a naive implementation requires time ( d · n ) O ( ℓ ). Since for the aforementioned applications, the typical sample size is d Θ( ℓ ), our framework improves running times from to d O ( ℓ ).

Details