Author name cluster

Chutong Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

2 author rows

STOC Conference 2025 Conference Paper

Omnipredicting Single-Index Models with Multi-index Models

Lunjia Hu
Kevin Tian
Chutong Yang

Recent work on supervised learning defined the notion of omnipredictors , i.e., predictor functions p over features that are simultaneously competitive for minimizing a family of loss functions L against a comparator class C . Omniprediction requires approximating the Bayes-optimal predictor beyond the loss minimization paradigm, and has generated significant interest in the learning theory community. However, even for basic settings such as agnostically learning single-index models (SIMs), existing omnipredictor constructions require impractically-large sample complexities and runtimes, and output complex, highly-improper hypotheses. Our main contribution is a new, simple construction of omnipredictors for SIMs. We give a learner outputting an omnipredictor that is ε-competitive on any matching loss induced by a monotone, Lipschitz link function, when the comparator class is bounded linear predictors. Our algorithm requires ≈ ε −4 samples and runs in nearly-linear time, and its sample complexity improves to ≈ ε −2 if link functions are bi-Lipschitz. This significantly improves upon the only prior known construction, which used ≳ ε −10 samples. We achieve our construction via a new, sharp analysis of the classical Isotron algorithm in the challenging agnostic learning setting, of potential independent interest. Previously, Isotron was known to properly learn SIMs in the realizable setting, as well as constant-factor competitive hypotheses under the squared loss. As they are based on Isotron, our omnipredictors are multi-index models with ≈ ε −2 prediction heads, bringing us closer to the tantalizing goal of proper omniprediction for general loss families and comparators.

Details

NeurIPS Conference 2025 Conference Paper

Private Geometric Median in Nearly-Linear Time

Syamantak Kumar
Daogao Liu
Kevin Tian
Chutong Yang

Estimating the geometric median of a dataset is a robust counterpart to mean estimation, and is a fundamental problem in computational geometry. Recently, [HSU24] gave an $(\epsilon, \delta)$-differentially private algorithm obtaining an $\alpha$-multiplicative approximation to the geometric median objective, $\frac 1 n \sum_{i \in [n]} \|\cdot - \mathbf{x}_i\|$, given a dataset $D$ of $x_i$ for $i \in [n]$. Their algorithm requires $n \gtrsim \sqrt d \cdot \frac 1 {\alpha\epsilon}$ samples, which they prove is information-theoretically optimal. This result is surprising because its error scales with the effective radius of $D$ (i. e. , of a ball capturing most points), rather than the worst-case radius. We give an improved algorithm that obtains the same approximation quality, also using $n \gtrsim \sqrt d \cdot \frac 1 {\alpha\epsilon}$ samples, but in time $\widetilde{O}(nd + \frac d {\alpha^2})$. Our runtime is nearly-linear, plus the cost of the cheapest non-private first-order method due to [CLMPS16]. To achieve our results, we use subsampling and geometric aggregation tools inspired by FriendlyCore [TCKMS22] to speed up the "warm start" component of the [HSU24] algorithm, combined with a careful custom analysis of DP-SGD's sensitivity for the geometric median objective.

PDF Details

NeurIPS Conference 2024 Conference Paper

Testing Calibration in Nearly-Linear Time

Lunjia Hu
Arun Jambulapati
Kevin Tian
Chutong Yang

In the recent literature on machine learning and decision making, calibration has emerged as a desirable and widely-studied statistical property of the outputs of binary prediction models. However, the algorithmic aspects of measuring model calibration have remained relatively less well-explored. Motivated by Blasiok et al '23, which proposed a rigorous framework for measuring distances to calibration, we initiate the algorithmic study of calibration through the lens of property testing. We define the problem of calibration testing from samples where given $n$ draws from a distribution $\mathcal{D}$ on $(\text{predictions}, \text{binary outcomes})$, our goal is to distinguish between the cases where $\mathcal{D}$ is perfectly calibrated or $\epsilon$-far from calibration. We make the simple observation that the empirical smooth calibration linear program can be reformulated as an instance of minimum-cost flow on a highly-structured graph, and design an exact dynamic programming-based solver for it which runs in time $O(n\log^2(n))$, and solves the calibration testing problem information-theoretically optimally in the same time. This improves upon state-of-the-art black-box linear program solvers requiring $\Omega(n^\omega)$ time, where $\omega > 2$ is the exponent of matrix multiplication. We also develop algorithms for tolerant variants of our testing problem improving upon black-box linear program solvers, and give sample complexity lower bounds for alternative calibration measures to the one considered in this work. Finally, we present experiments showing the testing problem we define faithfully captures standard notions of calibration, and that our algorithms scale efficiently to accommodate large sample sizes.

PDF Details DOI

ICML Conference 2023 Conference Paper

Omnipredictors for Constrained Optimization

Lunjia Hu
Inbal Livni Navon
Omer Reingold
Chutong Yang

The notion of omnipredictors (Gopalan, Kalai, Reingold, Sharan and Wieder ITCS 2022), suggested a new paradigm for loss minimization. Rather than learning a predictor based on a known loss function, omnipredictors can easily be post-processed to minimize any one of a rich family of loss functions compared with the loss of hypotheses in a class $\mathcal C$. It has been shown that such omnipredictors exist and are implied (for all convex and Lipschitz loss functions) by the notion of multicalibration from the algorithmic fairness literature. In this paper, we introduce omnipredictors for constrained optimization and study their complexity and implications. The notion that we introduce allows the learner to be unaware of the loss function that will be later assigned as well as the constraints that will be later imposed, as long as the subpopulations that are used to define these constraints are known. We show how to obtain omnipredictors for constrained optimization problems, relying on appropriate variants of multicalibration. We also investigate the implications of this notion when the constraints used are so-called group fairness notions.

Details

NeurIPS Conference 2022 Conference Paper

Active Learning Polynomial Threshold Functions

Omri Ben-Eliezer
Max Hopkins
Chutong Yang
Hantao Yu

We initiate the study of active learning polynomial threshold functions (PTFs). While traditional lower bounds imply that even univariate quadratics cannot be non-trivially actively learned, we show that allowing the learner basic access to the derivatives of the underlying classifier circumvents this issue and leads to a computationally efficient algorithm for active learning degree-$d$ univariate PTFs in $\tilde{O}(d^3\log(1/\varepsilon\delta))$ queries. We extend this result to the batch active setting, providing a smooth transition between query complexity and rounds of adaptivity, and also provide near-optimal algorithms for active learning PTFs in several average case settings. Finally, we prove that access to derivatives is insufficient for active learning multivariate PTFs, even those of just two variables.

PDF Details