Author name cluster

Arsen Vasilyan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

2 author rows

SODA Conference 2025 Conference Paper

Local Lipschitz Filters for Bounded-Range Functions with Applications to Arbitrary Real-Valued Functions

Jane Lange
Ephraim Linder
Sofya Raskhodnikova
Arsen Vasilyan

Details

NeurIPS Conference 2025 Conference Paper

Robust learning of halfspaces under log-concave marginals

Jane Lange
Arsen Vasilyan

We say that a classifier is $\text{\emph{adversarially robust}}$ to perturbations of norm $r$ if, with high probability over a point $x$ drawn from the input distribution, there is no point within distance $\le r$ from $x$ that is classified differently. The $\text{\emph{boundary volume}}$ is the probability that a point falls within distance $r$ of a point with a different label. This work studies the task of learning a hypothesis with small boundary volume, where the input is distributed as a subgaussian isotropic log-concave distribution over $\mathbb{R}^d$. Linear threshold functions are adversarially robust; they have boundary volume proportional to $r$. Such concept classes are efficiently learnable by polynomial regression, which produces a polynomial threshold function (PTF), but PTFs in general may have boundary volume $\Omega(1)$, even for $r \ll 1$. We give an algorithm that agnostically learns linear threshold functions and returns a classfier with boundary volume $O(r+\varepsilon)$ at radius of perturbation $r$. The time and sample complexity of $d^{\tilde{O}(1/\varepsilon^2)}$ matches the complexity of polynomial regression. Our algorithm augments the classic approach of polynomial regression with three additional steps: \ $\quad$ a) performing the $\ell_1$-error regression under $\ell_1$ noise sensitivity constraints, \ $\quad$ b) a structured partitioning and rounding step that returns a Boolean classifier with error $\mathrm{opt} + O(\varepsilon)$ and noise sensitivity $O(r+\varepsilon)$ simultaneously, and \ $\quad c)$ a local corrector that ``smooths'' a function with low noise sensitivity into a function that is adversarially robust.

PDF Details

NeurIPS Conference 2025 Conference Paper

The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination

Adam Klivans
Konstantinos Stavropoulos
Kevin Tian
Arsen Vasilyan

Inspired by recent work on learning with distribution shift, we give a general outlier removal algorithm called *iterative polynomial filtering* and show a number of striking applications for supervised learning with contamination: (1) We show that any function class that can be approximated by low-degree polynomials with respect to a hypercontractive distribution can be efficiently learned under bounded contamination (also known as *nasty noise*). This is a surprising resolution to a longstanding gap between the complexity of agnostic learning and learning with contamination, as it was widely believed that low-degree approximators only implied tolerance to label noise. (2) For any function class that admits the (stronger) notion of sandwiching approximators, we obtain near-optimal learning guarantees even with respect to heavy additive contamination, where far more than $1/2$ of the training set may be added adversarially. Prior related work held only for regression and in a list-decodable setting. (3) We obtain the first efficient algorithms for tolerant testable learning of functions of halfspaces with respect to any fixed log-concave distribution. Even the non-tolerant case for a single halfspace in this setting had remained open. These results significantly advance our understanding of efficient supervised learning under contamination, a setting that has been much less studied than its unsupervised counterpart.

PDF Details

ICLR Conference 2024 Conference Paper

An Efficient Tester-Learner for Halfspaces

Aravind Gollakota
Adam R. Klivans
Konstantinos Stavropoulos
Arsen Vasilyan

We give the first efficient algorithm for learning halfspaces in the testable learning model recently defined by Rubinfeld and Vasilyan [2022]. In this model, a learner certifies that the accuracy of its output hypothesis is near optimal whenever the training set passes an associated test, and training sets drawn from some target distribution must pass the test. This model is more challenging than distribution-specific agnostic or Massart noise models where the learner is allowed to fail arbitrarily if the distributional assumption does not hold. We consider the setting where the target distribution is the standard Gaussian in $d$ dimensions and the label noise is either Massart or adversarial (agnostic). For Massart noise, our tester-learner runs in polynomial time and outputs a hypothesis with (information-theoretically optimal) error $\mathrm{opt}+\epsilon$ (and extends to any fixed strongly log-concave target distribution). For adversarial noise, our tester-learner obtains error $O(\mathrm{opt})+\epsilon$ in polynomial time. Prior work on testable learning ignores the labels in the training set and checks that the empirical moments of the covariates are close to the moments of the base distribution. Here we develop new tests of independent interest that make critical use of the labels and combine them with the moment-matching approach of Gollakota et al. [2022]. This enables us to implement a testable variant of the algorithm of Diakonikolas et al. [2020a, 2020b] for learning noisy halfspaces using nonconvex SGD.

Details

NeurIPS Conference 2024 Conference Paper

Efficient Discrepancy Testing for Learning with Distribution Shift

Gautam Chandrasekaran
Adam R. Klivans
Vasilis Kontonis
Konstantinos Stavropoulos
Arsen Vasilyan

A fundamental notion of distance between train and test distributions from the field of domain adaptation is discrepancy distance. While in general hard to compute, here we provide the first set of provably efficient algorithms for testing localized discrepancy distance, where discrepancy is computed with respect to a fixed output classifier. These results imply a broad set of new, efficient learning algorithms in the recently introduced model of Testable Learning with Distribution Shift (TDS learning) due to Klivans et al. (2023). Our approach generalizes and improves all prior work on TDS learning: (1) we obtain universal learners that succeed simultaneously for large classes of test distributions, (2) achieve near-optimal error rates, and (3) give exponential improvements for constant depth circuits. Our methods further extend to semi-parametric settings and imply the first positive results for low-dimensional convex sets. Additionally, we separate learning and testing phases and obtain algorithms that run in fully polynomial time at test time.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Plant-and-Steal: Truthful Fair Allocations via Predictions

Ilan R. Cohen
Alon Eden
Talya Eden
Arsen Vasilyan

We study truthful mechanisms for approximating the Maximin-Share (MMS) allocation of agents with additive valuations for indivisible goods. Algorithmically, constant factor approximations exist for the problem for any number of agents. When adding incentives to the mix, a jarring result by Amanatidis, Birmpas, Christodoulou, and Markakis [EC 2017] shows that the best possible approximation for two agents and $m$ items is $\lfloor \frac{m}{2} \rfloor$. We adopt a learning-augmented framework to investigate what is possible when some prediction on the input is given. For two agents, we give a truthful mechanism that takes agents' ordering over items as prediction. When the prediction is accurate, we give a $2$-approximation to the MMS (consistency), and when the prediction is off, we still get an $\lceil \frac{m}{2} \rceil$-approximation to the MMS (robustness). We further show that the mechanism's performance degrades gracefully in the number of ``mistakes" in the prediction; i. e. , we interpolate (up to constant factors) between the two extremes: when there are no mistakes, and when there is a maximum number of mistakes. We also show an impossibility result on the obtainable consistency for mechanisms with finite robustness. For the general case of $n\ge 2$ agents, we give a 2-approximation mechanism for accurate predictions, with relaxed fallback guarantees. Finally, we give experimental results which illustrate when different components of our framework, made to insure consistency and robustness, come into play.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Tolerant Algorithms for Learning with Arbitrary Covariate Shift

Surbhi Goel
Abhishek Shetty
Konstantinos Stavropoulos
Arsen Vasilyan

We study the problem of learning under arbitrary distribution shift, where the learner is trained on a labeled set from one distribution but evaluated on a different, potentially adversarially generated test distribution. We focus on two frameworks: PQ learning [GKKM'20], allowing abstention on adversarially generated parts of the test distribution, and TDS learning [KSV'23], permitting abstention on the entire test distribution if distribution shift is detected. All prior known algorithms either rely on learning primitives that are computationally hard even for simple function classes, or end up abstaining entirely even in the presence of a tiny amount of distribution shift. We address both these challenges for natural function classes, including intersections of halfspaces and decision trees, and standard training distributions, including Gaussians. For PQ learning, we give efficient learning algorithms, while for TDS learning, our algorithms can tolerate moderate amounts of distribution shift. At the core of our approach is an improved analysis of spectral outlier-removal techniques from learning with nasty noise. Our analysis can (1) handle arbitrarily large fraction of outliers, which is crucial for handling arbitrary distribution shifts, and (2) obtain stronger bounds on polynomial moments of the distribution after outlier removal, yielding new insights into polynomial regression under distribution shifts. Lastly, our techniques lead to novel results for tolerant testable learning [RV'23], and learning with nasty noise.

PDF Details DOI

FOCS Conference 2023 Conference Paper

Agnostic proper learning of monotone functions: beyond the black-box correction barrier

Jane Lange
Arsen Vasilyan

We give the first agnostic, efficient, proper learning algorithm for monotone Boolean functions. Given $2^{\tilde{O}(\sqrt{n} / \varepsilon)}$ uniformly random examples of an unknown function $f: \{ \pm 1\}^{n} \rightarrow\{ \pm 1\}$, our algorithm outputs a hypothesis $g: \{ \pm 1\}^{n} \rightarrow\{ \pm 1\}$ that is monotone and (opt $+\varepsilon$)-close to f, where opt is the distance from f to the closest monotone function. The running time of the algorithm (and consequently the size and evaluation time of the hypothesis) is also $2^{\tilde{O}(\sqrt{n} / \varepsilon)}$, nearly matching the lower bound of [13]. We also give an algorithm for estimating up to additive error $\varepsilon$ the distance of an unknown function f to monotone using a run-time of $2^{\tilde{O}(\sqrt{n} / \varepsilon)}$. Previously, for both of these problems, sample-efficient algorithms were known, but these algorithms were not run-time efficient. Our work thus closes this gap in our knowledge between the run-time and sample complexity. This work builds upon the improper learning algorithm of [17] and the proper semiagnostic learning algorithm of [40], which obtains a non-monotone Boolean-valued hypothesis, then “corrects” it to monotone using query-efficient local computation algorithms on graphs. This black-box correction approach can achieve no error better than 2 opt $+\varepsilon$ information-theoretically; we bypass this barrier bya)augmenting the improper learner with a convex optimization step, andb)learning and correcting a real-valued function before rounding its values to Boolean. Our real-valued correction algorithm solves the “poset sorting” problem of [40] for functions over general posets with non-Boolean labels.

Details

NeurIPS Conference 2023 Conference Paper

Tester-Learners for Halfspaces: Universal Algorithms

Aravind Gollakota
Adam Klivans
Konstantinos Stavropoulos
Arsen Vasilyan

We give the first tester-learner for halfspaces that succeeds universally over a wide class of structured distributions. Our universal tester-learner runs in fully polynomial time and has the following guarantee: the learner achieves error $O(\mathrm{opt}) + \epsilon$ on any labeled distribution that the tester accepts, and moreover, the tester accepts whenever the marginal is any distribution that satisfies a Poincare inequality. In contrast to prior work on testable learning, our tester is not tailored to any single target distribution but rather succeeds for an entire target class of distributions. The class of Poincare distributions includes all strongly log-concave distributions, and, assuming the Kannan--Lovasz--Simonovits (KLS) conjecture, includes all log-concave distributions. In the special case where the label noise is known to be Massart, our tester-learner achieves error $\mathrm{opt} + \epsilon$ while accepting all log-concave distributions unconditionally (without assuming KLS). Our tests rely on checking hypercontractivity of the unknown distribution using a sum-of-squares (SOS) program, and crucially make use of the fact that Poincare distributions are certifiably hypercontractive in the SOS framework.

PDF Details

STOC Conference 2023 Conference Paper

Testing Distributional Assumptions of Learning Algorithms

Ronitt Rubinfeld
Arsen Vasilyan

Details

FOCS Conference 2022 Conference Paper

Properly learning monotone functions via local correction

Jane Lange
Ronitt Rubinfeld
Arsen Vasilyan

We give a $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$-time algorithm for properly learning monotone Boolean functions under the uniform distribution over $\{0, 1\}^{n}$. Our algorithm is robust to adversarial label noise and has a running time nearly matching that of the state-of-the-art improper learning algorithm of Bshouty and Tamon (JACM 96) and an information-theoretic lower bound of Blais et al (RANDOM ’15). Prior to this work, no proper learning algorithm with running time smaller than $2^{\Omega(n)}$ was known to exist. The core of our proper learner is a local computation algorithm for sorting binary labels on a poset. Our algorithm is built on a body of work on distributed greedy graph algorithms; specifically we rely on a recent work of Ghaffari (FOCS’22), which gives an efficient algorithm for computing maximal matchings in a graph in the LCA model of Rubinfeld et al and Alon et al (ICS’II, SODA’12). The applications of our local sorting algorithm extend beyond learning on the Boolean cube: we also give a tolerant tester for Boolean functions over general posets that distinguishes functions that are $\varepsilon$/3-close to monotone from those that are $\varepsilon-$far. Previous tolerant testers for the Boolean cube only distinguished between $\varepsilon/\Omega(\sqrt{n}$)-close and $\varepsilon-$far.

Details