Author name cluster

Travis Dick

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

22 papers

2 author rows

ICML Conference 2025 Conference Paper

Nearly Optimal Sample Complexity for Learning with Label Proportions

Róbert Busa-Fekete
Travis Dick
Claudio Gentile
Haim Kaplan
Tomer Koren
Uri Stemmer

We investigate Learning from Label Proportions (LLP), a partial information setting where examples in a training set are grouped into bags, and only aggregate label values in each bag are available. Despite the partial observability, the goal is still to achieve small regret at the level of individual examples. We give results on the sample complexity of LLP under square loss, showing that our sample complexity is essentially optimal. From an algorithmic viewpoint, we rely on carefully designed variants of Empirical Risk Minimization, and Stochastic Gradient Descent algorithms, combined with ad hoc variance reduction techniques. On one hand, our theoretical results improve in important ways on the existing literature on LLP, specifically in the way the sample complexity depends on the bag size. On the other hand, we validate our algorithmic solutions on several datasets, demonstrating improved empirical performance (better accuracy for less samples) against recent baselines.

Details

NeurIPS Conference 2025 Conference Paper

Private Set Union with Multiple Contributions

Travis Dick
Haim Kaplan
Alex Kulesza
Uri Stemmer
Ziteng Sun
Ananda Theertha Suresh

In the private set union problem each user owns a bag of at most $k$ items (from some large universe of items), and we are interested in computing the union of the items in the bags of all of the users. This is trivial without privacy, but a differentially private algorithm must be careful about reporting items contained in only a small number of bags. We consider differentially private algorithms that always report a subset of the union, and define the utility of an algorithm to be the expected size of the subset that it reports. Because the achievable utility varies significantly with the dataset, we introduce the *utility ratio*, which normalizes utility by a dataset-specific upper bound and characterizes a mechanism by its lowest normalized utility across all datasets. We then develop algorithms with guaranteed utility ratios and complement them with bounds on the best possible utility ratio. Prior work has shown that a single algorithm can be simultaneously optimal for all datasets when $k=1$, but we show that instance-optimal algorithms do not exist when $k>1$, and characterize how performance degrades as $k$ grows. At the same time, we design a private algorithm that achieves the maximum possible utility, regardless of $k$, when the item histogram matches a prior prediction (for instance, from a previous data release) and degrades gracefully with the $L_\infty$ distance between the prediction and the actual histogram when the prediction is imperfect.

PDF Details

NeurIPS Conference 2024 Conference Paper

Auditing Privacy Mechanisms via Label Inference Attacks

Róbert I. Busa-Fekete
Travis Dick
Claudio Gentile
Andrés M. Medina
Adam Smith
Marika Swanberg

We propose reconstruction advantage measures to audit label privatization mechanisms. A reconstruction advantage measure quantifies the increase in an attacker's ability to infer the true label of an unlabeled example when provided with a private version of the labels in a dataset (e. g. , aggregate of labels from different users or noisy labels output by randomized response), compared to an attacker that only observes the feature vectors, but may have prior knowledge of the correlation between features and labels. We consider two such auditing measures: one additive, and on multiplicative. These cover previous approaches taken in the literature on empirical auditing and differential privacy. These measures allow us to place a variety of proposed privatization schemes---some differentially private, some not---on the same footing. We analyze these measures theoretically under a distributional model which, we claim, encapsulates reasonable adversarial settings. We also quantify their behavior empirically on real and simulated prediction tasks. Across a range of experimental settings, we find that differentially private schemes dominate or match the privacy-utility tradeoff of more heuristic approaches.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

Better Private Linear Regression Through Better Private Feature Selection

Travis Dick
Jennifer Gillenwater
Matthew Joseph

Existing work on differentially private linear regression typically assumes that end users can precisely set data bounds or algorithmic hyperparameters. End users often struggle to meet these requirements without directly examining the data (and violating privacy). Recent work has attempted to develop solutions that shift these burdens from users to algorithms, but they struggle to provide utility as the feature dimension grows. This work extends these algorithms to higher-dimensional problems by introducing a differentially private feature selection method based on Kendall rank correlation. We prove a utility guarantee for the setting where features are normally distributed and conduct experiments across 25 datasets. We find that adding this private feature selection step before regression significantly broadens the applicability of ``plug-and-play'' private linear regression algorithms at little additional cost to privacy, computation, or decision-making by the end user.

PDF Details

NeurIPS Conference 2023 Conference Paper

Easy Learning from Label Proportions

Róbert Busa-Fekete
Heejin Choi
Travis Dick
Claudio Gentile
Andres Munoz Medina

We consider the problem of Learning from Label Proportions (LLP), a weakly supervised classification setup where instances are grouped into i. i. d. “bags”, and only the frequency of class labels at each bag is available. Albeit, the objective of the learner is to achieve low task loss at an individual instance level. Here we propose EASYLLP, a flexible and simple-to-implement debiasing approach based on aggregate labels, which operates on arbitrary loss functions. Our technique allows us to accurately estimate the expected loss of an arbitrary model at an individual level. We elucidate the differences between our method and standard methods based on label proportion matching, in terms of applicability and optimality conditions. We showcase the flexibility of our approach compared to alternatives by applying our method to popular learning frameworks, like Empirical Risk Minimization (ERM) and Stochastic Gradient Descent (SGD) with provable guarantees on instance level performance. Finally, we validate our theoretical results on multiple datasets, empirically illustrating the conditions under which our algorithm is expected to perform better or worse than previous LLP approaches

PDF Details

ICML Conference 2023 Conference Paper

Learning-augmented private algorithms for multiple quantile release

Mikhail Khodak
Kareem Amin 0002
Travis Dick
Sergei Vassilvitskii

When applying differential privacy to sensitive data, we can often improve performance using external information such as other sensitive data, public data, or human priors. We propose to use the learning-augmented algorithms (or algorithms with predictions) framework—previously applied largely to improve time complexity or competitive ratios—as a powerful way of designing and analyzing privacy-preserving methods that can take advantage of such external information to improve utility. This idea is instantiated on the important task of multiple quantile release, for which we derive error guarantees that scale with a natural measure of prediction quality while (almost) recovering state-of-the-art prediction-independent guarantees. Our analysis enjoys several advantages, including minimal assumptions about the data, a natural way of adding robustness, and the provision of useful surrogate losses for two novel ”meta” algorithms that learn predictions from other (potentially sensitive) data. We conclude with experiments on challenging tasks demonstrating that learning predictions across one or more instances can lead to large error reductions while preserving privacy.

Details

ICML Conference 2023 Conference Paper

Subset-Based Instance Optimality in Private Estimation

Travis Dick
Alex Kulesza
Ziteng Sun
Ananda Theertha Suresh

We propose a new definition of instance optimality for differentially private estimation algorithms. Our definition requires an optimal algorithm to compete, simultaneously for every dataset $D$, with the best private benchmark algorithm that (a) knows $D$ in advance and (b) is evaluated by its worst-case performance on large subsets of $D$. That is, the benchmark algorithm need not perform well when potentially extreme points are added to $D$; it only has to handle the removal of a small number of real data points that already exist. This makes our benchmark significantly stronger than those proposed in prior work. We nevertheless show, for real-valued datasets, how to construct private algorithms that achieve our notion of instance optimality when estimating a broad class of dataset properties, including means, quantiles, and $\ell_p$-norm minimizers. For means in particular, we provide a detailed analysis and show that our algorithm simultaneously matches or exceeds the asymptotic performance of existing algorithms under a range of distributional assumptions.

Details

STOC Conference 2021 Conference Paper

How much data is sufficient to learn high-performing algorithms? generalization guarantees for data-driven algorithm design

Maria-Florina Balcan
Dan F. DeBlasio
Travis Dick
Carl Kingsford
Tuomas Sandholm
Ellen Vitercik

Details

ICLR Conference 2020 Conference Paper

Learning to Link

Maria-Florina Balcan
Travis Dick
Manuel Lang

Clustering is an important part of many modern data analysis pipelines, including network analysis and data retrieval. There are many different clustering algorithms developed by various communities, and it is often not clear which algorithm will give the best performance on a specific clustering task. Similarly, we often have multiple ways to measure distances between data points, and the best clustering performance might require a non-trivial combination of those metrics. In this work, we study data-driven algorithm selection and metric learning for clustering problems, where the goal is to simultaneously learn the best algorithm and metric for a specific application. The family of clustering algorithms we consider is parameterized linkage based procedures that includes single and complete linkage. The family of distance functions we learn over are convex combinations of base distance functions. We design efficient learning algorithms which receive samples from an application-specific distribution over clustering instances and learn a near-optimal distance and clustering algorithm from these classes. We also carry out a comprehensive empirical evaluation of our techniques showing that they can lead to significantly improved clustering performance on real-world datasets.

Details

JMLR Journal 2020 Journal Article

Random Smoothing Might be Unable to Certify $\ell_\infty$ Robustness for High-Dimensional Images

Avrim Blum
Travis Dick
Naren Manoj
Hongyang Zhang

We show a hardness result for random smoothing to achieve certified adversarial robustness against attacks in the $\ell_p$ ball of radius $\epsilon$ when $p>2$. Although random smoothing has been well understood for the $\ell_2$ case using the Gaussian distribution, much remains unknown concerning the existence of a noise distribution that works for the case of $p>2$. This has been posed as an open problem by Cohen et al. (2019) and includes many significant paradigms such as the $\ell_\infty$ threat model. In this work, we show that any noise distribution $\mathcal{D}$ over $\mathbb{R}^d$ that provides $\ell_p$ robustness for all base classifiers with $p>2$ must satisfy $\mathbb{E} \eta_i^2=\Omega(d^{1-2/p}\epsilon^2(1-\delta)/\delta^2)$ for 99% of the features (pixels) of vector $\eta\sim\mathcal{D}$, where $\epsilon$ is the robust radius and $\delta$ is the score gap between the highest-scored class and the runner-up. Therefore, for high-dimensional images with pixel values bounded in $[0,255]$, the required noise will eventually dominate the useful information in the images, leading to trivial smoothed classifiers. [abs] [ pdf ][ bib ] [ code ] &copy JMLR 2020. ( edit, beta )

PDF Details

UAI Conference 2020 Conference Paper

Semi-bandit Optimization in the Dispersed Setting

Travis Dick
Wesley Pegden
Maria-Florina Balcan

The goal of data-driven algorithm design is to obtain high-performing algorithms for specific application domains using machine learning and data. Across many fields in AI, science, and engineering, practitioners will often fix a family of parameterized algorithms and then optimize those parameters to obtain good performance on example instances from the application domain. In the online setting, we must choose algorithm parameters for each instance as they arrive, and our goal is to be competitive with the best fixed algorithm in hindsight.There are two major challenges in online data-driven algorithm design. First, it can be computationally expensive to evaluate the loss functions that map algorithm parameters to performance, which often require the learner to run a combinatorial algorithm to measure its performance. Second, the losses can be extremely volatile and have sharp discontinuities. However, we show that in many applications, evaluating the loss function for one algorithm choice can sometimes reveal the loss for a range of similar algorithms, essentially for free. We develop online optimization algorithms capable of using this kind of extra information by working in the semi-bandit feedback setting. Our algorithms achieve regret bounds that are essentially as good as algorithms under full-information feedback and are significantly more computationally efficient. We apply our semi-bandit results to obtain the first provable guarantees for data-driven algorithm design for linkage-based clustering and we improve the best regret bounds for designing greedy knapsack algorithms.

Details

NeurIPS Conference 2019 Conference Paper

Differentially Private Covariance Estimation

Kareem Amin
Travis Dick
Alex Kulesza
Andres Munoz
Sergei Vassilvitskii

The covariance matrix of a dataset is a fundamental statistic that can be used for calculating optimum regression weights as well as in many other learning and data analysis settings. For datasets containing private user information, we often want to estimate the covariance matrix in a way that preserves differential privacy. While there are known methods for privately computing the covariance matrix, they all have one of two major shortcomings. Some, like the Gaussian mechanism, only guarantee (epsilon, delta)-differential privacy, leaving a non-trivial probability of privacy failure. Others give strong epsilon-differential privacy guarantees, but are impractical, requiring complicated sampling schemes, and tend to perform poorly on real data. In this work we propose a new epsilon-differentially private algorithm for computing the covariance matrix of a dataset that addresses both of these limitations. We show that it has lower error than existing state-of-the-art approaches, both analytically and empirically. In addition, the algorithm is significantly less complicated than other methods and can be efficiently implemented with rejection sampling.

PDF Details

NeurIPS Conference 2019 Conference Paper

Envy-Free Classification

Maria-Florina Balcan
Travis Dick
Ritesh Noothigattu
Ariel Procaccia

In classic fair division problems such as cake cutting and rent division, envy-freeness requires that each individual (weakly) prefer his allocation to anyone else's. On a conceptual level, we argue that envy-freeness also provides a compelling notion of fairness for classification tasks, especially when individuals have heterogeneous preferences. Our technical focus is the generalizability of envy-free classification, i. e. , understanding whether a classifier that is envy free on a sample would be almost envy free with respect to the underlying distribution with high probability. Our main result establishes that a small sample is sufficient to achieve such guarantees, when the classifier in question is a mixture of deterministic classifiers that belong to a family of low Natarajan dimension.

PDF Details

NeurIPS Conference 2018 Conference Paper

Data-Driven Clustering via Parameterized Lloyd's Families

Maria-Florina Balcan
Travis Dick
Colin White

Algorithms for clustering points in metric spaces is a long-studied area of research. Clustering has seen a multitude of work both theoretically, in understanding the approximation guarantees possible for many objective functions such as k-median and k-means clustering, and experimentally, in finding the fastest algorithms and seeding procedures for Lloyd's algorithm. The performance of a given clustering algorithm depends on the specific application at hand, and this may not be known up front. For example, a "typical instance" may vary depending on the application, and different clustering heuristics perform differently depending on the instance. In this paper, we define an infinite family of algorithms generalizing Lloyd's algorithm, with one parameter controlling the the initialization procedure, and another parameter controlling the local search procedure. This family of algorithms includes the celebrated k-means++ algorithm, as well as the classic farthest-first traversal algorithm. We design efficient learning algorithms which receive samples from an application-specific distribution over clustering instances and learn a near-optimal clustering algorithm from the class. We show the best parameters vary significantly across datasets such as MNIST, CIFAR, and mixtures of Gaussians. Our learned algorithms never perform worse than k-means++, and on some datasets we see significant improvements.

PDF Details

FOCS Conference 2018 Conference Paper

Dispersion for Data-Driven Algorithm Design, Online Learning, and Private Optimization

Maria-Florina Balcan
Travis Dick
Ellen Vitercik

A crucial problem in modern data science is data-driven algorithm design, where the goal is to choose the best algorithm, or algorithm parameters, for a specific application domain. In practice, we often optimize over a parametric algorithm family, searching for parameters with high performance on a collection of typical problem instances. While effective in practice, these procedures generally have not come with provable guarantees. A recent line of work initiated by a seminal paper of Gupta and Roughgarden (2017) analyzes application-specific algorithm selection from a theoretical perspective. We progress this research direction in several important settings. We provide upper and lower bounds on regret for algorithm selection in online settings, where problems arrive sequentially and we must choose parameters online. We also consider differentially private algorithm selection, where the goal is to find good parameters for a set of problems without divulging too much sensitive information contained therein. We analyze several important parameterized families of algorithms, including SDP-rounding schemes for problems formulated as integer quadratic programs as well as greedy techniques for several canonical subset selection problems. The cost function that measures an algorithm's performance is often a volatile piecewise Lipschitz function of its parameters, since a small change to the parameters can lead to a cascade of different decisions made by the algorithm. We present general techniques for optimizing the sum or average of piecewise Lipschitz functions when the underlying functions satisfy a sufficient and general condition called dispersion. Intuitively, a set of piecewise Lipschitz functions is dispersed if no small region contains many of the functions' discontinuities. Using dispersion, we improve over the best-known online learning regret bounds for a variety problems, prove regret bounds for problems not previously studied, and provide matching regret lower bounds. In the private optimization setting, we show how to optimize performance while preserving privacy for several important problems, providing matching upper and lower bounds on performance loss due to privacy preservation. Though algorithm selection is our primary motivation, we believe the notion of dispersion may be of independent interest. Therefore, we present our results for the more general problem of optimizing piecewise Lipschitz functions. Finally, we uncover dispersion in domains beyond algorithm selection, namely, auction design and pricing, providing online and privacy guarantees for these problems as well.

Details

ICML Conference 2018 Conference Paper

Learning to Branch

Maria-Florina Balcan
Travis Dick
Tuomas Sandholm
Ellen Vitercik

Tree search algorithms, such as branch-and-bound, are the most widely used tools for solving combinatorial problems. These algorithms recursively partition the search space to find an optimal solution. To keep the tree small, it is crucial to carefully decide, when expanding a tree node, which variable to branch on at that node to partition the remaining space. Many partitioning techniques have been proposed, but no theory describes which is optimal. We show how to use machine learning to determine an optimal weighting of any set of partitioning procedures for the instance distribution at hand using samples. Via theory and experiments, we show that learning to branch is both practical and hugely beneficial.

Details

ICML Conference 2017 Conference Paper

Differentially Private Clustering in High-Dimensional Euclidean Spaces

Maria-Florina Balcan
Travis Dick
Yingyu Liang
Wenlong Mou
Hongyang Zhang 0001

We study the problem of clustering sensitive data while preserving the privacy of individuals represented in the dataset, which has broad applications in practical machine learning and data analysis tasks. Although the problem has been widely studied in the context of low-dimensional, discrete spaces, much remains unknown concerning private clustering in high-dimensional Euclidean spaces $\mathbb{R}^d$. In this work, we give differentially private and efficient algorithms achieving strong guarantees for $k$-means and $k$-median clustering when $d=\Omega(\mathsf{polylog}(n))$. Our algorithm achieves clustering loss at most $\log^3(n)\mathsf{OPT}+\mathsf{poly}(\log n, d, k)$, advancing the state-of-the-art result of $\sqrt{d}\mathsf{OPT}+\mathsf{poly}(\log n, d^d, k^d)$. We also study the case where the data points are $s$-sparse and show that the clustering loss can scale logarithmically with $d$, i. e. , $\log^3(n)\mathsf{OPT}+\mathsf{poly}(\log n, \log d, k, s)$. Experiments on both synthetic and real datasets verify the effectiveness of the proposed method.

Details

AAAI Conference 2017 Conference Paper

Label Efficient Learning by Exploiting Multi-Class Output Codes

Maria Balcan
Travis Dick
Yishay Mansour

We present a new perspective on the popular multi-class algorithmic techniques of one-vs-all and error correcting output codes. Rather than studying the behavior of these techniques for supervised learning, we establish a connection between the success of these methods and the existence of labelefﬁcient learning procedures. We show that in both the realizable and agnostic cases, if output codes are successful at learning from labeled data, they implicitly assume structure on how the classes are related. By making that structure explicit, we design learning algorithms to recover the classes with low label complexity. We provide results for the commonly studied cases of one-vs-all learning and when the codewords of the classes are well separated. We additionally consider the more challenging case where the codewords are not well separated, but satisfy a boundary features condition that captures the natural intuition that every bit of the codewords should be signiﬁcant.

PDF Details

ICML Conference 2014 Conference Paper

Online Learning in Markov Decision Processes with Changing Cost Sequences

Travis Dick
András György 0001
Csaba Szepesvári

In this paper we consider online learning in finite Markov decision processes (MDPs) with changing cost sequences under full and bandit-information. We propose to view this problem as an instance of online linear optimization. We propose two methods for this problem: MD^2 (mirror descent with approximate projections) and the continuous exponential weights algorithm with Dikin walks. We provide a rigorous complexity analysis of these techniques, while providing near-optimal regret-bounds (in particular, we take into account the computational costs of performing approximate projections in MD^2). In the case of full-information feedback, our results complement existing ones. In the case of bandit-information feedback we consider the online stochastic shortest path problem, a special case of the above MDP problems, and manage to improve the existing results by removing the previous restrictive assumption that the state-visitation probabilities are uniformly bounded away from zero under all policies.

Details

RLDM Conference 2013 Conference Abstract

Lunar Lander: A Continous-Action Case Study for Policy-Gradient Actor-Critic Algorithms

Travis Dick
Roshan Shariff

We empirically investigate modifications and implementation techniques required to apply a policy-gradient actor-critic algorithm to reinforcement learning problems with continuous state and action spaces. As a test-bed, we introduce a new simulated task, which involves landing a lunar module in a simpli- fied two-dimensional world. The empirical results demonstrate the importance of efficiently implementing eligibility traces and appropriately weighting the features produced by a tile coder.

PDF Details

RLDM Conference 2013 Conference Abstract

Online Learning in Markov Decision Processes with Changing Reward Sequences

Travis Dick
Andras Gyorgy
Csaba Szepesvari

In this paper we consider online learning in finite Markovian Decision Process with changing reward sequences under full and bandit-information. We propose to view this problem as an instance of on- line linear optimization. We propose two methods for this problem: MD2 (mirror descent with approximate projections) and the continuous exponential weights algorithm with Dikin walks. We provide a rigorous complexity analysis of these techniques, while providing near-optimal regret-bounds. In the case of full- information feedback, our results complement existing results, while in the case of bandit-information feed- back, we manage to improve the dependence of regret significantly by removing the restrictive assumption that the state-visitation probabilities are uniformly bounded away from zero under all policies.

PDF Details

ICRA Conference 2013 Conference Paper

SEPO: Selecting by pointing as an intuitive human-robot command interface

Camilo Perez Quintero
Romeo Tatsambon Fomena
Azad Shademan
Nina Wolleb
Travis Dick
Martin Jägersand

Pointing to indicate direction or position is one of the intuitive communication mechanisms used by humans in all life stages. Our aim is to develop a natural human-robot command interface using pointing gestures for human-robot interaction (HRI). We propose an interface based on the Kinect sensor for selecting by pointing (SEPO) in a 3D real-world situation, where the user points to a target object or location and the interface returns the 3D position coordinates of the target. Through our interface we perform three experiments to study precision and accuracy of human pointing in typical household scenarios: pointing to a “wall”, pointing to a “table”, and pointing to a “floor”. Our results prove that the proposed SEPO interface enables users to point and select objects with an average 3D position accuracy of 9: 6 cm in household situations.

Details