Author name cluster

Brian D. Ziebart

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

19 papers

2 author rows

IROS Conference 2023 Conference Paper

Robot Learning to Mop Like Humans Using Video Demonstrations

Sanket Gaurav
Aaron Crookes
David Hoying
Vignesh Narayanaswamy
Harish Venkataraman
Matthew Barker
Venugopal Vasudevan
Brian D. Ziebart

Though mopping the floor is a mundane and tedious daily task, enabling robots to perform it comparably to humans remains a challenge. Hand-coding desired mopping behaviors for variable surfaces and situations is particularly difficult. In this paper, we develop a robotic system for mopping the floor by mimicking the human behavior demonstrated in videos. Our baseline robotic system uses traditional computer vision techniques for tracking and inverse kinematics. Our proposed robot mop learning system comprises advanced computer vision techniques, Time Contrastive Network (TCN), and reinforcement learning. Using these, we devise a reward function for the mopping task. We use a Universal 10e robotic arm attached to a mop to perform the mopping task and a first-person camera attached on top of the robotic arm to provide feedback for robotic learning. We evaluate our proposed robot mop learning system's imitative similarity using optical flow, distance in mop location, and force applied to the floor, as well as cleaning efficiency using a white glove method.

Details

ICML Conference 2023 Conference Paper

Superhuman Fairness

Omid Memarrast
Linh Vu
Brian D. Ziebart

The fairness of machine learning-based decisions has become an increasingly important focus in the design of supervised machine learning methods. Most fairness approaches optimize a specified trade-off between performance measure(s) (e. g. , accuracy, log loss, or AUC) and fairness metric(s) (e. g. , demographic parity, equalized odds). This begs the question: are the right performance-fairness trade-offs being specified? We instead re-cast fair machine learning as an imitation learning task by introducing superhuman fairness, which seeks to simultaneously outperform human decisions on multiple predictive performance and fairness measures. We demonstrate the benefits of this approach given suboptimal decisions.

Details

AIJ Journal 2022 Journal Article

Risk-averse policy optimization via risk-neutral policy optimization

Lorenzo Bisi
Davide Santambrogio
Federico Sandrelli
Andrea Tirinzoni
Brian D. Ziebart
Marcello Restelli

Details DOI

ICML Conference 2022 Conference Paper

Towards Uniformly Superhuman Autonomy via Subdominance Minimization

Brian D. Ziebart
Sanjiban Choudhury
Xinyan Yan
Paul Vernaza

Prevalent imitation learning methods seek to produce behavior that matches or exceeds average human performance. This often prevents achieving expert-level or superhuman performance when identifying the better demonstrations to imitate is difficult. We instead assume demonstrations are of varying quality and seek to induce behavior that is unambiguously better (i. e. , Pareto dominant or minimally subdominant) than all human demonstrations. Our minimum subdominance inverse optimal control training objective is primarily defined by high quality demonstrations; lower quality demonstrations, which are more easily dominated, are effectively ignored instead of degrading imitation. With increasing probability, our approach produces superhuman behavior incurring lower cost than demonstrations on the demonstrator’s unknown cost function{—}even if that cost function differs for each demonstration. We apply our approach on a computer cursor pointing task, producing behavior that is 78% superhuman, while minimizing demonstration suboptimality provides 50% superhuman behavior{—}and only 72% even after selective data cleaning.

Details

AAAI Conference 2021 Conference Paper

Robust Fairness Under Covariate Shift

Ashkan Rezaei
Anqi Liu
Omid Memarrast
Brian D. Ziebart

Making predictions that are fair with regard to protected attributes (race, gender, age, etc.) has become an important requirement for classification algorithms. Existing techniques derive a fair model from sampled labeled data relying on the assumption that training and testing data are identically and independently drawn (iid) from the same distribution. In practice, distribution shift can and does occur between training and testing datasets as the characteristics of individuals interacting with the machine learning system change. We investigate fairness under covariate shift, a relaxation of the iid assumption in which the inputs or covariates change while the conditional label distribution remains the same. We seek fair decisions under these assumptions on target data with unknown labels. We propose an approach that obtains the predictor that is robust to the worst-case testing performance while satisfying target fairness requirements and matching statistical properties of the source data. We demonstrate the benefits of our approach on benchmark prediction tasks.

PDF Details

UAI Conference 2020 Conference Paper

Adversarial Learning for 3D Matching

Wei Xing
Brian D. Ziebart

Structured prediction of objects in spaces that are inherently difficult to search or compactly characterize is a particularly challenging task. For example, though bipartite matchings in two dimensions can be tractably optimized and learned, the higher-dimensional generalization—3D matchings—are NP-hard to optimally obtain and the set of potential solutions cannot be compactly characterized. Though approximation is therefore necessary, prevalent structured prediction methods inherit the weaknesses they possess in the two-dimensional setting either suffering from inconsistency or intractability—even when the approximations are sufficient. In this paper, we explore extending an adversarial approach to learning bipartite matchings that avoids these weaknesses to the three dimensional setting. We assess the benefits compared to margin-based methods on a three-frame tracking problem.

Details

ICML Conference 2019 Conference Paper

Active Learning for Probabilistic Structured Prediction of Cuts and Matchings

Sima Behpour
Anqi Liu 0001
Brian D. Ziebart

Active learning methods, like uncertainty sampling, combined with probabilistic prediction techniques have achieved success in various problems like image classification and text classification. For more complex multivariate prediction tasks, the relationships between labels play an important role in designing structured classifiers with better performance. However, computational time complexity limits prevalent probabilistic methods from effectively supporting active learning. Specifically, while non-probabilistic methods based on structured support vector ma-chines can be tractably applied to predicting cuts and bipartite matchings, conditional random fields are intractable for these structures. We propose an adversarial approach for active learning with structured prediction domains that is tractable for cuts and matching. We evaluate this approach algorithmically in two important structured prediction problems: multi-label classification and object tracking in videos. We demonstrate better accuracy and computational efficiency for our proposed method.

Details

ICML Conference 2018 Conference Paper

Efficient and Consistent Adversarial Bipartite Matching

Rizal Fathony
Sima Behpour
Xinhua Zhang
Brian D. Ziebart

Many important structured prediction problems, including learning to rank items, correspondence-based natural language processing, and multi-object tracking, can be formulated as weighted bipartite matching optimizations. Existing structured prediction approaches have significant drawbacks when applied under the constraints of perfect bipartite matchings. Exponential family probabilistic models, such as the conditional random field (CRF), provide statistical consistency guarantees, but suffer computationally from the need to compute the normalization term of its distribution over matchings, which is a #P-hard matrix permanent computation. In contrast, the structured support vector machine (SSVM) provides computational efficiency, but lacks Fisher consistency, meaning that there are distributions of data for which it cannot learn the optimal matching even under ideal learning conditions (i. e. , given the true distribution and selecting from all measurable potential functions). We propose adversarial bipartite matching to avoid both of these limitations. We develop this approach algorithmically, establish its computational efficiency and Fisher consistency properties, and apply it to matching problems that demonstrate its empirical benefits.

Details

ICRA Conference 2017 Conference Paper

Goal-predictive robotic teleoperation from noisy sensors

Christopher Schultz
Sanket Gaurav
Mathew Monfort
Lingfei Zhang
Brian D. Ziebart

Robotic teleoperation from a human operator's pose demonstrations provides an intuitive and effective means of control that has been made feasible by improvements in sensor technologies in recent years. However, the imprecision of low-cost depth cameras and the difficulty of calibrating a frame of reference for the operator introduce inefficiencies in this process when performing tasks that require interactions with objects in the robot's workspace. We develop a goal-predictive teleoperation system that aids in “de-noising” the controls of the operator to be more goal-directed. Our approach uses inverse optimal control to predict the intended object of interaction from the current motion trajectory in real time and then adapts the degree of autonomy between the operator's demonstrations and autonomous completion of the predicted task. We evaluate our approach using the Microsoft Kinect depth camera as our input sensor to control a Rethink Robotics Baxter robot.

Details

UAI Conference 2016 Conference Paper

Adversarial Inverse Optimal Control for General Imitation Learning Losses and Embodiment Transfer

Xiangli Chen
Mathew Monfort
Brian D. Ziebart
Peter Carr

We develop a general framework for inverse optimal control that distinguishes between rationalizing demonstrated behavior and imitating inductively inferred behavior. This enables learning for more general imitative evaluation measures and differences between the capabilities of the demonstrator and those of the learner (i. e. , differences in embodiment). Our formulation takes the form of a zero-sum game between a learner attempting to minimize an imitative loss measure, and an adversary attempting to maximize the loss by approximating the demonstrated examples in limited ways. We establish the consistency and generalization guarantees of this approach and illustrate its benefits on real and synthetic imitation learning tasks.

Details

IJCAI Conference 2016 Conference Paper

Adversarial Sequence Tagging

Jia Li
Kaiser Asif
Hong Wang
Brian D. Ziebart
Tanya Berger-Wolf

Providing sequence tagging that minimize Hamming loss is a challenging, but important, task. Directly minimizing this loss over a training sample is generally an NP-hard problem. Instead, existing sequence tagging methods minimize a convex upper bound that upper bounds the Hamming loss. Unfortunately, this often either leads to inconsistent predictors (e. g. , max-margin methods) or predictions that are mismatched on the Hamming loss (e. g. , conditional random fields). We present adversarial sequence tagging, a consistent structured prediction framework for minimizing Hamming loss by pessimistically viewing uncertainty. Our approach pessimistically approximates the training data, yielding an adversarial game between the sequence tag predictor and the sequence labeler. We demonstrate the benefits of the approach on activity recognition and information extraction/segmentation tasks.

PDF Details

UAI Conference 2015 Conference Paper

Adversarial Cost-Sensitive Classification

Kaiser Asif
Wei Xing
Sima Behpour
Brian D. Ziebart

In many classification settings, mistakes incur different application-dependent penalties based on the predicted and actual class labels. Costsensitive classifiers minimizing these penalties are needed. We propose a robust minimax approach for producing classifiers that directly minimize the cost of mistakes as a convex optimization problem. This is in contrast to previous methods that minimize the empirical risk using a convex surrogate for the cost of mistakes, since minimizing the empirical risk of the actual cost-sensitive loss is generally intractable. By treating properties of the training data as uncertain, our approach avoids these computational difficulties. We develop theory and algorithms for our approach and demonstrate its benefits on cost-sensitive classification tasks.

Details

ICML Conference 2011 Conference Paper

Computational Rationalization: The Inverse Equilibrium Problem

Kevin Waugh
Brian D. Ziebart
J. Andrew Bagnell

Details

AAMAS Conference 2011 Conference Paper

Maximum Causal Entropy Correlated Equilibria for Markov Games

Brian D. Ziebart
J. Andrew Bagnell
Anind K. Dey

Motivated by a machine learning perspective-that game-theoretic equilibria constraints should serve as guidelines for predicting agents' strategies, we introduce maximum causal entropy correlated equilibria (MCECE), a novel solution concept for general-sum Markov games. In line with this perspective, a MCECE strategy profile is a uniquely-defined joint probability distribution over actions for each game state that minimizes the worst-case prediction of agents' actions under log-loss. Equivalently, it maximizes the worst-case growth rate for gambling on the sequences of agents' joint actions under uniform odds. We present a convex optimization technique for obtaining MCECE strategy profiles that resembles value iteration in finite-horizon games. We assess the predictive benefits of our approach by predicting the strategies generated by previously proposed correlated equilibria solution concepts, and compare against those previous approaches on that same prediction task.

PDF

ICML Conference 2010 Conference Paper

Modeling Interaction via the Principle of Maximum Causal Entropy

Brian D. Ziebart
J. Andrew Bagnell
Anind K. Dey

Details

IROS Conference 2009 Conference Paper

Planning-based prediction for pedestrians

Brian D. Ziebart
Nathan D. Ratliff
Garratt Gallagher
Christoph Mertz
Kevin M. Peterson
J. Andrew Bagnell
Martial Hebert
Anind K. Dey

We present a novel approach for determining robot movements that efficiently accomplish the robot's tasks while not hindering the movements of people within the environment. Our approach models the goal-directed trajectories of pedestrians using maximum entropy inverse optimal control. The advantage of this modeling approach is the generality of its learned cost function to changes in the environment and to entirely different environments. We employ the predictions of this model of pedestrian trajectories in a novel incremental planner and quantitatively show the improvement in hindrance-sensitive robot trajectory planning provided by our approach.

Details

ICAPS Conference 2008 Conference Paper

Fast Planning for Dynamic Preferences

Brian D. Ziebart
Anind K. Dey
J. Andrew Bagnell

We present an algorithm that quickly finds optimal plans for unforeseen agent preferences within graph-based planning domains where actions have deterministic outcomes and action costs are linearly parameterized by preference parameters. We focus on vehicle route planning for drivers with personal trade-offs for different types of roads, and specifically on settings where these preferences are not known until planning time. We employ novel bounds (based on the triangle inequality and on the the concavity of the optimal plan cost in the space of preferences) to enable the reuse of previously computed optimal plans that are similar to the new plan preferences. The resulting lower bounds are employed to guide the search for the optimal plan up to 60 times more efficiently than previous methods.

Details

AAAI Conference 2008 Conference Paper

Maximum Entropy Inverse Reinforcement Learning

Brian D. Ziebart
J. Andrew Bagnell

Recent research has shown the benefit of framing problems of imitation learning as solutions to Markov Decision Problems. This approach reduces learning to the problem of recovering a utility function that makes the behavior induced by a near-optimal policy closely mimic demonstrated behavior. In this work, we develop a probabilistic approach based on the principle of maximum entropy. Our approach provides a well-defined, globally normalized distribution over decision sequences, while providing the same performance guarantees as existing methods. We develop our technique in the context of modeling realworld navigation and driving behaviors where collected data is inherently noisy and imperfect. Our probabilistic approach enables modeling of route preferences as well as a powerful new approach to inferring destinations and routes based on partial trajectories.

PDF Details

UAI Conference 2007 Conference Paper

Learning Selectively Conditioned Forest Structures with Applications to DBNs and Classification

Brian D. Ziebart
Anind K. Dey
J. Andrew Bagnell

Dealing with uncertainty in Bayesian Network structures using maximum a posteriori (MAP) estimation or Bayesian Model Averaging (BMA) is often intractable due to the superexponential number of possible directed, acyclic graphs. When the prior is decomposable, two classes of graphs where efficient learning can take place are tree structures, and fixed-orderings with limited in-degree. We show how MAP estimates and BMA for selectively conditioned forests (SCF), a combination of these two classes, can be computed efficiently for ordered sets of variables. We apply SCFs to temporal data to learn Dynamic Bayesian Networks having an intra-timestep forest and inter-timestep limited in-degree structure, improving model accuracy over DBNs without the combination of structures. We also apply SCFs to Bayes Net classification to learn selective forest augmented Naıve Bayes classifiers. We argue that the built-in feature selection of selective augmented Bayes classifiers makes them preferable to similar non-selective classifiers based on empirical evidence.

Details