Author name cluster

Daniel Lizotte

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

1 author row

RLDM Conference 2017 Conference Abstract

Prediction Regions and Tolerance Regions for Multi-Objective Markov Decision Processes

Maria Jahja
Daniel Lizotte

We present a framework for computing and presenting prediction regions and tolerance re- gions for the returns of an estimated policy operating within a multi-objective Markov decision process (MOMDP). Our framework draws on two bodies of existing work, one in computer science for learning in MOMDPs, and one in statistics for uncertainty quantification. We review the relevant methods from each body of work, give our framework, and illustrate its use with an empirical example. Finally, we discuss potential future directions of this work for supporting sequential decision-making.

PDF Details

NeurIPS Conference 2011 Conference Paper

Convergent Fitted Value Iteration with Linear Function Approximation

Daniel Lizotte

Fitted value iteration (FVI) with ordinary least squares regression is known to diverge. We present a new method, "Expansion-Constrained Ordinary Least Squares" (ECOLS), that produces a linear approximation but also guarantees convergence when used with FVI. To ensure convergence, we constrain the least squares regression operator to be a non-expansion in the infinity-norm. We show that the space of function approximators that satisfy this constraint is more rich than the space of "averagers, " we prove a minimax property of the ECOLS residual error, and we give an efficient algorithm for computing the coefficients of ECOLS based on constraint generation. We illustrate the algorithmic convergence of FVI with ECOLS in a suite of experiments, and discuss its properties.

PDF Details

IJCAI Conference 2007 Conference Paper

Daniel Lizotte
Tao Wang
Michael Bowling
Dale Schuurmans

Gait optimization is a basic yet challenging problem for both quadrupedal and bipedal robots. Although techniques for automating the process exist, most involve local function optimization procedures that suffer from three key drawbacks. Local optimization techniques are naturally plagued by local optima, make no use of the expensive gait evaluations once a local step is taken, and do not explicitly model noise in gait evaluation. These drawbacks increase the need for a large number of gait evaluations, making optimization slow, data inefficient, and manually intensive. We present a Bayesian approach based on Gaussian process regression that addresses all three drawbacks. It uses a global search strategy based on a posterior model inferred from all of the individual noisy evaluations. We demonstrate the technique on a quadruped robot, using it to optimize two different criteria: speed and smoothness. We show in both cases our technique requires dramatically fewer gait evaluations than state-of-the-art local gradient approaches.

PDF Details

NeurIPS Conference 2007 Conference Paper

Stable Dual Dynamic Programming

Tao Wang
Michael Bowling
Dale Schuurmans
Daniel Lizotte

Recently, we have introduced a novel approach to dynamic programming and re- inforcement learning that is based on maintaining explicit representations of sta- tionary distributions instead of value functions. In this paper, we investigate the convergence properties of these dual algorithms both theoretically and empirically, and show how they can be scaled up by incorporating function approximation.

PDF Details