RLDM Conference 2017 Conference Abstract
Prediction Regions and Tolerance Regions for Multi-Objective Markov Decision Processes
- Maria Jahja
- Daniel Lizotte
We present a framework for computing and presenting prediction regions and tolerance re- gions for the returns of an estimated policy operating within a multi-objective Markov decision process (MOMDP). Our framework draws on two bodies of existing work, one in computer science for learning in MOMDPs, and one in statistics for uncertainty quantification. We review the relevant methods from each body of work, give our framework, and illustrate its use with an empirical example. Finally, we discuss potential future directions of this work for supporting sequential decision-making.