Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation

Josiah Hanna; Peter Stone; Scott Niekum

Back to AAAI

AAAI 2017

Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation

Short Paper Student Abstract Track Artificial Intelligence

PDF Details

Abstract

In many reinforcement learning applications, it is desirable to determine conﬁdence interval lower bounds on the performance of any given policy without executing said policy. In this context, we propose two bootstrapping off-policy evaluation methods which use learned MDP transition models in order to estimate lower conﬁdence bounds on policy performance with limited data. We empirically evaluate the proposed methods in a standard policy evaluation tasks. 1

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: AAAI Conference on Artificial Intelligence
Archive span: 1980-2026
Indexed papers: 28718
Paper id: 901005674917740972