Approximate Policy Iteration with Linear Action Models

Hengshuai Yao; Csaba Szepesvari

Back to AAAI

AAAI 2012

Approximate Policy Iteration with Linear Action Models

Conference Paper Papers Artificial Intelligence

PDF Details

Abstract

In this paper we consider the problem of finding a good policy given some batch data. We propose a new approach, LAM- API, that first builds a so-called linear action model (LAM) from the data and then uses the learned model and the collected data in approximate policy iteration (API) to find a good policy. A natural choice for the policy evaluation step in this algorithm is to use least-squares temporal difference (LSTD) learning algorithm. Empirical results on three benchmark problems show that this particular instance of LAM- API performs competitively as compared with LSPI, both from the point of view of data and computational efficiency.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: AAAI Conference on Artificial Intelligence
Archive span: 1980-2026
Indexed papers: 28718
Paper id: 733780911609967739