Arrow Research search
Back to AAAI

AAAI 2012

Approximate Policy Iteration with Linear Action Models

Conference Paper Papers Artificial Intelligence

Abstract

In this paper we consider the problem of finding a good policy given some batch data. We propose a new approach, LAM- API, that first builds a so-called linear action model (LAM) from the data and then uses the learned model and the collected data in approximate policy iteration (API) to find a good policy. A natural choice for the policy evaluation step in this algorithm is to use least-squares temporal difference (LSTD) learning algorithm. Empirical results on three benchmark problems show that this particular instance of LAM- API performs competitively as compared with LSPI, both from the point of view of data and computational efficiency.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
733780911609967739