Arrow Research search
Back to RLDM

RLDM 2013

CAPI: Generalized Classification-based Approximate Policy Iteration

Conference Abstract Accepted abstract Artificial Intelligence · Decision Making · Machine Learning · Reinforcement Learning

Abstract

Efficient methods for tackling large reinforcement learning problems usually exploit regularities, or intrinsic structures, of the problem in hand. Most current methods benefit from the regularities of either value function or policy, but not both. In this paper, we introduce a general classification-based approximate policy iteration (CAPI) framework, which can benefit from both types of regularities. This framework has two main components: a generic user- specified value function estimator and a weighted classifier that learns a policy based on the estimated value function. The result is a flexible and sample-efficient class of algorithms. We also use a particular instantiation of CAPI to design an adaptive treatment strategy for HIV-infected patients. Comparison with a state-of-the-art purely value-based reinforcement learning algorithm, Tree- based Fitted Q-Iteration, shows that benefitting from the regularity of both policy and value function can lead to better performance.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
Multidisciplinary Conference on Reinforcement Learning and Decision Making
Archive span
2013-2025
Indexed papers
1004
Paper id
903653690957081092