Online Value Function Improvement

Mitchell Bloch; John Laird

RLDM 2013

Online Value Function Improvement

Conference Abstract Accepted abstract Artificial Intelligence · Decision Making · Machine Learning · Reinforcement Learning

PDF Details

Abstract

Our goal is to develop broadly competent agents that can dynamically construct an appropri- ate value function for tasks with large state spaces so that they can effectively and efficiently learn using reinforcement learning. We study the case where an agent’s state is determined by a small number of con- tinuous dimensions, so that the problem of determining the relevant features corresponds roughly to that of determining the appropriate level of discretization of the continuous values. We adopt hierarchical tile coding, which applies state aggregation at multiple levels of state abstraction simultaneously. Using our for- mulation, it is possible to capture the advantages of learning with state abstractions ranging from general to specific using linear function approximation. We then develop a novel algorithm for incrementally refining the degree of state abstraction, based on cumulative absolute temporal difference error, which produces a sparse non-uniform tile coding. We empirically evaluate our approach in the Puddle World and Mountain Car environments. The results demonstrate that the static and incremental hierarchical tile codings signif- icantly outperform individual tilings and multilevel tile codings (CMACs) for initial learning. Our results also indicate that the incrementally constructed tilings perform nearly as well as the full hierarchical tile coding while requiring an order of magnitude fewer weights.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: Multidisciplinary Conference on Reinforcement Learning and Decision Making
Archive span: 2013-2025
Indexed papers: 1004
Paper id: 78548707396093836