Arrow Research search
Back to ICLR

ICLR 2023

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

Conference Paper Accepted Paper Artificial Intelligence ยท Machine Learning

Abstract

With the increasing need for handling large state and action spaces, general function approximation has become a key technique in reinforcement learning (RL). In this paper, we propose a general framework that unifies model-based and model-free RL, and an Admissible Bellman Characterization (ABC) class that subsumes nearly all Markov decision process (MDP) models in the literature for tractable RL. We propose a novel estimation function with decomposable structural properties for optimization-based exploration and the functional Eluder dimension as a complexity measure of the ABC class. Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed, achieving regret bounds that match or improve over the best-known results for a variety of MDP models. In particular, for MDPs with low Witness rank, under a slightly stronger assumption, OPERA improves the state-of-the-art sample complexity results by a factor of $dH$. Our framework provides a generic interface to design and analyze new RL models and algorithms.

Authors

Keywords

  • general function approximation
  • sample-efficient RL
  • optimization-based exploration
  • Eluder dimension
  • Bellman rank
  • witness rank
  • complexity measure
  • hypothesis class

Context

Venue
International Conference on Learning Representations
Archive span
2013-2025
Indexed papers
10294
Paper id
167568394396079246