Kernel-Based Function Approximation for Average Reward Reinforcement Learning: An Optimist No-Regret Algorithm

Sattar Vakili; Julia Olkhovskaya

Back to EWRL

EWRL 2024

Kernel-Based Function Approximation for Average Reward Reinforcement Learning: An Optimist No-Regret Algorithm

Workshop Paper EWRL17 Artificial Intelligence · Machine Learning · Reinforcement Learning

PDF

Abstract

Reinforcement learning utilizing kernel ridge regression to predict the expected value function represents a powerful method with great representational capacity. This setting is a highly versatile framework amenable to analytical results. We consider kernel-based function approximation for RL in the infinite horizon average reward setting, also referred to as the undiscounted setting. We propose an \emph{optimistic} algorithm, similar to acquisition function based algorithms in the special case of bandits. We establish novel \emph{no-regret} performance guarantees for our algorithm, under kernel-based modelling assumptions. Additionally, we derive a novel confidence interval for the kernel-based prediction of the expected value function, applicable across various RL problems.

Authors

Keywords

Kernel function approximation
Reinforcement Learning

Context

Venue: European Workshop on Reinforcement Learning
Archive span: 2008-2025
Indexed papers: 649
Paper id: 20100392312166884