A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes

Honghao Wei; Xin Liu; Lei Ying

Back to AAAI

AAAI 2022

A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes

Conference Paper AAAI Technical Track on Constraint Satisfaction and Optimization Artificial Intelligence

PDF Details

Abstract

This paper presents a model-free reinforcement learning (RL) algorithm for infinite-horizon average-reward Constrained Markov Decision Processes (CMDPs). Considering a learning horizon K, which is sufficiently large, the proposed algorithm achieves Õ √ SAκ δ K 5 6 regret and zero constraint violation, where S is the number of states, A is the number of actions, and κ and δ are two constants independent of the learning horizon K.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: AAAI Conference on Artificial Intelligence
Archive span: 1980-2026
Indexed papers: 28718
Paper id: 3922982254696452