Arrow Research search
Back to AAAI

AAAI 2022

A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes

Conference Paper AAAI Technical Track on Constraint Satisfaction and Optimization Artificial Intelligence

Abstract

This paper presents a model-free reinforcement learning (RL) algorithm for infinite-horizon average-reward Constrained Markov Decision Processes (CMDPs). Considering a learning horizon K, which is sufficiently large, the proposed algorithm achieves Õ √ SAκ δ K 5 6 regret and zero constraint violation, where S is the number of states, A is the number of actions, and κ and δ are two constants independent of the learning horizon K.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
3922982254696452