Efficient Value Iteration for s-rectangular Robust Markov Decision Processes

Navdeep Kumar; Kaixin Wang; Kfir Yehuda Levy; Shie Mannor

Back to ICML

ICML 2024

Efficient Value Iteration for s-rectangular Robust Markov Decision Processes

Conference Paper Accept (Poster) Artificial Intelligence · Machine Learning

Details

Abstract

We focus on s-rectangular robust Markov decision processes (MDPs), which capture interconnected uncertainties across different actions within each state. This framework is more general compared to sa-rectangular robust MDPs, where uncertainties in each action are independent. However, the introduced interdependence significantly amplifies the complexity of the problem. Existing methods either have slow performance guarantees or are inapplicable to even moderately large state spaces. In this work, we derive optimal robust Bellman operators in explicit forms. This leads to robust value iteration methods with significantly faster time complexities than existing approaches, which can be used in large state spaces. Further, our findings reveal that the optimal policies demonstrate a novel threshold behavior, selectively favoring a limited set of actions based on their respective advantage functions. Additionally, our study uncovers a noteworthy connection between the robustness of a policy and the variance in its value function, highlighting that policies with lower variance exhibit greater resilience.

Efficient Value Iteration for s-rectangular Robust Markov Decision Processes

Abstract

Authors

Keywords

Context