On Shallow Planning Under Partial Observability

Randy Lefebvre; Audrey Durand

doi:10.1609/aaai.v39i25.34860

Back to AAAI

AAAI 2025

On Shallow Planning Under Partial Observability

Conference Paper AAAI Technical Track on Planning, Routing, and Scheduling Artificial Intelligence

PDF Details DOI

Abstract

Formulating a real-world problem under the Reinforcement Learning framework involves non-trivial design choices, such as selecting a discount factor for the learning objective (dis- counted cumulative rewards), which articulates the planning horizon of the agent. This work investigates the impact of the discount factor on the bias-variance trade-off given structural parameters of the underlying Markov Decision Process. Our results support the idea that a shorter planning horizon might be beneficial, especially under partial observability.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: AAAI Conference on Artificial Intelligence
Archive span: 1980-2026
Indexed papers: 28718
Paper id: 999448399985428101