AAMAS Conference 2025 Conference Paper
A Simple Integration of Epistemic Logic and Reinforcement Learning
- Thorsten Engesser
- Thibaut Le Marre
- Emiliano Lorini
- François Schwarzentruber
- Bruno Zanuttini
We propose an integration of epistemic logic with reinforcement learning via a semantics that uses the concept of belief bases. In our framework, an agent’s subjective state is identified with their belief base, which captures the agent’s personal representation of the environment. The agent’s subjective state is distinguished from the global state, which captures the overall information about the environment and about the agent’s belief base from an external perspective. We instantiate the concepts of global state and subjective state in Partially-Observable Markov Decision Process (POMDPs), defining so-called Belief Base POMDPs (BB-POMDPs). We show that in our epistemic framework, we can use the beliefs of the learning agent to formalize and implement a natural form of shielding, which prevents agents from performing actions that are not known to be safe. Our implementation of shielding relies on a model-checking algorithm to automatically verify whether a given fact is deducible from the agent’s belief base. We perform a case study of model-free reinforcement learning on a simple wumpus scenario, using a variant of Q-learning on the agent’s subjective states, using the agent’s beliefs for reward shaping and shielding. In particular, our experiments show that our version of shielding can successfully protect the agent from harm while improving the utility of the learned policy.