Author name cluster

Thorsten Engesser

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

2 author rows

AAMAS Conference 2025 Conference Paper

A Simple Integration of Epistemic Logic and Reinforcement Learning

Thorsten Engesser
Thibaut Le Marre
Emiliano Lorini
François Schwarzentruber
Bruno Zanuttini

We propose an integration of epistemic logic with reinforcement learning via a semantics that uses the concept of belief bases. In our framework, an agent’s subjective state is identified with their belief base, which captures the agent’s personal representation of the environment. The agent’s subjective state is distinguished from the global state, which captures the overall information about the environment and about the agent’s belief base from an external perspective. We instantiate the concepts of global state and subjective state in Partially-Observable Markov Decision Process (POMDPs), defining so-called Belief Base POMDPs (BB-POMDPs). We show that in our epistemic framework, we can use the beliefs of the learning agent to formalize and implement a natural form of shielding, which prevents agents from performing actions that are not known to be safe. Our implementation of shielding relies on a model-checking algorithm to automatically verify whether a given fact is deducible from the agent’s belief base. We perform a case study of model-free reinforcement learning on a simple wumpus scenario, using a variant of Q-learning on the agent’s subjective states, using the agent’s beliefs for reward shaping and shielding. In particular, our experiments show that our version of shielding can successfully protect the agent from harm while improving the utility of the learned policy.

PDF

AAMAS Conference 2024 Conference Paper

A Symbolic Sequential Equilibria Solver for Game Theory Explorer

Moritz Graf
Thorsten Engesser
Bernhard Nebel

We present the first implemented symbolic solver for sequential equilibria in general finite imperfect information games. 1

PDF

AAMAS Conference 2024 Conference Paper

Symbolic Computation of Sequential Equilibria

Moritz Graf
Thorsten Engesser
Bernhard Nebel

The sequential equilibrium is a standard solution concept for extensive-form games with imperfect information that includes an explicit representation of the players’ beliefs. An assessment consisting of a strategy and a belief is a sequential equilibrium if it satisfies the properties of sequential rationality and consistency. Our main result is that both properties together can be written as a single finite system of polynomial equations and inequalities. The solutions to this system are exactly the sequential equilibria of the game. We construct this system explicitly and describe an implementation that solves it using cylindrical algebraic decomposition. To write consistency as a finite system of equations, we need to compute the extreme directions of a set of polyhedral cones. We propose a modified version of the double description method, optimized for this specific purpose. To the best of our knowledge, our implementation is the first to symbolically solve general finite imperfect information games for sequential equilibria. 1

PDF

AAAI Conference 2024 Conference Paper

Towards Epistemic-Doxastic Planning with Observation and Revision

Thorsten Engesser
Andreas Herzig
Elise Perrotin

Epistemic planning is useful in situations where multiple agents have different knowledge and beliefs about the world, such as in robot-human interaction. One aspect that has been largely neglected in the literature is planning with observations in the presence of false beliefs. This is a particularly challenging problem because it requires belief revision. We introduce a simple specification language for reasoning about actions with knowledge and belief. We demonstrate our approach on well-known false-belief tasks such as the Sally-Anne Task and compare it to other action languages. Our logic leads to an epistemic planning formalism that is expressive enough to model second-order false-belief tasks, yet has the same computational complexity as classical planning.

PDF Details DOI

AIJ Journal 2021 Journal Article

Game description language and dynamic epistemic logic compared

Thorsten Engesser
Robert Mattmüller
Bernhard Nebel
Michael Thielscher

Several different frameworks have been proposed to model and reason about knowledge in dynamic multi-agent settings, among them the logic-programming-based game description language GDL-III and dynamic epistemic logic (DEL). GDL-III and DEL have complementary strengths and weaknesses in terms of ease of modeling and simplicity of semantics. In this paper, we formally study the expressiveness of GDL-III vs. DEL. We clarify the commonalities and differences between those languages, demonstrate how to bridge the differences where possible, and identify large fragments of GDL-III and DEL that are equivalent in the sense that they can be used to encode games or planning tasks that admit the same legal action sequences. We prove the latter by providing translations between those fragments of GDL-III and DEL.

Details DOI

AAAI Conference 2020 Conference Paper

Implicit Coordination Using FOND Planning

Thorsten Engesser
Tim Miller

Epistemic planning can be used to achieve implicit coordination in cooperative multi-agent settings where knowledge and capabilities are distributed between the agents. In these scenarios, agents plan and act on their own without having to agree on a common plan or protocol beforehand. However, epistemic planning is undecidable in general. In this paper, we show how implicit coordination can be achieved in a simpler, propositional setting by using nondeterminism as a means to allow the agents to take the other agents’ perspectives. We identify a decidable fragment of epistemic planning that allows for arbitrary initial state uncertainty and nondeterminism, but where actions can never increase the uncertainty of the agents. We show that in this fragment, planning for implicit coordination can be reduced to a version of fully observable nondeterministic (FOND) planning and that it thus has the same computational complexity as FOND planning. We provide a small case study, modeling the problem of multi-agent path ﬁnding with destination uncertainty in FOND, to show that our approach can be successfully applied in practice.

PDF Details

KR Conference 2020 Conference Paper

Token-based Execution Semantics for Multi-Agent Epistemic Planning

Thorsten Engesser
Robert Mattmüller
Bernhard Nebel
Felicitas Ritter

Epistemic planning has been employed as a means to achieve implicit coordination in cooperative multi-agent systems where world knowledge is distributed between the agents, and agents plan and act individually. However, recent work has shown that even if all agents act with respect to plans that they consider optimal from their own subjective perspective, infinite executions can occur. In this paper, we analyze the idea of using a single token that can be passed around between the agents and which is used as a prerequisite for acting. We show that introducing such a token to any planning task will prevent the existence of infinite executions. We furthermore analyze the conditions under which solutions to a planning task are preserved under our tokenization.

PDF Details DOI

JAIR Journal 2019 Journal Article

Implicitly Coordinated Multi-Agent Path Finding under Destination Uncertainty: Success Guarantees and Computational Complexity

Bernhard Nebel
Thomas Bolander
Thorsten Engesser
Robert Mattmüller

In multi-agent path finding (MAPF), it is usually assumed that planning is performed centrally and that the destinations of the agents are common knowledge. We will drop both assumptions and analyze under which conditions it can be guaranteed that the agents reach their respective destinations using implicitly coordinated plans without communication. Furthermore, we will analyze what the computational costs associated with such a coordination regime are. As it turns out, guarantees can be given assuming that the agents are of a certain type. However, the implied computational costs are quite severe. In the distributed setting, we either have to solve a sequence of NP-complete problems or have to tolerate exponentially longer executions. In the setting with destination uncertainty, bounded plan existence becomes PSPACE-complete. This clearly demonstrates the value of communicating about plans before execution starts.

PDF Details DOI

IJCAI Conference 2019 Conference Paper

Implicitly Coordinated Multi-Agent Path Finding under Destination Uncertainty: Success Guarantees and Computational Complexity (Extended Abstract)

Bernhard Nebel
Thomas Bolander
Thorsten Engesser
Robert Mattmüller

In multi-agent path finding, it is usually assumed that planning is performed centrally and that the destinations of the agents are common knowledge. We will drop both assumptions and analyze under which conditions it can be guaranteed that the agents reach their respective destinations using implicitly coordinated plans without communication.

PDF Details

JELIA Conference 2019 Conference Paper

The Dynamic Logic of Policies and Contingent Planning

Thomas Bolander
Thorsten Engesser
Andreas Herzig
Robert Mattmüller
Bernhard Nebel

Abstract In classical deterministic planning, solutions to planning tasks are simply sequences of actions, but that is not sufficient for contingent plans in non-deterministic environments. Contingent plans are often expressed through policies that map states to actions. An alternative is to specify contingent plans as programs, e. g. in the syntax of Propositional Dynamic Logic (PDL). PDL is a logic for reasoning about programs with sequential composition, test and non-deterministic choice. However, as we show in the paper, none of the existing PDL modalities directly captures the notion of a solution to a planning task under non-determinism. We add a new modality to star-free PDL correctly capturing this notion. We prove the appropriateness of the new modality by showing how to translate back and forth between policies and PDL programs under the new modality. More precisely, we show how a policy solution to a planning task gives rise to a program solution expressed via the new modality, and vice versa. We also provide an axiomatisation of our PDL extension through reduction axioms into standard star-free PDL.

Details

KR Conference 2018 Conference Paper

Better Eager Than Lazy? How Agent Types Impact the Successfulness of Implicit Coordination

Thomas Bolander
Thorsten Engesser
Robert Mattmüller
Bernhard Nebel

Epistemic planning can be used for decision making in multiagent situations with distributed knowledge and capabilities. In recent work, we proposed a new notion of strong policies with implicit coordination. With this it is possible to solve planning tasks with joint goals from a single-agent perspective without the agents having to negotiate about and commit to a joint policy at plan time. We study how and under which circumstances the decentralized application of those policies leads to the desired outcome.

IJCAI Conference 2018 Conference Paper

Game Description Language and Dynamic Epistemic Logic Compared

Thorsten Engesser
Robert Mattmüller
Bernhard Nebel
Michael Thielscher

Several different frameworks have been proposed to model and reason about knowledge in dynamic multi-agent settings, among them the logic-programming-based game description language GDL-III, and dynamic epistemic logic (DEL), based on possible-worlds semantics. GDL-III and DEL have complementary strengths and weaknesses in terms of ease of modeling and simplicity of semantics. In this paper, we formally study the expressiveness of GDL-III vs. DEL. We clarify the commonalities and differences between those languages, demonstrate how to bridge the differences where possible, and identify large fragments of GDL-III and DEL that are equivalent in the sense that they can be used to encode games or planning tasks that admit the same legal action sequences. We prove the latter by providing compilations between those fragments of GDL-III and DEL.

PDF Details