Author name cluster

Olivier Buffet

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

38 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

\(\varepsilon\)-Optimally Solving Two-Player Zero-Sum POSGs

Erwan escudie
Matthia Sabatelli
Olivier Buffet
Jilles Dibangoye

We present a novel framework for (\varepsilon)-optimally solving two-player zero-sum partially observable stochastic games (zs-POSGs). These games pose a major challenge due to the absence of a principled connection with dynamic programming (DP) techniques developed for two-player zero-sum stochastic games (zs-SGs). Prior attempts at transferring solution methods have lacked a lossless reduction—defined here as a transformation that preserves value functions, equilibrium strategies, and optimality structure—thereby limiting generalisation to ad hoc algorithms. This work introduces the first lossless reduction from zs-POSGs to transition-independent zs-SGs, enabling the principled application of a broad class of DP-based methods. We show empirically that point-based value iteration (PBVI) algorithms, applied via this reduction, produce (\varepsilon)-optimal strategies across a range of benchmark domains, consistently matching or outperforming existing state-of-the-art methods. Our results open a systematic pathway for algorithmic and theoretical transfer from SGs to partially observable settings.

PDF Details

AAMAS Conference 2025 Conference Paper

Observer-Aware Probabilistic Planning under Partial Observability

Salomé Lepers
Vincent Thomas
Olivier Buffet

We are interested in planning problems where the agent is aware of the presence of an observer, and where this observer is in a partial observability situation. The agent has to choose its strategy so as to optimize the information transmitted by observations. Building on observer-aware Markov decision processes (OAMDPs), we propose a framework to handle this type of problems and thus formalize properties such as legibility, explicability and predictability. This extension of OAMDPs to partial observability can not only handle more realistic problems, but also permits considering dynamic hidden variables of interest. We discuss theoretical properties of PO-OAMDPs and, experimenting with benchmark problems, we analyze HSVI’s convergence behavior with dedicated initializations and study the resulting strategies.

PDF

ECAI Conference 2025 Conference Paper

Post-Hoc Interpretation of POMDP Policies

Geoffrey Laforest
Olivier Buffet
Alexandre Niveau
Bruno Zanuttini

Policies for partially observable Markov decision processes are rich objects, prescribing actions to take depending on the whole history of observations and actions. Typical representations of such policies are by hyperplanes in the space of belief states, or by finite-state controllers, which are arguably not easy to interpret. We propose to redescribe policies into mappings defined on features of the current belief state, built in a systematic manner from state features. Such a mapping can in turn be represented by an intelligible object, like a decision tree, thereby providing an interpretable representation of the policy as a whole. We moreover show how our approach allows to explain the decision taken by an agent at each step of an interaction with the environment. This provides an end-to-end process, starting from a policy computed by any solver, and ending with an explanation of each decision made at execution time. We formally define our approach, investigate related computational problems, and report on experiments on several families of problems.