Shielding Atari Games with Bounded Prescience

Mirco Giacobbe; Mohammadhosein Hasanbeig; Daniel Kroening; Hjalmar Wijk

Back to AAMAS

AAMAS 2021

Shielding Atari Games with Bounded Prescience

Conference Paper Extended Abstracts Autonomous Agents and Multiagent Systems

PDF

Abstract

We present the first explicit-state method for analysing and ensuring the safety of DRL agents for Atari games. Our method only requires access to the emulator. We give a suite of 42 properties that characterise “safe behaviour” for 31 games. We evaluate the safety of the best available DRL agents which, as our experiments show, violate most of our properties. We propose a countermeasure that implements shielding using bounded explicit-state exploration. Our method improved their overall safety, producing the safest DRL agents for Atari games currently available.

Authors

Keywords

Safe AI
Deep Reinforcement Learning
Atari Games

Context

Venue: International Conference on Autonomous Agents and Multiagent Systems
Archive span: 2002-2025
Indexed papers: 7403
Paper id: 392466951110579944