Author name cluster

Philippe Heim

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers

2 author rows

ICAPS Conference 2022 Conference Paper

Debugging a Policy: Automatic Action-Policy Testing in AI Planning

Marcel Steinmetz
Daniel Fiser
Hasan Ferit Eniser
Patrick Ferber
Timo P. Gros
Philippe Heim
Daniel Höller
Xandra Schuler

Testing is a promising way to gain trust in neural action policies π. Previous work on policy testing in sequential decision making targeted environment behavior leading to failure conditions. But if the failure is unavoidable given that behavior, then π is not actually to blame. For a situation to qualify as a "bug" in π, there must be an alternative policy π' that does better. We introduce a generic policy testing framework based on that intuition. This raises the bug confirmation problem, deciding whether or not a state is a bug. We analyze the use of optimistic and pessimistic bounds for the design of test oracles approximating that problem. We contribute an implementation of our framework in classical planning, experimenting with several test oracles and with random-walk methods generating test states biased to poor policy performance and/or state novelty. We evaluate these techniques on policies π learned with ASNets. We find that they are able to effectively identify bugs in these π, and that our random-walk biases improve over uninformed baselines.

Details

PRL Workshop 2021 Workshop Paper

Debugging a Policy: A Framework for Automatic Action Policy Testing

Marcel Steinmetz
Timo P. Gros
Philippe Heim
Daniel Höller
Joerg Hoffmann

Neural network (NN) action policies are an attractive option for real-time action decisions in dynamic environments. Yet this requires a high degree of trust in the NN. How to gain such trust? Systematic testing certainly is one possible answer, in analogy to program testing. The input to the program becomes the start state for the policy; and erroneous program behaviors – “bugs” – become bad policy behavior, e. g. not reaching the goal from a solvable state. We introduce a framework spelling out this concept. The framework is generic and in principle applicable to arbitrary planning models. We discuss how this form of testing can be operationalized, i. e. , how to confirm a bug has been found, and how potential bugs might be identified in the first place. This essentially involves seeing standard planning concepts through the new lense of policy testing. The implementation and practical exploration of this framework remains open for future work. We believe that action policy testing is an important topic for ICAPS, and we hope that our framework will serve to start its discussion.

PDF Details