Author name cluster

Joerg Hoffmann

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

1 author row

PRL Workshop 2021 Workshop Paper

Debugging a Policy: A Framework for Automatic Action Policy Testing

Marcel Steinmetz
Timo P. Gros
Philippe Heim
Daniel Höller
Joerg Hoffmann

Neural network (NN) action policies are an attractive option for real-time action decisions in dynamic environments. Yet this requires a high degree of trust in the NN. How to gain such trust? Systematic testing certainly is one possible answer, in analogy to program testing. The input to the program becomes the start state for the policy; and erroneous program behaviors – “bugs” – become bad policy behavior, e. g. not reaching the goal from a solvable state. We introduce a framework spelling out this concept. The framework is generic and in principle applicable to arbitrary planning models. We discuss how this form of testing can be operationalized, i. e. , how to confirm a bug has been found, and how potential bugs might be identified in the first place. This essentially involves seeing standard planning concepts through the new lense of policy testing. The implementation and practical exploration of this framework remains open for future work. We believe that action policy testing is an important topic for ICAPS, and we hope that our framework will serve to start its discussion.

PDF Details

PRL Workshop 2021 Workshop Paper

Neural Network Heuristics for Classical Planning: Reinforcement Learning and Comparison to Other Methods

Patrick Ferber
Florian Geißer
Felipe Trevizan
Malte Helmert
Joerg Hoffmann

How can we train neural network (NN) heuristic functions for classical planning, using only states as the NN input? Prior work addressed this question by (a) supervised learning and/or (b) per-domain learning generalizing over problem instances. The former limits the approach to instances small enough for training data generation, the latter to domains and instance distributions where the necessary knowledge generalizes across instances. Clearly, reinforcement learning (RL) on large instances can potentially avoid both difficulties. We explore this here in terms of three methods drawing on previous ideas relating to bootstrapping and approximate value iteration, including a new bootstrapping variant that estimates search effort instead of goal distance. We empirically compare these methods to (a) and (b), aligning three different NN heuristic function learning architectures for crosscomparison in an experiment of unprecedented breadth in this context. Key lessons from this experiment are that our methods and supervised learning are highly complementary; that per-instance learning often yields stronger heuristics than perdomain learning; and that LAMA is still dominant but is outperformed by our methods in one benchmark domain.

PDF Details

AAAI Conference 2020 Conference Paper

Beliefs We Can Believe in: Replacing Assumptions with Data in Real-Time Search

Maximilian Fickert
Tianyi Gu
Leonhard Staut
Wheeler Ruml
Joerg Hoffmann
Marek Petrik

Suboptimal heuristic search algorithms can beneﬁt from reasoning about heuristic error, especially in a real-time setting where there is not enough time to search all the way to a goal. However, current reasoning methods implicitly or explicitly incorporate assumptions about the cost-to-go function. We consider a recent real-time search algorithm, called Nancy, that manipulates explicit beliefs about the cost-togo. The original presentation of Nancy assumed that these beliefs are Gaussian, with parameters following a certain form. In this paper, we explore how to replace these assumptions with actual data. We develop a data-driven variant of Nancy, DDNancy, that bases its beliefs on heuristic performance statistics from the same domain. We extend Nancy and DDNancy with the notion of persistence and prove their completeness. Experimental results show that DDNancy can perform well in domains in which the original assumption-based Nancy performs poorly.

PDF Details

PRL Workshop 2020 Workshop Paper

Real-time Planning as Data-driven Decision-making

Maximilian Fickert
Tianyi Gu
Leonhard Staut
Sai Lekyang
Wheeler Ruml
Joerg Hoffmann
Marek Petrik

If reinforcement learning (RL) is the use of incrementally gathered data to drive decision-making, then any heuristic search strategy is fundamentally an RL process. This is perhaps clearest in real-time planning, where an agent must select the next action to take within a fixed time bound. Even in deterministic domains, real-time action selection inherently suffers from uncertainty about those portions of the state space that have not yet been computed by the lookahead search. In this paper, we present new results in a line of research that explores how an agent can benefit from metareasoning about this uncertainty. Taking inspiration from prior work in distributional methods from RL, the Nancy search framework represents its uncertainty explicitly as beliefs over cost-to-go. Nancy then expands nodes so as to minimize the expected regret in case a non-optimal action is chosen. We present detailed results showing how beliefs can be informed by prior experience and we experimentally compare Nancy against both conventional real-time search algorithms like LSS-LRTA* and approaches from RL that exploit uncertainty, such as Monte Carlo tree search and Kaelbling’s interval estimation. We find that Nancy generally outperforms previous methods, particularly on more difficult problems. This work illustrates how the distributional perspective from Bayesian RL can be adapted to deterministic planning settings, and how deterministic planning can provide useful testbeds for methods that metareason about uncertainty during planning.

PDF Details

PRL Workshop 2020 Workshop Paper

Reinforcement Learning for Planning Heuristics

Patrick Ferber
Malte Helmert
Joerg Hoffmann

Informed heuristics are essential for the success of heuristic search algorithms. But, it is difficult to develop a new heuristic which is informed on various tasks. Instead, we propose a framework that trains a neural network as heuristic for the tasks it is supposed to solve. We present two reinforcement learning approaches to learn heuristics for fixed state spaces and fixed goals. Our first approach uses approximate value iteration, our second approach uses searches to generate training data. We show that in some domains our approaches outperform previous work, and we point out potentials for future improvements.

PDF Details

IJCAI Conference 2018 Conference Paper

LP Heuristics over Conjunctions: Compilation, Convergence, Nogood Learning

Marcel Steinmetz
Joerg Hoffmann

Two strands of research in classical planning are LP heuristics and conjunctions to improve approximations. Combinations of the two have also been explored. Here, we focus on convergence properties, forcing the LP heuristic to equal the perfect heuristic h* in the limit. We show that, under reasonable assumptions, partial variable merges are strictly dominated by the compilation Pi^C of explicit conjunctions, and that both render the state equation heuristic equal to h* for a suitable set C of conjunctions. We show that consistent potential heuristics can be computed from a variant of Pi^C, and that such heuristics can represent h* for suitable C. As an application of these convergence properties, we consider sound nogood learning in state space search, via refining the set C. We design a suitable refinement method to this end. Experiments on IPC benchmarks show significant performance improvements in several domains.

PDF Details

IJCAI Conference 2018 Conference Paper

Unchaining the Power of Partial Delete Relaxation, Part II: Finding Plans with Red-Black State Space Search

Maximilian Fickert
Daniel Gnad
Joerg Hoffmann

Red-black relaxation in classical planning allows to interpolate between delete-relaxed and real planning. Yet the traditional use of relaxations to generate heuristics restricts relaxation usage to tractable fragments. How to actually tap into the red-black relaxation's interpolation power? Prior work has devised red-black state space search (RBS) for intractable red-black planning, and has explored two uses: proving unsolvability, generating seed plans for plan repair. Here, we explore the generation of plans directly through RBS. We design two enhancements to this end: (A) use a known tractable fragment where possible, use RBS for the intractable parts; (B) check RBS state transitions for realizability, spawn relaxation refinements where the check fails. We show the potential merits of both techniques on IPC benchmarks.

PDF Details

JAIR Journal 2016 Journal Article

Combining the Delete Relaxation with Critical-Path Heuristics: A Direct Characterization

Maximilian Fickert
Joerg Hoffmann
Marcel Steinmetz

Recent work has shown how to improve delete relaxation heuristics by computing relaxed plans, i.e., the hFF heuristic, in a compiled planning task PiC which represents a given set C of fact conjunctions explicitly. While this compilation view of such partial delete relaxation is simple and elegant, its meaning with respect to the original planning task is opaque, and the size of PiC grows exponentially in |C|. We herein provide a direct characterization, without compilation, making explicit how the approach arises from a combination of the delete-relaxation with critical-path heuristics. Designing equations characterizing a novel view on h+ on the one hand, and a generalized version hC of hm on the other hand, we show that h+(PiC) can be characterized in terms of a combined hcplus equation. This naturally generalizes the standard delete-relaxation framework: understanding that framework as a relaxation over singleton facts as atomic subgoals, one can refine the relaxation by using the conjunctions C as atomic subgoals instead. Thanks to this explicit view, we identify the precise source of complexity in hFF(PiC), namely maximization of sets of supported atomic subgoals during relaxed plan extraction, which is easy for singleton-fact subgoals but is NP-complete in the general case. Approximating that problem greedily, we obtain a polynomial-time hCFF version of hFF(PiC), superseding the PiC compilation, and superseding the modified PiCce compilation which achieves the same complexity reduction but at an information loss. Experiments on IPC benchmarks show that these theoretical advantages can translate into empirical ones.

PDF Details DOI

AAAI Conference 2016 Conference Paper

Towards Clause-Learning State Space Search: Learning to Recognize Dead-Ends

Marcel Steinmetz
Joerg Hoffmann

We introduce a state space search method that identiﬁes deadend states, analyzes the reasons for failure, and learns to avoid similar mistakes in the future. Our work is placed in classical planning. The key technique are critical-path heuristics hC, relative to a set C of conjunctions. These recognize a dead-end state s, returning hC (s) = ∞, if s has no solution even when allowing to break up conjunctive subgoals into the elements of C. Our key idea is to learn C during search. Starting from a simple initial C, we augment search to identify unrecognized dead-ends s, where hC (s) < ∞. We design methods analyzing the situation at such s, adding new conjunctions into C to obtain hC (s) = ∞, thus learning to recognize s as well as similar dead-ends search may encounter in the future. We furthermore learn clauses φ where s |= φ implies hC (s ) = ∞, to avoid the prohibitive overhead of computing hC on every search state. Arranging these techniques in a depth-ﬁrst search, we obtain an algorithm approaching the elegance of clause learning in SAT, learning to refute search subtrees. Our experiments show that this can be quite powerful. On problems where dead-ends abound, the learning reliably reduces the search space by several orders of magnitude.

PDF Details

AAAI Conference 2013 Conference Paper

Red-Black Relaxed Plan Heuristics

Michael Katz
Joerg Hoffmann
Carmel Domshlak

Despite its success, the delete relaxation has significant pitfalls. Recent work has devised the red-black planning framework, where red variables take the relaxed semantics (accumulating their values), while black variables take the regular semantics. Provided the red variables are chosen so that redblack plan generation is tractable, one can generate such a plan for every search state, and take its length as the heuristic distance estimate. Previous results were not suitable for this purpose because they identified tractable fragments for redblack plan existence, as opposed to red-black plan generation. We identify a new fragment of red-black planning, that fixes this issue. We devise machinery to efficiently generate redblack plans, and to automatically select the red variables. Experiments show that the resulting heuristics can significantly improve over standard delete relaxation heuristics.

PDF Details

AAAI Conference 2012 Conference Paper

Semi-Relaxed Plan Heuristics

Emil Keyder
Joerg Hoffmann
Patrik Haslum

The currently dominant approach to domain-independent planning is planning as heuristic search, with most successful planning heuristics being based on solutions to delete-relaxed versions of planning problems, in which the negative effects of actions are ignored. We introduce a principled, flexible, and practical technique for augmenting delete-relaxed tasks with a limited amount of delete information, by introducing special fluents that explicitly represent conjunctions of fluents in the original planning task. Differently from previous work, conditional effects are used to limit the growth of the task to be linear in the number of such conjunctions, making its use for obtaining heuristic functions feasible. The resulting heuristics are empirically evaluated, and shown to be sometimes much more informative than standard delete-relaxation heuristics. 1

PDF Details

AAAI Conference 2010 Conference Paper

SAP Speaks PDDL

Joerg Hoffmann
Ingo Weber
Frank Kraft

In several application areas for Planning, in particular helping with the creation of new processes in Business Process Management (BPM), a major obstacle lies in the modeling. Obtaining a suitable model to plan with is often prohibitively complicated and/or costly. Our core observation in this work is that, for software-architectural purposes, SAP is already using a model that is essentially a variant of PDDL. That model describes the behavior of Business Objects, in terms of status variables and how they are affected by system transactions. We show herein that one can leverage the model to obtain (a) a promising BPM planning application which incurs hardly any modeling costs, and (b) an interesting planning benchmark. We design a suitable planning formalism and an adaptation of FF, and we perform large-scale experiments. Our prototype is part of a research extension to the SAP NetWeaver platform.

PDF Details

IJCAI Conference 2007 Conference Paper

Carla P. Gomes
Joerg Hoffmann
Ashish Sabharwal
Bart Selman

We introduce a new technique for counting models of Boolean satisfiability problems. Our approach incorporates information obtained from sampling the solution space. Unlike previous approaches, our method does not require uniform or near-uniform samples. It instead converts local search sampling without any guarantees into very good bounds on the model count with guarantees. We give a formal analysis and provide experimental results showing the effectiveness of our approach.

PDF Details

AAAI Conference 2007 Conference Paper

Web Service Composition as Planning, Revisited: In Between Background Theories and Initial State Uncertainty

Joerg Hoffmann

Thanks to recent advances, AI Planning has become the underlying technique for several applications. Amongst these, a prominent one is automated Web Service Composition (WSC). One important issue in this context has been hardly addressed so far: WSC requires dealing with background ontologies. The support for those is severely limited in current planning tools. We introduce a planning formalism that faithfully represents WSC. We show that, unsurprisingly, planning in such a formalism is very hard. We then identify an interesting special case that covers many relevant WSC scenarios, and where the semantics are simpler and easier to deal with. This opens the way to the development of effective support tools for WSC. Furthermore, we show that if one additionally limits the amount and form of outputs that can be generated, then the set of possible states becomes static, and can be modelled in terms of a standard notion of initial state uncertainty. For this, effective tools exist; these can realize scalable WSC with powerful background ontologies. In an initial experiment, we show how scaling WSC instances are comfortably solved by a tool incorporating modern planning heuristics.

PDF Details