Author name cluster

Martin M

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

1 author row

IJCAI Conference 2007 Conference Paper

Adi Botea
Martin M
uuml; ller
Jonathan Schaeffer

Research on macro-operators has a long history in planning and other search applications. There has been a revival of interest in this topic, leading to systems that successfully combine macro-operators with current state-of-the-art planning approaches based on heuristic search. However, research is still necessary to make macros become a standard, widely-used enhancement of search algorithms. This article introduces sequences of macro-actions, called iterative macros. Iterative macros exhibit both the potential advantages (e. g. , travel fast towards goal) and the potential limitations (e. g. , utility problem) of classical macros, only on a much larger scale. A family of techniques are introduced to balance this trade-off in favor of faster planning. Experiments on a collection of planning benchmarks show that, when compared to low-level search and even to search with classical macro-operators, iterative macros can achieve an impressive speed-up in search.

PDF Details

IJCAI Conference 2007 Conference Paper

Kazuki Yoshizoe
Akihiro Kishimoto
Martin M
uuml; ller

Thomsen's λ search and Nagai's depth-first proof-number (DFPN) search are two powerful but very different AND/OR tree search algorithms. Lambda Depth-First Proof Number search (LDFPN) is a novel algorithm that combines ideas from both algorithms. λ search can dramatically reduce a search space by finding different levels of threat sequences. DFPN employs the notion of proof and disproof numbers to expand nodes expected to be easiest to prove or disprove. The method was shown to be effective for many games. Integrating λ order with proof and disproof numbers enables LDFPN to select moves more effectively, while preserving the efficiency of DFPN. LDFPN has been implemented for capturing problems in Go and is shown to be more efficient than DFPN and more robust than an algorithm based on classical λ search.

PDF Details

IJCAI Conference 2007 Conference Paper

David Silver
Richard Sutton
Martin M
uuml; ller

We explore an application to the game of Go of a reinforcement learning approach based on a linear evaluation function and large numbers of binary features. This strategy has proved effective in game playing programs and other reinforcement learning applications. We apply this strategy to Go by creating over a million features based on templates for small fragments of the board, and then use temporal difference learning and self-play. This method identifies hundreds of low level shapes with recognisable significance to expert Go players, and provides quantitive estimates of their values. We analyse the relative contributions to performance of templates of different types and sizes. Our results show that small, translation-invariant templates are surprisingly effective. We assess the performance of our program by playing against the Average Liberty Player and a variety of computer opponents on the 9x9 Computer Go Server. Our linear evaluation function appears to outperform all other static evaluation functions that do not incorporate substantial domain knowledge.

PDF Details