Arrow Research search

Author name cluster

William Uther

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
1 author row

Possible papers

3

AAMAS Conference 2010 Conference Paper

Approximate Dynamic Programming with Affine ADDs

  • Scott Sanner
  • William Uther
  • Karina Valdivia Delgado

The Affine ADD (AADD) is an extension of the Algebraic Decision Diagram (ADD) that compactly represents context-specific, additive and multiplicative structure in functions from a discretedomain to a real-valued range. In this paper, we introduce a novelalgorithm for efficiently finding AADD approximations that we useto develop the MADCAP algorithm for AADD-based structuredapproximate dynamic programming (ADP) with factored MDPs. MADCAP requires less time and space to achieve comparable orbetter approximate solutions than the current state-of-the-art ADD-based ADP algorithm of APRICODD and can provide approximatesolutions for problems with context-specific, additive and multiplicative structure on which APRICODD runs out of memory.

NeurIPS Conference 2009 Conference Paper

Bootstrapping from Game Tree Search

  • Joel Veness
  • David Silver
  • Alan Blair
  • William Uther

In this paper we introduce a new algorithm for updating the parameters of a heuristic evaluation function, by updating the heuristic towards the values computed by an alpha-beta search. Our algorithm differs from previous approaches to learning from search, such as Samuels checkers player and the TD-Leaf algorithm, in two key ways. First, we update all nodes in the search tree, rather than a single node. Second, we use the outcome of a deep search, instead of the outcome of a subsequent search, as the training signal for the evaluation function. We implemented our algorithm in a chess program Meep, using a linear heuristic function. After initialising its weight vector to small random values, Meep was able to learn high quality weights from self-play alone. When tested online against human opponents, Meep played at a master level, the best performance of any chess program with a heuristic learned entirely from self-play.