Author name cluster

Malcolm J. A. Strens

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

2 author rows

ICAPS Conference 2006 Conference Paper

Combining Stochastic Task Models with Reinforcement Learning for Dynamic Scheduling

Malcolm J. A. Strens

We view dynamic scheduling as a sequential decision problem. Firstly, we introduce a generalized planning operator, the stochastic task model (STM), which predicts the effects of executing a particular task on state, time and reward using a general procedural format (pure stochastic function). Secondly, we show that effective planning under uncertainty can be obtained by combining adaptive horizon stochastic planning with reinforcement learning (RL) in a hybrid system. The benefits of the hybrid approach are evaluated using a repeatable job shop scheduling task.

Details

ECAI Conference 2004 Conference Paper

Algorithms for Distributed Exploration

Thomas Walker
Daniel Kudenko
Malcolm J. A. Strens

Details

ICML Conference 2004 Conference Paper

Efficient hierarchical MCMC for policy search

Malcolm J. A. Strens

Details

ICML Conference 2003 Conference Paper

Evolutionary MCMC Sampling and Optimization in Discrete Spaces

Malcolm J. A. Strens

Details

ICML Conference 2002 Conference Paper

Markov Chain Monte Carlo Sampling using Direct Search Optimization

Malcolm J. A. Strens
Mark Bernhardt
Nicholas Everett

Details

JMLR Journal 2002 Journal Article

Policy Search using Paired Comparisons

Malcolm J. A. Strens
Andrew W. Moore

Direct policy search is a practical way to solve reinforcement learning (RL) problems involving continuous state and action spaces. The goal becomes finding policy parameters that maximize a noisy objective function. The Pegasus method converts this stochastic optimization problem into a deterministic one, by using fixed start states and fixed random number sequences for comparing policies (Ng and Jordan, 2000). We evaluate Pegasus, and new paired comparison methods, using the mountain car problem, and a difficult pursuer-evader problem. We conclude that: (i) paired tests can improve performance of optimization procedures; (ii) several methods are available to reduce the 'overfitting' effect found with Pegasus; (iii) adapting the number of trials used for each comparison yields faster learning; (iv) pairing also helps stochastic search methods such as differential evolution.

PDF Details

ICML Conference 2001 Conference Paper

Direct Policy Search using Paired Statistical Tests

Malcolm J. A. Strens
Andrew W. Moore 0001

Details

ICML Conference 2000 Conference Paper

A Bayesian Framework for Reinforcement Learning

Malcolm J. A. Strens

Details