Arrow Research search

Author name cluster

Daniel S. Bernstein

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers
2 author rows

Possible papers

8

JAAMAS Journal 2009 Journal Article

Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

  • Christopher Amato
  • Daniel S. Bernstein
  • Shlomo Zilberstein

Abstract POMDPs and their decentralized multiagent counterparts, DEC-POMDPs, offer a rich framework for sequential decision making under uncertainty. Their high computational complexity, however, presents an important research challenge. One way to address the intractable memory requirements of current algorithms is based on representing agent policies as finite-state controllers. Using this representation, we propose a new approach that formulates the problem as a nonlinear program, which defines an optimal policy of a desired size for each agent. This new formulation allows a wide range of powerful nonlinear programming algorithms to be used to solve POMDPs and DEC-POMDPs. Although solving the NLP optimally is often intractable, the results we obtain using an off-the-shelf optimization method are competitive with state-of-the-art POMDP algorithms and outperform state-of-the-art DEC-POMDP algorithms. Our approach is easy to implement and it opens up promising research directions for solving POMDPs and DEC-POMDPs using nonlinear programming methods.

IJCAI Conference 2007 Conference Paper

  • Christopher Amato
  • Daniel S. Bernstein
  • Shlomo Zilberstein

Developing scalable algorithms for solving partially observable Markov decision processes (POMDPs) is an important challenge. One approach that effectively addresses the intractable memory requirements of POMDP algorithms is based on representing POMDP policies as finite-state controllers. In this paper, we illustrate some fundamental disadvantages of existing techniques that use controllers. We then propose a new approach that formulates the problem as a quadratically constrained linear program (QCLP), which defines an optimal controller of a desired size. This representation allows a wide range of powerful nonlinear programming algorithms to be used to solve POMDPs. Although QCLP optimization techniques guarantee only local optimality, the results we obtain using an existing optimization method show significant solution improvement over the state-of-the-art techniques. The results open up promising research directions for solving large POMDPs using nonlinear programming methods.

IJCAI Conference 2005 Conference Paper

Bounded Policy Iteration for Decentralized POMDPs

  • Daniel S. Bernstein
  • Eric A. Hansen
  • Shlomo

We present a bounded policy iteration algorithm for infinite-horizon decentralized POMDPs. Policies are represented as joint stochastic finite-state controllers, which consist of a local controller for each agent. We also let a joint controller include a correlation device that allows the agents to correlate their behavior without exchanging information during execution, and show that this leads to improved performance. The algorithm uses a fixed amount of memory, and each iteration is guaranteed to produce a controller with value at least as high as the previous one for all possible initial state distributions. For the case of a single agent, the algorithm reduces to Poupart and Boutilier’s bounded policy iteration for POMDPs.

IJCAI Conference 2003 Conference Paper

Contract Algorithms and Robots on Rays: Unifying Two Scheduling Problems

  • Daniel S. Bernstein
  • Lev Finkebtein
  • Shlomo Zilberstein

We study two apparently different, but formally similar, scheduling problems. The first problem involves contract algorithms, which can trade off run time for solution quality, as long as the amount of available run time is known in advance. The problem is to schedule contract algorithms to run on parallel processors, under the condition that an interruption can occur at any time, and upon interruption a solution to any one of a number of problems can be requested. Schedules are compared in terms of acceleration ratio, which is a worst-case measure of efficiency. We provide a schedule and prove its optimality among a particular class of schedules. Our second problem involves multiple robots searching for a goal on one of multiple rays. Search strategics are compared in terms of time-competitive ratio, the ratio of the total search time to the time it would take for one robot to traverse directly to the goal. We demonstrate that search strategies and contract schedules are formally equivalent. In addition, for our class of schedules, we derive a formula relating the acceleration ratio of a schedule to the time-competitive ratio of the corresponding search strategy.

AAAI Conference 2002 Conference Paper

Scheduling Contract Algorithms on Multiple Processors

  • Daniel S. Bernstein
  • and Shlomo Zilberstein

Anytime algorithms offer a tradeoff between computation time and the quality of the result returned. They can be divided into two classes: contract algorithms, for which the total run time must be specified in advance, and interruptible algorithms, which can be queried at any time for a solution. An interruptible algorithm can be constructed from a contract algorithm by repeatedly activating the contract algorithm with increasing run times. The acceleration ratio of a run-time schedule is a worst-case measure of how inefficient the constructed interruptible algorithm is compared to the contract algorithm. The smallest acceleration ratio achievable on a single processor is known. Using multiple processors, smaller acceleration ratios are possible. In this paper, we provide a schedule for m processors and prove that it is optimal for all m. Our results provide general guidelines for the use of parallel processors in the design of real-time systems.

UAI Conference 2000 Conference Paper

The Complexity of Decentralized Control of Markov Decision Processes

  • Daniel S. Bernstein
  • Shlomo Zilberstein
  • Neil Immerman

Planning for distributed agents with partial state information is considered from a decision- theoretic perspective. We describe generalizations of both the MDP and POMDP models that allow for decentralized control. For even a small number of agents, the finite-horizon problems corresponding to both of our models are complete for nondeterministic exponential time. These complexity results illustrate a fundamental difference between centralized and decentralized control of Markov processes. In contrast to the MDP and POMDP problems, the problems we consider provably do not admit polynomial-time algorithms and most likely require doubly exponential time to solve in the worst case. We have thus provided mathematical evidence corresponding to the intuition that decentralized planning problems cannot easily be reduced to centralized problems and solved exactly using established techniques.