Arrow Research search

Author name cluster

David Tolpin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers
2 author rows

Possible papers

14

AAAI Conference 2023 Conference Paper

Probabilistic Programs as an Action Description Language

  • Ronen I. Brafman
  • David Tolpin
  • Or Wertheim

Actions description languages (ADLs), such as STRIPS, PDDL, and RDDL specify the input format for planning algorithms. Unfortunately, their syntax is familiar to planning experts only, and not to potential users of planning technology. Moreover, this syntax limits the ability to describe complex and large domains. We argue that programming languages (PLs), and more specifically, probabilistic programming languages (PPLs), provide a more suitable alternative. PLs are familiar to all programmers, support complex data types and rich libraries for their manipulation, and have powerful constructs, such as loops, sub-routines, and local variables with which complex, realistic models and complex objectives can be simply and naturally specified. PPLs, specifically, make it easy to specify distributions, which is essential for stochastic models. The natural objection to this proposal is that PLs are opaque and too expressive, making reasoning about them difficult. However, PPLs also come with efficient inference algorithms, which, coupled with a growing body of work on sampling-based and gradient-based planning, imply that planning and execution monitoring can be carried out efficiently in practice. In this paper, we expand on this proposal, illustrating its potential with examples.

ICML Conference 2021 Conference Paper

Probabilistic Programs with Stochastic Conditioning

  • David Tolpin
  • Yuan Zhou 0013
  • Tom Rainforth
  • Hongseok Yang

We tackle the problem of conditioning probabilistic programs on distributions of observable variables. Probabilistic programs are usually conditioned on samples from the joint data distribution, which we refer to as deterministic conditioning. However, in many real-life scenarios, the observations are given as marginal distributions, summary statistics, or samplers. Conventional probabilistic programming systems lack adequate means for modeling and inference in such scenarios. We propose a generalization of deterministic conditioning to stochastic conditioning, that is, conditioning on the marginal distribution of a variable taking a particular form. To this end, we first define the formal notion of stochastic conditioning and discuss its key properties. We then show how to perform inference in the presence of stochastic conditioning. We demonstrate potential usage of stochastic conditioning on several case studies which involve various kinds of stochastic conditioning and are difficult to solve otherwise. Although we present stochastic conditioning in the context of probabilistic programming, our formalization is general and applicable to other settings.

AIJ Journal 2018 Journal Article

Rational deployment of multiple heuristics in optimal state-space search

  • Erez Karpas
  • Oded Betzalel
  • Solomon Eyal Shimony
  • David Tolpin
  • Ariel Felner

The obvious way to use several admissible heuristics in searching for an optimal solution is to take their maximum. In this paper, we aim to reduce the time spent on computing heuristics within the context of A ⁎ and I D A ⁎. We discuss Lazy A ⁎ and Lazy I D A ⁎, variants of A ⁎ and I D A ⁎, respectively, where heuristics are evaluated lazily: only when they are essential to a decision to be made in the search process. While these lazy algorithms outperform naive maximization, we can do even better by intelligently deciding when to compute the more expensive heuristic. We present a new rational metareasoning based scheme which decides whether to compute the more expensive heuristics at all, based on a myopic regret estimate. This scheme is used to create rational lazy A ⁎ and rational lazy I D A ⁎. We also present different methods for estimating the parameters necessary for making such decisions. An empirical evaluation in several domains supports the theoretical results, and shows that the rational variants, rational lazy A ⁎ and rational lazy I D A ⁎, are better than their non-rational counterparts.

IJCAI Conference 2015 Conference Paper

ICBS: Improved Conflict-Based Search Algorithm for Multi-Agent Pathfinding

  • Eli Boyarski
  • Ariel Felner
  • Roni Stern
  • Guni Sharon
  • David Tolpin
  • Oded Betzalel
  • Eyal Shimony

Conflict-Based Search (CBS) and its enhancements, Meta-Agent CBS and bypassing conflicts are amongst the strongest newly introduced algorithms for Multi-Agent Path Finding. This paper introduces two new improvements to CBS and incorporates them into a coherent, improved version of CBS, namely ICBS. Experimental results show that each of these improvements further reduces the runtime over the existing CBS-based approaches. When all improvements are combined, an even larger improvement is achieved, producing state-of-the art results for a number of domains.

SoCS Conference 2015 Conference Paper

ICBS: The Improved Conflict-Based Search Algorithm for Multi-Agent Pathfinding

  • Eli Boyarski
  • Ariel Felner
  • Roni Stern
  • Guni Sharon
  • Oded Betzalel
  • David Tolpin
  • Solomon Eyal Shimony

Conflict-Based Search (CBS) and its generalization, Meta-Agent CBS are amongst the strongest newly introduced algorithms for Multi-Agent Path Finding. This paper introduces ICBS, an improved version of CBS. ICBS incorporates three orthogonal improvements to CBS which are systematically described and studied. Experimental results show that each of these improvements reduces the runtime over basic CBS by up to 20x in many cases. When all three improvements are combined, an even larger improvement is achieved, producing state-ofthe art results for a number of domains.

SoCS Conference 2015 Conference Paper

Maximum a Posteriori Estimation by Search in Probabilistic Programs

  • David Tolpin
  • Frank Wood

We introduce an approximate search algorithm for fast maximum a posteriori probability estimation in probabilistic programs, which we call Bayesian ascent Monte Carlo (BaMC). Probabilistic programs represent probabilistic models with varying number of mutually dependent finite, countable, and continuous random variables. BaMC is an anytime MAP search algorithm applicable to any combination of random variables and dependencies. We compare BaMC to other MAP estimation algorithms and show that BaMC is faster and more robust on a range of probabilistic models.

IJCAI Conference 2013 Conference Paper

Towards Rational Deployment of Multiple Heuristics in A*

  • David Tolpin
  • Tal Beja
  • Solomon Eyal Shimony
  • Ariel Felner
  • Erez Karpas

The obvious way to use several admissible heuristics in A∗ is to take their maximum. In this paper we aim to reduce the time spent on computing heuristics. We discuss Lazy A∗, a variant of A∗ where heuristics are evaluated lazily: only when they are essential to a decision to be made in the A∗ search process. We present a new rational meta-reasoning based scheme, rational lazy A∗, which decides whether to compute the more expensive heuristics at all, based on a myopic value of information estimate. Both methods are examined theoretically. Empirical evaluation on several domains supports the theoretical results, and shows that lazy A∗ and rational lazy A∗ are state-of-the-art heuristic combination methods.

SoCS Conference 2013 Conference Paper

Towards Rational Deployment of Multiple Heuristics in A* (Extended Abstract)

  • David Tolpin
  • Tal Beja
  • Solomon Eyal Shimony
  • Ariel Felner
  • Erez Karpas

In this paper we discuss and experiment with Lazy A*, a variant of A* where heuristics are evaluated lazily and with Rational Lazy A*, which decides whether to compute the more expensive heuristics at all, based on a myopic value of information estimate. Full version appears in IJCAI-2013.

AAAI Conference 2012 Conference Paper

MCTS Based on Simple Regret

  • David Tolpin
  • Solomon Shimony

UCT, a state-of-the art algorithm for Monte Carlo tree search (MCTS) in games and Markov decision processes, is based on UCB, a sampling policy for the Multi-armed Bandit problem (MAB) that minimizes the cumulative regret. However, search differs from MAB in that in MCTS it is usually only the final “arm pull” (the actual move selection) that collects a reward, rather than all “arm pulls”. Therefore, it makes more sense to minimize the simple regret, as opposed to the cumulative regret. We begin by introducing policies for multiarmed bandits with lower finite-time and asymptotic simple regret than UCB, using it to develop a two-stage scheme (SR+CR) for MCTS which outperforms UCT empirically. Optimizing the sampling process is itself a metareasoning problem, a solution of which can use value of information (VOI) techniques. Although the theory of VOI for search exists, applying it to MCTS is non-trivial, as typical myopic assumptions fail. Lacking a complete working VOI theory for MCTS, we nevertheless propose a sampling scheme that is “aware” of VOI, achieving an algorithm that in empirical evaluation outperforms both UCT and the other proposed algorithms.

SoCS Conference 2012 Conference Paper

MCTS Based on Simple Rerget

  • David Tolpin
  • Solomon Eyal Shimony

UCT, a state-of-the art algorithm for Monte Carlo tree search (MCTS), is based on UCB, a policy for the Multi-armed Bandit problem (MAB) thatminimizes the cumulative regret. However, search differs from MAB inthat in MCTS it is usually only the final ``arm pull

UAI Conference 2012 Conference Paper

Selecting Computations: Theory and Applications

  • Nicholas Hay
  • Stuart Russell 0001
  • David Tolpin
  • Solomon Eyal Shimony

Sequential decision problems are often approximately solvable by simulating possible future action sequences. Metalevel decision procedures have been developed for selecting which action sequences to simulate, based on estimating the expected improvement in decision quality that would result from any particular simulation; an example is the recent work on using bandit algorithms to control Monte Carlo tree search in the game of Go. In this paper we develop a theoretical basis for metalevel decisions in the statistical framework of Bayesian selection problems, arguing (as others have done) that this is more appropriate than the bandit framework. We derive a number of basic results applicable to Monte Carlo selection problems, including the first finite sampling bounds for optimal policies in certain cases; we also provide a simple counterexample to the intuitive conjecture that an optimal policy will necessarily reach a decision in all cases. We then derive heuristic approximations in both Bayesian and distribution-free settings and demonstrate their superiority to bandit-based heuristics in one-shot decision problems and in Go.

ECAI Conference 2012 Conference Paper

VOI-aware MCTS

  • David Tolpin
  • Solomon Eyal Shimony

UCT, a state-of-the art algorithm for Monte Carlo tree search (MCTS) in games and Markov decision processes, is based on UCB1, a sampling policy for the Multi-armed Bandit problem (MAB) that minimizes the cumulative regret. However, search differs from MAB in that in MCTS it is usually only the final "arm pull" (the actual move selection) that collects a reward, rather than all "arm pulls". In this paper, an MCTS sampling policy based on Value of Information (VOI) estimates of rollouts is suggested. Empirical evaluation of the policycomparison to UCB1UCT is performed on random MAB instances as well as on Computer Go.

IJCAI Conference 2011 Conference Paper

Rational Deployment of CSP Heuristics

  • David Tolpin
  • Solomon Eyal Shimony

Heuristics are crucial tools in decreasing search effort in varied fields of AI. In order to be effective, a heuristic must be efficient to compute, as well as provide useful information to the search algorithm. However, some well-known heuristics which do well in reducing backtracking are so heavy that the gain of deploying them in a search algorithm might be outweighed by their overhead. We propose a rational metareasoning approach to decide when to deploy heuristics, using CSP backtracking search as a case study. In particular, a value of information approach is taken to adaptive deployment of solution-count estimation heuristics for value ordering. Empirical results show that indeed the proposed mechanism successfully balances the tradeoff between decreasing backtracking and heuristic computational overhead, resulting in a significant overall search time reduction.