Author name cluster

David Tolpin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

2 author rows

AAAI Conference 2023 Conference Paper

Probabilistic Programs as an Action Description Language

Ronen I. Brafman
David Tolpin
Or Wertheim

Actions description languages (ADLs), such as STRIPS, PDDL, and RDDL specify the input format for planning algorithms. Unfortunately, their syntax is familiar to planning experts only, and not to potential users of planning technology. Moreover, this syntax limits the ability to describe complex and large domains. We argue that programming languages (PLs), and more specifically, probabilistic programming languages (PPLs), provide a more suitable alternative. PLs are familiar to all programmers, support complex data types and rich libraries for their manipulation, and have powerful constructs, such as loops, sub-routines, and local variables with which complex, realistic models and complex objectives can be simply and naturally specified. PPLs, specifically, make it easy to specify distributions, which is essential for stochastic models. The natural objection to this proposal is that PLs are opaque and too expressive, making reasoning about them difficult. However, PPLs also come with efficient inference algorithms, which, coupled with a growing body of work on sampling-based and gradient-based planning, imply that planning and execution monitoring can be carried out efficiently in practice. In this paper, we expand on this proposal, illustrating its potential with examples.

PDF Details DOI

ICML Conference 2021 Conference Paper

Probabilistic Programs with Stochastic Conditioning

David Tolpin
Yuan Zhou 0013
Tom Rainforth
Hongseok Yang

We tackle the problem of conditioning probabilistic programs on distributions of observable variables. Probabilistic programs are usually conditioned on samples from the joint data distribution, which we refer to as deterministic conditioning. However, in many real-life scenarios, the observations are given as marginal distributions, summary statistics, or samplers. Conventional probabilistic programming systems lack adequate means for modeling and inference in such scenarios. We propose a generalization of deterministic conditioning to stochastic conditioning, that is, conditioning on the marginal distribution of a variable taking a particular form. To this end, we first define the formal notion of stochastic conditioning and discuss its key properties. We then show how to perform inference in the presence of stochastic conditioning. We demonstrate potential usage of stochastic conditioning on several case studies which involve various kinds of stochastic conditioning and are difficult to solve otherwise. Although we present stochastic conditioning in the context of probabilistic programming, our formalization is general and applicable to other settings.

Details

AIJ Journal 2018 Journal Article

Rational deployment of multiple heuristics in optimal state-space search

Erez Karpas
Oded Betzalel
Solomon Eyal Shimony
David Tolpin
Ariel Felner

The obvious way to use several admissible heuristics in searching for an optimal solution is to take their maximum. In this paper, we aim to reduce the time spent on computing heuristics within the context of A ⁎ and I D A ⁎. We discuss Lazy A ⁎ and Lazy I D A ⁎, variants of A ⁎ and I D A ⁎, respectively, where heuristics are evaluated lazily: only when they are essential to a decision to be made in the search process. While these lazy algorithms outperform naive maximization, we can do even better by intelligently deciding when to compute the more expensive heuristic. We present a new rational metareasoning based scheme which decides whether to compute the more expensive heuristics at all, based on a myopic regret estimate. This scheme is used to create rational lazy A ⁎ and rational lazy I D A ⁎. We also present different methods for estimating the parameters necessary for making such decisions. An empirical evaluation in several domains supports the theoretical results, and shows that the rational variants, rational lazy A ⁎ and rational lazy I D A ⁎, are better than their non-rational counterparts.

Details DOI

IJCAI Conference 2015 Conference Paper

ICBS: Improved Conflict-Based Search Algorithm for Multi-Agent Pathfinding

Eli Boyarski
Ariel Felner
Roni Stern
Guni Sharon
David Tolpin
Oded Betzalel
Eyal Shimony

Conflict-Based Search (CBS) and its enhancements, Meta-Agent CBS and bypassing conflicts are amongst the strongest newly introduced algorithms for Multi-Agent Path Finding. This paper introduces two new improvements to CBS and incorporates them into a coherent, improved version of CBS, namely ICBS. Experimental results show that each of these improvements further reduces the runtime over the existing CBS-based approaches. When all improvements are combined, an even larger improvement is achieved, producing state-of-the art results for a number of domains.

PDF Details

SoCS Conference 2015 Conference Paper

ICBS: The Improved Conflict-Based Search Algorithm for Multi-Agent Pathfinding

Eli Boyarski
Ariel Felner
Roni Stern
Guni Sharon
Oded Betzalel
David Tolpin
Solomon Eyal Shimony

Conflict-Based Search (CBS) and its generalization, Meta-Agent CBS are amongst the strongest newly introduced algorithms for Multi-Agent Path Finding. This paper introduces ICBS, an improved version of CBS. ICBS incorporates three orthogonal improvements to CBS which are systematically described and studied. Experimental results show that each of these improvements reduces the runtime over basic CBS by up to 20x in many cases. When all three improvements are combined, an even larger improvement is achieved, producing state-ofthe art results for a number of domains.

Details

SoCS Conference 2015 Conference Paper

Maximum a Posteriori Estimation by Search in Probabilistic Programs

David Tolpin
Frank Wood

We introduce an approximate search algorithm for fast maximum a posteriori probability estimation in probabilistic programs, which we call Bayesian ascent Monte Carlo (BaMC). Probabilistic programs represent probabilistic models with varying number of mutually dependent finite, countable, and continuous random variables. BaMC is an anytime MAP search algorithm applicable to any combination of random variables and dependencies. We compare BaMC to other MAP estimation algorithms and show that BaMC is faster and more robust on a range of probabilistic models.

Details

ECAI Conference 2014 Conference Paper

Rational Deployment of Multiple Heuristics in IDA

David Tolpin
Oded Betzalel
Ariel Felner
Solomon Eyal Shimony

Details

IJCAI Conference 2013 Conference Paper

Towards Rational Deployment of Multiple Heuristics in A*

David Tolpin
Tal Beja
Solomon Eyal Shimony
Ariel Felner
Erez Karpas

The obvious way to use several admissible heuristics in A∗ is to take their maximum. In this paper we aim to reduce the time spent on computing heuristics. We discuss Lazy A∗, a variant of A∗ where heuristics are evaluated lazily: only when they are essential to a decision to be made in the A∗ search process. We present a new rational meta-reasoning based scheme, rational lazy A∗, which decides whether to compute the more expensive heuristics at all, based on a myopic value of information estimate. Both methods are examined theoretically. Empirical evaluation on several domains supports the theoretical results, and shows that lazy A∗ and rational lazy A∗ are state-of-the-art heuristic combination methods.

PDF Details DOI

SoCS Conference 2013 Conference Paper

Towards Rational Deployment of Multiple Heuristics in A* (Extended Abstract)

David Tolpin
Tal Beja
Solomon Eyal Shimony
Ariel Felner
Erez Karpas

In this paper we discuss and experiment with Lazy A*, a variant of A* where heuristics are evaluated lazily and with Rational Lazy A*, which decides whether to compute the more expensive heuristics at all, based on a myopic value of information estimate. Full version appears in IJCAI-2013.

Details

AAAI Conference 2012 Conference Paper

MCTS Based on Simple Regret

David Tolpin
Solomon Shimony

UCT, a state-of-the art algorithm for Monte Carlo tree search (MCTS) in games and Markov decision processes, is based on UCB, a sampling policy for the Multi-armed Bandit problem (MAB) that minimizes the cumulative regret. However, search differs from MAB in that in MCTS it is usually only the final “arm pull” (the actual move selection) that collects a reward, rather than all “arm pulls”. Therefore, it makes more sense to minimize the simple regret, as opposed to the cumulative regret. We begin by introducing policies for multiarmed bandits with lower finite-time and asymptotic simple regret than UCB, using it to develop a two-stage scheme (SR+CR) for MCTS which outperforms UCT empirically. Optimizing the sampling process is itself a metareasoning problem, a solution of which can use value of information (VOI) techniques. Although the theory of VOI for search exists, applying it to MCTS is non-trivial, as typical myopic assumptions fail. Lacking a complete working VOI theory for MCTS, we nevertheless propose a sampling scheme that is “aware” of VOI, achieving an algorithm that in empirical evaluation outperforms both UCT and the other proposed algorithms.

PDF Details

SoCS Conference 2012 Conference Paper

MCTS Based on Simple Rerget

David Tolpin
Solomon Eyal Shimony

UCT, a state-of-the art algorithm for Monte Carlo tree search (MCTS), is based on UCB, a policy for the Multi-armed Bandit problem (MAB) thatminimizes the cumulative regret. However, search differs from MAB inthat in MCTS it is usually only the final ``arm pull

Details

UAI Conference 2012 Conference Paper

Selecting Computations: Theory and Applications

Nicholas Hay
Stuart Russell 0001
David Tolpin
Solomon Eyal Shimony

Sequential decision problems are often approximately solvable by simulating possible future action sequences. Metalevel decision procedures have been developed for selecting which action sequences to simulate, based on estimating the expected improvement in decision quality that would result from any particular simulation; an example is the recent work on using bandit algorithms to control Monte Carlo tree search in the game of Go. In this paper we develop a theoretical basis for metalevel decisions in the statistical framework of Bayesian selection problems, arguing (as others have done) that this is more appropriate than the bandit framework. We derive a number of basic results applicable to Monte Carlo selection problems, including the first finite sampling bounds for optimal policies in certain cases; we also provide a simple counterexample to the intuitive conjecture that an optimal policy will necessarily reach a decision in all cases. We then derive heuristic approximations in both Bayesian and distribution-free settings and demonstrate their superiority to bandit-based heuristics in one-shot decision problems and in Go.

Details

ECAI Conference 2012 Conference Paper

VOI-aware MCTS

David Tolpin
Solomon Eyal Shimony

UCT, a state-of-the art algorithm for Monte Carlo tree search (MCTS) in games and Markov decision processes, is based on UCB1, a sampling policy for the Multi-armed Bandit problem (MAB) that minimizes the cumulative regret. However, search differs from MAB in that in MCTS it is usually only the final "arm pull" (the actual move selection) that collects a reward, rather than all "arm pulls". In this paper, an MCTS sampling policy based on Value of Information (VOI) estimates of rollouts is suggested. Empirical evaluation of the policycomparison to UCB1UCT is performed on random MAB instances as well as on Computer Go.

Details

IJCAI Conference 2011 Conference Paper

Rational Deployment of CSP Heuristics

David Tolpin
Solomon Eyal Shimony

Heuristics are crucial tools in decreasing search effort in varied fields of AI. In order to be effective, a heuristic must be efficient to compute, as well as provide useful information to the search algorithm. However, some well-known heuristics which do well in reducing backtracking are so heavy that the gain of deploying them in a search algorithm might be outweighed by their overhead. We propose a rational metareasoning approach to decide when to deploy heuristics, using CSP backtracking search as a case study. In particular, a value of information approach is taken to adaptive deployment of solution-count estimation heuristics for value ordering. Empirical results show that indeed the proposed mechanism successfully balances the tradeoff between decreasing backtracking and heuristic computational overhead, resulting in a significant overall search time reduction.

PDF Details DOI