Arrow Research search

Author name cluster

Parag Singla

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

33 papers
2 author rows

Possible papers

33

IROS Conference 2024 Conference Paper

Learning to Recover from Plan Execution Errors during Robot Manipulation: A Neuro-symbolic Approach

  • Namasivayam Kalithasan
  • Arnav Tuli
  • Vishal Bindal
  • Himanshu Singh 0002
  • Parag Singla
  • Rohan Paul

Automatically detecting and recovering from failures is an important but challenging problem for autonomous robots. Most of the recent work on learning to plan from demonstrations lacks the ability to detect and recover from errors in the absence of an explicit state representation and/or a (sub-) goal check function. We propose an approach (blending learning with symbolic search) for automated error discovery and recovery, without needing annotated data of failures. Central to our approach is a neuro-symbolic state representation, in the form of dense scene graph, structured based on the objects present within the environment. This enables efficient learning of the transition function and a discriminator that not only identifies failures but also localizes them facilitating fast re-planning via computation of heuristic distance function. We also present an anytime version of our algorithm, where instead of recovering to the last correct state, we search for a sub-goal in the original plan minimizing the total distance to the goal given a re-planning budget. Experiments on a physics simulator with a variety of simulated failures show the effectiveness of our approach compared to existing baselines, both in terms of efficiency as well as accuracy of our recovery mechanism.

ICLR Conference 2023 Conference Paper

Few-shot Cross-domain Image Generation via Inference-time Latent-code Learning

  • Arnab Kumar Mondal
  • Piyush Tiwary
  • Parag Singla
  • Prathosh A. P.

In this work, our objective is to adapt a Deep generative model trained on a large-scale source dataset to multiple target domains with scarce data. Specifically, we focus on adapting a pre-trained Generative Adversarial Network (GAN) to a target domain without re-training the generator. Our method draws the motivation from the fact that out-of-distribution samples can be `embedded' onto the latent space of a pre-trained source-GAN. We propose to train a small latent-generation network during the inference stage, each time a batch of target samples is to be generated. These target latent codes are fed to the source-generator to obtain novel target samples. Despite using the same small set of target samples and the source generator, multiple independent training episodes of the latent-generation network results in the diversity of the generated target samples. Our method, albeit simple, can be used to generate data from multiple target distributions using a generator trained on a single source distribution. We demonstrate the efficacy of our surprisingly simple method in generating multiple target datasets with only a single source generator and a few target samples.

ICRA Conference 2023 Conference Paper

Learning Neuro-symbolic Programs for Language Guided Robot Manipulation

  • Namasivayam Kalithasan
  • Himanshu Singh 0002
  • Vishal Bindal
  • Arnav Tuli
  • Vishwajeet Agrawal
  • Rahul Jain
  • Parag Singla
  • Rohan Paul

Given a natural language instruction and an input scene, our goal is to train a model to output a manipulation program that can be executed by the robot. Prior approaches for this task possess one of the following limitations: (i) rely on hand-coded symbols for concepts limiting generalization beyond those seen during training [1] (ii) infer action sequences from instructions but require dense sub-goal supervision [2] or (iii) lack semantics required for deeper object-centric reasoning inherent in interpreting complex instructions [3]. In contrast, our approach can handle linguistic as well as perceptual variations, end-to-end trainable and requires no intermediate supervision. The proposed model uses symbolic reasoning constructs that operate on a latent neural object-centric representation, allowing for deeper reasoning over the input scene. Central to our approach is a modular structure consisting of a hierarchical instruction parser and an action simulator to learn disentangled action representations. Our experiments on a simulated environment with a 7-DOF manipulator, consisting of instructions with varying number of steps and scenes with different number of objects, demonstrate that our model is robust to such variations and significantly outperforms baselines, particularly in the generalization settings. The code, dataset and experiment videos are available at https://nsrmp.github.io

PRL Workshop 2023 Workshop Paper

Object-Centric Learning of Neural Policies for Zero-shot Transfer over Domains with Varying Quantities of Interest

  • Vishal Sharma
  • Aniket Gupta
  • Prayushi Faldu
  • Rushil Gupta
  • Mausam
  • Parag Singla

Our goal is to learn policies that generalize across variation in quantities of interest in the domain (e.g., number of objects, motion dynamics, distance to the goal) in a zero shot manner. Recent work on object-centric approaches for image and video processing has shown significant promise in building models that generalize well to unseen settings. In this work, we present Object Centric Reinforcement Learning Agent (ORLA), the first object-centric approach for model-free RL in perceptual domains. ORLA works in three phases: first, it learns to extract a variable number of objects masks, via an expert trained using encoder-decoder architecture, which in turn generates data for fine-tuning a YOLO based model for extracting bounding boxes in unseen settings. Second, bounding boxes are used to construct a symbolic state consisting of object positions across a sequence of frames. Finally, a GAT based architecture is employed over extracted object positions to learn a dense state embedding, which is then decoded to get the final policy that generalizes to unseen environments. Extensive experimentation over a number of domains shows that ORLA can learn significantly better policies that transfer across variations in different quantities of interest compared to existing baselines, which often fail to do any meaningful transfer.

UAI Conference 2023 Conference Paper

SymNet 3. 0: Exploiting Long-Range Influences in Learning Generalized Neural Policies for Relational MDPs

  • Vishal Sharma
  • Daman Arora
  • Mausam
  • Parag Singla

We focus on the learning of generalized neural policies for Relational Markov Decision Processes (RMDPs) expressed in RDDL. Recent work first converts the instances of a relational domain into an instance graph, and then trains a Graph Attention Network (GAT) of fixed depth with parameters shared across instances to learn a state representation, which can be decoded to get the policy [sharma et al. , 22]. Unfortunately, this approach struggles to learn policies that exploit long-range dependencies – a fact we formally prove in this paper. As a remedy, we first construct a novel influence graph characterized by edges capturing one-step influence (dependence) between nodes based on the transition model. We then define influence distance between two nodes as the shortest path between them in this graph – a feature we exploit to represent long-range dependencies. We show that our architecture, referred to as Symbolic Influence Network (SymNet3. 0), with its distance-based features, does not suffer from the representational issues faced by earlier approaches. Extensive experimentation demonstrates that we are competitive with existing baselines on 12 standard IPPC domains, and perform significantly better on six additional domains (including IPPC variants), designed to test a model’s capability in capturing long-range dependencies. Further analysis shows that SymNet3. 0 automatically learns to focus on nodes that have key information for representing policies that capture long-range dependencies.

PRL Workshop 2023 Workshop Paper

SymNet 3.0: Exploiting Long-Range Influences in Learning Generalized Neural Policies for Relational MDPs

  • Vishal Sharma
  • Daman Arora
  • Mausam
  • Parag Singla

We focus on the learning of generalized neural policies for Relational Markov Decision Processes (RMDPs) expressed in RDDL. Recent work first converts the instances of a relational domain into an instance graph, and then trains a Graph Attention Network (GAT) of fixed depth with parameters shared across instances to learn a state representation, which can be decoded to get the policy [sharma et al., 22]. Unfortunately, this approach struggles to learn policies that exploit long-range dependencies – a fact we formally prove in this paper. As a remedy, we first construct a novel influence graph characterized by edges capturing one-step influence (dependence) between nodes based on the transition model. We then define influence distance between two nodes as the shortest path between them in this graph – a feature we exploit to represent long-range dependencies. We show that our architecture, referred to as Symbolic Influence Network (SymNet3.0), with its distance-based features, does not suffer from the representational issues faced by earlier approaches. Extensive experimentation demonstrates that we are competitive with existing baselines on 12 standard IPPC domains, and perform significantly better on six additional domains (including IPPC variants), designed to test a model’s capability in capturing long-range dependencies. Further analysis shows that SymNet3.0 automatically learns to focus on nodes that have key information for representing policies that capture long-range dependencies.

NeurIPS Conference 2022 Conference Paper

A Solver-free Framework for Scalable Learning in Neural ILP Architectures

  • Yatin Nandwani
  • Rishabh Ranjan
  • - Mausam
  • Parag Singla

There is a recent focus on designing architectures that have an Integer Linear Programming (ILP) layer within a neural model (referred to as \emph{Neural ILP} in this paper). Neural ILP architectures are suitable for pure reasoning tasks that require data-driven constraint learning or for tasks requiring both perception (neural) and reasoning (ILP). A recent SOTA approach for end-to-end training of Neural ILP explicitly defines gradients through the ILP black box [Paulus et al. [2021]] – this trains extremely slowly, owing to a call to the underlying ILP solver for every training data point in a minibatch. In response, we present an alternative training strategy that is \emph{solver-free}, i. e. , does not call the ILP solver at all at training time. Neural ILP has a set of trainable hyperplanes (for cost and constraints in ILP), together representing a polyhedron. Our key idea is that the training loss should impose that the final polyhedron separates the positives (all constraints satisfied) from the negatives (at least one violated constraint or a suboptimal cost value), via a soft-margin formulation. While positive example(s) are provided as part of the training data, we devise novel techniques for generating negative samples. Our solution is flexible enough to handle equality as well as inequality constraints. Experiments on several problems, both perceptual as well as symbolic, which require learning the constraints of an ILP, show that our approach has superior performance and scales much better compared to purely neural baselines and other state-of-the-art models that require solver-based training. In particular, we are able to obtain excellent performance in 9 x 9 symbolic and visual Sudoku, to which the other Neural ILP solver is not able to scale.

ICLR Conference 2022 Conference Paper

Neural Models for Output-Space Invariance in Combinatorial Problems

  • Yatin Nandwani
  • Vidit Jain
  • Mausam
  • Parag Singla

Recently many neural models have been proposed to solve combinatorial puzzles by implicitly learning underlying constraints using their solved instances, such as sudoku or graph coloring (GCP). One drawback of the proposed architectures, which are often based on Graph Neural Networks (GNN) (Zhou et al., 2020), is that they cannot generalize across the size of the output space from which variables are assigned a value, for example, set of colors in a GCP, or board-size in sudoku. We call the output space for the variables as ‘value-set’. While many works have demonstrated generalization of GNNs across graph size, there has been no study on how to design a GNN for achieving value-set invariance for problems that come from the same domain. For example, learning to solve 16 x 16 sudoku after being trained on only 9 x 9 sudokus, or coloring a 7 colorable graph after training on 4 colorable graphs. In this work, we propose novel methods to extend GNN based architectures to achieve value-set invariance. Specifically, our model builds on recently proposed Recurrent Relational Networks (RRN) (Palm et al., 2018). Our first approach exploits the graph-size invariance of GNNs by converting a multi-class node classification problem into a binary node classification problem. Our second approach works directly with multiple classes by adding multiple nodes corresponding to the values in the value-set, and then connecting variable nodes to value nodes depending on the problem initialization. Our experimental evaluation on three different combinatorial problems demonstrates that both our models perform well on our novel problem, compared to a generic neural reasoner. Between two of our models, we observe an inherent trade-off: while the binarized model gives better performance when trained on smaller value-sets, multi-valued model is much more memory efficient, resulting in improved performance when trained on larger value-sets, where binarized model fails to train.

UAI Conference 2022 Conference Paper

SymNet 2. 0: Effectively handling Non-Fluents and Actions in Generalized Neural Policies for RDDL Relational MDPs

  • Vishal Sharma
  • Daman Arora
  • Florian Geißer
  • Mausam
  • Parag Singla

Relational MDPs (RMDPs) compactly represent an infinite set of MDPs with an unbounded number of objects. Solving an RMDP requires a generalized policy that applies to all instances of a domain. Recently, Garg et al. proposed SymNet for this task– it constructs a graph neural network that shares parameters across all instances in a domain, thus making it applicable to any instance in a zero-shot manner. Our analysis of SymNet reveals that it performs no better than random on 1/4th of planning competition domains. The key reasons are its design choices: it misses important information during graph construction, leading to (1) poor generalizability, and (2) potential non-identifiability of different actions. In response, our solution, SymNet2. 0, substantially augments SymNet’s graph construction approach by introducing additional nodes and edges which allow a better transfer of important information about a domain. It also improves SymNet’s action decoders with relevant information from objects to make different actions identifiable during scoring. Extensive experiments on twelve competition domains, where we use imitation learning over data generated from the PROST planner, demonstrate that SymNet2. 0 performs vastly better than SymNet. Interestingly, even though SymNet2. 0 is trained over data from PROST, it outperforms the planner on several test instances due to former’s ability to scale to large instances in a zero-shot manner.

UAI Conference 2021 Conference Paper

FlexAE: flexibly learning latent priors for wasserstein auto-encoders

  • Arnab Kumar Mondal
  • Himanshu Asnani
  • Parag Singla
  • Prathosh A. P.

Auto-Encoder (AE) based neural generative frameworks model the joint-distribution between the data and the latent space using an Encoder-Decoder pair, with regularization imposed in terms of a prior over the latent space. Despite their advantages, such as stability in training, efficient inference, the performance of AE based models has not reached the superior standards of the other generative models such as Generative Adversarial Networks (GANs). Motivated by this, we examine the effect of the latent prior on the generation quality of deterministic AE models in this paper. Specifically, we consider the class of Generative AE models with deterministic Encoder-Decoder pair (such as Wasserstein Auto-Encoder (WAE), Adversarial Auto-Encoder (AAE)), and show that having a fixed prior distribution, a priori, oblivious to the dimensionality of the ‘true’ latent space, will lead to the infeasibility of the optimization problem considered. As a remedy to the issue mentioned above, we introduce an additional state space in the form of flexibly learnable latent priors, in the optimization objective of WAE/AAE. Additionally, we employ a latent-space interpolation based smoothing scheme to address the non-smoothness that may arise from highly flexible priors. We show the efficacy of our proposed models, called FlexAE and FlexAE-SR, through several experiments on multiple datasets, and demonstrate that FlexAE-SR is the new state-of-the-art for the AE based generative models in terms of generation quality as measured by several metrics such as Fr\’echet Inception Distance, Precision/Recall score.

ICLR Conference 2021 Conference Paper

Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces

  • Yatin Nandwani
  • Deepanshu Jindal
  • Mausam
  • Parag Singla

Recent research has proposed neural architectures for solving combinatorial problems in structured output spaces. In many such problems, there may exist multiple solutions for a given input, e.g. a partially filled Sudoku puzzle may have many completions satisfying all constraints. Further, we are often interested in finding any "one" of the possible solutions, without any preference between them. Existing approaches completely ignore this solution multiplicity. In this paper, we argue that being oblivious to the presence of multiple solutions can severely hamper their training ability. Our contribution is two-fold. First, we formally define the task of learning one-of-many solutions for combinatorial problems in structured output spaces, which is applicable for solving several problems of interest such as N-Queens, and Sudoku. Second, we present a generic learning framework that adapts an existing prediction network for a combinatorial problem to handle solution multiplicity. Our framework uses a selection module, whose goal is to dynamically determine, for every input, the solution that is most effective for training the network parameters in any given learning iteration. We propose an RL based approach to jointly train the selection module with the prediction network. Experiments on three different domains, and using two different prediction networks, demonstrate that our framework significantly improves the accuracy in our setting, obtaining up to 21 pt gain over the baselines.

UAI Conference 2020 Conference Paper

MaskAAE: Latent space optimization for Adversarial Auto-Encoders

  • Arnab Kumar Mondal
  • Sankalan Pal Chowdhury
  • Aravind Jayendran
  • Himanshu Asnani
  • Parag Singla
  • Prathosh A. P.

The field of neural generative models is dominated by the highly successful Generative Adversarial Networks (GANs) despite their challenges, such as training instability and mode collapse. Auto-Encoders (AE) with regularized latent space provide an alternative framework for generative models, albeit their performance levels have not reached that of GANs. In this work, we hypothesise that the dimensionality of the AE model’s latent space has a critical effect on the quality of generated data. Under the assumption that nature generates data by sampling from a “true" generative latent space followed by a deterministic function, we show that the optimal performance is obtained when the dimensionality of the latent space of the AE-model matches with that of the “true" generative latent space. Further, we propose an algorithm called the Mask Adversarial Auto-Encoder (MaskAAE), in which the dimensionality of the latent space of an adversarial auto encoder is brought closer to that of the “true" generative latent space, via a procedure to mask the spurious latent dimensions. We demonstrate through experiments on synthetic and several real-world datasets that the proposed formulation yields betterment in the generation quality.

NeurIPS Conference 2019 Conference Paper

A Primal Dual Formulation For Deep Learning With Constraints

  • Yatin Nandwani
  • Abhishek Pathak
  • Mausam
  • Parag Singla

For several problems of interest, there are natural constraints which exist over the output label space. For example, for the joint task of NER and POS labeling, these constraints might specify that the NER label ‘organization’ is consistent only with the POS labels ‘noun’ and ‘preposition’. These constraints can be a great way of injecting prior knowledge into a deep learning model, thereby improving overall performance. In this paper, we present a constrained optimization formulation for training a deep network with a given set of hard constraints on output labels. Our novel approach first converts the label constraints into soft logic constraints over probability distributions outputted by the network. It then converts the constrained optimization problem into an alternating min-max optimization with Lagrangian variables defined for each constraint. Since the constraints are independent of the target labels, our framework easily generalizes to semi-supervised setting. We experiment on the tasks of Semantic Role Labeling (SRL), Named Entity Recognition (NER) tagging, and fine-grained entity typing and show that our constraints not only significantly reduce the number of constraint violations, but can also result in state-of-the-art performance

UAI Conference 2018 Conference Paper

Block-Value Symmetries in Probabilistic Graphical Models

  • Gagan Madan
  • Ankit Anand
  • Mausam
  • Parag Singla

One popular way for lifted inference in probabilistic graphical models is to first merge symmetric states into a single cluster (orbit) and then use these for downstream inference, via variations of orbital MCMC [Niepert, 2012]. These orbits are represented compactly using permutations over variables, and variablevalue (VV) pairs, but they can miss several state symmetries in a domain. We define the notion of permutations over block-value (BV) pairs, where a block is a set of variables. BV strictly generalizes VV symmetries, and can compute many more symmetries for increasing block sizes. To operationalize use of BV permutations in lifted inference, we describe 1) an algorithm to compute BV permutations given a block partition of the variables, 2) BV-MCMC, an extension of orbital MCMC that can sample from BV orbits, and 3) a heuristic to suggest good block partitions. Our experiments show that BV-MCMC can mix much faster compared to vanilla MCMC and orbital MCMC.

UAI Conference 2018 Conference Paper

Lifted Marginal MAP Inference

  • Vishal Sharma
  • Noman Ahmed Sheikh
  • Happy Mittal
  • Vibhav Gogate
  • Parag Singla

Lifted inference reduces the complexity of inference in relational probabilistic models by identifying groups of constants (or atoms) which behave symmetric to each other. A number of techniques have been proposed in the literature for lifting marginal as well MAP inference. We present the first application of lifting rules for marginal-MAP (MMAP), an important inference problem in models having latent (random) variables. Our main contribution is two fold: (1) we define a new equivalence class of (logical) variables, called Single Occurrence for MAX (SOM), and show that solution lies at extreme with respect to the SOM variables, i. e. , predicate groundings differing only in the instantiation of the SOM variables take the same truth value (2) we define a sub-class SOM-R (SOM Reduce) and exploit properties of extreme assignments to show that MMAP inference can be performed by reducing the domain of SOM-R variables to a single constant. We refer to our lifting technique as the SOM-R rule for lifted MMAP. Combined with existing rules such as decomposer and binomial, this results in a powerful framework for lifted MMAP. Experiments on three benchmark domains show significant gains in both time and memory compared to ground inference as well as lifted approaches not using SOM-R.

IJCAI Conference 2017 Conference Paper

Coarse-to-Fine Lifted MAP Inference in Computer Vision

  • Haroun Habeeb
  • Ankit Anand
  • Mausam
  • Parag Singla

There is a vast body of theoretical research on lifted inference in probabilistic graphical models (PGMs). However, few demonstrations exist where lifting is applied in conjunction with top of the line applied algorithms. We pursue the applicability of lifted inference for computer vision (CV), with the insight that a globally optimal (MAP) labeling will likely have the same label for two symmetric pixels. The success of our approach lies in efficiently handling a distinct unary potential on every node (pixel), typical of CV applications. This allows us to lift the large class of algorithms that model a CV problem via PGM inference. We propose a generic template for coarse-to-fine (C2F) inference in CV, which progressively refines an initial coarsely lifted PGM for varying quality-time trade-offs. We demonstrate the performance of C2F inference by developing lifted versions of two near state-of-the-art CV algorithms for stereo vision and interactive image segmentation. We find that, against flat algorithms, the lifted versions have a much superior anytime performance, without any loss in final solution quality.

IJCAI Conference 2016 Conference Paper

Contextual Symmetries in Probabilistic Graphical Models

  • Ankit Anand
  • Aditya Grover
  • Mausam
  • Parag Singla

An important approach for efficient inference in probabilistic graphical models exploits symmetries among objects in the domain. Symmetric variables (states) are collapsed into meta-variables (metastates) and inference algorithms are run over the lifted graphical model instead of the flat one. Our paper extends existing definitions of symmetry by introducing the novel notion of contextual symmetry. Two states that are not globally symmetric, can be contextually symmetric under some specific assignment to a subset of variables, referred to as the context variables. Contextual symmetry subsumes previous symmetry definitions and can represent a large class of symmetries not representable earlier. We show how to compute contextual symmetries by reducing it to the problem of graph isomorphism. We extend previous work on exploiting symmetries in the MCMC framework to the case of contextual symmetries. Our experiments on several domains of interest demonstrate that exploiting contextual symmetries can result in significant computational gains.

ICAPS Conference 2016 Conference Paper

OGA-UCT: On-the-Go Abstractions in UCT

  • Ankit Anand
  • Ritesh Noothigattu
  • Mausam
  • Parag Singla

Recent work has begun exploring the value of domain abstractions in Monte-Carlo Tree Search (MCTS) algorithms for probabilistic planning. These algorithms automatically aggregate symmetric search nodes (states or state-action pairs) saving valuable planning time. Existing algorithms alternate between two phases: (1) abstraction computation forcomputing node aggregations, and (2) modified MCTS that use aggregate nodes. We believe that these algorithms do not achieve the full potential of abstractions because of disjoint phases – e. g. , it can take a while to recover from erroneous abstractions, or compute better abstractions based on newly found knowledge. In response, we propose On-the-Go Abstractions (OGA), a novel approach in which abstraction computation is tightlyintegrated into the MCTS algorithm. We implement these on top of UCT and name the resulting algorithm OGA-UCT. It has several desirable properties, including (1) rapid use of new information in modifying existing abstractions, (2) elimination of the expensive batch abstraction computationphase, and (3) focusing abstraction computation on important part of the sampled search space. We experimentally compare OGA-UCT against ASAP-UCT, a recent state-of-the-art MDP algorithm as well as vanilla UCT algorithm. We find that OGA-UCT is robust across a suite of planning competition and other MDP domains, and obtains up to 28 % quality improvements.

IJCAI Conference 2016 Conference Paper

Unsupervised Alignment of Actions in Video with Text Descriptions

  • Young Chol Song
  • Iftekhar Naim
  • Abdullah Al Mamun
  • Kaustubh Kulkarni
  • Parag Singla
  • Jiebo Luo
  • Daniel Gildea
  • Henry Kautz

Advances in video technology and data storage have made large scale video data collections of complex activities readily accessible. An increasingly popular approach for automatically inferring the details of a video is to associate the spatio-temporal segments in a video with its natural language descriptions. Most algorithms for connecting natural language with video rely on pre-aligned supervised training data. Recently, several models have been shown to be effective for unsupervised alignment of objects in video with language. However, it remains difficult to generate good spatio-temporal video segments for actions that align well with language. This paper presents a framework that extracts higher level representations of low-level action features through hyperfeature coding from video and aligns them with language. We propose a two-step process that creates a high-level action feature codebook with temporally consistent motions, and then applies an unsupervised alignment algorithm over the action codewords and verbs in the language to identify individual activities. We show an improvement over previous alignment models of objects and nouns on videos of biological experiments, and also evaluate our system on a larger scale collection of videos involving kitchen activities.

IJCAI Conference 2015 Conference Paper

ASAP-UCT: Abstraction of State-Action Pairs in UCT

  • Ankit Anand
  • Aditya Grover
  • Mausam
  • Parag Singla

Monte-Carlo Tree Search (MCTS) algorithms such as UCT are an attractive online framework for solving planning under uncertainty problems modeled as a Markov Decision Process. However, MCTS search trees are constructed in flat state and action spaces, which can lead to poor policies for large problems. In a separate research thread, domain abstraction techniques compute symmetries to reduce the original MDP. This can lead to significant savings in computation, but these have been predominantly implemented for offline planning. This paper makes two contributions. First, we define the ASAP (Abstraction of State-Action Pairs) framework, which extends and unifies past work on domain abstractions by holistically aggregating both states and state-action pairs – ASAP uncovers a much larger number of symmetries in a given domain. Second, we propose ASAP-UCT, which implements ASAP-style abstractions within a UCT framework combining strengths of online planning with domain abstractions. Experimental evaluation on several benchmark domains shows up to 26% improvement in the quality of policies obtained over existing algorithms.

NeurIPS Conference 2015 Conference Paper

Fast Lifted MAP Inference via Partitioning

  • Somdeb Sarkhel
  • Parag Singla
  • Vibhav Gogate

Recently, there has been growing interest in lifting MAP inference algorithms for Markov logic networks (MLNs). A key advantage of these lifted algorithms is that they have much smaller computational complexity than propositional algorithms when symmetries are present in the MLN and these symmetries can be detected using lifted inference rules. Unfortunately, lifted inference rules are sound but not complete and can often miss many symmetries. This is problematic because when symmetries cannot be exploited, lifted inference algorithms ground the MLN, and search for solutions in the much larger propositional space. In this paper, we present a novel approach, which cleverly introduces new symmetries at the time of grounding. Our main idea is to partition the ground atoms and force the inference algorithm to treat all atoms in each part as indistinguishable. We show that by systematically and carefully refining (and growing) the partitions, we can build advanced any-time and any-space MAP inference algorithms. Our experiments on several real-world datasets clearly show that our new algorithm is superior to previous approaches and often finds useful symmetries in the search space that existing lifted inference rules are unable to detect.

NeurIPS Conference 2015 Conference Paper

Lifted Inference Rules With Constraints

  • Happy Mittal
  • Anuj Mahajan
  • Vibhav Gogate
  • Parag Singla

Lifted inference rules exploit symmetries for fast reasoning in statistical rela-tional models. Computational complexity of these rules is highly dependent onthe choice of the constraint language they operate on and therefore coming upwith the right kind of representation is critical to the success of lifted inference. In this paper, we propose a new constraint language, called setineq, which allowssubset, equality and inequality constraints, to represent substitutions over the vari-ables in the theory. Our constraint formulation is strictly more expressive thanexisting representations, yet easy to operate on. We reformulate the three mainlifting rules: decomposer, generalized binomial and the recently proposed singleoccurrence for MAP inference, to work with our constraint representation. Exper-iments on benchmark MLNs for exact and sampling based inference demonstratethe effectiveness of our approach over several other existing techniques.

NeurIPS Conference 2015 Conference Paper

Lifted Symmetry Detection and Breaking for MAP Inference

  • Timothy Kopp
  • Parag Singla
  • Henry Kautz

Symmetry breaking is a technique for speeding up propositional satisfiability testing by adding constraints to the theory that restrict the search space while preserving satisfiability. In this work, we extend symmetry breaking to the problem of model finding in weighted and unweighted relational theories, a class of problems that includes MAP inference in Markov Logic and similar statistical-relational languages. We introduce term symmetries, which are induced by an evidence set and extend to symmetries over a relational theory. We provide the important special case of term equivalent symmetries, showing that such symmetries can be found in low-degree polynomial time. We show how to break an exponential number of these symmetries with added constraints whose number is linear in the size of the domain. We demonstrate the effectiveness of these techniques through experiments in two relational domains. We also discuss the connections between relational symmetry breaking and work on lifted inference in statistical-relational reasoning.

NeurIPS Conference 2014 Conference Paper

An Integer Polynomial Programming Based Framework for Lifted MAP Inference

  • Somdeb Sarkhel
  • Deepak Venugopal
  • Parag Singla
  • Vibhav Gogate

In this paper, we present a new approach for lifted MAP inference in Markov logic networks (MLNs). The key idea in our approach is to compactly encode the MAP inference problem as an Integer Polynomial Program (IPP) by schematically applying three lifted inference steps to the MLN: lifted decomposition, lifted conditioning, and partial grounding. Our IPP encoding is lifted in the sense that an integer assignment to a variable in the IPP may represent a truth-assignment to multiple indistinguishable ground atoms in the MLN. We show how to solve the IPP by first converting it to an Integer Linear Program (ILP) and then solving the latter using state-of-the-art ILP techniques. Experiments on several benchmark MLNs show that our new algorithm is substantially superior to ground inference and existing methods in terms of computational efficiency and solution quality.

AAAI Conference 2014 Conference Paper

Approximate Lifting Techniques for Belief Propagation

  • Parag Singla
  • Aniruddh Nath
  • Pedro Domingos

Many AI applications need to explicitly represent relational structure as well as handle uncertainty. First order probabilistic models combine the power of logic and probability to deal with such domains. A naive approach to inference in these models is to propositionalize the whole theory and carry out the inference on the ground network. Lifted inference techniques (such as lifted belief propagation; Singla and Domingos 2008) provide a more scalable approach to inference by combining together groups of objects which behave identically. In many cases, constructing the lifted network can itself be quite costly. In addition, the exact lifted network is often very close in size to the fully propositionalized model. To overcome these problems, we present approximate lifted inference, which groups together similar but distinguishable objects and treats them as if they were identical. Early stopping terminates the execution of the lifted network construction at an early stage resulting in a coarser network. Noisetolerant hypercubes allow for marginal errors in the representation of the lifted network itself. Both of our algorithms can significantly speed up the process of lifted network construction as well as result in much smaller models. The coarseness of the approximation can be adjusted depending on the accuracy required, and we can bound the resulting error. Extensive evaluation on six domains demonstrates great efficiency gains with only minor (or no) loss in accuracy.

NeurIPS Conference 2014 Conference Paper

New Rules for Domain Independent Lifted MAP Inference

  • Happy Mittal
  • Prasoon Goyal
  • Vibhav Gogate
  • Parag Singla

Lifted inference algorithms for probabilistic first-order logic frameworks such as Markov logic networks (MLNs) have received significant attention in recent years. These algorithms use so called lifting rules to identify symmetries in the first-order representation and reduce the inference problem over a large probabilistic model to an inference problem over a much smaller model. In this paper, we present two new lifting rules, which enable fast MAP inference in a large class of MLNs. Our first rule uses the concept of single occurrence equivalence class of logical variables, which we define in the paper. The rule states that the MAP assignment over an MLN can be recovered from a much smaller MLN, in which each logical variable in each single occurrence equivalence class is replaced by a constant (i. e. , an object in the domain of the variable). Our second rule states that we can safely remove a subset of formulas from the MLN if all equivalence classes of variables in the remaining MLN are single occurrence and all formulas in the subset are tautology (i. e. , evaluate to true) at extremes (i. e. , assignments with identical truth value for groundings of a predicate). We prove that our two new rules are sound and demonstrate via a detailed experimental evaluation that our approach is superior in terms of scalability and MAP solution quality to the state of the art approaches.

AAAI Conference 2011 Conference Paper

Abductive Markov Logic for Plan Recognition

  • Parag Singla
  • Raymond Mooney

Plan recognition is a form of abductive reasoning that involves inferring plans that best explain sets of observed actions. Most existing approaches to plan recognition and other abductive tasks employ either purely logical methods that do not handle uncertainty, or purely probabilistic methods that do not handle structured representations. To overcome these limitations, this paper introduces an approach to abductive reasoning using a first-order probabilistic logic, specifically Markov Logic Networks (MLNs). It introduces several novel techniques for making MLNs efficient and effective for abduction. Experiments on three plan recognition datasets show the benefit of our approach over existing methods.

AAAI Conference 2008 Conference Paper

Lifted First-Order Belief Propagation

  • Parag Singla

Unifying first-order logic and probability is a long-standing goal of AI, and in recent years many representations combining aspects of the two have been proposed. However, inference in them is generally still at the level of propositional logic, creating all ground atoms and formulas and applying standard probabilistic inference methods to the resulting network. Ideally, inference should be lifted as in first-order logic, handling whole sets of indistinguishable objects together, in time independent of their cardinality. Poole (2003) and Braz et al. (2005, 2006) developed a lifted version of the variable elimination algorithm, but it is extremely complex, generally does not scale to realistic domains, and has only been applied to very small artificial problems. In this paper we propose the first lifted version of a scalable probabilistic inference algorithm, belief propagation (loopy or not). Our approach is based on first constructing a lifted network, where each node represents a set of ground atoms that all pass the same messages during belief propagation. We then run belief propagation on this network. We prove the correctness and optimality of our algorithm. Experiments show that it can greatly reduce the cost of inference.

AAAI Conference 2006 Conference Paper

Memory-Efficient Inference in Relational Domains

  • Parag Singla

Propositionalization of a first-order theory followed by satisfiability testing has proved to be a remarkably efficient approach to inference in relational domains such as planning (Kautz & Selman 1996) and verification (Jackson 2000). More recently, weighted satisfiability solvers have been used successfully for MPE inference in statistical relational learners (Singla & Domingos 2005). However, fully instantiating a finite first-order theory requires memory on the order of the number of constants raised to the arity of the clauses, which significantly limits the size of domains it can be applied to. In this paper we propose LazySAT, a variation of the Walk- SAT solver that avoids this blowup by taking advantage of the extreme sparseness that is typical of relational domains (i. e. , only a small fraction of ground atoms are true, and most clauses are trivially satisfied). Experiments on entity resolution and planning problems show that LazySAT reduces memory usage by orders of magnitude compared to Walk- SAT, while taking comparable time to run and producing the same solutions.

AAAI Conference 2005 Conference Paper

Discriminative Training of Markov Logic Networks

  • Parag Singla

Many machine learning applications require a combination of probability and first-order logic. Markov logic networks (MLNs) accomplish this by attaching weights to first-order clauses, and viewing these as templates for features of Markov networks. Model parameters (i. e. , clause weights) can be learned by maximizing the likelihood of a relational database, but this can be quite costly and lead to suboptimal results for any given prediction task. In this paper we propose a discriminative approach to training MLNs, one which optimizes the conditional likelihood of the query predicates given the evidence ones, rather than the joint likelihood of all predicates. We extend Collins’s (2002) voted perceptron algorithm for HMMs to MLNs by replacing the Viterbi algorithm with a weighted satisfiability solver. Experiments on entity resolution and link prediction tasks show the advantages of this approach compared to generative MLN training, as well as compared to purely probabilistic and purely logical approaches.