Arrow Research search

Author name cluster

Ankit Anand

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers
2 author rows

Possible papers

14

ICML Conference 2025 Conference Paper

Discovering Symbolic Cognitive Models from Human and Animal Behavior

  • Pablo Samuel Castro
  • Nenad Tomasev
  • Ankit Anand
  • Navodita Sharma
  • Rishika Mohanta
  • Aparna Dev
  • Kuba Perlin
  • Siddhant Jain

Symbolic models play a key role in cognitive science, expressing computationally precise hypotheses about how the brain implements a cognitive process. Identifying an appropriate model typically requires a great deal of effort and ingenuity on the part of a human scientist. Here, we adapt FunSearch (Romera-Paredes et al. 2024), a recently developed tool that uses Large Language Models (LLMs) in an evolutionary algorithm, to automatically discover symbolic cognitive models that accurately capture human and animal behavior. We consider datasets from three species performing a classic reward-learning task that has been the focus of substantial modeling effort, and find that the discovered programs outperform state-of-the-art cognitive models for each. The discovered programs can readily be interpreted as hypotheses about human and animal cognition, instantiating interpretable symbolic learning and decision-making algorithms. Broadly, these results demonstrate the viability of using LLM-powered program synthesis to propose novel scientific hypotheses regarding mechanisms of human and animal cognition.

ICML Conference 2025 Conference Paper

GRAIL: Graph Edit Distance and Node Alignment using LLM-Generated Code

  • Samidha Verma
  • Arushi Goyal
  • Ananya Mathur
  • Ankit Anand
  • Sayan Ranu

Graph Edit Distance (GED) is a widely used metric for measuring similarity between two graphs. Computing the optimal GED is NP-hard, leading to the development of various neural and non-neural heuristics. While neural methods have achieved improved approximation quality compared to non-neural approaches, they face significant challenges: (1) They require large amounts of ground truth data, which is itself NP-hard to compute. (2) They operate as black boxes, offering limited interpretability. (3) They lack cross-domain generalization, necessitating expensive retraining for each new dataset. We address these limitations with GRAIL, introducing a paradigm shift in this domain. Instead of training a neural model to predict GED, GRAIL employs a novel combination of large language models (LLMs) and automated prompt tuning to generate a program that is used to compute GED. This shift from predicting GED to generating programs imparts various advantages, including end-to-end interpretability and an autonomous self-evolutionary learning mechanism without ground-truth supervision. Extensive experiments on seven datasets confirm that GRAIL not only surpasses state-of-the-art GED approximation methods in prediction quality but also achieves robust cross-domain generalization across diverse graph distributions.

NeurIPS Conference 2024 Conference Paper

AutoMix: Automatically Mixing Language Models

  • Pranjal Aggarwal
  • Aman Madaan
  • Ankit Anand
  • Srividya Pranavi Potharaju
  • Swaroop Mishra
  • Pei Zhou
  • Aditya Gupta
  • Dheeraj Rajagopal

Large language models (LLMs) are now available from cloud API providers in various sizes and configurations. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present AutoMix, an approach that strategically routes queries to larger LMs, based on the approximate correctness of outputs from a smaller LM. Central to AutoMix are two key technical contributions. First, it has a few-shot self-verification mechanism, which estimates the reliability of its own outputs without requiring extensive training. Second, given that self-verification can be noisy, it employs a POMDP based router that can effectively select an appropriately sized model, based on answer confidence. Experiments across five language models and five challenging datasets show that Automix consistently surpasses strong baselines, reducing computational cost by over 50\% for comparable performance.

ICML Conference 2024 Conference Paper

Code as Reward: Empowering Reinforcement Learning with VLMs

  • David Venuto
  • Mohammad Sami Nur Islam
  • Martin Klissarov
  • Doina Precup
  • Sherry Yang 0001
  • Ankit Anand

Pre-trained Vision-Language Models (VLMs) are able to understand visual concepts, describe and decompose complex tasks into sub-tasks, and provide feedback on task completion. In this paper, we aim to leverage these capabilities to support the training of reinforcement learning (RL) agents. In principle, VLMs are well suited for this purpose, as they can naturally analyze image-based observations and provide feedback (reward) on learning progress. However, inference in VLMs is computationally expensive, so querying them frequently to compute rewards would significantly slowdown the training of an RL agent. To address this challenge, we propose a framework named Code as Reward (VLM-CaR). VLM-CaR produces dense reward functions from VLMs through code generation, thereby significantly reducing the computational burden of querying the VLM directly. We show that the dense rewards generated through our approach are very accurate across a diverse set of discrete and continuous environments, and can be more effective in training RL policies than the original sparse environment rewards.

IJCAI Conference 2024 Conference Paper

Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search

  • Abbas Mehrabian
  • Ankit Anand
  • Hyunjik Kim
  • Nicolas Sonnerat
  • Matej Balog
  • Gheorghe Comanici
  • Tudor Berariu
  • Andrew Lee

This work proposes a new learning-to-search benchmark and uses AI to discover new mathematical knowledge related to an open conjecture of Erdos (1975) in extremal graph theory. The problem is to find graphs with a given size (number of nodes) that maximize the number of edges without having 3- or 4-cycles. We formulate this as a sequential decision-making problem and compare AlphaZero, a neural network-guided tree search, with tabu search, a heuristic local search method. Using either method, by introducing a curriculum---jump-starting the search for larger graphs using good graphs found at smaller sizes---we improve the state-of-the-art lower bounds for several sizes. We also propose a flexible graph-generation environment and a permutation-invariant network architecture for learning to search in the space of graphs.

NeurIPS Conference 2024 Conference Paper

ReMI: A Dataset for Reasoning with Multiple Images

  • Mehran Kazemi
  • Nishanth Dikkala
  • Ankit Anand
  • Petar Devic
  • Ishita Dasgupta
  • Fangyu Liu
  • Bahare Fatemi
  • Pranjal Awasthi

With the continuous advancement of large language models (LLMs), it is essential to create new benchmarks to evaluate their expanding capabilities and identify areas for improvement. This work focuses on multi-image reasoning, an emerging capability in state-of-the-art LLMs. We introduce ReMI, a dataset designed to assess LLMs' ability to reason with multiple images. This dataset encompasses a diverse range of tasks, spanning various reasoning domains such as math, physics, logic, code, table/chart understanding, and spatial and temporal reasoning. It also covers a broad spectrum of characteristics found in multi-image reasoning scenarios. We have benchmarked several cutting-edge LLMs using ReMI and found a substantial gap between their performance and human-level proficiency. This highlights the challenges in multi-image reasoning and the need for further research. Our analysis also reveals the strengths and weaknesses of different models, shedding light on the types of reasoning that are currently attainable and areas where future models require improvement. We anticipate that ReMI will be a valuable resource for developing and evaluating more sophisticated LLMs capable of handling real-world multi-image understanding tasks.

ICLR Conference 2023 Conference Paper

Learning About Progress From Experts

  • Jake Bruce
  • Ankit Anand
  • Bogdan Mazoure
  • Rob Fergus

Many important tasks involve some notion of long-term progress in multiple phases: e.g. to clean a shelf it must be cleared of items, cleaning products applied, and then the items placed back on the shelf. In this work, we explore the use of expert demonstrations in long-horizon tasks to learn a monotonically increasing function that summarizes progress. This function can then be used to aid agent exploration in environments with sparse rewards. As a case study we consider the NetHack environment, which requires long-term progress at a variety of scales and is far from being solved by existing approaches. In this environment, we demonstrate that by learning a model of long-term progress from expert data containing only observations, we can achieve efficient exploration in challenging sparse tasks, well beyond what is possible with current state-of-the-art approaches. We have made the curated gameplay dataset used in this work available at https://github.com/deepmind/nao_top10.

ICML Conference 2022 Conference Paper

Proving Theorems using Incremental Learning and Hindsight Experience Replay

  • Eser Aygün
  • Ankit Anand
  • Laurent Orseau
  • Xavier Glorot
  • Stephen Marcus McAleer
  • Vlad Firoiu
  • Lei M. Zhang
  • Doina Precup

Traditional automated theorem proving systems for first-order logic depend on speed-optimized search and many handcrafted heuristics designed to work over a wide range of domains. Machine learning approaches in the literature either depend on these traditional provers to bootstrap themselves, by leveraging these heuristics, or can struggle due to limited existing proof data. The latter issue can be explained by the lack of a smooth difficulty gradient in theorem proving datasets; large gaps in difficulty between different theorems can make training harder or even impossible. In this paper, we adapt the idea of hindsight experience replay from reinforcement learning to the automated theorem proving domain, so as to use the intermediate data generated during unsuccessful proof attempts. We build a first-order logic prover by disabling all the smart clause-scoring heuristics of the state-of-the-art E prover and replacing them with a clause-scoring neural network learned by using hindsight experience replay in an incremental learning setting. Clauses are represented as graphs and presented to transformer networks with spectral features. We show that provers trained in this way can outperform previous machine learning approaches and compete with the state of the art heuristic-based theorem prover E in its best configuration, on the popular benchmarks MPTP2078, M2k and Mizar40. The proofs generated by our algorithm are also almost always significantly shorter than E’s proofs.

UAI Conference 2018 Conference Paper

Block-Value Symmetries in Probabilistic Graphical Models

  • Gagan Madan
  • Ankit Anand
  • Mausam
  • Parag Singla

One popular way for lifted inference in probabilistic graphical models is to first merge symmetric states into a single cluster (orbit) and then use these for downstream inference, via variations of orbital MCMC [Niepert, 2012]. These orbits are represented compactly using permutations over variables, and variablevalue (VV) pairs, but they can miss several state symmetries in a domain. We define the notion of permutations over block-value (BV) pairs, where a block is a set of variables. BV strictly generalizes VV symmetries, and can compute many more symmetries for increasing block sizes. To operationalize use of BV permutations in lifted inference, we describe 1) an algorithm to compute BV permutations given a block partition of the variables, 2) BV-MCMC, an extension of orbital MCMC that can sample from BV orbits, and 3) a heuristic to suggest good block partitions. Our experiments show that BV-MCMC can mix much faster compared to vanilla MCMC and orbital MCMC.

IJCAI Conference 2017 Conference Paper

Coarse-to-Fine Lifted MAP Inference in Computer Vision

  • Haroun Habeeb
  • Ankit Anand
  • Mausam
  • Parag Singla

There is a vast body of theoretical research on lifted inference in probabilistic graphical models (PGMs). However, few demonstrations exist where lifting is applied in conjunction with top of the line applied algorithms. We pursue the applicability of lifted inference for computer vision (CV), with the insight that a globally optimal (MAP) labeling will likely have the same label for two symmetric pixels. The success of our approach lies in efficiently handling a distinct unary potential on every node (pixel), typical of CV applications. This allows us to lift the large class of algorithms that model a CV problem via PGM inference. We propose a generic template for coarse-to-fine (C2F) inference in CV, which progressively refines an initial coarsely lifted PGM for varying quality-time trade-offs. We demonstrate the performance of C2F inference by developing lifted versions of two near state-of-the-art CV algorithms for stereo vision and interactive image segmentation. We find that, against flat algorithms, the lifted versions have a much superior anytime performance, without any loss in final solution quality.

IJCAI Conference 2016 Conference Paper

Contextual Symmetries in Probabilistic Graphical Models

  • Ankit Anand
  • Aditya Grover
  • Mausam
  • Parag Singla

An important approach for efficient inference in probabilistic graphical models exploits symmetries among objects in the domain. Symmetric variables (states) are collapsed into meta-variables (metastates) and inference algorithms are run over the lifted graphical model instead of the flat one. Our paper extends existing definitions of symmetry by introducing the novel notion of contextual symmetry. Two states that are not globally symmetric, can be contextually symmetric under some specific assignment to a subset of variables, referred to as the context variables. Contextual symmetry subsumes previous symmetry definitions and can represent a large class of symmetries not representable earlier. We show how to compute contextual symmetries by reducing it to the problem of graph isomorphism. We extend previous work on exploiting symmetries in the MCMC framework to the case of contextual symmetries. Our experiments on several domains of interest demonstrate that exploiting contextual symmetries can result in significant computational gains.

IJCAI Conference 2016 Conference Paper

Lifting Techniques for Sequential Decision Making and Probabilistic Inference (Extended Abstract)

  • Ankit Anand

Many traditional AI algorithms fail to scale as the size of state space increases exponentially with the number of features. One way to reduce computation in such scenarios is to reduce the problem size by grouping symmetric states together and then running the algorithm on the reduced problem. The focus of this work is to exploit symmetry in problems of sequential decision making and probabilistic inference. Our recent work- ASAP-UCT defines new State-Action Pair (SAP) symmetries in Markov Decision Processes. We also apply these SAP symmetries in Monte Carlo Tree Search (MCTS) framework. In probabilistic inference, we expand the notion of unconditional symmetries to contextual symmetries and apply them in Markov Chain Monte Carlo (MCMC) methods. In future, we plan to explore interesting links in symmetry exploitation in different problems and aim to develop a generic symmetry based framework.

ICAPS Conference 2016 Conference Paper

OGA-UCT: On-the-Go Abstractions in UCT

  • Ankit Anand
  • Ritesh Noothigattu
  • Mausam
  • Parag Singla

Recent work has begun exploring the value of domain abstractions in Monte-Carlo Tree Search (MCTS) algorithms for probabilistic planning. These algorithms automatically aggregate symmetric search nodes (states or state-action pairs) saving valuable planning time. Existing algorithms alternate between two phases: (1) abstraction computation forcomputing node aggregations, and (2) modified MCTS that use aggregate nodes. We believe that these algorithms do not achieve the full potential of abstractions because of disjoint phases – e. g. , it can take a while to recover from erroneous abstractions, or compute better abstractions based on newly found knowledge. In response, we propose On-the-Go Abstractions (OGA), a novel approach in which abstraction computation is tightlyintegrated into the MCTS algorithm. We implement these on top of UCT and name the resulting algorithm OGA-UCT. It has several desirable properties, including (1) rapid use of new information in modifying existing abstractions, (2) elimination of the expensive batch abstraction computationphase, and (3) focusing abstraction computation on important part of the sampled search space. We experimentally compare OGA-UCT against ASAP-UCT, a recent state-of-the-art MDP algorithm as well as vanilla UCT algorithm. We find that OGA-UCT is robust across a suite of planning competition and other MDP domains, and obtains up to 28 % quality improvements.

IJCAI Conference 2015 Conference Paper

ASAP-UCT: Abstraction of State-Action Pairs in UCT

  • Ankit Anand
  • Aditya Grover
  • Mausam
  • Parag Singla

Monte-Carlo Tree Search (MCTS) algorithms such as UCT are an attractive online framework for solving planning under uncertainty problems modeled as a Markov Decision Process. However, MCTS search trees are constructed in flat state and action spaces, which can lead to poor policies for large problems. In a separate research thread, domain abstraction techniques compute symmetries to reduce the original MDP. This can lead to significant savings in computation, but these have been predominantly implemented for offline planning. This paper makes two contributions. First, we define the ASAP (Abstraction of State-Action Pairs) framework, which extends and unifies past work on domain abstractions by holistically aggregating both states and state-action pairs – ASAP uncovers a much larger number of symmetries in a given domain. Second, we propose ASAP-UCT, which implements ASAP-style abstractions within a UCT framework combining strengths of online planning with domain abstractions. Experimental evaluation on several benchmark domains shows up to 26% improvement in the quality of policies obtained over existing algorithms.