Author name cluster

Michael Witbrock

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

17 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

Trust Region Reward Optimization and Proximal Inverse Reward Optimization Algorithm

Yang Chen
Menglin Zou
Jiaqi Zhang
Yitan Zhang
Junyi Yang
Gaël Gendron
Libo Zhang
Jiamou Liu

Inverse Reinforcement Learning (IRL) learns a reward function to explain expert demonstrations. Modern IRL methods often use the adversarial (minimax) formulation that alternates between reward and policy optimization, which often lead to {\em unstable} training. Recent non-adversarial IRL approaches improve stability by jointly learning reward and policy via energy-based formulations but lack formal guarantees. This work bridges this gap. We first present a unified view showing canonical non-adversarial methods explicitly or implicitly maximize the likelihood of expert behavior, which is equivalent to minimizing the expected return gap. This insight leads to our main contribution: Trust Region Reward Optimization (TRRO), a framework that guarantees monotonic improvement in this likelihood via a Minorization-Maximization process. We instantiate TRRO into Proximal Inverse Reward Optimization (PIRO), a practical and stable IRL algorithm. Theoretically, TRRO provides the IRL counterpart to the stability guarantees of Trust Region Policy Optimization (TRPO) in forward RL. Empirically, PIRO matches or surpasses state-of-the-art baselines in reward recovery, policy imitation with high sample efficiency on MuJoCo and Gym-Robotics benchmarks and a real-world animal behavior modeling task.

PDF Details

AAMAS Conference 2024 Conference Paper

Behaviour Modelling of Social Animals via Causal Structure Discovery and Graph Neural Networks

Gaël Gendron
Yang Chen
Mitchell Rogers
Yiping Liu
Mihailo Azhar
Shahrokh Heidari
David Arturo Soriano Valdez
Kobe Knowles

Better understanding the natural world is a crucial task with a wide range of applications. In environments with close proximity between humans and animals, such as zoos, it is essential to better understand the causes behind animal behaviour to predict unusual changes, mitigate their detrimental effects and increase the well-being of animals. However, the complex social behaviours of mammalian groups remain largely unexplored. In this work, we propose a method to build behavioural models using causal structure discovery and graph neural networks for time series. We apply this method to a mob of meerkats in a zoo environment and study its ability to predict future actions and model the behaviour distribution at an individual-level and at a group level. We show that our method can match and outperform standard deep learning architectures and generate more realistic data, while using fewer parameters and providing increased interpretability.

PDF

IJCAI Conference 2024 Conference Paper

Large Language Models Are Not Strong Abstract Reasoners

Gaël Gendron
Qiming Bao
Michael Witbrock
Gillian Dobbie

Large Language Models have shown tremendous performance on a large variety of natural language processing tasks, ranging from text comprehension to common sense reasoning. However, the mechanisms responsible for this success remain opaque, and it is unclear whether LLMs can achieve human-like cognitive capabilities or whether these models are still fundamentally circumscribed. Abstract reasoning is a fundamental task for cognition, consisting of finding and applying a general pattern from few data. Evaluating deep neural architectures on this task could give insight into their potential limitations regarding reasoning and their broad generalisation abilities, yet this is currently an under-explored area. In this paper, we introduce a new benchmark for evaluating language models beyond memorization on abstract reasoning tasks. We perform extensive evaluations of state-of-the-art LLMs, showing that they currently achieve very limited performance in contrast with other natural language tasks, even when applying techniques that have been shown to improve performance on other NLP tasks. We argue that guiding LLM generation to follow causal paths could help improve the generalisation and reasoning abilities of LLMs.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Meta-Inverse Reinforcement Learning for Mean Field Games via Probabilistic Context Variables

Yang Chen
Xiao Lin
Bo Yan
Libo Zhang
Jiamou Liu
Neset Özkan Tan
Michael Witbrock

Designing suitable reward functions for numerous interacting intelligent agents is challenging in real-world applications. Inverse reinforcement learning (IRL) in mean field games (MFGs) offers a practical framework to infer reward functions from expert demonstrations. While promising, the assumption of agent homogeneity limits the capability of existing methods to handle demonstrations with heterogeneous and unknown objectives, which are common in practice. To this end, we propose a deep latent variable MFG model and an associated IRL method. Critically, our method can infer rewards from different yet structurally similar tasks without prior knowledge about underlying contexts or modifying the MFG model itself. Our experiments, conducted on simulated scenarios and a real-world spatial taxi-ride pricing problem, demonstrate the superiority of our approach over state-of-the-art IRL methods in MFGs.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Robust Node Classification on Graph Data with Graph and Label Noise

Yonghua Zhu
Lei Feng
Zhenyun Deng
Yang Chen
Robert Amor
Michael Witbrock

Current research for node classification focuses on dealing with either graph noise or label noise, but few studies consider both of them. In this paper, we propose a new robust node classification method to simultaneously deal with graph noise and label noise. To do this, we design a graph contrastive loss to conduct local graph learning and employ self-attention to conduct global graph learning. They enable us to improve the expressiveness of node representation by using comprehensive information among nodes. We also utilize pseudo graphs and pseudo labels to deal with graph noise and label noise, respectively. Furthermore, We numerically validate the superiority of our method in terms of robust node classification compared with all comparison methods.

PDF Details DOI

AAMAS Conference 2023 Conference Paper

Adversarial Inverse Reinforcement Learning for Mean Field Games

Yang Chen
Libo Zhang
Jiamou Liu
Michael Witbrock

Goal-based agents respond to environments and adjust behaviour accordingly to reach objectives. Understanding incentives of interacting agents from observed behaviour is a core problem in multi-agent systems. Inverse reinforcement learning (IRL) solves this problem, which infers underlying reward functions by observing the behaviour of rational agents. Despite IRL being principled, it becomes intractable when the number of agents grows because of the curse of dimensionality and the explosion of agent interactions. The formalism of Mean field games (MFGs) has gained momentum as a mathematically tractable paradigm for studying large-scale multi-agent systems. By grounding IRL in MFGs, recent research attempts to push the limits of the agent number in IRL. However, the study of IRL for MFGs is far from being mature as existing methods assume strong rationality, while real-world agents often exhibit bounded rationality due to the limited cognitive or computational capacity. Towards a more general and practical IRL framework for MFGs, this paper proposes Mean-Field Adversarial IRL, a novel framework capable of tolerating bounded rationality. We build it upon the maximum entropy principle, adversarial learning, and a new equilibrium concept for MFGs. We evaluate our machinery on simulated tasks with imperfect demonstrations resulting from bounded rationality. Experimental results demonstrate the superiority of MF-AIRL over existing methods in reward recovery.

PDF

IJCAI Conference 2023 Conference Paper

Disentanglement of Latent Representations via Causal Interventions

Gaël Gendron
Michael Witbrock
Gillian Dobbie

The process of generating data such as images is controlled by independent and unknown factors of variation. The retrieval of these variables has been studied extensively in the disentanglement, causal representation learning, and independent component analysis fields. Recently, approaches merging these domains together have shown great success. Instead of directly representing the factors of variation, the problem of disentanglement can be seen as finding the interventions on one image that yield a change to a single factor. Following this assumption, we introduce a new method for disentanglement inspired by causal dynamics that combines causality theory with vector-quantized variational autoencoders. Our model considers the quantized vectors as causal variables and links them in a causal graph. It performs causal interventions on the graph and generates atomic transitions affecting a unique factor of variation in the image. We also introduce a new task of action retrieval that consists of finding the action responsible for the transition between two images. We test our method on standard synthetic and real-world disentanglement datasets. We show that it can effectively disentangle the factors of variation and perform precise interventions on high-level semantic attributes of an image without affecting its quality, even with imbalanced data distributions.

PDF Details DOI

AAMAS Conference 2023 Conference Paper

Learning Density-Based Correlated Equilibria for Markov Games

Libo Zhang
Yang Chen
Toru Takisaka
Bakh Khoussainov
Michael Witbrock
Jiamou Liu

Correlated Equilibrium (CE) is a well-established solution concept that captures coordination among agents and enjoys good algorithmic properties. In real-world multi-agent systems, in addition to being in equilibrium, agents’ policies are often expected to meet requirements with respect to safety, and fairness. Such additional requirements can often be expressed in terms of the state density which measures the state-visitation frequencies during the course of a game. However, existing CE notions or CE-finding approaches cannot explicitly specify a CE with particular properties concerning state density; they do so implicitly by either modifying reward functions or using value functions as the selection criteria. The resulting CE may thus not fully fulfil the state-density requirements. In this paper, we propose Density-Based Correlated Equilibria (DBCE), a new notion of CE that explicitly takes state density as a selection criterion. Concretely, we instantiate DBCE by specifying different state-density requirements motivated by real-world applications. To compute DBCE, we put forward the Density Based Correlated Policy Iteration algorithm for the underlying control problem. We perform experiments on various games where results demonstrate the advantage of our CE-finding approach over existing methods in scenarios with state-density concerns.

PDF

TMLR Journal 2023 Journal Article

Teaching Smaller Language Models To Generalise To Unseen Compositional Questions

Tim Hartill
Neset TAN
Michael Witbrock
Patricia J. Riddle

We equip a smaller Language Model to generalise to answering challenging compositional questions that have not been seen in training. To do so we propose a combination of multitask supervised pretraining on up to 93 tasks designed to instill diverse reasoning abilities, and a dense retrieval system that aims to retrieve a set of evidential paragraph fragments. Recent progress in question-answering has been achieved either through prompting methods against very large pretrained Language Models in zero or few-shot fashion, or by fine-tuning smaller models, sometimes in conjunction with information retrieval. We focus on the less explored question of the extent to which zero-shot generalisation can be enabled in smaller models with retrieval against a corpus within which sufficient information to answer a particular question may not exist. We establish strong baselines in this setting for diverse evaluation datasets (StrategyQA, CommonsenseQA, IIRC, DROP, Musique and ARC-DA), and show that performance can be significantly improved by adding retrieval-augmented training datasets which are designed to expose our models to a variety of heuristic reasoning strategies such as weighing partial evidence or ignoring an irrelevant context.

PDF Details

IJCAI Conference 2022 Conference Paper

Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering

Zhenyun Deng
Yonghua Zhu
Yang Chen
Michael Witbrock
Patricia Riddle

Effective multi-hop question answering (QA) requires reasoning over multiple scattered paragraphs and providing explanations for answers. Most existing approaches cannot provide an interpretable reasoning process to illustrate how these models arrive at an answer. In this paper, we propose a Question Decomposition method based on Abstract Meaning Representation (QDAMR) for multi-hop QA, which achieves interpretable reasoning by decomposing a multi-hop question into simpler subquestions and answering them in order. Since annotating the decomposition is expensive, we first delegate the complexity of understanding the multi-hop question to an AMR parser. We then achieve decomposition of a multi-hop question via segmentation of the corresponding AMR graph based on the required reasoning type. Finally, we generate sub-questions using an AMR-to-Text generation model and answer them with an off-the-shelf QA model. Experimental results on HotpotQA demonstrate that our approach is competitive for interpretable reasoning and that the sub-questions generated by QDAMR are well-formed, outperforming existing question-decomposition-based multihop QA approaches.

PDF Details DOI

NeSy Conference 2022 Conference Paper

Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation

Qiming Bao 0001
Alex Yuxuan Peng
Tim Hartill
Neset Tan
Zhenyun Deng
Michael Witbrock
Jiamou Liu

Combining deep learning with symbolic logic reasoning aims to capitalize on the success of both fields and is drawing increasing attention. Inspired by DeepLogic, an end-to-end model trained to perform inference on logic programs, we introduce IMA-GloVe-GA, an iterative neural inference network for multi-step reasoning expressed in natural language. In our model, reasoning is performed using an iterative memory neural network based on RNN with a gate attention mechanism. We evaluate IMA-GloVe-GA on three datasets: PARARULES, CONCEPTRULES V1 and CONCEPTRULES V2. Experimental results show DeepLogic with gate attention can achieve higher test accuracy than DeepLogic and other RNN baseline models. Our model achieves better out-of-distribution generalisation than RoBERTa-Large when the rules have been shuffled. Furthermore, to address the issue of unbalanced distribution of reasoning depths in the current multi-step reasoning datasets, we develop PARARULE-Plus, a large dataset with more examples that require deeper reasoning steps. Experimental results show that the addition of PARARULEPlus can increase the model’s performance on examples requiring deeper reasoning depths. The source code and data are available at https: //github. com/Strong-AI-Lab/Multi-Step-Deductive-Reasoning-OverNatural-Language.

Details

AAAI Conference 2021 Conference Paper

A Deep Reinforcement Learning Approach to First-Order Logic Theorem Proving

Maxwell Crouse
Ibrahim Abdelaziz
Bassem Makni
Spencer Whitehead
Cristina Cornelio
Pavan Kapanipathi
Kavitha Srinivas
Veronika Thost

Automated theorem provers have traditionally relied on manually tuned heuristics to guide how they perform proof search. Deep reinforcement learning has been proposed as a way to obviate the need for such heuristics, however, its deployment in automated theorem proving remains a challenge. In this paper we introduce TRAIL, a system that applies deep reinforcement learning to saturation-based theorem proving. TRAIL leverages (a) a novel neural representation of the state of a theorem prover and (b) a novel characterization of the inference selection process in terms of an attention-based action policy. We show through systematic analysis that these mechanisms allow TRAIL to significantly outperform previous reinforcementlearning-based theorem provers on two benchmark datasets for first-order logic automated theorem proving (proving around 15% more theorems).

PDF Details

AAAI Conference 2019 Conference Paper

A Sequential Set Generation Method for Predicting Set-Valued Outputs

Tian Gao
Jie Chen
Vijil Chenthamarakshan
Michael Witbrock

Consider a general machine learning setting where the output is a set of labels or sequences. This output set is unordered and its size varies with the input. Whereas multi-label classification methods seem a natural first resort, they are not readily applicable to set-valued outputs because of the growth rate of the output space; and because conventional sequence generation doesn’t reflect sets’ order-free nature. In this paper, we propose a unified framework—sequential set generation (SSG)—that can handle output sets of labels and sequences. SSG is a meta-algorithm that leverages any probabilistic learning method for label or sequence prediction, but employs a proper regularization such that a new label or sequence is generated repeatedly until the full set is produced. Though SSG is sequential in nature, it does not penalize the ordering of the appearance of the set elements and can be applied to a variety of set output problems, such as a set of classification labels or sequences. We perform experiments with both benchmark and synthetic data sets and demonstrate SSG’s strong performance over baseline methods.

PDF Details

AAAI Conference 2019 Conference Paper

Improving Natural Language Inference Using External Knowledge in the Science Questions Domain

Xiaoyan Wang
Pavan Kapanipathi
Ryan Musa
Mo Yu
Kartik Talamadupula
Ibrahim Abdelaziz
Maria Chang
Achille Fokoue

Natural Language Inference (NLI) is fundamental to many Natural Language Processing (NLP) applications including semantic search and question answering. The NLI problem has gained significant attention due to the release of large scale, challenging datasets. Present approaches to the problem largely focus on learning-based methods that use only textual information in order to classify whether a given premise entails, contradicts, or is neutral with respect to a given hypothesis. Surprisingly, the use of methods based on structured knowledge – a central topic in artificial intelligence – has not received much attention vis-a-vis the NLI problem. While there are many open knowledge bases that contain various types of reasoning information, their use for NLI has not been well explored. To address this, we present a combination of techniques that harness external knowledge to improve performance on the NLI problem in the science questions domain. We present the results of applying our techniques on text, graph, and text-and-graph based models; and discuss the implications of using external knowledge to solve the NLI problem. Our model achieves close to state-of-the-art performance for NLI on the SciTail science questions dataset.

PDF Details

NeurIPS Conference 2017 Conference Paper

Dilated Recurrent Neural Networks

Shiyu Chang
Yang Zhang
Wei Han
Mo Yu
Xiaoxiao Guo
Wei Tan
Xiaodong Cui
Michael Witbrock

Learning with recurrent neural networks (RNNs) on long sequences is a notoriously difficult task. There are three major challenges: 1) complex dependencies, 2) vanishing and exploding gradients, and 3) efficient parallelization. In this paper, we introduce a simple yet effective RNN connection structure, the DilatedRNN, which simultaneously tackles all of these challenges. The proposed architecture is characterized by multi-resolution dilated recurrent skip connections and can be combined flexibly with diverse RNN cells. Moreover, the DilatedRNN reduces the number of parameters needed and enhances training efficiency significantly, while matching state-of-the-art performance (even with standard RNN cells) in tasks involving very long-term dependencies. To provide a theory-based quantification of the architecture's advantages, we introduce a memory capacity measure, the mean recurrent length, which is more suitable for RNNs with long skip connections than existing measures. We rigorously prove the advantages of the DilatedRNN over other recurrent neural architectures. The code for our method is publicly available at https: //github. com/code-terminator/DilatedRNN.

PDF Details

KR Conference 2004 Conference Paper

Towards a Quantitative, Platform-Independent Analysis of Knowledge Systems

Noah S. Friedland
Paul G. Allen
Michael Witbrock
Gavin Matthews
Nancy Salay
Pierluigi Miraglia
Jurgen Angele
Steffen Staab

The Halo Pilot, a six-month effort to evaluate the state-ofthe- art in applied Knowledge Representation and Reasoning (KRR) systems, collaboratively developed a taxonomy of failures with the goal of creating a common framework of metrics against which we could measure inter- and intra- system failure characteristics of each of the three Halo knowledge applications. This platform independent taxonomy was designed with the intent of maximizing its coverage of potential failure types; providing the necessary granularity and precision to enable clear categorization of failure types; and providing a productive framework for short and longer term corrective action. Examining the failure analysis and initial empirical use of the taxonomy provides quantitative insights into the strengths and weaknesses of individual systems and raises some issues shared by all three. These results are particularly interesting when considered against the long history of assumed reasons for knowledge system failure. Our study has also uncovered some shortcomings in the taxonomy itself, implying the need to improve both its granularity and precision. It is the hope of Project Halo to eventually produce a failure taxonomy and associated methodology that will be of general use in the fine-grained analysis of knowledge systems.

IJCAI Conference 2003 Conference Paper

Inducing criteria for lexicalization parts of speech using the Cyc KB

Tom O'Hara
Michael Witbrock
Bjern Aldag
Stefano Bertolo
Nancy Salay
Jon Curtis
Kathy Panton

We present an approach for learning part-of-speech distinctions by induction over the lexicon of the Cyc knowledge base. This produces good results (74. 6%) using a decision tree that incorporates both semantic features and syntactic features. Accurate results (90. 5%) are achieved for the special case of deciding whether lexical mappings should use count noun or mass noun headwords. Comparable results are also obtained using OpenCyc, the publicly available version of Cyc.

PDF