Arrow Research search

Author name cluster

Svetha Venkatesh

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

78 papers
2 author rows

Possible papers

78

JBHI Journal 2026 Journal Article

Confident and Trustworthy Model for Fidgety Movement Classification

  • Romero Morais
  • Thao Minh Le
  • Truyen Tran
  • Caroline Alexander
  • Natasha Amery
  • Catherine Morgan
  • Alicia Spittle
  • Vuong Le

General movements (GMs) are part of the spontaneous movement repertoire and are present from early fetal life onwards up to age five months. GMs are connected to infants’ neurological development and can be qualitatively assessed via the General Movement Assessment (GMA). In particular, between the age of three to five months, typically developing infants produce Fidgety Movements (FM) and their absence provides strong evidence for the presence of cerebral palsy (CP). To improve accessibility to the GMA, automated GMA solutions have been a key research area with proposed models becoming increasingly more accurate and interpretable. However, current models cannot gauge their ability to make decisions, which may lead to overconfident mistakes. To address this issue, we propose a Deep learning-based approach that not only classifies movements as fidgety or non-fidgety but also selectively abstains from classification when uncertain. Through two novel regularization losses, our model maintains a balanced coverage across the two movement types, which prevents bias toward an easy-to-classify subset of movements. We show that our proposed model learns to gauge its own confidence on movement classification, and our proposed regularization losses effectively ensure that the model maintains a similar confidence across movement types. We also show that the local movement abstentions have little impact on the video-level coverage and that relying on the most confident predictions improves the video-level performance.

JBHI Journal 2025 Journal Article

Fine-Grained Fidgety Movement Classification Using Active Learning

  • Romero Morais
  • Truyen Tran
  • Caroline Alexander
  • Natasha Amery
  • Catherine Morgan
  • Alicia Spittle
  • Vuong Le
  • Nadia Badawi

Typically developing infants, between the corrected age of 9–20 weeks, produce fidgety movements. These movements can be identified with the General Movement Assessment, but their identification requires trained professionals to conduct the assessment from video recordings. Since trained professionals are expensive and their demand may be higher than their availability, computer vision-based solutions have been developed to assist practitioners. However, most solutions to date treat the problem as a direct mapping from video to infant status, without modeling fidgety movements throughout the video. To address that, we propose to directly model infants' short movements and classify them as fidgety or non-fidgety. In this way, we model the explanatory factor behind the infant's status and improve model interpretability. The issue with our proposal is that labels for an infant's short movements are not available, which precludes us to train such a model. We overcome this issue with active learning. Active learning is a framework that minimizes the amount of labeled data required to train a model, by only labeling examples that are considered “informative” to the model. The assumption is that a model trained on informative examples reaches a higher performance level than a model trained with randomly selected examples. We validate our framework by modeling the movements of infants' hips on two representative cohorts: typically developing and at-risk infants. Our results show that active learning is suitable to our problem and that it works adequately even when the models are trained with labels provided by a novice annotator.

AAAI Conference 2025 Conference Paper

Multi-Reference Preference Optimization for Large Language Models

  • Hung Le
  • Quan Hung Tran
  • Dung Nguyen
  • Kien Do
  • Saloni Mittal
  • Kelechi Ogueji
  • Svetha Venkatesh

How can Large Language Models (LLMs) be aligned with human intentions and values? A typical solution is to gather human preference on model outputs and finetune the LLMs accordingly while ensuring that updates do not deviate too far from a reference model. Recent approaches, such as direct preference optimization (DPO), have eliminated the need for unstable and sluggish reinforcement learning optimization by introducing close-formed supervised losses. However, a significant limitation of the current approach is its design for a single reference model only, neglecting to leverage the collective power of numerous pretrained LLMs. To overcome this limitation, we introduce a novel closed-form formulation for direct preference optimization using multiple reference models. The resulting algorithm, Multi-Reference Preference Optimization (MRPO), leverages broader prior knowledge from diverse reference models, substantially enhancing preference learning capabilities compared to the single-reference DPO. Our experiments demonstrate that LLMs finetuned with MRPO generalize better in various preference data, regardless of data scarcity or abundance. Furthermore, MRPO effectively finetunes LLMs to exhibit superior performance in several downstream natural language processing tasks such as HH-RLHF, GSM8K and TruthfulQA.

AAMAS Conference 2025 Conference Paper

Navigating Social Dilemmas with LLM-based Agents via Consideration of Future Consequences

  • Dung Nguyen
  • Hung Le
  • Kien Do
  • Sunil Gupta
  • Svetha Venkatesh
  • Truyen Tran

Agents built on LLMs have shown versatile capabilities but face difficulties in being cooperative in social dilemma situations. When making decisions under the strain of selecting between long-term consequences and short-term benefits in commonly shared resources, LLM-based agents are vulnerable to the tragedy of the commons, i. e. individuals’ greed exploitation leads to early depletion. We propose LLM agents that consider future consequences to aid them in navigating intertemporal social dilemmas. We introduce two approaches—prompting and intervention—to equip the agent with the ability to consider future consequences when making a decision, which results in a new kind of agent—CFC-Agent. Furthermore, we enable the CFC-Agent to act toward different levels of consideration for future consequences. Our experiments in different settings show that agents that consider future consequences exhibit sustainable behaviour and achieve high common rewards for the population.

IJCAI Conference 2025 Conference Paper

Navigating Social Dilemmas with LLM-based Agents via Consideration of Future Consequences

  • Dung Nguyen
  • Hung Le
  • Kien Do
  • Sunil Gupta
  • Svetha Venkatesh
  • Truyen Tran

Artificial agents with the aid of large language models (LLMs) are effective in various real-world scenarios but struggle to cooperate in social dilemmas. When making decisions under the strain of selecting between long-term consequences and short-term benefits in commonly shared resources, LLM-based agents often exploit the environment, leading to early depletion. Inspired by the concept of consideration of future consequences (CFC), which is well-known in social psychology, we propose a framework to enable the ability to consider future consequences for LLM-based agents, which results in a new kind of agent that we term the CFC-Agent. We enable the CFC-Agent to act toward different levels of consideration for future consequences. Our first set of experiments, where LLM is directly asked to make decisions, shows that agents considering future consequences exhibit sustainable behaviour and achieve high common rewards for the population. Extensive experiments in complex environments showed that the CFC-Agent can manage a sequence of calls to LLM for reasoning and engaging in communication to cooperate with others to resolve the common dilemma better. Finally, our analysis showed that considering future consequences not only affects the final decision but also improves the conversations between LLM-based agents toward a better resolution of social dilemmas.

TMLR Journal 2025 Journal Article

Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models

  • Hung Le
  • Van Dai Do
  • Dung Nguyen
  • Svetha Venkatesh

Recent advances in fine-tuning large language models (LLMs) with reinforcement learning (RL) have shown promising improvements in complex reasoning tasks, particularly when paired with chain-of-thought (CoT) prompting. However, these successes have been largely demonstrated on large-scale models with billions of parameters, where a strong pretraining foundation ensures effective initial exploration. In contrast, RL remains challenging for tiny LLMs with 1 billion parameters or fewer because they lack the necessary pretraining strength to explore effectively, often leading to suboptimal reasoning patterns. This work introduces a novel intrinsic motivation approach, called Memory-R+, that leverages episodic memory to address this challenge, improving tiny LLMs in CoT reasoning tasks. Inspired by human memory-driven learning, our method leverages successful reasoning patterns stored in memory while allowing controlled exploration to generate novel responses. Intrinsic rewards are computed efficiently using a kNN-based episodic memory, allowing the model to discover new reasoning strategies while quickly adapting to effective past solutions. Experiments on three reasoning datasets demonstrate that our approach significantly enhances smaller LLMs' reasoning performance and generalization capability, making RL-based reasoning improvements more accessible in low-resource settings.

NeurIPS Conference 2025 Conference Paper

Reproducing Kernel Banach Space Models for Neural Networks with Application to Rademacher Complexity Analysis

  • Alistair Shilton
  • Sunil Gupta
  • Santu Rana
  • Svetha Venkatesh

This paper explores the use of Hermite transform based reproducing kernel Banach space methods to construct exact or un-approximated models of feedforward neural networks of arbitrary width, depth and topology, including ResNet and Transformers networks, assuming only a feedforward topology, finite energy activations and finite (spectral-) norm weights and biases. Using this model, two straightforward but surprisingly tight bounds on Rademacher complexity are derived, precisely (1) a general bound that is width-independent and scales exponentially with depth; and (2) a width- and depth-independent bound for networks with appropriately constrained (below threshold) weights and biases.

ICLR Conference 2025 Conference Paper

Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning

  • Hung Le 0002
  • Dung Nguyen 0001
  • Kien Do
  • Sunil Gupta 0001
  • Svetha Venkatesh

Effective decision-making in partially observable environments demands robust memory management. Despite their success in supervised learning, current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term. They fail to efficiently capture relevant past information, adapt flexibly to changing observations, and maintain stable updates over long episodes. We theoretically analyze the limitations of existing memory models within a unified framework and introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents. Our model dynamically adjusts memory by erasing no longer needed experiences and reinforcing crucial ones computationally efficiently. To this end, we leverage the Hadamard product for calibrating and updating memory, specifically designed to enhance memory capacity while mitigating numerical and learning challenges. Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks, such as meta-reinforcement learning, long-horizon credit assignment, and POPGym, demonstrating superior performance in handling long-term and evolving contexts.

NeurIPS Conference 2024 Conference Paper

Active Set Ordering

  • Quoc Phong Nguyen
  • Sunil Gupta
  • Svetha Venkatesh
  • Bryan Kian Hsiang Low
  • Patrick Jaillet

In this paper, we formalize the active set ordering problem, which involves actively discovering a set of inputs based on their orderings determined by expensive evaluations of a blackbox function. We then propose the mean prediction (MP) algorithm and theoretically analyze it in terms of the regret of predicted pairwise orderings between inputs. Notably, as a special case of this framework, we can cast Bayesian optimization as an active set ordering problem by recognizing that maximizers can be identified solely by comparison rather than by precisely estimating the function evaluations. As a result, we are able to construct the popular Gaussian process upper confidence bound (GP-UCB) algorithm through the lens of ordering with several nuanced insights. We empirically validate the performance of our proposed solution using various synthetic functions and real-world datasets.

AAMAS Conference 2024 Conference Paper

Beyond Surprise: Improving Exploration Through Surprise Novelty

  • Hung Le
  • Kien Do
  • Dung Nguyen
  • Svetha Venkatesh

We present a new computing model for intrinsic rewards in reinforcement learning that addresses the limitations of existing surprise-driven explorations. The reward is the novelty of the surprise rather than the surprise norm. We estimate the surprise novelty as retrieval errors of a memory network wherein the memory stores and reconstructs surprises. Our surprise memory (SM) augments the capability of surprise-based intrinsic motivators, maintaining the agent’s interest in exciting exploration while reducing unwanted attraction to unpredictable or noisy observations. Our experiments demonstrate that the SM combined with various surprise predictors exhibits efficient exploring behaviors and significantly boosts the final performance in sparse reward environments, including Noisy-TV, navigation and challenging Atari games.

IJCAI Conference 2024 Conference Paper

Diversifying Training Pool Predictability for Zero-shot Coordination: A Theory of Mind Approach

  • Dung Nguyen
  • Hung Le
  • Kien Do
  • Sunil Gupta
  • Svetha Venkatesh
  • Truyen Tran

The challenge in constructing artificial social agents is to enable adaptation ability to novel agents, and is called zero-shot coordination (ZSC). A promising approach is to train the adaptive agents by interacting with a diverse pool of collaborators, assuming that the greater the diversity in other agents seen during training, the better the generalisation. In this paper, we explore an alternative procedure by considering the behavioural predictability of collaborators, i. e. whether their actions and intentions are predictable, and use it to select a diverse set of agents for the training pool. More specifically, we develop a pool of agents through self-play training during which agents' behaviour evolves and has diversity in levels of behavioural predictability (LoBP) through its evolution. We construct an observer to compute the level of behavioural predictability for each version of the collaborators. To do so, the observer is equipped with the theory of mind (ToM) capability to learn to infer the actions and intentions of others. We then use an episodic memory based on the LoBP metric to maintain agents with different levels of behavioural predictability in the pool of agents. Since behaviours that emerge at the later training phase are more complex and meaningful, the memory is updated with the latest versions of training agents. Our extensive experiments demonstrate that LoBP-based diversity training leads to better ZSC than other diversity training methods.

ECAI Conference 2024 Conference Paper

Large Language Model Prompting with Episodic Memory

  • Dai Do
  • Quan Tran
  • Svetha Venkatesh
  • Hung Le 0002

Prompt optimization is essential for enhancing the performance of Large Language Models (LLMs) in a range of Natural Language Processing (NLP) tasks, particularly in scenarios of few-shot learning where training examples are incorporated directly into the prompt. Despite the growing interest in optimizing prompts with few-shot examples, existing methods for prompt optimization are often resource-intensive or perform inadequately. In this work, we propose PrOmpting with Episodic Memory (POEM), a novel prompt optimization technique that is simple, efficient, and demonstrates strong generalization capabilities. We approach prompt optimization as a Reinforcement Learning (RL) challenge, using episodic memory to archive combinations of input data, permutations of few-shot examples, and the rewards observed during training. In the testing phase, we optimize the sequence of examples for each test query by selecting the sequence that yields the highest total rewards from the top-k most similar training examples in the episodic memory. Our results show that POEM outperforms recent techniques like TEMPERA and RLPrompt by over 5. 3% in various text classification tasks. Furthermore, our approach adapts well to broader language understanding tasks, consistently outperforming conventional heuristic methods for ordering examples.

TMLR Journal 2024 Journal Article

Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory

  • Hung Le
  • Dung Nguyen
  • Kien Do
  • Svetha Venkatesh
  • Truyen Tran

We introduce Pointer-Augmented Neural Memory (PANM), a versatile module designed to enhance neural networks' ability to process symbols and extend their capabilities to longer data sequences. PANM integrates an external neural memory utilizing novel physical addresses and pointer manipulation techniques, emulating human and computer-like symbol processing abilities. PANM facilitates operations like pointer assignment, dereferencing, and arithmetic by explicitly employing physical pointers for memory access. This module can be trained end-to-end on sequence data, empowering various sequential models, from simple recurrent networks to large language models (LLMs). Our experiments showcase PANM's exceptional length extrapolation capabilities and its enhancement of recurrent neural networks in symbol processing tasks, including algorithmic reasoning and Dyck language recognition. PANM enables Transformers to achieve up to 100% generalization accuracy in compositional learning tasks and significantly improves performance in mathematical reasoning, question answering, and machine translation. Notably, the generalization effectiveness scales with stronger backbone models, as evidenced by substantial performance gains when we test LLMs finetuned with PANM for tasks up to 10-100 times longer than the training data.

ECAI Conference 2024 Conference Paper

Revisiting the Dataset Bias Problem from a Statistical Perspective

  • Kien Do
  • Dung Nguyen 0001
  • Hung Le 0002
  • Thao Le 0003
  • Dang Nguyen 0002
  • Haripriya Harikumar
  • Tran The Truyen
  • Santu Rana

In this paper, we study the “dataset bias” problem from a statistical standpoint, and identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b in the input x, represented by p(u|b) differing significantly from p(u). Since p(u|b) appears as part of the sampling distributions in the standard maximum log-likelihood (MLL) objective, a model trained on a biased dataset via MLL inherently incorporates such correlation into its parameters, leading to poor generalization to unbiased test data. From this observation, we propose to mitigate dataset bias via either weighting the objective of each sample n by 1 / p(un|bn) or sampling that sample with a weight proportional to 1 / p(un|bn). While both methods are statistically equivalent, the former proves more stable and effective in practice. Additionally, we establish a connection between our debiasing approach and causal reasoning, reinforcing our method’s theoretical foundation. However, when the bias label is unavailable, computing p(u|b) exactly is difficult. To overcome this challenge, we propose to approximate 1 / p(u|b) using a biased classifier trained with “bias amplification” losses. Extensive experiments on various biased datasets demonstrate the superiority of our method over existing debiasing techniques in most settings, validating our theoretical analysis.

AAAI Conference 2024 Conference Paper

Root Cause Explanation of Outliers under Noisy Mechanisms

  • Phuoc Nguyen
  • Truyen Tran
  • Sunil Gupta
  • Thin Nguyen
  • Svetha Venkatesh

Identifying root causes of anomalies in causal processes is vital across disciplines. Once identified, one can isolate the root causes and implement necessary measures to restore the normal operation. Causal processes are often modelled as graphs with entities being nodes and their paths/interconnections as edge. Existing work only consider the contribution of nodes in the generative process, thus can not attribute the outlier score to the edges of the mechanism if the anomaly occurs in the connections. In this paper, we consider both individual edge and node of each mechanism when identifying the root causes. We introduce a noisy functional causal model to account for this purpose. Then, we employ Bayesian learning and inference methods to infer the noises of the nodes and edges. We then represent the functional form of a target outlier leaf as a function of the node and edge noises. Finally, we propose an efficient gradient-based attribution method to compute the anomaly attribution scores which scales linearly with the number of nodes and edges. Experiments on simulated datasets and two real-world scenario datasets show better anomaly attribution performance of the proposed method compared to the baselines. Our method scales to larger graphs with more nodes and edges.

ICML Conference 2023 Conference Paper

Gradient Descent in Neural Networks as Sequential Learning in Reproducing Kernel Banach Space

  • Alistair Shilton
  • Sunil Gupta 0001
  • Santu Rana
  • Svetha Venkatesh

The study of Neural Tangent Kernels (NTKs) has provided much needed insight into convergence and generalization properties of neural networks in the over-parametrized (wide) limit by approximating the network using a first-order Taylor expansion with respect to its weights in the neighborhood of their initialization values. This allows neural network training to be analyzed from the perspective of reproducing kernel Hilbert spaces (RKHS), which is informative in the over-parametrized regime, but a poor approximation for narrower networks as the weights change more during training. Our goal is to extend beyond the limits of NTK toward a more general theory. We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights as an inner product of two feature maps, respectively from data and weight-step space, to feature space, allowing neural network training to be analyzed from the perspective of reproducing kernel Banach space (RKBS). We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning in RKBS. Using this, we present novel bound on uniform convergence where the iterations count and learning rate play a central role, giving new theoretical insight into neural network training.

AAAI Conference 2023 Conference Paper

Memory-Augmented Theory of Mind Network

  • Dung Nguyen
  • Phuoc Nguyen
  • Hung Le
  • Kien Do
  • Svetha Venkatesh
  • Truyen Tran

Social reasoning necessitates the capacity of theory of mind (ToM), the ability to contextualise and attribute mental states to others without having access to their internal cognitive structure. Recent machine learning approaches to ToM have demonstrated that we can train the observer to read the past and present behaviours of other agents and infer their beliefs (including false beliefs about things that no longer exist), goals, intentions and future actions. The challenges arise when the behavioural space is complex, demanding skilful space navigation for rapidly changing contexts for an extended period. We tackle the challenges by equipping the observer with novel neural memory mechanisms to encode, and hierarchical attention to selectively retrieve information about others. The memories allow rapid, selective querying of distal related past behaviours of others to deliberatively reason about their current mental state, beliefs and future behaviours. This results in ToMMY, a theory of mind model that learns to reason while making little assumptions about the underlying mental processes. We also construct a new suite of experiments to demonstrate that memories facilitate the learning process and achieve better theory of mind performance, especially for high-demand false-belief tasks that require inferring through multiple steps of changes.

AAAI Conference 2023 Conference Paper

On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation

  • Thanh Nguyen-Tang
  • Ming Yin
  • Sunil Gupta
  • Svetha Venkatesh
  • Raman Arora

Sample-efficient offline reinforcement learning (RL) with linear function approximation has been studied extensively recently. Much of the prior work has yielded instance-independent rates that hold even for the worst-case realization of problem instances. This work seeks to understand instance-dependent bounds for offline RL with linear function approximation. We present an algorithm called Bootstrapped and Constrained Pessimistic Value Iteration (BCP-VI), which leverages data bootstrapping and constrained optimization on top of pessimism. We show that under a partial data coverage assumption, that of concentrability with respect to an optimal policy, the proposed algorithm yields a fast rate for offline RL when there is a positive gap in the optimal Q-value functions, even if the offline data were collected adaptively. Moreover, when the linear features of the optimal actions in the states reachable by an optimal policy span those reachable by the behavior policy and the optimal actions are unique, offline RL achieves absolute zero sub-optimality error when the number of episodes exceeds a (finite) instance-dependent threshold. To the best of our knowledge, these are the first results that give a fast rate bound on the sub-optimality and an absolute zero sub-optimality bound for offline RL with linear function approximation from adaptive data with partial coverage. We also provide instance-agnostic and instance-dependent information-theoretical lower bounds to complement our upper bounds.

JBHI Journal 2023 Journal Article

Robust and Interpretable General Movement Assessment Using Fidgety Movement Detection

  • Romero Morais
  • Vuong Le
  • Catherine Morgan
  • Alicia Spittle
  • Nadia Badawi
  • Jane Valentine
  • Elizabeth M Hurrion
  • Paul A Dawson

Fidgety movements occur in infants between the age of 9 to 20 weeks post-term, and their absence are a strong indicator that an infant has cerebral palsy. Prechtl's General Movement Assessment method evaluates whether an infant has fidgety movements, but requires a trained expert to conduct it. Timely evaluation facilitates early interventions, and thus computer-based methods have been developed to aid domain experts. However, current solutions rely on complex models or high-dimensional representations of the data, which hinder their interpretability and generalization ability. To address that we propose $\text {FidgetyFind}$, a method that detects fidgety movements and uses them towards an assessment of the quality of an infant's general movements. $\text {FidgetyFind}$ is true to the domain expert process, more accurate, and highly interpretable due to its fine-grained scoring system. The main idea behind $\text {FidgetyFind}$ is to specify signal properties of fidgety movements that are measurable and quantifiable. In particular, we measure the movement direction variability of joints of interest, for movements of small amplitude in short video segments. $\text {FidgetyFind}$ also comprises a strategy to reduce those measurements to a single score that quantifies the quality of an infant's general movements; the strategy is a direct translation of the qualitative procedure domain experts use to assess infants. This brings $\text {FidgetyFind}$ closer to the process a domain expert applies to decide whether an infant produced enough fidgety movements. We evaluated $\text {FidgetyFind}$ on the largest clinical dataset reported, where it showed to be interpretable and more accurate than many methods published to date.

IJCAI Conference 2023 Conference Paper

Social Motivation for Modelling Other Agents under Partial Observability in Decentralised Training

  • Dung Nguyen
  • Hung Le
  • Kien Do
  • Svetha Venkatesh
  • Truyen Tran

Understanding other agents is a key challenge in constructing artificial social agents. Current works focus on centralised training, wherein agents are allowed to know all the information about others and the environmental state during training. In contrast, this work studies decentralised training, wherein agents must learn the model of other agents in order to cooperate with them under partially-observable conditions, even during training, i. e. learning agents are myopic. The intrinsic motivation for artificial agents is modelled on the concept of human social motivation that entices humans to meet and understand each other, especially when experiencing a utility loss. Our intrinsic motivation encourages agents to stay near each other to obtain better observations and construct a model of others. They do so when their model of other agents is poor, or the overall task performance is bad during the learning phase. This simple but effective method facilitates the processes of modelling others, resulting in an improvement of the performance in cooperative tasks significantly. Our experiments demonstrate that the socially-motivated agent can model others better and promote cooperation across different tasks.

AAAI Conference 2022 Conference Paper

Episodic Policy Gradient Training

  • Hung Le
  • Majid Abdolshah
  • Thommen K. George
  • Kien Do
  • Dung Nguyen
  • Svetha Venkatesh

We introduce a novel training procedure for policy gradient methods wherein episodic memory is used to optimize the hyperparameters of reinforcement learning algorithms onthe-fly. Unlike other hyperparameter searches, we formulate hyperparameter scheduling as a standard Markov Decision Process and use episodic memory to store the outcome of used hyperparameters and their training contexts. At any policy update step, the policy learner refers to the stored experiences, and adaptively reconfigures its learning algorithm with the new hyperparameters determined by the memory. This mechanism, dubbed as Episodic Policy Gradient Training (EPGT), enables an episodic learning process, and jointly learns the policy and the learning algorithm’s hyperparameters within a single run. Experimental results on both continuous and discrete environments demonstrate the advantage of using the proposed method in boosting the performance of various policy gradient algorithms.

NeurIPS Conference 2022 Conference Paper

Expected Improvement for Contextual Bandits

  • Hung Tran-The
  • Sunil Gupta
  • Santu Rana
  • Tuan Truong
  • Long Tran-Thanh
  • Svetha Venkatesh

The expected improvement (EI) is a popular technique to handle the tradeoff between exploration and exploitation under uncertainty. This technique has been widely used in Bayesian optimization but it is not applicable for the contextual bandit problem which is a generalization of the standard bandit and Bayesian optimization. In this paper, we initiate and study the EI technique for contextual bandits from both theoretical and practical perspectives. We propose two novel EI-based algorithms, one when the reward function is assumed to be linear and the other for more general reward functions. With linear reward functions, we demonstrate that our algorithm achieves a near-optimal regret. Notably, our regret improves that of LinTS \cite{agrawal13} by a factor $\sqrt{d}$ while avoiding to solve a NP-hard problem at each iteration as in LinUCB \cite{Abbasi11}. For more general reward functions which are modeled by deep neural networks, we prove that our algorithm achieves a $\tilde{\mathcal O} (\tilde{d}\sqrt{T})$ regret, where $\tilde{d}$ is the effective dimension of a neural tangent kernel (NTK) matrix, and $T$ is the number of iterations. Our experiments on various benchmark datasets show that both proposed algorithms work well and consistently outperform existing approaches, especially in high dimensions.

ICLR Conference 2022 Conference Paper

Generative Pseudo-Inverse Memory

  • Kha Pham
  • Hung Le 0002
  • Man Ngo
  • Tran The Truyen
  • Bao Ho
  • Svetha Venkatesh

We propose Generative Pseudo-Inverse Memory (GPM), a class of deep generative memory models that are fast to write in and read out. Memory operations are recast as seeking robust solutions of linear systems, which naturally lead to the use of matrix pseudo-inverses. The pseudo-inverses are iteratively approximated, with practical computation complexity of almost $O(1)$. We prove theoretically and verify empirically that our model can retrieve exactly what have been written to the memory under mild conditions. A key capability of GPM is iterative reading, during which the attractor dynamics towards fixed points are enabled, allowing the model to iteratively improve sample quality in denoising and generating. More impressively, GPM can store a large amount of data while maintaining key abilities of accurate retrieving of stored patterns, denoising of corrupted data and generating novel samples. Empirically we demonstrate the efficiency and versatility of GPM on a comprehensive suite of experiments involving binarized MNIST, binarized Omniglot, FashionMNIST, CIFAR10 & CIFAR100 and CelebA.

NeurIPS Conference 2022 Conference Paper

Human-AI Collaborative Bayesian Optimisation

  • Arun Kumar A V
  • Santu Rana
  • Alistair Shilton
  • Svetha Venkatesh

Abstract Human-AI collaboration looks at harnessing the complementary strengths of both humans and AI. We propose a new method for human-AI collaboration in Bayesian optimisation where the optimum is mainly pursued by the Bayesian optimisation algorithm following complex computation, whilst getting occasional help from the accompanying expert having a deeper knowledge of the underlying physical phenomenon. We expect experts to have some understanding of the correlation structures of the experimental system, but not the location of the optimum. The expert provides feedback by either changing the current recommendation or providing her belief on the good and bad regions of the search space based on the current observations. Our proposed method takes such feedback to build a model that aligns with the expert’s model and then uses it for optimisation. We provide theoretical underpinning on why such an approach may be more efficient than the one without expert’s feedback. The empirical results show the robustness and superiority of our method with promising efficiency gains.

AAMAS Conference 2022 Conference Paper

Learning Theory of Mind via Dynamic Traits Attribution

  • Dung Nguyen
  • Phuoc Nguyen
  • Hung Le
  • Kien Do
  • Svetha Venkatesh
  • Truyen Tran

Machine learning of Theory of Mind (ToM) is essential to build social agents that co-live with humans and other agents. This capacity, once acquired, will help machines infer the mental states of others from observed contextual action trajectories, enabling future prediction of goals, intention, actions and successor representations. The underlying mechanism for such a prediction remains unclear, however. Inspired by the observation that humans often infer the character traits of others, then use it to explain behaviour, we propose a new neural ToM architecture that learns to generate a latent trait vector of an actor from the past trajectories. This trait vector then multiplicatively modulates the prediction mechanism via a ‘fast weights’ scheme in the prediction neural network, which reads the current context and predicts the behaviour. We empirically show that the fast weights provide a good inductive bias to model the character traits of agents and hence improves mindreading ability. On the indirect assessment of false-belief understanding, the new ToM model enables more efficient helping behaviours.

NeurIPS Conference 2022 Conference Paper

Learning to Constrain Policy Optimization with Virtual Trust Region

  • Thai Hung Le
  • Thommen Karimpanal George
  • Majid Abdolshah
  • Dung Nguyen
  • Kien Do
  • Sunil Gupta
  • Svetha Venkatesh

We introduce a constrained optimization method for policy gradient reinforcement learning, which uses two trust regions to regulate each policy update. In addition to using the proximity of one single old policy as the first trust region as done by prior works, we propose forming a second trust region by constructing another virtual policy that represents a wide range of past policies. We then enforce the new policy to stay closer to the virtual policy, which is beneficial if the old policy performs poorly. We propose a mechanism to automatically build the virtual policy from a memory buffer of past policies, providing a new capability for dynamically selecting appropriate trust regions during the optimization process. Our proposed method, dubbed Memory-Constrained Policy Optimization (MCPO), is examined in diverse environments, including robotic locomotion control, navigation with sparse rewards and Atari games, consistently demonstrating competitive performance against recent on-policy constrained policy gradient methods.

AAMAS Conference 2022 Conference Paper

Learning to Transfer Role Assignment Across Team Sizes

  • Dung Nguyen
  • Phuoc Nguyen
  • Svetha Venkatesh
  • Truyen Tran

Multi-agent reinforcement learning holds the key for solving complex tasks that demand the coordination of learning agents. However, strong coordination often leads to expensive exploration over the exponentially large state-action space. A powerful approach is to decompose team works into roles, which are ideally assigned to agents with the relevant skills. Training agents to adaptively choose and play emerging roles in a team thus allows the team to scale to complex tasks and quickly adapt to changing environments. These promises, however, have not been fully realised by current role-based multi-agent reinforcement learning methods as they assume either a pre-defined role structure or a fixed team size. We propose a framework to learn role assignment and transfer across team sizes. In particular, we train a role assignment network for small teams by demonstration and transfer the network to larger teams, which continue to learn through interaction with the environment. We demonstrate that re-using the role-based credit assignment structure can foster the learning process of larger reinforcement learning teams to achieve tasks requiring different roles. Our proposal outperforms competing techniques in enriched role-enforcing Prey-Predator games and in new scenarios in the StarCraft II Micro-Management benchmark.

NeurIPS Conference 2022 Conference Paper

Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation

  • Kien Do
  • Thai Hung Le
  • Dung Nguyen
  • Dang Nguyen
  • HARIPRIYA HARIKUMAR
  • Truyen Tran
  • Santu Rana
  • Svetha Venkatesh

Data-free Knowledge Distillation (DFKD) has attracted attention recently thanks to its appealing capability of transferring knowledge from a teacher network to a student network without using training data. The main idea is to use a generator to synthesize data for training the student. As the generator gets updated, the distribution of synthetic data will change. Such distribution shift could be large if the generator and the student are trained adversarially, causing the student to forget the knowledge it acquired at the previous steps. To alleviate this problem, we propose a simple yet effective method called Momentum Adversarial Distillation (MAD) which maintains an exponential moving average (EMA) copy of the generator and uses synthetic samples from both the generator and the EMA generator to train the student. Since the EMA generator can be considered as an ensemble of the generator's old versions and often undergoes a smaller change in updates compared to the generator, training on its synthetic samples can help the student recall the past knowledge and prevent the student from adapting too quickly to the new updates of the generator. Our experiments on six benchmark datasets including big datasets like ImageNet and Places365 demonstrate the superior performance of MAD over competing methods for handling the large distribution shift problem. Our method also compares favorably to existing DFKD methods and even achieves state-of-the-art results in some cases.

ICML Conference 2022 Conference Paper

Neurocoder: General-Purpose Computation Using Stored Neural Programs

  • Hung Le 0002
  • Svetha Venkatesh

Artificial Neural Networks are functionally equivalent to special-purpose computers. Their inter-neuronal connection weights represent the learnt Neural Program that instructs the networks on how to compute the data. However, without storing Neural Programs, they are restricted to only one, overwriting learnt programs when trained on new data. Here we design Neurocoder, a new class of general-purpose neural networks in which the neural network “codes” itself in a data-responsive way by composing relevant programs from a set of shareable, modular programs stored in external memory. This time, a Neural Program is efficiently treated as data in memory. Integrating Neurocoder into current neural architectures, we demonstrate new capacity to learn modular programs, reuse simple programs to build complex ones, handle pattern shifts and remember old programs as new ones are learnt, and show substantial performance improvement in solving object recognition, playing video games and continual learning tasks.

ICLR Conference 2022 Conference Paper

Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization

  • Thanh Nguyen-Tang
  • Sunil Gupta 0001
  • A. Tuan Nguyen
  • Svetha Venkatesh

Offline policy learning (OPL) leverages existing data collected a priori for policy optimization without any active exploration. Despite the prevalence and recent interest in this problem, its theoretical and algorithmic foundations in function approximation settings remain under-developed. In this paper, we consider this problem on the axes of distributional shift, optimization, and generalization in offline contextual bandits with neural networks. In particular, we propose a provably efficient offline contextual bandit with neural network function approximation that does not require any functional assumption on the reward. We show that our method provably generalizes over unseen contexts under a milder condition for distributional shift than the existing OPL works. Notably, unlike any other OPL method, our method learns from the offline data in an online manner using stochastic gradient descent, allowing us to leverage the benefits of online learning into an offline setting. Moreover, we show that our method is more computationally efficient and has a better dependence on the effective dimension of the neural network than an online counterpart. Finally, we demonstrate the empirical effectiveness of our method in a range of synthetic and real-world OPL problems.

TMLR Journal 2022 Journal Article

On Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks in Besov Spaces

  • Thanh Nguyen-Tang
  • Sunil Gupta
  • Hung Tran-The
  • Svetha Venkatesh

Offline reinforcement learning (RL) leverages previously collected data for policy optimization without any further active exploration. Despite the recent interest in this problem, its theoretical results in neural network function approximation settings remain elusive. In this paper, we study the statistical theory of offline RL with deep ReLU network function approximation. In particular, we establish the sample complexity of $n = \tilde{\mathcal{O}}( H^{4 + 4 \frac{d}{\alpha}} \kappa_{\mu}^{1 + \frac{d}{\alpha}} \epsilon^{-2 - 2\frac{d}{\alpha}} )$ for offline RL with deep ReLU networks, where $\kappa_{\mu}$ is a measure of distributional shift, $H = (1-\gamma)^{-1}$ is the effective horizon length, $d$ is the dimension of the state-action space, $\alpha$ is a (possibly fractional) smoothness parameter of the underlying Markov decision process (MDP), and $\epsilon$ is a user-specified error. Notably, our sample complexity holds under two novel considerations: the Besov dynamic closure and the correlated structure. While the Besov dynamic closure subsumes the dynamic conditions for offline RL in the prior works, the correlated structure renders the prior works of offline RL with general/neural network function approximation improper or inefficient in long (effective) horizon problems. To the best of our knowledge, this is the first theoretical characterization of the sample complexity of offline RL with deep neural network function approximation under the general Besov regularity condition that goes beyond the linearity regime in the traditional Reproducing Hilbert kernel spaces and Neural Tangent Kernels.

AAAI Conference 2022 Conference Paper

TRF: Learning Kernels with Tuned Random Features

  • Alistair Shilton
  • Sunil Gupta
  • Santu Rana
  • Arun Kumar Venkatesh
  • Svetha Venkatesh

Random Fourier features (RFF) are a popular set of tools for constructing low-dimensional approximations of translationinvariant kernels, allowing kernel methods to be scaled to big data. Apart from their computational advantages, by working in the spectral domain random Fourier features expose the translation invariant kernel as a density function that may, in principle, be manipulated directly to tune the kernel. In this paper we propose selecting the density function from a reproducing kernel Hilbert space to allow us to search the space of all translation-invariant kernels. Our approach, which we call tuned random features (TRF), achieves this by approximating the density function as the RKHS-norm regularised leastsquares best fit to an unknown “true” optimal density function, resulting in a RFF formulation where kernel selection is reduced to regularised risk minimisation with a novel regulariser. We derive bounds on the Rademacher complexity for our method showing that our random features approximation method converges to optimal kernel selection in the large N, D limit. Finally, we prove experimental results for a variety of real-world learning problems, demonstrating the performance of our approach compared to comparable methods.

ICML Conference 2021 Conference Paper

A New Representation of Successor Features for Transfer across Dissimilar Environments

  • Majid Abdolshah
  • Hung Le 0002
  • Thommen George Karimpanal
  • Sunil Gupta 0001
  • Santu Rana
  • Svetha Venkatesh

Transfer in reinforcement learning is usually achieved through generalisation across tasks. Whilst many studies have investigated transferring knowledge when the reward function changes, they have assumed that the dynamics of the environments remain consistent. Many real-world RL problems require transfer among environments with different dynamics. To address this problem, we propose an approach based on successor features in which we model successor feature functions with Gaussian Processes permitting the source successor features to be treated as noisy measurements of the target successor feature function. Our theoretical analysis proves the convergence of this approach as well as the bounded error on modelling successor feature functions with Gaussian Processes in environments with both different dynamics and rewards. We demonstrate our method on benchmark datasets and show that it outperforms current baselines.

JBHI Journal 2021 Journal Article

A Spatio-Temporal Attention-Based Model for Infant Movement Assessment From Videos

  • Binh Nguyen-Thai
  • Vuong Le
  • Catherine Morgan
  • Nadia Badawi
  • Truyen Tran
  • Svetha Venkatesh

The absence or abnormality of fidgety movements of joints or limbs is strongly indicative of cerebral palsy in infants. Developing computer-based methods for assessing infant movements in videos is pivotal for improved cerebral palsy screening. Most existing methods use appearance-based features and are thus sensitive to strong but irrelevant signals caused by background clutter or a moving camera. Moreover, these features are computed over the whole frame, thus they measure gross whole body movements rather than specific joint/limb motion. Addressing these challenges, we develop and validate a new method for fidgety movement assessment from consumer-grade videos using human poses extracted from short clips. Human poses capture only relevant motion profiles of joints and limbs and are thus free from irrelevant appearance artifacts. The dynamics and coordination between joints are modeled using spatio-temporal graph convolutional networks. Frames and body parts that contain discriminative information about fidgety movements are selected through a spatio-temporal attention mechanism. We validate the proposed model on the cerebral palsy screening task using a real-life consumer-grade video dataset collected at an Australian hospital through the Cerebral Palsy Alliance, Australia. Our experiments show that the proposed method achieves the ROC-AUC score of 81. 87%, significantly outperforming existing competing methods with better interpretability.

ICML Conference 2021 Conference Paper

Bayesian Optimistic Optimisation with Exponentially Decaying Regret

  • Hung Tran-The
  • Sunil Gupta 0001
  • Santu Rana
  • Svetha Venkatesh

Bayesian optimisation (BO) is a well known algorithm for finding the global optimum of expensive, black-box functions. The current practical BO algorithms have regret bounds ranging from $\mathcal{O}(\frac{logN}{\sqrt{N}})$ to $\mathcal O(e^{-\sqrt{N}})$, where $N$ is the number of evaluations. This paper explores the possibility of improving the regret bound in the noise-free setting by intertwining concepts from BO and optimistic optimisation methods which are based on partitioning the search space. We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $\mathcal O(N^{-\sqrt{N}})$ under the assumption that the objective function is sampled from a Gaussian process with a Matérn kernel with smoothness parameter $\nu > 4 +\frac{D}{2}$, where $D$ is the number of dimensions. We perform experiments on optimisation of various synthetic functions and machine learning hyperparameter tuning tasks and show that our algorithm outperforms baselines.

AAAI Conference 2021 Conference Paper

Distributional Reinforcement Learning via Moment Matching

  • Thanh Nguyen-Tang
  • Sunil Gupta
  • Svetha Venkatesh

We consider the problem of learning a set of probability distributions from the empirical Bellman dynamics in distributional reinforcement learning (RL), a class of state-of-the-art methods that estimate the distribution, as opposed to only the expectation, of the total return. We formulate a method that learns a finite set of statistics from each return distribution via neural networks, as in the distributional RL literature. Existing distributional RL methods however constrain the learned statistics to predefined functional forms of the return distribution which is both restrictive in representation and difficult in maintaining the predefined statistics. Instead, we learn unrestricted statistics, i. e. , deterministic (pseudo-)samples, of the return distribution by leveraging a technique from hypothesis testing known as maximum mean discrepancy (MMD), which leads to a simpler objective amenable to backpropagation. Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target. We establish sufficient conditions for the contraction of the distributional Bellman operator and provide finitesample analysis for the deterministic samples in distribution approximation. Experiments on the suite of Atari games show that our method outperforms the distributional RL baselines and sets a new record in the Atari games for non-distributed agents.

AAAI Conference 2021 Conference Paper

High Dimensional Level Set Estimation with Bayesian Neural Network

  • Huong Ha
  • Sunil Gupta
  • Santu Rana
  • Svetha Venkatesh

Level Set Estimation (LSE) is an important problem with applications in various fields such as material design, biotechnology, machine operational testing, etc. Existing techniques suffer from the scalability issue, that is, these methods do not work well with high dimensional inputs. This paper proposes novel methods to solve the high dimensional LSE problems using Bayesian Neural Networks. In particular, we consider two types of LSE problems: (1) explicit LSE problem where the threshold level is a fixed user-specified value, and, (2) implicit LSE problem where the threshold level is defined as a percentage of the (unknown) maximum of the objective function. For each problem, we derive the corresponding theoretic information based acquisition function to sample the data points so as to maximally increase the level set accuracy. Furthermore, we also analyse the theoretical time complexity of our proposed acquisition functions, and suggest a practical methodology to efficiently tune the network hyper-parameters to achieve high model accuracy. Numerical experiments on both synthetic and real-world datasets show that our proposed method can achieve better results compared to existing state-of-the-art approaches.

NeurIPS Conference 2021 Conference Paper

Kernel Functional Optimisation

  • Arun Kumar Anjanapura Venkatesh
  • Alistair Shilton
  • Santu Rana
  • Sunil Gupta
  • Svetha Venkatesh

Traditional methods for kernel selection rely on parametric kernel functions or a combination thereof and although the kernel hyperparameters are tuned, these methods often provide sub-optimal results due to the limitations induced by the parametric forms. In this paper, we propose a novel formulation for kernel selection using efficient Bayesian optimisation to find the best fitting non-parametric kernel. The kernel is expressed using a linear combination of functions sampled from a prior Gaussian Process (GP) defined by a hyperkernel. We also provide a mechanism to ensure the positive definiteness of the Gram matrix constructed using the resultant kernels. Our experimental results on GP regression and Support Vector Machine (SVM) classification tasks involving both synthetic functions and several real-world datasets show the superiority of our approach over the state-of-the-art.

NeurIPS Conference 2021 Conference Paper

Model-Based Episodic Memory Induces Dynamic Hybrid Controls

  • Hung Le
  • Thommen Karimpanal George
  • Majid Abdolshah
  • Truyen Tran
  • Svetha Venkatesh

Episodic control enables sample efficiency in reinforcement learning by recalling past experiences from an episodic memory. We propose a new model-based episodic memory of trajectories addressing current limitations of episodic control. Our memory estimates trajectory values, guiding the agent towards good policies. Built upon the memory, we construct a complementary learning model via a dynamic hybrid control unifying model-based, episodic and habitual learning into a single architecture. Experiments demonstrate that our model allows significantly faster and better learning than other strong reinforcement learning agents across a variety of environments including stochastic and non-Markovian settings.

AAAI Conference 2021 Conference Paper

Semi-Supervised Learning with Variational Bayesian Inference and Maximum Uncertainty Regularization

  • Kien Do
  • Truyen Tran
  • Svetha Venkatesh

We propose two generic methods for improving semisupervised learning (SSL). The first integrates weight perturbation (WP) into existing “consistency regularization” (CR) based methods. We implement WP by leveraging variational Bayesian inference (VBI). The second method proposes a novel consistency loss called “maximum uncertainty regularization” (MUR). While most consistency losses act on perturbations in the vicinity of each data point, MUR actively searches for “virtual” points situated beyond this region that cause the most uncertain class predictions. This allows MUR to impose smoothness on a wider area in the input-output manifold. Our experiments show clear improvements in classification errors of various CR based methods when they are combined with VBI or MUR or both.

AAAI Conference 2020 Conference Paper

Bayesian Optimization for Categorical and Category-Specific Continuous Inputs

  • Dang Nguyen
  • Sunil Gupta
  • Santu Rana
  • Alistair Shilton
  • Svetha Venkatesh

Many real-world functions are defined over both categorical and category-specific continuous variables and thus cannot be optimized by traditional Bayesian optimization (BO) methods. To optimize such functions, we propose a new method that formulates the problem as a multi-armed bandit problem, wherein each category corresponds to an arm with its reward distribution centered around the optimum of the objective function in continuous variables. Our goal is to identify the best arm and the maximizer of the corresponding continuous function simultaneously. Our algorithm uses a Thompson sampling scheme that helps connecting both multi-arm bandit and BO in a unified framework. We extend our method to batch BO to allow parallel optimization when multiple resources are available. We theoretically analyze our method for convergence and prove sub-linear regret bounds. We perform a variety of experiments: optimization of several benchmark functions, hyper-parameter tuning of a neural network, and automatic selection of the best machine learning model along with its optimal hyper-parameters (a. k. a automated machine learning). Comparisons with other methods demonstrate the effectiveness of our proposed method.

ICML Conference 2020 Conference Paper

DeepCoDA: personalized interpretability for compositional health data

  • Thomas P. Quinn
  • Dang Nguyen 0002
  • Santu Rana
  • Sunil Gupta 0001
  • Svetha Venkatesh

Abstract Interpretability allows the domain-expert to directly evaluate the model’s relevance and reliability, a practice that offers assurance and builds trust. In the healthcare setting, interpretable models should implicate relevant biological mechanisms independent of technical factors like data pre-processing. We define personalized interpretability as a measure of sample-specific feature attribution, and view it as a minimum requirement for a precision health model to justify its conclusions. Some health data, especially those generated by high-throughput sequencing experiments, have nuances that compromise precision health models and their interpretation. These data are compositional, meaning that each feature is conditionally dependent on all other features. We propose the Deep Compositional Data Analysis (DeepCoDA) framework to extend precision health modelling to high-dimensional compositional data, and to provide personalized interpretability through patient-specific weights. Our architecture maintains state-of-the-art performance across 25 real-world data sets, all while producing interpretations that are both personalized and fully coherent for compositional data.

IJCAI Conference 2020 Conference Paper

Dynamic Language Binding in Relational Visual Reasoning

  • Thao Minh Le
  • Vuong Le
  • Svetha Venkatesh
  • Truyen Tran

We present Language-binding Object Graph Network, the first neural reasoning method with dynamic relational structures across both visual and textual domains with applications in visual question answering. Relaxing the common assumption made by current models that the object predicates pre-exist and stay static, passive to the reasoning process, we propose that these dynamic predicates expand across the domain borders to include pair-wise visual-linguistic object binding. In our method, these contextualized object links are actively found within each recurrent reasoning step without relying on external predicative priors. These dynamic structures reflect the conditional dual-domain object dependency given the evolving context of the reasoning through co-attention. Such discovered dynamic graphs facilitate multi-step knowledge combination and refinements that iteratively deduce the compact representation of the final answer. The effectiveness of this model is demonstrated on image question answering demonstrating favorable performance on major VQA datasets. Our method outperforms other methods in sophisticated question-answering tasks wherein multiple object relations are involved. The graph structure effectively assists the progress of training, and therefore the network learns efficiently compared to other reasoning models.

ICLR Conference 2020 Conference Paper

Neural Stored-program Memory

  • Hung Le 0002
  • Tran The Truyen
  • Svetha Venkatesh

Neural networks powered with external memory simulate computer behaviors. These models, which use the memory to store data for a neural controller, can learn algorithms and other complex tasks. In this paper, we introduce a new memory to store weights for the controller, analogous to the stored-program memory in modern computer architectures. The proposed model, dubbed Neural Stored-program Memory, augments current memory-augmented neural networks, creating differentiable machines that can switch programs through time, adapt to variable contexts and thus fully resemble the Universal Turing Machine. A wide range of experiments demonstrate that the resulting machines not only excel in classical algorithmic problems, but also have potential for compositional, continual, few-shot learning and question-answering tasks.

IJCAI Conference 2020 Conference Paper

Randomised Gaussian Process Upper Confidence Bound for Bayesian Optimisation

  • Julian Berk
  • Sunil Gupta
  • Santu Rana
  • Svetha Venkatesh

In order to improve the performance of Bayesian optimisation, we develop a modified Gaussian process upper confidence bound (GP-UCB) acquisition function. This is done by sampling the exploration-exploitation trade-off parameter from a distribution. We prove that this allows the expected trade-off parameter to be altered to better suit the problem without compromising a bound on the function's Bayesian regret. We also provide results showing that our method achieves better performance than GP-UCB in a range of real-world and synthetic problems.

ICML Conference 2020 Conference Paper

Self-Attentive Associative Memory

  • Hung Le 0002
  • Tran The Truyen
  • Svetha Venkatesh

Heretofore, neural networks with external memory are restricted to single memory with lossy representations of memory interactions. A rich representation of relationships between memory pieces urges a high-order and segregated relational memory. In this paper, we propose to separate the storage of individual experiences (item memory) and their occurring relationships (relational memory). The idea is implemented through a novel Self-attentive Associative Memory (SAM) operator. Found upon outer product, SAM forms a set of associative memories that represent the hypothetical high-order relationships between arbitrary pairs of memory elements, through which a relational memory is constructed from an item memory. The two memories are wired into a single sequential model capable of both memorization and relational reasoning. We achieve competitive results with our proposed two-memory model in a diversity of machine learning tasks, from challenging synthetic problems to practical testbeds such as geometry, graph, reinforcement learning, and question answering.

NeurIPS Conference 2020 Conference Paper

Sub-linear Regret Bounds for Bayesian Optimisation in Unknown Search Spaces

  • Hung Tran-The
  • Sunil Gupta
  • Santu Rana
  • Huong Ha
  • Svetha Venkatesh

Bayesian optimisation is a popular method for efficient optimisation of expensive black-box functions. Traditionally, BO assumes that the search space is known. However, in many problems, this assumption does not hold. To this end, we propose a novel BO algorithm which expands (and shifts) the search space over iterations based on controlling the expansion rate thought a \emph{hyperharmonic series}. Further, we propose another variant of our algorithm that scales to high dimensions. We show theoretically that for both our algorithms, the cumulative regret grows at sub-linear rates. Our experiments with synthetic and real-world optimisation tasks demonstrate the superiority of our algorithms over the current state-of-the-art methods for Bayesian optimisation in unknown search space.

AAAI Conference 2020 Conference Paper

Trading Convergence Rate with Computational Budget in High Dimensional Bayesian Optimization

  • Hung Tran-The
  • Sunil Gupta
  • Santu Rana
  • Svetha Venkatesh

Scaling Bayesian optimisation (BO) to high-dimensional search spaces is a active and open research problems particularly when no assumptions are made on function structure. The main reason is that at each iteration, BO requires to find global maximisation of acquisition function, which itself is a non-convex optimization problem in the original search space. With growing dimensions, the computational budget for this maximisation gets increasingly short leading to inaccurate solution of the maximisation. This inaccuracy adversely affects both the convergence and the efficiency of BO. We propose a novel approach where the acquisition function only requires maximisation on a discrete set of low dimensional subspaces embedded in the original highdimensional search space. Our method is free of any low dimensional structure assumption on the function unlike many recent high-dimensional BO methods. Optimising acquisition function in low dimensional subspaces allows our method to obtain accurate solutions within limited computational budget. We show that in spite of this convenience, our algorithm remains convergent. In particular, cumulative regret of our algorithm only grows sub-linearly with the number of iterations. More importantly, as evident from our regret bounds, our algorithm provides a way to trade the convergence rate with the number of subspaces used in the optimisation. Finally, when the number of subspaces is ”sufficiently large”, our algorithm’s cumulative regret is at most O∗ ( √ TγT ) as opposed to O∗ ( √ DTγT ) for the GP-UCB of Srinivas et al. (2012), reducing a crucial factor √ D where D being the dimensional number of input space. We perform empirical experiments to evaluate our method extensively, showing that its sample efficiency is better than the existing methods for many optimisation problems involving dimensions up to 5000.

AAAI Conference 2019 Conference Paper

Bayesian Functional Optimisation with Shape Prior

  • Pratibha Vellanki
  • Santu Rana
  • Sunil Gupta
  • David Rubin de Celis Leal
  • Alessandra Sutti
  • Murray Height
  • Svetha Venkatesh

Real world experiments are expensive, and thus it is important to reach a target in a minimum number of experiments. Experimental processes often involve control variables that change over time. Such problems can be formulated as functional optimisation problem. We develop a novel Bayesian optimisation framework for such functional optimisation of expensive black-box processes. We represent the control function using Bernstein polynomial basis and optimise in the coefficient space. We derive the theory and practice required to dynamically adjust the order of the polynomial degree, and show how prior information about shape can be integrated. We demonstrate the effectiveness of our approach for short polymer fibre design and optimising learning rate schedules for deep networks.

NeurIPS Conference 2019 Conference Paper

Bayesian Optimization with Unknown Search Space

  • Huong Ha
  • Santu Rana
  • Sunil Gupta
  • Thanh Nguyen
  • Hung Tran-The
  • Svetha Venkatesh

Applying Bayesian optimization in problems wherein the search space is unknown is challenging. To address this problem, we propose a systematic volume expansion strategy for the Bayesian optimization. We devise a strategy to guarantee that in iterative expansions of the search space, our method can find a point whose function value within epsilon of the objective function maximum. Without the need to specify any parameters, our algorithm automatically triggers a minimal expansion required iteratively. We derive analytic expressions for when to trigger the expansion and by how much to expand. We also provide theoretical analysis to show that our method achieves epsilon-accuracy after a finite number of iterations. We demonstrate our method on both benchmark test functions and machine learning hyper-parameter tuning tasks and demonstrate that our method outperforms baselines.

ICLR Conference 2019 Conference Paper

Learning to Remember More with Less Memorization

  • Hung Le 0002
  • Tran The Truyen
  • Svetha Venkatesh

Memory-augmented neural networks consisting of a neural controller and an external memory have shown potentials in long-term sequential learning. Current RAM-like memory models maintain memory accessing every timesteps, thus they do not effectively leverage the short-term memory held in the controller. We hypothesize that this scheme of writing is suboptimal in memory utilization and introduces redundant computation. To validate our hypothesis, we derive a theoretical bound on the amount of information stored in a RAM-like system and formulate an optimization problem that maximizes the bound. The proposed solution dubbed Uniform Writing is proved to be optimal under the assumption of equal timestep contributions. To relax this assumption, we introduce modifications to the original solution, resulting in a solution termed Cached Uniform Writing. This method aims to balance between maximizing memorization and forgetting via overwriting mechanisms. Through an extensive set of experiments, we empirically demonstrate the advantages of our solutions over other recurrent architectures, claiming the state-of-the-arts in various sequential modeling tasks.

NeurIPS Conference 2019 Conference Paper

Multi-objective Bayesian optimisation with preferences over objectives

  • Majid Abdolshah
  • Alistair Shilton
  • Santu Rana
  • Sunil Gupta
  • Svetha Venkatesh

We present a multi-objective Bayesian optimisation algorithm that allows the user to express preference-order constraints on the objectives of the type objective A is more important than objective B. These preferences are defined based on the stability of the obtained solutions with respect to preferred objective functions. Rather than attempting to find a representative subset of the complete Pareto front, our algorithm selects those Pareto-optimal points that satisfy these constraints. We formulate a new acquisition function based on expected improvement in dominated hypervolume (EHI) to ensure that the subset of Pareto front satisfying the constraints is thoroughly explored. The hypervolume calculation is weighted by the probability of a point satisfying the constraints from a gradient Gaussian Process model. We demonstrate our algorithm on both synthetic and real-world problems.

NeurIPS Conference 2018 Conference Paper

Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation

  • Shivapratap Gopakumar
  • Sunil Gupta
  • Santu Rana
  • Vu Nguyen
  • Svetha Venkatesh

We introduce algorithmic assurance, the problem of testing whether machine learning algorithms are conforming to their intended design goal. We address this problem by proposing an efficient framework for algorithmic testing. To provide assurance, we need to efficiently discover scenarios where an algorithm decision deviates maximally from its intended gold standard. We mathematically formulate this task as an optimisation problem of an expensive, black-box function. We use an active learning approach based on Bayesian optimisation to solve this optimisation problem. We extend this framework to algorithms with vector-valued outputs by making appropriate modification in Bayesian optimisation via the EXP3 algorithm. We theoretically analyse our methods for convergence. Using two real-world applications, we demonstrate the efficiency of our methods. The significance of our problem formulation and initial solutions is that it will serve as the foundation in assuring humans about machines making complex decisions.

UAI Conference 2018 Conference Paper

Multi-Target Optimisation via Bayesian Optimisation and Linear Programming

  • Alistair Shilton
  • Santu Rana
  • Sunil Gupta 0001
  • Svetha Venkatesh

In Bayesian Multi-Objective optimisation, expected hypervolume improvement is often used to measure the goodness of candidate solutions. However when there are many objectives the calculation of expected hypervolume improvement can become computationally prohibitive. An alternative approach measures the goodness of a candidate based on the distance of that candidate from the Pareto front in objective space. In this paper we present a novel distance-based Bayesian Many-Objective optimisation algorithm. We demonstrate the efficacy of our algorithm on three problems, namely the DTLZ2 benchmark problem, a hyper-parameter selection problem, and high-temperature creep-resistant alloy design.

NeurIPS Conference 2018 Conference Paper

Variational Memory Encoder-Decoder

  • Hung Le
  • Truyen Tran
  • Thin Nguyen
  • Svetha Venkatesh

Introducing variability while maintaining coherence is a core task in learning to generate utterances in conversation. Standard neural encoder-decoder models and their extensions using conditional variational autoencoder often result in either trivial or digressive responses. To overcome this, we explore a novel approach that injects variability into neural encoder-decoder via the use of external memory as a mixture model, namely Variational Memory Encoder-Decoder (VMED). By associating each memory read with a mode in the latent mixture distribution at each timestep, our model can capture the variability observed in sequential data such as natural conversations. We empirically compare the proposed model against other recent approaches on various conversational datasets. The results show that VMED consistently achieves significant improvement over others in both metric-based and qualitative evaluations.

JBHI Journal 2017 Journal Article

$\mathtt {Deepr}$: A Convolutional Net for Medical Records

  • Phuoc Nguyen
  • Truyen Tran
  • Nilmini Wickramasinghe
  • Svetha Venkatesh

Feature engineering remains a major bottleneck when creating predictive systems from electronic medical records. At present, an important missing element is detecting predictive regular clinical motifs from irregular episodic records. We present Deepr (short for Deep record), a new end-to-end deep learning system that learns to extract features from medical records and predicts future risk automatically. Deepr transforms a record into a sequence of discrete elements separated by coded time gaps and hospital transfers. On top of the sequence is a convolutional neural net that detects and combines predictive local clinical motifs to stratify the risk. Deepr permits transparent inspection and visualization of its inner working. We validate Deepr on hospital data to predict unplanned readmission after discharge. Deepr achieves superior accuracy compared to traditional techniques, detects meaningful clinical motifs, and uncovers the underlying structure of the disease and intervention space.

JBHI Journal 2017 Journal Article

A Framework for Mixed-Type Multioutcome Prediction With Applications in Healthcare

  • Budhaditya Saha
  • Sunil Gupta
  • Dinh Phung
  • Svetha Venkatesh

Health analysis often involves prediction of multiple outcomes of mixed type. The existing work is restrictive to either a limited number or specific outcome types. We propose a framework for mixed-type multioutcome prediction. Our proposed framework proposes a cumulative loss function composed of a specific loss function for each outcome type-as an example, least square (continuous outcome), hinge (binary outcome), Poisson (count outcome), and exponential (nonnegative outcome). To model these outcomes jointly, we impose a commonality across the prediction parameters through a common matrix normal prior. The framework is formulated as iterative optimization problems and solved using an efficient block-coordinate descent method. We empirically demonstrate both scalability and convergence. We apply the proposed model to a synthetic dataset and then on two real-world cohorts: a cancer cohort and an acute myocardial infarction cohort collected over a two-year period. We predict multiple emergency-related outcomes-as example, future emergency presentations (binary), emergency admissions (count), emergency length of stay days (nonnegative), and emergency time to next admission day (nonnegative). We show that the predictive performance of the proposed model is better than several state-of-the-art baselines.

AAAI Conference 2017 Conference Paper

Column Networks for Collective Classification

  • Trang Pham
  • Truyen Tran
  • Dinh Phung
  • Svetha Venkatesh

Relational learning deals with data that are characterized by relational structures. An important task is collective classification, which is to jointly classify networked objects. While it holds a great promise to produce a better accuracy than non-collective classifiers, collective classification is computationally challenging and has not leveraged on the recent breakthroughs of deep learning. We present Column Network (CLN), a novel deep learning model for collective classification in multi-relational domains. CLN has many desirable theoretical properties: (i) it encodes multi-relations between any two instances; (ii) it is deep and compact, allowing complex functions to be approximated at the network level with a small set of free parameters; (iii) local and relational features are learned simultaneously; (iv) long-range, higher-order dependencies between instances are supported naturally; and (v) crucially, learning and inference are efficient with linear complexity in the size of the network and the number of relations. We evaluate CLN on multiple real-world applications: (a) delay prediction in software projects, (b) PubMed Diabetes publication classification and (c) film genre classification. In all of these applications, CLN demonstrates a higher accuracy than state-of-the-art rivals.

IJCAI Conference 2017 Conference Paper

High Dimensional Bayesian Optimization using Dropout

  • Cheng Li
  • Sunil Gupta
  • Santu Rana
  • Vu Nguyen
  • Svetha Venkatesh
  • Alistair Shilton

Scaling Bayesian optimization to high dimensions is challenging task as the global optimization of high-dimensional acquisition function can be expensive and often infeasible. Existing methods depend either on limited “active” variables or the additive form of the objective function. We propose a new method for high-dimensional Bayesian optimization, that uses a drop-out strategy to optimize only a subset of variables at each iteration. We derive theoretical bounds for the regret and show how it can inform the derivation of our algorithm. We demonstrate the efficacy of our algorithms for optimization on two benchmark functions and two real-world applications - training cascade classifiers and optimizing alloy composition.

ICML Conference 2017 Conference Paper

High Dimensional Bayesian Optimization with Elastic Gaussian Process

  • Santu Rana
  • Cheng Li 0003
  • Sunil Gupta 0001
  • Vu Nguyen 0001
  • Svetha Venkatesh

Bayesian optimization is an efficient way to optimize expensive black-box functions such as designing a new product with highest quality or hyperparameter tuning of a machine learning algorithm. However, it has a serious limitation when the parameter space is high-dimensional as Bayesian optimization crucially depends on solving a global optimization of a surrogate utility function in the same sized dimensions. The surrogate utility function, known commonly as acquisition function is a continuous function but can be extremely sharp at high dimension - having only a few peaks marooned in a large terrain of almost flat surface. Global optimization algorithms such as DIRECT are infeasible at higher dimensions and gradient-dependent methods cannot move if initialized in the flat terrain. We propose an algorithm that enables local gradient-dependent algorithms to move through the flat terrain by using a sequence of gross-to-finer Gaussian process priors on the objective function as we leverage two underlying facts - a) there exists a large enough length-scales for which the acquisition function can be made to have a significant gradient at any location in the parameter space, and b) the extrema of the consecutive acquisition functions are close although they are different only due to a small difference in the length-scales. Theoretical guarantees are provided and experiments clearly demonstrate the utility of the proposed method at high dimension using both benchmark test functions and real-world case studies.

NeurIPS Conference 2017 Conference Paper

Process-constrained batch Bayesian optimisation

  • Pratibha Vellanki
  • Santu Rana
  • Sunil Gupta
  • David Rubin
  • Alessandra Sutti
  • Thomas Dorin
  • Murray Height
  • Paul Sanders

Abstract Prevailing batch Bayesian optimisation methods allow all control variables to be freely altered at each iteration. Real-world experiments, however, often have physical limitations making it time-consuming to alter all settings for each recommendation in a batch. This gives rise to a unique problem in BO: in a recommended batch, a set of variables that are expensive to experimentally change need to be fixed, while the remaining control variables can be varied. We formulate this as a process-constrained batch Bayesian optimisation problem. We propose two algorithms, pc-BO(basic) and pc-BO(nested). pc-BO(basic) is simpler but lacks convergence guarantee. In contrast pc-BO(nested) is slightly more complex, but admits convergence analysis. We show that the regret of pc-BO(nested) is sublinear. We demonstrate the performance of both pc-BO(basic) and pc-BO(nested) by optimising benchmark test functions, tuning hyper-parameters of the SVM classifier, optimising the heat-treatment process for an Al-Sc alloy to achieve target hardness, and optimising the short polymer fibre production process.

JBHI Journal 2016 Journal Article

A Framework for Classifying Online Mental Health-Related Communities With an Interest in Depression

  • Budhaditya Saha
  • Thin Nguyen
  • Dinh Phung
  • Svetha Venkatesh

Mental illness has a deep impact on individuals, families, and by extension, society as a whole. Social networks allow individuals with mental disorders to communicate with others sufferers via online communities, providing an invaluable resource for studies on textual signs of psychological health problems. Mental disorders often occur in combinations, e. g. , a patient with an anxiety disorder may also develop depression. This co-occurring mental health condition provides the focus for our work on classifying online communities with an interest in depression. For this, we have crawled a large body of 620 000 posts made by 80 000 users in 247 online communities. We have extracted the topics and psycholinguistic features expressed in the posts, using these as inputs to our model. Following a machine learning technique, we have formulated a joint modeling framework in order to classify mental health-related co-occurring online communities from these features. Finally, we performed empirical validation of the model on the crawled dataset where our model outperforms recent state-of-the-art baselines.

UAI Conference 2016 Conference Paper

Scalable Nonparametric Bayesian Multilevel Clustering

  • Viet Huynh
  • Dinh Q. Phung
  • Svetha Venkatesh
  • XuanLong Nguyen
  • Matthew D. Hoffman
  • Hung Hai Bui

Multilevel clustering problems where the content and contextual information are jointly clustered are ubiquitous in modern datasets. Existing works on this problem are limited to small datasets due to the use of the Gibbs sampler. We address the problem of scaling up multilevel clustering under a Bayesian nonparametric setting, extending the MC2 model proposed in (Nguyen et al. , 2014). We ground our approach in structured mean-field and stochastic variational inference (SVI) and develop a treestructured SVI algorithm that exploits the interplay between content and context modeling. Our new algorithm avoids the need to repeatedly go through the corpus as in Gibbs sampler. More crucially, our method is immediately amendable to parallelization, facilitating a scalable distributed implementation on the Apache Spark platform. We conduct extensive experiments in a variety of domains including text, images, and real-world user application activities. Direct comparison with the Gibbs-sampler demonstrates that our method is an order-ofmagnitude faster without loss of model quality. Our Spark-based implementation gains another order-of-magnitude speedup and can scale to large real-world datasets containing millions of documents and groups.

IJCAI Conference 2015 Conference Paper

Groupwise Registration of Aerial Images

  • Ognjen Arandjelovic
  • Duc-Son Pham
  • Svetha Venkatesh

This paper addresses the task of time separated aerial image registration. The ability to solve this problem accurately and reliably is important for a variety of subsequent image understanding applications. The principal challenge lies in the extent and nature of transient appearance variation that a land area can undergo, such as that caused by the change in illumination conditions, seasonal variations, or the occlusion by non-persistent objects (people, cars). Our work introduces several novelties: (i) unlike all previous work on aerial image registration, we approach the problem using a set-based paradigm; (ii) we show how local, pairwise constraints can be used to enforce a globally good registration using a constraints graph structure; (iii) we show how a simple holistic representation derived from raw aerial images can be used as a basic building block of the constraints graph in a manner which achieves both high registration accuracy and speed. We demonstrate: (i) that the proposed method outperforms the state-of-the-art for pair-wise registration already, achieving greater accuracy and reliability, while at the same time reducing the computational cost of the task; and (ii) that the increase in the number of available images in a set consistently reduces the average registration error.

JBHI Journal 2015 Journal Article

Stabilizing High-Dimensional Prediction Models Using Feature Graphs

  • Shivapratap Gopakumar
  • Truyen Tran
  • Tu Dinh Nguyen
  • Dinh Phung
  • Svetha Venkatesh

We investigate feature stability in the context of clinical prognosis derived from high-dimensional electronic medical records. To reduce variance in the selected features that are predictive, we introduce Laplacian-based regularization into a regression model. The Laplacian is derived on a feature graph that captures both the temporal and hierarchic relations between hospital events, diseases, and interventions. Using a cohort of patients with heart failure, we demonstrate better feature stability and goodness-of-fit through feature graph stabilization.

AAAI Conference 2015 Conference Paper

Tensor-Variate Restricted Boltzmann Machines

  • Tu Nguyen
  • Truyen Tran
  • Dinh Phung
  • Svetha Venkatesh

Restricted Boltzmann Machines (RBMs) are an important class of latent variable models for representing vector data. An under-explored area is multimode data, where each data point is a matrix or a tensor. Standard RBMs applying to such data would require vectorizing matrices and tensors, thus resulting in unnecessarily high dimensionality and at the same time, destroying the inherent higher-order interaction structures. This paper introduces Tensor-variate Restricted Boltzmann Machines (TvRBMs) which generalize RBMs to capture the multiplicative interaction between data modes and the latent variables. TvRBMs are highly compact in that the number of free parameters grows only linear with the number of modes. We demonstrate the capacity of TvRBMs on three real-world applications: handwritten digit classification, face recognition and EEG-based alcoholic diagnosis. The learnt features of the model are more discriminative than the rivals, resulting in better classification performance.

ICML Conference 2014 Conference Paper

Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts

  • Vu Nguyen 0001
  • Dinh Q. Phung
  • XuanLong Nguyen
  • Svetha Venkatesh
  • Hung Hai Bui

We present a Bayesian nonparametric framework for multilevel clustering which utilizes group-level context information to simultaneously discover low-dimensional structures of the group contents and partitions groups into clusters. Using the Dirichlet process as the building block, our model constructs a product base-measure with a nested structure to accommodate content and context observations at multiple levels. The proposed model possesses properties that link the nested Dirichlet processes (nDP) and the Dirichlet process mixture models (DPM) in an interesting way: integrating out all contents results in the DPM over contexts, whereas integrating out group-specific contexts results in the nDP mixture over content variables. We provide a Polya-urn view of the model and an efficient collapsed Gibbs inference procedure. Extensive experiments on real-world datasets demonstrate the advantage of utilizing context information via our model in both text and image domains.

ICML Conference 2013 Conference Paper

Factorial Multi-Task Learning: A Bayesian Nonparametric Approach

  • Sunil Gupta 0001
  • Dinh Q. Phung
  • Svetha Venkatesh

Multi-task learning is a paradigm shown to improve the performance of related tasks through their joint learning. However, for real-world data, it is usually difficult to assess the task relatedness and joint learning with unrelated tasks may lead to serious performance degradations. To this end, we propose a framework that groups the tasks based on their relatedness in a low dimensional subspace and allows a varying degree of relatedness among tasks by sharing the subspace bases across the groups. This provides the flexibility of no sharing when two sets of tasks are unrelated and partial/total sharing when the tasks are related. Importantly, the number of task-groups and the subspace dimensionality are automatically inferred from the data. This feature keeps the model beyond a specific set of parameters. To realize our framework, we present a novel Bayesian nonparametric prior that extends the traditional hierarchical beta process prior using a Dirichlet process to permit potentially infinite number of child beta processes. We apply our model for multi-task regression and classification applications. Experimental results using several synthetic and real-world datasets show the superiority of our model to other recent state-of-the-art multi-task learning methods.

ICML Conference 2013 Conference Paper

Thurstonian Boltzmann Machines: Learning from Multiple Inequalities

  • Tran The Truyen
  • Dinh Q. Phung
  • Svetha Venkatesh

We introduce Thurstonian Boltzmann Machines (TBM), a unified architecture that can naturally incorporate a wide range of data inputs at the same time. Our motivation rests in the Thurstonian view that many discrete data types can be considered as being generated from a subset of underlying latent continuous variables, and in the observation that each realisation of a discrete type imposes certain inequalities on those variables. Thus learning and inference in TBM reduce to making sense of a set of inequalities. Our proposed TBM naturally supports the following types: Gaussian, intervals, censored, binary, categorical, muticategorical, ordinal, (in)-complete rank with and without ties. We demonstrate the versatility and capacity of the proposed model on three applications of very different natures; namely handwritten digit recognition, collaborative filtering and complex social survey analysis.

AAAI Conference 2012 Conference Paper

A Sequential Decision Approach to Ordinal Preferences in Recommender Systems

  • Truyen Tran
  • Dinh Phung
  • Svetha Venkatesh

We propose a novel sequential decision approach to modeling ordinal ratings in collaborative filtering problems. The rating process is assumed to start from the lowest level, evaluates against the latent utility at the corresponding level and moves up until a suitable ordinal level is found. Crucial to this generative process is the underlying utility random variables that govern the generation of ratings and their modelling choices. To this end, we make a novel use of the generalised extreme value distributions, which is found to be particularly suitable for our modeling tasks and at the same time, facilitate our inference and learning procedure. The proposed approach is flexible to incorporate features from both the user and the item. We evaluate the proposed framework on three well-known datasets: MovieLens, Dating Agency and Netflix. In all cases, it is demonstrated that the proposed work is competitive against state-of-the-art collaborative filtering methods.

UAI Conference 2012 Conference Paper

A Slice Sampler for Restricted Hierarchical Beta Process with Applications to Shared Subspace Learning

  • Sunil Gupta 0001
  • Dinh Q. Phung
  • Svetha Venkatesh

Hierarchical beta process has found interesting applications in recent years. In this paper we present a modified hierarchical beta process prior with applications to hierarchical modeling of multiple data sources. The novel use of the prior over a hierarchical factor model allows factors to be shared across different sources. We derive a slice sampler for this model, enabling tractable inference even when the likelihood and the prior over parameters are non-conjugate. This allows the application of the model in much wider contexts without restrictions. We present two different data generative models – a linear Gaussian- Gaussian model for real valued data and a linear Poisson-gamma model for count data. Encouraging transfer learning results are shown for two real world applications – text modeling and content based image retrieval.

UAI Conference 2009 Conference Paper

Ordinal Boltzmann Machines for Collaborative Filtering

  • Tran The Truyen
  • Dinh Q. Phung
  • Svetha Venkatesh

Collaborative ltering is an e ective recommendation technique wherein the preference of an individual can potentially be predicted based on preferences of other members. Early algorithms often relied on the strong locality in the preference data, that is, it is enough to predict preference of a user on a particular item based on a small subset of other users with similar tastes or of other items with similar properties. More recently, dimensionality reduction techniques have proved to be equally competitive, and these are based on the co-occurrence patterns rather than locality. This paper explores and extends a probabilistic model known as Boltzmann Machine for collaborative ltering tasks. It seamlessly integrates both the similarity and cooccurrence in a principled manner. In particular, we study parameterisation options to deal with the ordinal nature of the preferences, and propose a joint modelling of both the user-based and item-based processes. Experiments on moderate and large-scale movie recommendation show that our framework rivals existing well-known methods.

NeurIPS Conference 2008 Conference Paper

Hierarchical Semi-Markov Conditional Random Fields for Recursive Sequential Data

  • Tran Truyen
  • Dinh Phung
  • Hung Bui
  • Svetha Venkatesh

Inspired by the hierarchical hidden Markov models (HHMM), we present the hierarchical semi-Markov conditional random field (HSCRF), a generalisation of embedded undirected Markov chains to model complex hierarchical, nested Markov processes. It is parameterised in a discriminative framework and has polynomial time algorithms for learning and inference. Importantly, we develop efficient algorithms for learning and constrained inference in a partially-supervised setting, which is important issue in practice where labels can only be obtained sparsely. We demonstrate the HSCRF in two applications: (i) recognising human activities of daily living (ADLs) from indoor surveillance cameras, and (ii) noun-phrase chunking. We show that the HSCRF is capable of learning rich hierarchical models with reasonable accuracy in both fully and partially observed data cases.

AAAI Conference 2008 Conference Paper

The Hidden Permutation Model and Location-Based Activity Recognition

  • Hung H. Bui
  • Svetha Venkatesh

Permutation modeling is challenging because of the combinatorial nature of the problem. However, such modeling is often required in many real-world applications, including activity recognition where subactivities are often permuted and partially ordered. This paper introduces a novel Hidden Permutation Model (HPM) that can learn the partial ordering constraints in permuted state sequences. The HPM is parameterized as an exponential family distribution and is flexible so that it can encode constraints via different feature functions. A chain-flipping Metropolis-Hastings Markov chain Monte Carlo (MCMC) is employed for inference to overcome the O(n!) complexity. Gradient-based maximum likelihood parameter learning is presented for two cases when the permutation is known and when it is hidden. The HPM is evaluated using both simulated and real data from a location-based activity recognition domain. Experimental results indicate that the HPM performs far better than other baseline models, including the naive Bayes classifier, the HMM classifier, and Kirshner’s multinomial permutation model. Our presented HPM is generic and can potentially be utilized in any problem where the modeling of permuted states from noisy data is needed.

IJCAI Conference 2007 Conference Paper

  • Ronny Tjahyadi
  • Wanquan Liu
  • Senjian an
  • Svetha Venkatesh

In this paper we investigate the face recognition problem via the overlapping energy histogram of the DCT coefficients. Particularly, we investigate some important issues relating to the recognition performance, such as the issue of selecting threshold and the number of bins. These selection methods utilise information obtained from the training dataset. Experimentation is conducted on the Yale face database and results indicate that the proposed parameter selection methods perform well in selecting the threshold and number of bins. Furthermore, we show that the proposed overlapping energy histogram approach outperforms the Eigenfaces, 2DPCA and energy histogram significantly.