Author name cluster

Anthony Cohn

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

1 author row

AAAI Conference 2025 Conference Paper

Language-Models-as-a-Service: Overview of a New Paradigm and its Challenges

Emanuele La Malfa
Aleksandar Petrov
Simon Frieder
Christoph Weinhuber
Ryan Burnell
Raza Nazar
Anthony Cohn
Nigel Shadbolt

Some of the most powerful language models currently are proprietary systems, accessible only via (typically restrictive) web or software programming interfaces. This is the LanguageModels-as-a-Service (LMaaS) paradigm. In contrast with scenarios where full model access is available, as in the case of open-source models, such closed-off language models present specific challenges for evaluating, benchmarking, and testing them. This paper has two goals: on the one hand, we delineate how the aforementioned challenges act as impediments to the accessibility, reproducibility, reliability, and trustworthiness of LMaaS. We systematically examine the issues that arise from a lack of information about language models for each of these four aspects. We conduct a detailed analysis of existing solutions, put forth a number of recommendations, and highlight directions for future advancements. On the other hand, it serves as a synthesized overview of the licences and capabilities of the most popular LMaaS.

PDF Details DOI

JAIR Journal 2024 Journal Article

Language-Models-as-a-Service: Overview of a New Paradigm and its Challenges

Emanuele La Malfa
Aleksandar Petrov
Simon Frieder
Christoph Weinhuber
Ryan Burnell
Raza Nazar
Anthony Cohn
Nigel Shadbolt

Some of the most powerful language models currently are proprietary systems, accessible only via (typically restrictive) web or software programming interfaces. This is the Language-Models-as-a-Service (LMaaS) paradigm. In contrast with scenarios where full model access is available, as in the case of open-source models, such closed-off language models present specific challenges for evaluating, benchmarking, and testing them. This paper has two goals: on the one hand, we delineate how the aforementioned challenges act as impediments to the accessibility, reproducibility, reliability, and trustworthiness of LMaaS. We systematically examine the issues that arise from a lack of information about language models for each of these four aspects. We conduct a detailed analysis of existing solutions, put forth a number of recommendations, and highlight directions for future advancements. On the other hand, it serves as a synthesized overview of the licences and capabilities of the most popular LMaaS.

PDF Details DOI

AAAI Conference 2022 Conference Paper

Towards Explainable Action Recognition by Salient Qualitative Spatial Object Relation Chains

Hua Hua
Dongxu Li
Ruiqi Li
Peng Zhang
Jochen Renz
Anthony Cohn

In order to be trusted by humans, Artificial Intelligence agents should be able to describe rationales behind their decisions. One such application is human action recognition in critical or sensitive scenarios, where trustworthy and explainable action recognizers are expected. For example, reliable pedestrian action recognition is essential for self-driving cars and explanations for real-time decision making are critical for investigations if an accident happens. In this regard, learningbased approaches, despite their popularity and accuracy, are disadvantageous due to their limited interpretability. This paper presents a novel neuro-symbolic approach that recognizes actions from videos with human-understandable explanations. Specifically, we first propose to represent videos symbolically by qualitative spatial relations between objects called qualitative spatial object relation chains. We further develop a neural saliency estimator to capture the correlation between such object relation chains and the occurrence of actions. Given an unseen video, this neural saliency estimator is able to tell which object relation chains are more important for the action recognized. We evaluate our approach on two real-life video datasets, with respect to recognition accuracy and the quality of generated action explanations. Experiments show that our approach achieves superior performance on both aspects to previous symbolic approaches, thus facilitating trustworthy intelligent decision making. Our approach can be used to augment state-of-the-art learning approaches with explainability.

PDF Details

AAAI Conference 2017 Conference Paper

Latent Dirichlet Allocation for Unsupervised Activity Analysis on an Autonomous Mobile Robot

Paul Duckworth
Muhannad Alomari
James Charles
David Hogg
Anthony Cohn

For autonomous robots to collaborate on joint tasks with humans they require a shared understanding of an observed scene. We present a method for unsupervised learning of common human movements and activities on an autonomous mobile robot, which generalises and improves on recent results. Our framework encodes multiple qualitative abstractions of RGBD video from human observations and does not require external temporal segmentation. Analogously to information retrieval in text corpora, each human detection is modelled as a random mixture of latent topics. A generative probabilistic technique is used to recover topic distributions over an auto-generated vocabulary of discrete, qualitative spatio-temporal code words. We show that the emergent categories align well with human activities as interpreted by a human. This is a particularly challenging task on a mobile robot due to the varying camera viewpoints which lead to incomplete, partial and occluded human detections.

PDF Details

AAAI Conference 2017 Conference Paper

Natural Language Acquisition and Grounding for Embodied Robotic Systems

Muhannad Alomari
Paul Duckworth
David Hogg
Anthony Cohn

We present a cognitively plausible novel framework capable of learning the grounding in visual semantics and the grammar of natural language commands given to a robot in a table top environment. The input to the system consists of video clips of a manually controlled robot arm, paired with natural language commands describing the action. No prior knowledge is assumed about the meaning of words, or the structure of the language, except that there are different classes of words (corresponding to observable actions, spatial relations, and objects and their observable properties). The learning process automatically clusters the continuous perceptual spaces into concepts corresponding to linguistic input. A novel relational graph representation is used to build connections between language and vision. As well as the grounding of language to perception, the system also induces a set of probabilistic grammar rules. The knowledge learned is used to parse new commands involving previously unseen objects.

PDF Details

KR Conference 2016 Short Paper

Acquiring Knowledge about Semantics of Relational Features and Actions: Connecting Language and Vision

Muhannad Al-Omari
Eris Chinellato
Yiannis Gatsoulis
David Hogg
Anthony Cohn

AAAI Conference 2013 Conference Paper

An Effective Approach for Imbalanced Classification: Unevenly Balanced Bagging

Guohua Liang
Anthony Cohn

Learning from imbalanced data is an important problem in data mining research. Much research has addressed the problem of imbalanced data by using sampling methods to generate an equally balanced training set to improve the performance of the prediction models, but it is unclear what ratio of class distribution is best for training a prediction model. Bagging is one of the most popular and effective ensemble learning methods for improving the performance of prediction models; however, there is a major drawback on extremely imbalanced data-sets. It is unclear under which conditions bagging is outperformed by other sampling schemes in terms of imbalanced classiﬁcation. These issues motivate us to propose a novel approach, unevenly balanced bagging (UBagging), to boost the performance of the prediction model for imbalanced binary classiﬁcation. Our experimental results demonstrate that UBagging is effective and statistically signiﬁcantly superior to single learner decision trees J48 (SingleJ48), bagging, and equally balanced bagging (BBagging) on 32 imbalanced data-sets.

PDF Details

AAAI Conference 2010 Conference Paper

Unsupervised Learning of Event Classes from Video

Muralikrishna Sridhar
Anthony Cohn
David Hogg

We present a method for unsupervised learning of event classes from videos in which multiple actions might occur simultaneously. It is assumed that all such activities are produced from an underlying set of event class generators. The learning task is then to recover this generative process from visual data. A set of event classes is derived from the most likely decomposition of the tracks into a set of labelled events involving subsets of interacting tracks. Interactions between subsets of tracks are modelled as a relational graph structure that captures qualitative spatio-temporal relationships between these tracks. The posterior probability of candidate solutions favours decompositions in which events of the same class have a similar relational structure, together with other measures of well-formedness. A Markov Chain Monte Carlo (MCMC) procedure is used to efficiently search for the MAP solution. This search moves between possible decompositions of the tracks into sets of unlabelled events and at each move adds a close to optimal labelling (for this decomposition) using spectral clustering. Experiments on real data show that the discovered event classes are often semantically meaningful and correspond well with groundtruth event classes assigned by hand.

PDF Details