Author name cluster

Michael Cohen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

2 author rows

TMLR Journal 2023 Journal Article

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava
Abhinav Rastogi
Abhishek Rao
Abu Awal Md Shoeb
Abubakar Abid
Adam Fisch
Adam R. Brown
Adam Santoro

Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG- bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood develop- ment, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google- internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.

PDF Details

TARK Conference 2021 Conference Paper

De Re Updates

Michael Cohen
Wen Tang
Yanjing Wang 0001

In this paper, we propose a lightweight yet powerful dynamic epistemic logic that captures not only the distinction between de dicto and de re knowledge but also the distinction between de dicto and de re updates. The logic is based on the dynamified version of an epistemic language extended with the assignment operator borrowed from dynamic logic, following the work of Wang and Seligman (Proc. AiML 2018). We obtain complete axiomatizations for the counterparts of public announcement logic and event-model-based DEL based on new reduction axioms taking care of the interactions between dynamics and assignments.

Details DOI

AAAI Conference 2020 Conference Paper

Asymptotically Unambitious Artificial General Intelligence

Michael Cohen
Badri Vellambi
Marcus Hutter

General intelligence, the ability to solve arbitrary solvable problems, is supposed by many to be artiﬁcially constructible. Narrow intelligence, the ability to solve a given particularly difﬁcult problem, has seen impressive recent development. Notable examples include self-driving cars, Go engines, image classiﬁers, and translators. Artiﬁcial General Intelligence (AGI) presents dangers that narrow intelligence does not: if something smarter than us across every domain were indifferent to our concerns, it would be an existential threat to humanity, just as we threaten many species despite no ill will. Even the theory of how to maintain the alignment of an AGI’s goals with our own has proven highly elusive. We present the ﬁrst algorithm we are aware of for asymptotically unambitious AGI, where “unambitiousness” includes not seeking arbitrary power. Thus, we identify an exception to the Instrumental Convergence Thesis, which is roughly that by default, an AGI would seek power, including over us.

PDF Details

LORI Conference 2017 Conference Paper

A Note on Belief, Question Embedding and Neg-Raising

Michael Cohen

Abstract The epistemic verb to believe does not embed polar questions, unlike the verb to know. After reviewing this phenomenon, I propose an explanation which connects the neg-raising behavior of belief with its embedding patterns (following [ 14 ]). I use dynamic epistemic logic to model the presuppositions and the effects associated with belief assertions.

Details

LORI Conference 2015 Conference Paper

A Dynamic Epistemic Logic with a Knowability Principle

Michael Cohen

Abstract A dynamic epistemic logic is presented in which the single agent can reason about his knowledge stages before and after announcements. The logic is generated by reinterpreting multi agent private announcements in a single agent environment. It is shown that a knowability principle is valid for such logic: any initially true ϕ can be known after a certain number of announcements.

Details

NeurIPS Conference 1992 Conference Paper

Context-Dependent Multiple Distribution Phonetic Modeling with MLPs

Michael Cohen
Horacio Franco
Nelson Morgan
David Rumelhart
Victor Abrash

A number of hybrid multilayer perceptron (MLP)/hidden Markov model (HMM: ) speech recognition systems have been developed in recent years (Morgan and Bourlard. 1990). In this paper. we present a new MLP architecture and training algorithm which allows the modeling of context-dependent phonetic classes in a hybrid MLP/HMM: framework. The new training procedure smooths MLPs trained at different degrees of context dependence in order to obtain a robust estimate of the cootext-dependent probabilities. Tests with the DARPA Resomce Management database have shown substantial advantages of the context-dependent MLPs over earlier cootext(cid: 173) independent MLPs. and have shown substantial advantages of this hybrid approach over a pure HMM approach.

PDF Details

NeurIPS Conference 1992 Conference Paper

Modeling Consistency in a Speaker Independent Continuous Speech Recognition System

Yochai Konig
Nelson Morgan
Chuck Wooters
Victor Abrash
Michael Cohen
Horacio Franco

We would like to incorporate speaker-dependent consistencies, such as gender, in an otherwise speaker-independent speech recognition system. In this paper we discuss a Gender Dependent Neural Network (GDNN) which can be tuned for each gender, while sharing most of the speaker independent parameters. We use a classification network to help generate gender-dependent phonetic probabilities for a statistical (HMM) recogni(cid: 173) tion system. The gender classification net predicts the gender with high accuracy, 98. 3% on a Resource Management test set. However, the in(cid: 173) tegration of the GDNN into our hybrid HMM-neural network recognizer provided an improvement in the recognition score that is not statistically significant on a Resource Management test set.

PDF Details

NeurIPS Conference 1991 Conference Paper

Connectionist Optimisation of Tied Mixture Hidden Markov Models

Steve Renals
Nelson Morgan
Hervé Bourlard
Horacio Franco
Michael Cohen

Issues relating to the estimation of hidden Markov model (HMM) local probabilities are discussed. In particular we note the isomorphism of ra(cid: 173) dial basis functions (RBF) networks to tied mixture density modellingj additionally we highlight the differences between these methods arising from the different training criteria employed. We present a method in which connectionist training can be modified to resolve these differences and discuss some preliminary experiments. Finally, we discuss some out(cid: 173) standing problems with discriminative training.

PDF Details