Author name cluster

Murray Campbell

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

2 author rows

ICLR Conference 2024 Conference Paper

On the generalization capacity of neural networks during generic multimodal reasoning

Takuya Ito
Soham Dan
Mattia Rigotti
James R. Kozloski
Murray Campbell

The advent of the Transformer has led to the development of large language models (LLM), which appear to demonstrate human-like capabilities. To assess the generality of this class of models and a variety of other base neural network architectures to multimodal domains, we evaluated and compared their capacity for multimodal generalization. We introduce a multimodal question-answer benchmark to evaluate three specific types of out-of-distribution (OOD) generalization performance: distractor generalization (generalization in the presence of distractors), systematic compositional generalization (generalization to new task permutations), and productive compositional generalization (generalization to more complex tasks with deeper dependencies). While we found that most architectures faired poorly on most forms of generalization (e.g., RNNs and standard Transformers), models that leveraged cross-attention mechanisms between input domains, such as the Perceiver, fared better. Our positive results demonstrate that for multimodal distractor and systematic generalization, cross-attention is an important mechanism to integrate multiple sources of information. On the other hand, all architectures failed in productive generalization, suggesting fundamental limitations of existing architectures for specific types of multimodal OOD generalization. These results demonstrate the strengths and limitations of specific architectural components underlying modern neural models for multimodal reasoning. Finally, we provide *Generic COG* (gCOG), a configurable benchmark with several multimodal generalization splits, for future studies to explore.

Details

NeSy Conference 2022 Conference Paper

Combining Fast and Slow Thinking for Human-like and Efficient Decisions in Constrained Environments

Marianna Bergamaschi Ganapini
Murray Campbell
Francesco Fabiano
Lior Horesh
Jonathan Lenchner
Andrea Loreggia
Nicholas Mattei
Francesca Rossi 0001

Current AI systems lack several important human capabilities, such as adaptability, generalizability, selfcontrol, consistency, common sense, and causal reasoning. We believe that existing cognitive theories of human decision making, such as the thinking fast and slow theory, can provide insights on how to advance AI systems towards some of these capabilities. In this paper, we propose a general architecture that is based on fast/slow solvers and a metacognitive component. We then present experimental results on the behavior of an instance of this architecture, for AI systems that make decisions about navigating in a constrained environment. We show how combining the fast and slow decision modalities, which can be implemented by learning and reasoning components respectively, allows the system to evolve over time and gradually pass from slow to fast thinking with enough experience, and that this greatly helps in decision quality, resource consumption, and efficiency.

Details

PRL Workshop 2021 Workshop Paper

AI Planning Annotation in Reinforcement Learning: Options and Beyond

Junkyu Lee
Michael Katz
Don Joven Agravante
Miao Liu
Tim Klinger
Murray Campbell
Shirin Sohrabi
Gerald Tesauro

AI planning and reinforcement learning (RL) both solve sequential decision-making problems, taking fundamentally different approaches. In this work, we aim to bring AI planning and RL closer by investigating the relationship between abstractions in AI planning and the options framework in RL. To this end, we propose annotating RL tasks with AI planning models, allowing us to define options based purely on the planning model. Our experimental investigation shows that these options can be quickly trained offline and can improve the sample efficiency of a reinforcement learning algorithm.

PDF Details

AAAI Conference 2021 Conference Paper

Circles are like Ellipses, or Ellipses are like Circles? Measuring the Degree of Asymmetry of Static and Contextual Word Embeddings and the Implications to Representation Learning

Wei Zhang
Murray Campbell
Yang Yu
Sadhana Kumaravel

Human judgments of word similarity have been a popular method of evaluating the quality of word embedding. But it fails to measure the geometry properties such as asymmetry. For example, it is more natural to say “Ellipses are like Circles” than “Circles are like Ellipses”. Such asymmetry has been observed from the word evocation experiment, where one word is used to recall another. This association data have been understudied for measuring embedding quality. In this paper, we use three well-known evocation datasets for the purpose and study both static embedding as well as contextual embedding, such as BERT. To fight for the dynamic nature of BERT embedding, we probe BERT’s conditional probabilities as a language model, using a large number of Wikipedia contexts to derive a theoretically justifiable Bayesian asymmetry score. The result shows that the asymmetry judgment and similarity judgments disagree, and asymmetry judgment aligns with its strong performance on “extrinsic evaluations”. This is the first time we can show contextual embeddings’s strength on intrinsic evaluation, and the asymmetry judgment provides a new perspective to evaluate contextual embedding and new insights for representation learning.

PDF Details

IJCAI Conference 2021 Conference Paper

Mental Models of AI Agents in a Cooperative Game Setting (Extended Abstract)

Katy Ilonka Gero
Zahra Ashktorab
Casey Dugan
Qian Pan
James Johnson
Werner Geyer
Maria Ruiz
Sarah Miller

As more and more forms of AI become prevalent, it becomes increasingly important to understand how people develop mental models of these systems. In this work we study people's mental models of an AI agent in a cooperative word guessing game. We run a study in which people play the game with an AI agent while ``thinking out loud''; through thematic analysis we identify features of the mental models developed by participants. In a large-scale study we have participants play the game with the AI agent online and use a post-game survey to probe their mental model. We find that those who win more often have better estimates of the AI agent's abilities. We present three components---global knowledge, local knowledge, and knowledge distribution---for modeling AI systems and propose that understanding the underlying technology is insufficient for developing appropriate conceptual models---analysis of behavior is also necessary.

PDF Details DOI

AAAI Conference 2021 Conference Paper

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines

Keerthiram Murugesan
Mattia Atzeni
Pavan Kapanipathi
Pushkar Shukla
Sadhana Kumaravel
Gerald Tesauro
Kartik Talamadupula
Mrinmaya Sachan

Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making. In this paper, we examine the problem of infusing RL agents with commonsense knowledge. Such knowledge would allow agents to efficiently act in the world by pruning out implausible actions, and to perform lookahead planning to determine how current actions might affect future world states. We design a new text-based gaming environment called TextWorld Commonsense (TWC) for training and evaluating RL agents with a specific kind of commonsense knowledge about objects, their attributes, and affordances. We also introduce several baseline RL agents which track the sequential context and dynamically retrieve the relevant commonsense knowledge from ConceptNet. We show that agents which incorporate commonsense knowledge in TWC perform better, while acting more efficiently. We conduct user-studies to estimate human performance on TWC and show that there is ample room for future improvement.

PDF Details

AAAI Conference 2019 Conference Paper

Hybrid Reinforcement Learning with Expert State Sequences

Xiaoxiao Guo
Shiyu Chang
Mo Yu
Gerald Tesauro
Murray Campbell

Existing imitation learning approaches often require that the complete demonstration data, including sequences of actions and states, are available. In this paper, we consider a more realistic and difficult scenario where a reinforcement learning agent only has access to the state sequences of an expert, while the expert actions are unobserved. We propose a novel tensor-based model to infer the unobserved actions of the expert state sequences. The policy of the agent is then optimized via a hybrid objective combining reinforcement learning and imitation learning. We evaluated our hybrid approach on an illustrative domain and Atari games. The empirical results show that (1) the agents are able to leverage state expert sequences to learn faster than pure reinforcement learning baselines, (2) our tensor-based action inference model is advantageous compared to standard deep neural networks in inferring expert actions, and (3) the hybrid policy optimization objective is robust against noise in expert state sequences.

PDF Details

AAAI Conference 2019 Conference Paper

Learning to Teach in Cooperative Multiagent Reinforcement Learning

Shayegan Omidshafiei
Dong-Ki Kim
Miao Liu
Gerald Tesauro
Matthew Riemer
Christopher Amato
Murray Campbell
Jonathan P. How

Collective human knowledge has clearly benefited from the fact that innovations by individuals are taught to others through communication. Similar to human social groups, agents in distributed learning systems would likely benefit from communication to share knowledge and teach skills. The problem of teaching to improve agent learning has been investigated by prior works, but these approaches make assumptions that prevent application of teaching to general multiagent problems, or require domain expertise for problems they can apply to. This learning to teach problem has inherent complexities related to measuring long-term impacts of teaching that compound the standard multiagent coordination challenges. In contrast to existing works, this paper presents the first general framework and algorithm for intelligent agents to learn to teach in a multiagent environment. Our algorithm, Learning to Coordinate and Teach Reinforcement (LeCTR), addresses peer-to-peer teaching in cooperative multiagent reinforcement learning. Each agent in our approach learns both when and what to advise, then uses the received advice to improve local learning. Importantly, these roles are not fixed; these agents learn to assume the role of student and/or teacher at the appropriate moments, requesting and providing advice in order to improve teamwide performance and learning. Empirical comparisons against state-of-the-art teaching methods show that our teaching agents not only learn significantly faster, but also learn to coordinate in tasks where existing methods fail.

PDF Details

IJCAI Conference 2019 Conference Paper

Teaching AI Agents Ethical Values Using Reinforcement Learning and Policy Orchestration

Ritesh Noothigattu
Djallel Bouneffouf
Nicholas Mattei
Rachita Chandra
Piyush Madan
Kush R. Varshney
Murray Campbell
Moninder Singh

Autonomous cyber-physical agents play an increasingly large role in our lives. To ensure that they behave in ways aligned with the values of society, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. We detail a novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations and reinforcement learning to learn to maximize environmental rewards. A contextual bandit-based orchestrator then picks between the two policies: constraint-based and environment reward-based. The contextual bandit orchestrator allows the agent to mix policies in novel ways, taking the best actions from either a reward-maximizing or constrained policy. In addition, the orchestrator is transparent on which policy is being employed at each time step. We test our algorithms using Pac-Man and show that the agent is able to learn to act optimally, act within the demonstrated constraints, and mix these two functions in complex ways.

PDF Details

ICML Conference 2017 Conference Paper

Local-to-Global Bayesian Network Structure Learning

Tian Gao
Kshitij P. Fadnis
Murray Campbell

We introduce a new local-to-global structure learning algorithm, called graph growing structure learning (GGSL), to learn Bayesian network (BN) structures. GGSL starts at a (random) node and then gradually expands the learned structure through a series of local learning steps. At each local learning step, the proposed algorithm only needs to revisit a subset of the learned nodes, consisting of the local neighborhood of a target, and therefore improves on both memory and time efficiency compared to traditional global structure learning approaches. GGSL also improves on the existing local-to-global learning approaches by removing the need for conflict-resolving AND-rules, and achieves better learning accuracy. We provide theoretical analysis for the local learning step, and show that GGSL outperforms existing algorithms on benchmark datasets. Overall, GGSL demonstrates a novel direction to scale up BN structure learning while limiting accuracy loss.

Details

AIJ Journal 2002 Journal Article

Deep Blue

Murray Campbell
A.Joseph Hoane
Feng-hsiung Hsu

Details DOI

AIJ Journal 1984 Journal Article

Using chunking to solve chess pawn endgames

Hans Berliner
Murray Campbell

Details DOI

AAAI Conference 1983 Conference Paper

A Chess Program That Chunks

Murray Campbell

CHUNKER is a ches program that uses chunked knowledge to achieve success. Its domain is a subset of king and pawn endings in chess that has been studied for over 300 years. CHUNKER has a large library of chunk instances where chunk type has a property list and each instance has a set of values for these properties. This allows CHUNKER to reason about positions that come up in the search that would otherwise have to handled by means of additional search. Thus the program is able to sohe the most difficult problem of its present domain (a problem that would require 45 ply of search and on the order of 1Oi3 years of CPU time to be solved by the best of present day chess programs) in 18 ply and one minute of CPU time. Further, CHUNKER is undoubtedly the world' s foremost expert in its domain, and has discovercd 2 mistakes in the literature and has been instrumental in discovering a new theorem about the domain that allows the assessing of positions with a new degree of ease and confidence. In this paper we describe CHUNKER’s structure and performance, and discuss our plans for extending it to play the whole domain of king and pawn endings.

PDF Details