Author name cluster

Frederick Callaway

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

2 author rows

RLDM Conference 2019 Conference Abstract

A cognitive tutor for helping people overcome present bias

Falk Lieder
Frederick Callaway
Yash Raj Jain
Paul M Krueger
Priyam Das
Sayan Gul
Thomas Griffiths

People’s reliance on suboptimal heuristics gives rise to a plethora of cognitive biases in decision- making including the present bias, which denotes people’s tendency to be overly swayed by an action’s immediate costs/benefits rather than its more important long-term consequences. One approach to helping people overcome such biases is to teach them better decision strategies. But which strategies should we teach them? And how can we teach them effectively? Here, we leverage an automatic method for discovering rational heuristics and insights into how people acquire cognitive skills to develop an intelligent tutor that teaches people how to make better decisions. As a proof of concept, we derive the optimal planning strategy for a simple model of situations where people fall prey to the present bias. Our cognitive tutor teaches people this optimal planning strategy by giving them metacognitive feedback on how they plan in a 3-step sequential decision-making task. Our tutor’s feedback is designed to maximally accelerate people’s metacognitive reinforcement learning towards the optimal planning strategy. A series of four experiments confirmed that training with the cognitive tutor significantly reduced present bias and improved people’s decision-making competency: Experiment 1 demonstrated that the cognitive tutor’s feedback can help participants discover far-sighted planning strategies. Experiment 2 found that this training effect transfers to more complex environments. Experiment 3 found that these transfer effects are retained for at least 24 hours after the training. Finally, Experiment 4 found that practicing with the cognitive tutor can have additional benefits over being told the strategy in words. The results suggest that promoting metacognitive reinforcement learning with optimal feedback is a promising approach to improving the human mind.

RLDM Conference 2019 Conference Abstract

Attention in value-based choice as optimal sequential sampling

Frederick Callaway
Thomas Griffiths

When faced with a decision between several options, people rarely fully consider every alter- native. Instead, we direct our attention to the most promising candidates, focusing our limited cognitive resources on evaluating the options that we are most likely to choose. A growing body of empirical work has shown that attention plays an important role in human decision making, but it is still unclear how people choose with option to attend to at each moment in the decision making process. In this paper, we present an analysis of how a rational decision maker should allocate her attention. We cast attention allocation in decision making as a sequential sampling problem, in which the decision maker iteratively selects from which distribution to sample in order to update her beliefs about the values of the available alternatives. By approximating the optimal solution to this problem, we derive a model in which both the selection and integration of evidence are rational. This model predicts choices and reaction times, as well as sequences of visual fixations. Applying the model to a ternary-choice dataset, we find that its predictions align well with human data.

RLDM Conference 2019 Conference Abstract

Compositional subgoal representations

Carlos G Correa
Frederick Callaway
Thomas Griffiths

When faced with a complex problem, people naturally break it up into several simpler problems. This hierarchical decomposition of an ultimate goal into sub-goals facilitates planning by reducing the num- ber of factors that must be considered at one time. However, it can also lead to suboptimal decision-making, obscuring opportunities to make progress towards multiple subgoals with a single action. Is it possible to take advantage of the hierarchical structure of problems without sacrificing opportunities to kill two birds with one stone? We propose that people are able to do this by representing and pursuing multiple subgoals at once. We present a formal model of planning with compositional goals, and show that it explains human behavior better than the standard “one-at-a-time” subgoal model as well as non-hierarchical limited-depth search models. Our results suggest that people are capable of representing and pursuing multiple subgoals at once; however, there are limitations on how many subgoals one can pursue concurrently. We find that these limitations vary by individual.

RLDM Conference 2019 Conference Abstract

Measuring how people learn how to plan

Yash Raj Jain
Frederick Callaway
Falk Lieder

The human mind has an unparalleled ability to acquire complex cognitive skills, discover new strategies, and refine its ways of thinking and decision-making; these phenomena are collectively known as cognitive plasticity. One important manifestation of cognitive plasticity is learning to make better – more far-sighted – decisions via planning. A serious obstacle to studying how people learn how to plan is that cognitive plasticity is even more difficult to observe than cognitive strategies are. To address this problem, we develop a computational microscope for measuring cognitive plasticity and validate it on simulated and empirical data. Our approach employs a process tracing paradigm recording signatures of human planning and how they change over time. We then invert a generative model of the recorded changes to infer the underlying cognitive plasticity. Our computational microscope measures cognitive plasticity significantly more accurately than simpler approaches, and it correctly detected the effect of an external manipulation known to promote cognitive plasticity. We illustrate how computational microscopes can be used to gain new insights into the time course of metacognitive learning and to test theories of cognitive development and hypotheses about the nature of cognitive plasticity. Future work will leverage our computational microscope to reverse-engineer the learning mechanisms enabling people to acquire complex cognitive skills such as planning and problem solving.

RLDM Conference 2019 Conference Abstract

Optimal nudging

Mathew Hardy
Frederick Callaway

People often face decisions where errors are costly but computing the optimal choice is intractable or prohibitively difficult. To address this, researchers have developed nudge theory as a way to lead people to better options without imposing restrictions on their freedom of choice. While heuristics and case-by-case evaluations are usually used to predict and explain nudges’ effects on choice, another way of interpreting these effects is that nudges can change the costs of attaining certain pieces of information. These changes in costs then bias people towards or away from making particular choices. In this paper, we propose a method for predicting the effects of choice architecture on option selection by modeling deliberation as a metalevel Markov decision process and nudging as the reduction of certain computational costs. This allows us to con- struct optimal nudges by choosing cost modifications to maximize some objective function. This approach is flexible and can be adapted to arbitrary decision making problems. Furthermore, by making the objectives of nudging explicit, the approach can address ethical concerns regarding the effects of nudging and the role people should have in choosing how, when, and why they are nudged. We demonstrate the strength of this framework by applying it to the Mouselab paradigm, where deliberation costs are made explicit. We find that a version of our approach leads to significantly higher participant reward, both increasing the quality of their choices and lowering the cost of making these choices.

RLDM Conference 2019 Conference Abstract

Remediating Cognitive Decline with Cognitive Tutors

Priyam Das
Frederick Callaway
Thomas Grif-
Falk Lieder

As people age, their cognitive abilities tend to deteriorate, including their ability to make complex plans. To remediate this cognitive decline, many commercial brain training programs target basic cognitive capacities, such as working memory. We have recently developed an alternative approach: intelligent tutors that teach people cognitive strategies for making the best possible use of their limited cognitive resources. Here, we apply this approach to improve older adults’ planning skills. In a process-tracing experiment we found that the decline in planning performance may be partly because older adults use less effective planning strategies. We also found that, with practice, both older and younger adults learned more effective planning strategies from experience. But despite these gains there was still room for improvement – especially for older people. In a second experiment, we let older and younger adults train their planning skills with an intelligent cognitive tutor that teaches optimal planning strategies via metacognitive feedback. We found that practicing planning with this intelligent tutor allowed older adults to catch up to their younger counterparts. These findings suggest that intelligent tutors that teach clever cognitive strategies can help aging decision- makers stay sharp.

UAI Conference 2018 Conference Paper

Learning to select computations

Frederick Callaway
Sayan Gul
Paul M. Krueger
Thomas L. Griffiths 0001
Falk Lieder

The efficient use of limited computational resources is an essential ingredient of intelligence. Selecting computations optimally according to rational metareasoning would achieve this, but this is computationally intractable. Inspired by psychology and neuroscience, we propose the first concrete and domain-general learning algorithm for approximating the optimal selection of computations: Bayesian metalevel policy search (BMPS). We derive this general, sample-efficient search algorithm for a computation-selecting metalevel policy based on the insight that the value of information lies between the myopic value of information and the value of perfect information. We evaluate BMPS on three increasingly difficult metareasoning problems: when to terminate computation, how to allocate computation between competing options, and planning. Across all three domains, BMPS achieved near-optimal performance and compared favorably to previously proposed metareasoning heuristics. Finally, we demonstrate the practical utility of BMPS in an emergency management scenario, even accounting for the overhead of metareasoning.

RLDM Conference 2017 Conference Abstract

A reward shaping method for promoting metacognitive learning

Falk Lieder
Paul Krueger
Frederick Callaway
Tom Griffiths

The human mind has an impressive ability to improve itself based on experience, but this poten- tial for cognitive growth is rarely fully realized. Cognitive training programs seek to tap into this unrealized potential but their theoretical foun- dation is incomplete and the scientific findings on their effectiveness are mixed. Recent work suggests that mechanisms by which people learn to think and decide better can be understood in terms of metacognitive reinforcement learning. This perspective allow us to translate the theory of reward shaping developed in machine learning into a computational method for designing feed- back structures for effective cognitive training. Concretely, our method applies the shaping theorem for accelerating model-free reinforcement learning to an MDP formulation of a meta-decision problem whose actions are computations that update the decision-maker’s probabilistic beliefs about the returns of alterna- tive courses of action. As a proof of concept, we show that our method can be applied to accelerate learning to plan in an environment similar to a grid world where every location contained a reward. To measure and give feedback on people’s planning process, each reward was initially occluded and had to be revealed by clicking on the corresponding location. We found that participants in the feedback condition learned faster to deliberate more and consequently reaped higher rewards and identified the optimal sequence of moves more frequently. These findings inspire optimism that meta-level reward shap- ing might provide a princi- pled theoretical foundation for cognitive training and enable more effective interventions for improving the human mind by giving feedback that is optimized for promoting metacognitive reinforcement learning.

RLDM Conference 2017 Conference Abstract

Helping people choose subgoals with sparse pseudo rewards

Frederick Callaway
Falk Lieder
Tom Griffiths

Many decisions require planning multiple steps into the future, but optimal planning is computa- tionally intractable. One way people cope with this problem is by setting subgoals, suggesting that we can help people make better decisions by helping them identify good subgoals. Here, we evaluate the benefits and perils of highlighting potential subgoals with pseudo-rewards. We first show that sparse pseudo-rewards based on the value function of a Markov decision proccess (MDP) lead a limited depth planner to follow the optimal policy in the MDP. We then demonstrate the effectiveness of these pseudo-rewards in an online experiment. Each of 84 participants solved 40 sequential decision-making problems. In control trials, par- ticipants only saw the state-transition diagram and the reward structure. In experimental trials, participants additionally saw pseudo-rewards equal to the value (sum of future rewards) for the states 1-, 2-, or 3-steps ahead of the current state. When the participant reached one of those states, the experiment would again reveal the values of the states located 1-, 2-, or 3-steps ahead of the current state. We found that showing participants the value of proximal states induced goal-directed planning and improved their average score per second. This benefit was largest when the incentives were 1 or 2 steps away and decreased as they were moved farther into the future. Although these pseudo-rewards were beneficial over all, they also caused sys- tematic errors: Participants sometimes neglected the costs and rewards along the paths to potential subgoals, leading them to make “unwarranted sacrifices” in the pursuit of the most valuable highlighted states. Overall, our results suggest that highlighting valuable future states with pseudo-rewards can help people make better decisions. More research is needed to understand what constitutes optimal subgoals and how to better assist people in selecting them.

RLDM Conference 2017 Conference Abstract

Mouselab-MDP: A new paradigm for tracing how people plan

Frederick Callaway
Falk Lieder
Paul Krueger

Planning is a latent cognitive process that cannot be observed directly. This makes it difficult to study how people plan. To address this problem, we propose a new paradigm for studying planning that provides experimenters with a timecourse of participant attention to information in the task environ- ment. This paradigm employs the information-acquisition mechanism of the Mouselab paradigm, in which participants click on options to reveal the outcome of choosing those options. However, in contrast to the original Mouselab paradigm, our paradigm is a sequential decision process, in which participants must plan multiple steps ahead to achieve high scores. We release Mouselab-MDP open-source as a plugin for the JsPsych online Psychology experiment library. The plugin displays a Markov decision process as a directed graph, which the participant navigates to maximize reward. To trace the the process of planning, the re- wards associated with states or actions are initially occluded; the participant has to click on a transition to reveal its reward. Thus, the participant makes explicit the states she considers in her information gathering behavior. We illustrate the utility of the Mouselab-MDP paradigm with a proof-of-concept experiment in which we trace the temporal dynamics of planning in a simple environment. Our data shed new light on peo- ple’s approximate planning strategies and on how people prune decision trees. We hope that the release of Mouselab-MDP will facilitate future research on human planning strategies. In particular, we hope that the fine-grained time course data the paradigm generates will be instrumental in specifying algorithms, tracking learning trajectories, and characterizing individual differences in human planning.