Author name cluster

Jonathan Cohen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers

1 author row

TMLR Journal 2022 Journal Article

A Self-Supervised Framework for Function Learning and Extrapolation

Simon Segert
Jonathan Cohen

Understanding how agents learn to generalize — and, in particular, to extrapolate — in high-dimensional, naturalistic environments remains a challenge for both machine learning and the study of biological agents. One approach to this has been the use of function learning paradigms, which allow agents’ empirical patterns of generalization for smooth scalar functions to be described precisely. However, to date, such work has not succeeded in identifying mechanisms that acquire the kinds of general purpose representations over which function learning can operate to exhibit the patterns of generalization observed in human empirical studies. Here, we present a framework for how a learner may acquire such representations, that then support generalization-and extrapolation in particular-in a few-shot fashion in the domain of scalar function learning. Taking inspiration from a classic theory of visual processing, we construct a self-supervised encoder that implements the basic inductive bias of invariance under topological distortions. We show the resulting representations outperform those from other models for unsupervised time series learning in several downstream function learning tasks, including extrapolation.

AAAI Conference 2020 Conference Paper

People Do Not Just Plan,They Plan to Plan

Mark Ho
David Abel
Jonathan Cohen
Michael Littman
Thomas Griffiths

Planning is useful. It lets people take actions that have desirable long-term consequences. But, planning is hard. It requires thinking about consequences, which consumes limited computational and cognitive resources. Thus, people should plan their actions, but they should also be smart about how they deploy resources used for planning their actions. Put another way, people should also “plan their plans”. Here, we formulate this aspect of planning as a meta-reasoning problem and formalize it in terms of a recursive Bellman objective that incorporates both task rewards and information-theoretic planning costs. Our account makes quantitative predictions about how people should plan and meta-plan as a function of the overall structure of a task, which we test in two experiments with human participants. We ﬁnd that people’s reaction times reﬂect a planned use of information processing, consistent with our account. This formulation of planning to plan provides new insight into the function of hierarchical planning, state abstraction, and cognitive control in both humans and machines.

RLDM Conference 2019 Conference Abstract

Evidence for a cost of cognitive control effect on foraging behavior

Laura A Bustamante
Allison Burton
Nathaniel Daw
Jonathan Cohen

Objective: Evidence suggests exerting cognitive control carries an intrinsic cost and that indi- vidual differences in subjective costs may account for differences in everyday control allocation. Previous studies have demonstrated individual differences in the subjective effort associated with engaging control but are limited in that the choices are explicit and may introduce experimenter demand characteristics, or the choice period is separated from the realization of the cognitive effort. We sought to build on this literature using a novel method to quantify individual differences in the cost of cognitive control that addresses these limitations. Methods: We designed a method for quantifying control costs using a patch foraging task in which participants (N=20) had to complete a control-demanding task (N-Back) to travel between patches. We predicted that participants would over-exploit a patch, yielding diminishing rewards, when performance of the more demanding 3-Back task vs. a 1-Back task was required to travel. We applied the Marginal Value Theorem to quantify how costly participants treated the 3-Back task based on their shift of exit threshold. Results: Most participants treated control as costly and exited later in the 3-Back condition. Control costs may be separable from error avoidance as there was no reliable correlation with N-Back task performance. Conclusions: Our results demonstrate that along with time costs, cognitive control registers as a cost in a patch foraging environment. Advantages of this design include that control costs can be measured implicitly and cost is expressed directly in terms of reward (money). Additionally reward and cost learning are expe- riential, and control allocation is an immediate consequence of choice. This measure can be used to explore the extent to which control costs are experienced and utilized in decisions about control differently across individuals.

AAMAS Conference 2019 Conference Paper

Power Indices for Team Reformation Planning Under Uncertainty

Jonathan Cohen
Abdel-Illah Mouaddib

This work is an attempt at solving the problem of decentralized team formation and reformation under uncertainty with partial observability. We describe a model coined Team-POMDP, derived from the standard Dec-POMDP model, and we propose an approach based on the computation of team power indices using the Elo rating system to determine the most fitting team of agents in every situation. We couple this to a Monte-Carlo Tree Search algorithm to efficiently compute joint policies.

NeurIPS Conference 2017 Conference Paper

A graph-theoretic approach to multitasking

Noga Alon
Daniel Reichman
Igor Shinkar
Tal Wagner
Sebastian Musslick
Jonathan Cohen
Tom Griffiths
Biswadip Dey

A key feature of neural network architectures is their ability to support the simultaneous interaction among large numbers of units in the learning and processing of representations. However, how the richness of such interactions trades off against the ability of a network to simultaneously carry out multiple independent processes -- a salient limitation in many domains of human cognition -- remains largely unexplored. In this paper we use a graph-theoretic analysis of network architecture to address this question, where tasks are represented as edges in a bipartite graph $G=(A \cup B, E)$. We define a new measure of multitasking capacity of such networks, based on the assumptions that tasks that \emph{need} to be multitasked rely on independent resources, i. e. , form a matching, and that tasks \emph{can} be performed without interference if they form an induced matching. Our main result is an inherent tradeoff between the multitasking capacity and the average degree of the network that holds \emph{regardless of the network architecture}. These results are also extended to networks of depth greater than $2$. On the positive side, we demonstrate that networks that are random-like (e. g. , locally sparse) can have desirable multitasking properties. Our results shed light into the parallel-processing limitations of neural systems and provide insights that may be useful for the analysis and design of parallel architectures.

RLDM Conference 2017 Conference Abstract

Exploring fixed-threshold and optimal policies in multi-alternative decision making

Michael Shvartsman
Vaibhav Srivastava
Jonathan Cohen

The dynamics of human and animal behavior within a perceptual decision made based on a single stationary stimulus are consistent with sequential statistical testing (e. g. Bogacz et al. 2006) instantiated as the discrete-time sequential probability ratio test (SPRT; Wald & Wolfowitz, 1948) or its continuous time analogue, the diffusion model (DDM; Ratcliff, 1978). In this simple domain, the SPRT/DDM with a fixed threshold is both reward-rate- and Bayes-optimal. However, in nonstationary or multihypothesis settings, these criteria need not be equivalent: fixed threshold policies are not optimal under either criterion, and there is no systematic framework to compute reward-rate-optimal policies (though cf. Mahadevan, 1996; Dayanik & Yu, 2013). Consequently, work on the dynamics of decisions over nonstationary stimuli or multiple choices has either explored Bayes-optimal policies by dynamic programming (e. g. Frazier & Yu, 2008; Drugowitsch et al. 2012) or used fixed threshold policies (e. g. McMillen & Holmes 2006, Norris 2009). We use our model of the dynamics of multi-stimulus decision making to explore the differences between fixed-threshold and Bayes-optimal policies in different tasks, exploiting the connections between Markov decision processes, Bayesian inference, and diffusion (e. g. Dayan & Daw, 2008) to do so. We show that even in simple tasks, predictions can depend on whether we assume the organism uses the fixed- threshold policy or the Bayes-optimal policy. Specifically, we show that different explanations for the flanker effect (Yu, et al. 2009; White et al. 2011) are normative under different choices of the action set and policy space. We additionally show that the Bayes-optimal policy makes the unusual prediction that as the posterior probability of some hypotheses drops due to evidence, the decision criterion for the remaining hypotheses should rise. We speculate that the intention superiority effect in prospective memory could be evidence of such a rise.

RLDM Conference 2017 Conference Abstract

Learning to (mis)allocate control: maltransfer can lead to self-control failure

Laura Bustamante
Falk Lieder
Sebastian Musslick
Amitai Shenhav
Jonathan Cohen

How do people learn when and how much control to allocate to which cognitive mechanism? A satisfactory answer to this question should account not only for people’s adaptive control strategies but also for common forms of self-control failure including the phenomenon that people sometimes engage in effortful controlled processing even when it harms performance relative to automatic alternatives. For example, a driver who focuses so much of their attention on solving a complex math problem that they fail to notice the traffic ahead of them. We propose that people transfer what they have learned about the value of control in a particular situation to other situations with similar features and formally express this in a computational model. We explore whether failures of self-control may result from maltransfer in learning a rational approximation of the optimal control policy prescribed by the Expected Value of Control theory. We designed a novel color-word Stroop paradigm where reward for a task performed on an incongruent stimulus is jointly determined by the color and meaning of the word. In an initial association phase” words and colors were reinforced for performing either color-naming (CN) or word-reading (WR). In a transfer phase” CN was rewarded when either the word or the color were previously associated with it (SINGLE trials) but when both the word and the color were associated with CN the correct response was WR (X trials). We vary the frequency of SINGLE trials from 0% to 50% and hypothesize participants would incorrectly transfer the control demand they experienced on SINGLE trials to X trials and consequently reduce their reward rate. Empirical data from 30 participants confirmed this hypothesis and supports the conclusion that maltransfer in learning about the value of control can mislead people to overexert cognitive control even when it hurts their performance.

RLDM Conference 2017 Conference Abstract

Mechanisms of Overharvesting in Patch Foraging

Gary Kane
Aaron Bornstein
Amitai Shenhav
Robert Wilson
Nathaniel Daw
Jonathan Cohen

Serial stay-or-search decisions are ubiquitous across many domains, including decisions regard- ing employment, relationships, and foraging for resources or information. Studies of animal foraging, in which animals decide to harvest depleting rewards contained within a patch or to leave the patch in search of a new, full one, have revealed a consistent bias towards overharvesting, or staying in patches longer than is predicted by optimal foraging theory (the Marginal Value Theorem; MVT). Yet, the cognitive biases that lead to overharvesting are poorly understood. We attempt to determine the cognitive biases that underlie overharvesting in rats. We characterized rat foraging behavior in response to two basic manipulations in patch foraging tasks: travel time between reward sources and depletion rate of the source; and to two novel manipulations to the foraging environment: proportional changes to the size of rewards and length of delays, and placement of delays (pre- vs. post-reward). In response to the basic manipulations, rats qualitatively followed predictions of MVT, but stayed in patches longer than is predicted. In the latter two manipulations, rats deviated from predictions of MVT, exhibiting changes in behavior not predicted by MVT. We formally tested whether four separate cognitive biases — subjective costs, decreasing marginal utility for reward, discounting of future reward, and ignoring post-reward delays — could explain overharvesting in the former two manipulations and deviations from MVT in the latter two. All the biases tested explained overharvest- ing behavior in the former contexts, but only one bias — in which rats ignore post-reward delays — also explained deviations from MVT in the latter contexts. Our results reveal that multiple cognitive biases may contribute to overharvesting, but inaccurate estimation of post-reward delays provided the best explanation across all contexts.

RLDM Conference 2015 Conference Abstract

A Drift Diffusion Model of Proactive and Reactive Control in a Context-Dependent Two- Alternative Forced Choice Task

Olga Lositsky
Robert Wilson
Michael Shvartsman
Jonathan Cohen

Most of our everyday decisions rely crucially on context: foraging for food in the fridge may be appropriate at home, but not at someone else’s house. Yet the mechanism by which context modulates how we respond to stimuli remains a topic of intense investigation. In order to isolate such decisions experi- mentally, investigators have employed simple context-based decision-making tasks like the AX-Continuous Performance Test (AX-CPT). In this task, the correct response to a probe stimulus depends on a cue stim- ulus that appeared several seconds earlier. It has been proposed (Braver, 2007) that humans might employ two strategies to perform this task: one in which rule information is proactively maintained in working memory, and another one in which rule information is retrieved reactively at the time of probe onset. While this framework has inspired considerable investigation, it has not yet been committed to a formal model. Such a model would be valuable for testing quantitative predictions about the influence of proactive and reactive strategies on choice and reaction time behavior. To this end, we have built a drift diffusion model of behavior on the AX-CPT, in which evidence accumulation about a stimulus is modulated by context. We implemented proactive and reactive strategies as two distinct models: in the proactive variant, perception of the probe is modulated by the remembered cue; in the reactive variant, retrieval of the cue from memory is modulated by the perceived probe. Fitting these models to data shows that, counter-intuitively, behavior taken as a signature of reactive control is better fit by the proactive variant of the model, while proactive pro- files of behavior are better fit by the reactive variant. We offer possible interpretations of this result, and use simulations to suggest experimental manipulations for which the two models make divergent predictions.

NeurIPS Conference 2015 Conference Paper

A Theory of Decision Making Under Dynamic Context

Michael Shvartsman
Vaibhav Srivastava
Jonathan Cohen

The dynamics of simple decisions are well understood and modeled as a class of random walk models (e. g. Laming, 1968; Ratcliff, 1978; Busemeyer and Townsend, 1993; Usher and McClelland, 2001; Bogacz et al. , 2006). However, most real-life decisions include a rich and dynamically-changing influence of additional information we call context. In this work, we describe a computational theory of decision making under dynamically shifting context. We show how the model generalizes the dominant existing model of fixed-context decision making (Ratcliff, 1978) and can be built up from a weighted combination of fixed-context decisions evolving simultaneously. We also show how the model generalizes re- cent work on the control of attention in the Flanker task (Yu et al. , 2009). Finally, we show how the model recovers qualitative data patterns in another task of longstanding psychological interest, the AX Continuous Performance Test (Servan-Schreiber et al. , 1996), using the same model parameters.

RLDM Conference 2015 Conference Abstract

Directed and random exploration in realistic environments

Paul Krueger
Alexandria Oliver
Jonathan Cohen
Robert Wilson

Many everyday decisions involve a tradeoff between exploiting well-known options and explor- ing lesser-known options in hopes of a better outcome. Our previous work has shown that humans use at least two strategies to address this dilemma: directed exploration, driven by information-seeking, and random exploration, driven by decision noise. However, in the interest of simplicity, our task had two arti- ficial constraints—explicit cues for previous outcomes and numeric rewards—that are often not present in real-world decisions. In the current study, we relaxed these constraints to test whether our previous finding hold true in more ecologically valid situations. Our first experiment removed cues for previous outcomes while still using numeric rewards, requiring participants to use working memory to track past outcomes. Experiment 2 went further and also presented rewards as patches of green dots instead of numbers, with more dense patches of green corresponding to higher reward. In all conditions, we replicated our previous findings thus showing that both directed and random exploration are robust across a variety of conditions. Poster T40*: Choice reflexes in the rodent habit system Aaron Gruber*, University of Lethbridge; Ali Mashhoori, University of Lethbridge; Rajat Thapa, University of Lethbridge We examined the neural mechanisms by which rats rapidly adjust choices following reward omission. Animals often employ a ‘lose-switch’ strategy in which they switch responses following reward omission. We surprisingly found that such responding was greatly reduced following lesions of the dor- solateral striatum (DLS), a brain region hypothesized to be involved in the gradual formation of habits. Moreover, we found that a modified Q-learning model better fit behavioural data from the DLS-lesioned animals than controls or animals with lesions of dorsomedial striatum (DMS), a region associated with ‘goal-directed’ responding. The model-based analysis revealed that animals with striatal lesions, particu- larly of the DLS, had blunted reward sensitivity and less stochasticity in the choice mechanism. Subsequent experiments showed that lose-switch responding was reduced by systemic administration of amphetamine, or by infusion of agonists for D2 type dopamine receptors in the DLS (but not into DMS). These data reveal that the DLS is able to drive rapid switches following reward omission (< 15 seconds) via inactivation of D2 receptors by periods of low dopamine (negative reward prediction error signal). We propose that this serves as a ‘choice reflex’ following errors that prevents animals from repeating mistakes while other behavioural control systems update expected action/state values.

RLDM Conference 2015 Conference Abstract

Humans tradeoff information seeking and randomness in explore-exploit decisions

Robert Wilson
Jonathan Cohen

The explore-exploit dilemma occurs when we must choose between exploring options that yield information (potentially useful for the future) and exploiting options that yield known reward (certain to be useful right now). We have previously shown that humans use two distinct strategies for resolving this dilemma: the optimal-but-complex ‘directed exploration‘ in which choices are biased towards information, and the suboptimal-but-simple ‘random exploration’ in which choice variability leads to exploration by chance. Here we ask how these two strategies interact. We find that humans exhibit a tradeoff between these two forms of exploration, with higher levels of directed exploration associated with lower random exploration and vice versa. This directed-random tradeoff is described remarkably well by a parameter-free optimal theory that accurately captures individual differences between participants, as well as adjustments by individuals in response to simple experimental manipulations. These results show that humans combine information seeking and randomness in a rational way to solve the explore-exploit dilemma.

RLDM Conference 2015 Conference Abstract

Strategies for exploration in the domain of losses

Paul Krueger
Robert Wilson
Jonathan Cohen

In everyday life, many decisions involve choosing between familiar options (exploiting) and unfamiliar options (exploring). On average, exploiting yields good results but tells you nothing new, while exploring yields information but at a cost of uncertain and often worse outcomes. In previous work we have shown that a key factor in these ‘explore-exploit’ choices is the number of future decisions that people will make, the ‘time horizon’. As this horizon gets longer, people are more likely to explore, because acquiring information is useful for making future choices. Moreover, we found that this exploration is effected with two distinct strategies: directed exploration, in which an ‘information bonus’ that grows with horizon explicitly biases subjects to explore, and random exploration, in which increasing ‘decision noise’ drives exploration by chance. However, this, as well as most other previous work on the explore-exploit dilemma, focused on decisions in the domain of gains where the goal was to maximize reward. In many real- world decisions, however, the primary objective is to minimize losses and it is well known that humans can behave differently in this domain. In this study, we compared explore-exploit behavior of human participants under conditions of gain and loss. We found that people use both directed and random exploration regardless of whether they are exploring in response to gains or losses and that there is quantitative agreement between the exploration parameters across domains. Our model also revealed an overall bias towards exploration in the domain of losses that did not change with horizon. This seems to reflect an overall bias towards the uncertain outcomes in the domain of losses. Taken together, our results show that explore-exploit decisions in the domain of losses are driven by three independent processes: a baseline bias toward the uncertain option, and directed and random exploration.

RLDM Conference 2013 Conference Abstract

Exploration strategies in human decision making

Robert Wilson
Andra Geana
John White
Elliot Ludvig
Jonathan Cohen

The tradeoff between pursuing a known reward (exploitation) and sampling unknown, potential- ly better opportunities (exploration) is a fundamental challenge faced by all adaptive organisms. Theories formalize the value of exploration (gathering information) as an information bonus. However, this may be difficult to compute; a simpler alternative is to increase decision noise, driving random exploration. Relative- ly few studies have characterized human exploratory behavior, and most have failed to find an information bonus, suggesting it relies entirely on random exploration. However, these previous studies have either con- founded reward and information or failed to account for baseline levels ambiguity aversion and decision noise. To overcome these limitations, we conducted a sequential choice task that independently manipulated reward, information, and number of choices. Contrary to previous work, we found that humans do show an information bonus when given the opportunity to explore. In addition we found adaptive changes in decision noise consistent with a type of random exploration that is subject to cognitive control.

RLDM Conference 2013 Conference Abstract

Reward, Risk and Ambiguity in Human Exploration: A Wheel of Fortune Task

Andra Geana
Robert Wilson
Jonathan Cohen

In realistic environments, organisms are frequently faced with multiple resource alternatives, and must balance the tradeoff between pursuing the known options (exploitation), and searching the environment for unknown opportunities (exploration). Exploration can be most beneficial in the presence of environmen- tal uncertainty - when the range and benefits of all reward options are not fully known, exploration can lead to the discovery of new, better resources and an ultimately higher overall reward. However, uncertainty can take many forms, and it is unclear how different types of uncertainty impact people’s exploratory behaviour. We used a ‘wheel of fortune’ task to separate two well-established sources of uncertainty: risk (when out- comes are stochastic, but the probabilities of outcomes are known) and ambiguity (when the probabilities and/or the outcomes are unknown), and examine how they impact exploration. The results suggest that the presence of ambiguity in the environment drives people to explore in order to acquire more information and reduce the ambiguity. Conversely, a higher risk level in the environment increases exploration by increasing decision noise and making people less sensitive to the reward values of the available options. We examined these effects under two different decision horizons, and found that ambiguity-, and not risk-related explo- ration increases with decision horizon. These findings imply that different sources of uncertainty impact exploration differently, and may shed light on the mechanisms behind two distinguishable types of explo- ration that have been previously identified: random (characterized by an increase in decision noise) and directed (information-seeking) exploration.

NeurIPS Conference 2008 Conference Paper

Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement

Michael Todd
Yael Niv
Jonathan Cohen

Working memory is a central topic of cognitive neuroscience because it is critical for solving real world problems in which information from multiple temporally distant sources must be combined to generate appropriate behavior. However, an often neglected fact is that learning to use working memory effectively is itself a difficult problem. The Gating" framework is a collection of psychological models that show how dopamine can train the basal ganglia and prefrontal cortex to form useful working memory representations in certain types of problems. We bring together gating with ideas from machine learning about using finite memory systems in more general problems. Thus we present a normative Gating model that learns, by online temporal difference methods, to use working memory to maximize discounted future rewards in general partially observable settings. The model successfully solves a benchmark working memory problem, and exhibits limitations similar to those observed in human experiments. Moreover, the model introduces a concise, normative definition of high level cognitive concepts such as working memory and cognitive control in terms of maximizing discounted future rewards. "

NeurIPS Conference 2008 Conference Paper

Sequential effects: Superstition or rational behavior?

Angela Yu
Jonathan Cohen

In a variety of behavioral tasks, subjects exhibit an automatic and apparently sub-optimal sequential effect: they respond more rapidly and accurately to a stimulus if it reinforces a local pattern in stimulus history, such as a string of repetitions or alternations, compared to when it violates such a pattern. This is often the case even if the local trends arise by chance in the context of a randomized design, such that stimulus history has no predictive power. In this work, we use a normative Bayesian framework to examine the hypothesis that such idiosyncrasies may reflect the inadvertent engagement of fundamental mechanisms critical for adapting to changing statistics in the natural environment. We show that prior belief in non-stationarity can induce experimentally observed sequential effects in an otherwise Bayes-optimal algorithm. The Bayesian algorithm is shown to be well approximated by linear-exponential filtering of past observations, a feature also apparent in the behavioral data. We derive an explicit relationship between the parameters and computations of the exact Bayesian algorithm and those of the approximate linear-exponential filter. Since the latter is equivalent to a leaky-integration process, a commonly used model of neuronal dynamics underlying perceptual decision-making and trial-to-trial dependencies, our model provides a principled account of why such dynamics are useful. We also show that near-optimal tuning of the leaky-integration process is possible, using stochastic gradient descent based only on the noisy binary inputs. This is a proof of concept that not only can neurons implement near-optimal prediction based on standard neuronal dynamics, but that they can also learn to tune the processing parameters without explicitly representing probabilities.

NeurIPS Conference 2005 Conference Paper

An exploration-exploitation model based on norepinepherine and dopamine activity

Samuel McClure
Mark Gilzenrat
Jonathan Cohen

We propose a model by which dopamine (DA) and norepinepherine (NE) combine to alternate behavior between relatively exploratory and exploitative modes. The model is developed for a target detection task for which there is extant single neuron recording data available from locus coeruleus (LC) NE neurons. An exploration-exploitation trade-off is elicited by regularly switching which of the two stimuli are rewarded. DA functions within the model to change synaptic weights according to a reinforcement learning algorithm. Exploration is mediated by the state of LC firing, with higher tonic and lower phasic activity producing greater response variability. The opposite state of LC function, with lower baseline firing rate and greater phasic responses, favors exploitative behavior. Changes in LC firing mode result from combined measures of response conflict and reward rate, where response conflict is monitored using models of anterior cingulate cortex (ACC). Increased long-term response conflict and decreased reward rate, which occurs following reward contingency switch, favors the higher tonic state of LC function and NE release. This increases exploration, and facilitates discovery of the new target.

YNIMG Journal 2001 Journal Article

Conflict and the evaluative functions of the anterior cingulate: Converging evidence from event-related fMRI and high density ERP

Cameron Carter
Vincent van Veen
Matthew Botvinick
Jonathan Cohen
V. Andrew Stenger

YNIMG Journal 2001 Journal Article

The role of anterior cingulate cortex in performance monitoring

Nick Yeung
Jack Gelfand
Mike Scanlon
Jonathan Cohen

YNIMG Journal 2000 Journal Article

Functional double dissociation of dorsolateral prefrontal cortex and anterior cingulate cortex in cognitive control

Angus MacDonald
Jonathan Cohen
V. Andrew Stenger
Cameron Carter

YNIMG Journal 2000 Journal Article

Special Issue

Jonathan Cohen

YNIMG Journal 1996 Journal Article

Anterior cingulate gyrus dysfunction and attentional pathology in schizophrenia

Cameron Carter
Mark Mintun
Jonathan Cohen
Thomas Nichols
Marybeth Wiseman

NeurIPS Conference 1994 Conference Paper

A Computational Model of Prefrontal Cortex Function

Todd Braver
Jonathan Cohen
David Servan-Schreiber

Accumulating data from neurophysiology and neuropsychology have suggested two information processing roles for prefrontal cor(cid: 173) tex (PFC): 1) short-term active memory; and 2) inhibition. We present a new behavioral task and a computational model which were developed in parallel. The task was developed to probe both of these prefrontal functions simultaneously, and produces a rich set of behavioral data that act as constraints on the model. The model is implemented in continuous-time, thus providing a natural framework in which to study the temporal dynamics of processing in the task. We show how the model can be used to examine the be(cid: 173) havioral consequences of neuromodulation in PFC. Specifically, we use the model to make novel and testable predictions regarding the behavioral performance of schizophrenics, who are hypothesized to suffer from reduced dopaminergic tone in this brain area.

NeurIPS Conference 1989 Conference Paper

The Effect of Catecholamines on Performance: From Unit to System Behavior

David Servan-Schreiber
Harry Printz
Jonathan Cohen

At the level of individual neurons. catecholamine release increases the responsivity of cells to excitatory and inhibitory inputs. We present a model of catecholamine effects in a network of neural-like elements. We argue that changes in the responsivity of individual elements do not affect their ability to detect a signal and ignore noise. However. the same changes in cell responsivity in a network of such elements do improve the signal detection performance of the network as a whole. We show how this result can be used in a computer simulation of behavior to account for the effect of eNS stimulants on the signal detection performance of human subjects.