Author name cluster

Thomas Griffiths

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

38 papers

1 author row

AAAI Conference 2020 Conference Paper

People Do Not Just Plan,They Plan to Plan

Mark Ho
David Abel
Jonathan Cohen
Michael Littman
Thomas Griffiths

Planning is useful. It lets people take actions that have desirable long-term consequences. But, planning is hard. It requires thinking about consequences, which consumes limited computational and cognitive resources. Thus, people should plan their actions, but they should also be smart about how they deploy resources used for planning their actions. Put another way, people should also “plan their plans”. Here, we formulate this aspect of planning as a meta-reasoning problem and formalize it in terms of a recursive Bellman objective that incorporates both task rewards and information-theoretic planning costs. Our account makes quantitative predictions about how people should plan and meta-plan as a function of the overall structure of a task, which we test in two experiments with human participants. We ﬁnd that people’s reaction times reﬂect a planned use of information processing, consistent with our account. This formulation of planning to plan provides new insight into the function of hierarchical planning, state abstraction, and cognitive control in both humans and machines.

RLDM Conference 2019 Conference Abstract

A cognitive tutor for helping people overcome present bias

Falk Lieder
Frederick Callaway
Yash Raj Jain
Paul M Krueger
Priyam Das
Sayan Gul
Thomas Griffiths

People’s reliance on suboptimal heuristics gives rise to a plethora of cognitive biases in decision- making including the present bias, which denotes people’s tendency to be overly swayed by an action’s immediate costs/benefits rather than its more important long-term consequences. One approach to helping people overcome such biases is to teach them better decision strategies. But which strategies should we teach them? And how can we teach them effectively? Here, we leverage an automatic method for discovering rational heuristics and insights into how people acquire cognitive skills to develop an intelligent tutor that teaches people how to make better decisions. As a proof of concept, we derive the optimal planning strategy for a simple model of situations where people fall prey to the present bias. Our cognitive tutor teaches people this optimal planning strategy by giving them metacognitive feedback on how they plan in a 3-step sequential decision-making task. Our tutor’s feedback is designed to maximally accelerate people’s metacognitive reinforcement learning towards the optimal planning strategy. A series of four experiments confirmed that training with the cognitive tutor significantly reduced present bias and improved people’s decision-making competency: Experiment 1 demonstrated that the cognitive tutor’s feedback can help participants discover far-sighted planning strategies. Experiment 2 found that this training effect transfers to more complex environments. Experiment 3 found that these transfer effects are retained for at least 24 hours after the training. Finally, Experiment 4 found that practicing with the cognitive tutor can have additional benefits over being told the strategy in words. The results suggest that promoting metacognitive reinforcement learning with optimal feedback is a promising approach to improving the human mind.

RLDM Conference 2019 Conference Abstract

Attention in value-based choice as optimal sequential sampling

Frederick Callaway
Thomas Griffiths

When faced with a decision between several options, people rarely fully consider every alter- native. Instead, we direct our attention to the most promising candidates, focusing our limited cognitive resources on evaluating the options that we are most likely to choose. A growing body of empirical work has shown that attention plays an important role in human decision making, but it is still unclear how people choose with option to attend to at each moment in the decision making process. In this paper, we present an analysis of how a rational decision maker should allocate her attention. We cast attention allocation in decision making as a sequential sampling problem, in which the decision maker iteratively selects from which distribution to sample in order to update her beliefs about the values of the available alternatives. By approximating the optimal solution to this problem, we derive a model in which both the selection and integration of evidence are rational. This model predicts choices and reaction times, as well as sequences of visual fixations. Applying the model to a ternary-choice dataset, we find that its predictions align well with human data.

RLDM Conference 2019 Conference Abstract

Compositional subgoal representations

Carlos G Correa
Frederick Callaway
Thomas Griffiths

When faced with a complex problem, people naturally break it up into several simpler problems. This hierarchical decomposition of an ultimate goal into sub-goals facilitates planning by reducing the num- ber of factors that must be considered at one time. However, it can also lead to suboptimal decision-making, obscuring opportunities to make progress towards multiple subgoals with a single action. Is it possible to take advantage of the hierarchical structure of problems without sacrificing opportunities to kill two birds with one stone? We propose that people are able to do this by representing and pursuing multiple subgoals at once. We present a formal model of planning with compositional goals, and show that it explains human behavior better than the standard “one-at-a-time” subgoal model as well as non-hierarchical limited-depth search models. Our results suggest that people are capable of representing and pursuing multiple subgoals at once; however, there are limitations on how many subgoals one can pursue concurrently. We find that these limitations vary by individual.

AAAI Conference 2017 Conference Paper

When Does Bounded-Optimal Metareasoning Favor Few Cognitive Systems?

Smitha Milli
Falk Lieder
Thomas Griffiths

While optimal metareasoning is notoriously intractable, humans are nonetheless able to adaptively allocate their computational resources. A possible approximation that humans may use to do this is to only metareason over a ﬁnite set of cognitive systems that perform variable amounts of computation. The highly inﬂuential “dualprocess” accounts of human cognition, which postulate the coexistence of a slow accurate system with a fast error-prone system, can be seen as a special case of this approximation. This raises two questions: how many cognitive systems should a bounded optimal agent be equipped with and what characteristics should those systems have? We investigate these questions in two settings: a one-shot decision between two alternatives, and planning under uncertainty in a Markov decision process. We ﬁnd that the optimal number of systems depends on the variability of the environment and the costliness of metareasoning. Consistent with dual-process theories, we also ﬁnd that when having two systems is optimal, then the ﬁrst system is fast but error-prone and the second system is slow but accurate.

RLDM Conference 2015 Conference Abstract

Model-based strategy selection learning

Falk Lieder
Thomas Griffiths

Humans possess a repertoire of decision strategies. This raises the question how we decide how to decide. Behavioral experiments suggest that the answer includes metacognitive reinforcement learning: rewards reinforce not only our behavior but also the cognitive processes that lead to it. Previous theories of strategy selection, namely SSL and RELACS, assumed that model-free reinforcement learning identifies the cognitive strategy that works best on average across all problems in the environment. Here we explore the alternative: model-based reinforcement learning about how the differential effectiveness of cognitive strategies depends on the features of individual problems. Our theory posits that people learn a predictive model of each strategy’s accuracy and execution time and choose strategies according to their predicted speed-accuracy tradeoff for the problem to be solved. We evaluate our theory against previous accounts by fitting published data on multi-attribute decision making, conducting a novel experiment, and demonstrating that our theory can account for people’s adaptive flexibility in risky choice. We find that while SSL and RELACS are sufficient to explain people’s ability to adapt to a homogeneous environment in which all decision problems are of the same type, model-based strategy selection learning can also explain people’s ability to adapt to heterogeneous environments and flexibly switch to a different decision-strategy when the situation changes.

NeurIPS Conference 2012 Conference Paper

Human memory search as a random walk in a semantic network

Joseph Austerweil
Joshua Abbott
Thomas Griffiths

The human mind has a remarkable ability to store a vast amount of information in memory, and an even more remarkable ability to retrieve these experiences when needed. Understanding the representations and algorithms that underlie human memory search could potentially be useful in other information retrieval settings, including internet search. Psychological studies have revealed clear regularities in how people search their memory, with clusters of semantically related items tending to be retrieved together. These findings have recently been taken as evidence that human memory search is similar to animals foraging for food in patchy environments, with people making a rational decision to switch away from a cluster of related information as it becomes depleted. We demonstrate that the results that were taken as evidence for this account also emerge from a random walk on a semantic network, much like the random web surfer model used in internet search engines. This offers a simpler and more unified account of how people search their memory, postulating a single process rather than one process for exploring a cluster and one process for switching between clusters.

AAAI Conference 2011 Conference Paper

A Nonparametric Bayesian Model of Multi-Level Category Learning

Kevin Canini
Thomas Griffiths

Categories are often organized into hierarchical taxonomies, that is, tree structures where each node represents a labeled category, and a node’s parent and children are, respectively, the category’s supertype and subtypes. A natural question is whether it is possible to reconstruct category taxonomies in cases where we are not given explicit information about how categories are related to each other, but only a sample of observations of the members of each category. In this paper, we introduce a nonparametric Bayesian model of multi-level category learning, an extension of the hierarchical Dirichlet process (HDP) that we call the tree-HDP. We demonstrate the ability of the tree-HDP to reconstruct simulated datasets of artiﬁcial taxonomies, and show that it produces similar performance to human learners on a taxonomy inference task.

NeurIPS Conference 2011 Conference Paper

A rational model of causal inference with continuous causes

Thomas Griffiths
Michael James

Rational models of causal induction have been successful in accounting for people's judgments about the existence of causal relationships. However, these models have focused on explaining inferences from discrete data of the kind that can be summarized in a 2 ✕ 2 contingency table. This severely limits the scope of these models, since the world often provides non-binary data. We develop a new rational model of causal induction using continuous dimensions, which aims to diminish the gap between empirical and theoretical approaches and real-world causal induction. This model successfully predicts human judgments from previous studies better than models of discrete causal inference, and outperforms several other plausible models of causal induction with continuous causes in accounting for people's inferences in a new experiment.

NeurIPS Conference 2011 Conference Paper

An ideal observer model for identifying the reference frame of objects

Joseph Austerweil
Abram Friesen
Thomas Griffiths

The object people perceive in an image can depend on its orientation relative to the scene it is in (its reference frame). For example, the images of the symbols $\times$ and $+$ differ by a 45 degree rotation. Although real scenes have multiple images and reference frames, psychologists have focused on scenes with only one reference frame. We propose an ideal observer model based on nonparametric Bayesian statistics for inferring the number of reference frames in a scene and their parameters. When an ambiguous image could be assigned to two conflicting reference frames, the model predicts two factors should influence the reference frame inferred for the image: The image should be more likely to share the reference frame of the closer object ({\em proximity}) and it should be more likely to share the reference frame containing the most objects ({\em alignment}). We confirm people use both cues using a novel methodology that allows for easy testing of human reference frame inference.

NeurIPS Conference 2011 Conference Paper

Testing a Bayesian Measure of Representativeness Using a Large Image Database

Joshua Abbott
Katherine Heller
Zoubin Ghahramani
Thomas Griffiths

How do people determine which elements of a set are most representative of that set? We extend an existing Bayesian measure of representativeness, which indicates the representativeness of a sample from a distribution, to define a measure of the representativeness of an item to a set. We show that this measure is formally related to a machine learning method known as Bayesian Sets. Building on this connection, we derive an analytic expression for the representativeness of objects described by a sparse vector of binary features. We then apply this measure to a large database of images, using it to determine which images are the most representative members of different sets. Comparing the resulting predictions to human judgments of representativeness provides a test of this measure with naturalistic stimuli, and illustrates how databases that are more commonly used in computer vision and machine learning can be used to evaluate psychological theories.

NeurIPS Conference 2010 Conference Paper

Learning invariant features using the Transformed Indian Buffet Process

Joseph Austerweil
Thomas Griffiths

Identifying the features of objects becomes a challenge when those features can change in their appearance. We introduce the Transformed Indian Buffet Process (tIBP), and use it to define a nonparametric Bayesian model that infers features that can transform across instantiations. We show that this model can identify features that are location invariant by modeling a previous experiment on human feature learning. However, allowing features to transform adds new kinds of ambiguity: Are two parts of an object the same feature with different transformations or two unique features? What transformations can features undergo? We present two new experiments in which we explore how people resolve these questions, showing that the tIBP model demonstrates a similar sensitivity to context to that shown by human learners when determining the invariant aspects of features.

NeurIPS Conference 2009 Conference Paper

Differential Use of Implicit Negative Evidence in Generative and Discriminative Language Learning

Anne Hsu
Thomas Griffiths

A classic debate in cognitive science revolves around understanding how children learn complex linguistic rules, such as those governing restrictions on verb alternations, without negative evidence. Traditionally, formal learnability arguments have been used to claim that such learning is impossible without the aid of innate language-specific knowledge. However, recently, researchers have shown that statistical models are capable of learning complex rules from only positive evidence. These two kinds of learnability analyses differ in their assumptions about the role of the distribution from which linguistic input is generated. The former analyses assume that learners seek to identify grammatical sentences in a way that is robust to the distribution from which the sentences are generated, analogous to discriminative approaches in machine learning. The latter assume that learners are trying to estimate a generative model, with sentences being sampled from that model. We show that these two learning approaches differ in their use of implicit negative evidence -- the absence of a sentence -- when learning verb alternations, and demonstrate that human learners can produce results consistent with the predictions of both approaches, depending on the context in which the learning problem is presented.

NeurIPS Conference 2009 Conference Paper

Neural Implementation of Hierarchical Bayesian Inference by Importance Sampling

Lei Shi
Thomas Griffiths

The goal of perception is to infer the hidden states in the hierarchical process by which sensory data are generated. Human behavior is consistent with the optimal statistical solution to this problem in many tasks, including cue combination and orientation detection. Understanding the neural mechanisms underlying this behavior is of particular importance, since probabilistic computations are notoriously challenging. Here we propose a simple mechanism for Bayesian inference which involves averaging over a few feature detection neurons which fire at a rate determined by their similarity to a sensory stimulus. This mechanism is based on a Monte Carlo method known as importance sampling, commonly used in computer science and statistics. Moreover, a simple extension to recursive importance sampling can be used to perform hierarchical Bayesian inference. We identify a scheme for implementing importance sampling with spiking neurons, and show that this scheme can account for human behavior in cue combination and oblique effect.

NeurIPS Conference 2009 Conference Paper

Nonparametric Latent Feature Models for Link Prediction

Kurt Miller
Michael Jordan
Thomas Griffiths

As the availability and importance of relational data -- such as the friendships summarized on a social networking website -- increases, it becomes increasingly important to have good models for such data. The kinds of latent structure that have been considered for use in predicting links in such networks have been relatively limited. In particular, the machine learning community has focused on latent class models, adapting nonparametric Bayesian methods to jointly infer how many latent classes there are while learning which entities belong to each class. We pursue a similar approach with a richer kind of latent variable -- latent features -- using a nonparametric Bayesian technique to simultaneously infer the number of features at the same time we learn which entities have each feature. The greater expressiveness of this approach allows us to improve link prediction on three datasets.

NeurIPS Conference 2008 Conference Paper

A rational model of preference learning and choice prediction by children

Christopher Lucas
Thomas Griffiths
Fei Xu
Christine Fawcett

Young children demonstrate the ability to make inferences about the preferences of other agents based on their choices. However, there exists no overarching account of what children are doing when they learn about preferences or how they use that knowledge. We use a rational model of preference learning, drawing on ideas from economics and computer science, to explain the behavior of children in several recent experiments. Specifically, we show how a simple econometric model can be extended to capture two- to four-year-oldsâ use of statistical information in inferring preferences, and their generalization of these preferences.

NeurIPS Conference 2008 Conference Paper

Analyzing human feature learning as nonparametric Bayesian inference

Thomas Griffiths
Joseph Austerweil

Almost all successful machine learning algorithms and cognitive models require powerful representations capturing the features that are relevant to a particular problem. We draw on recent work in nonparametric Bayesian statistics to define a rational model of human feature learning that forms a featural representation from raw sensory data without pre-specifying the number of features. By comparing how the human perceptual system and our rational model use distributional and category information to infer feature representations, we seek to identify some of the forces that govern the process by which people separate and combine sensory primitives to form features.

NeurIPS Conference 2008 Conference Paper

How memory biases affect information transmission: A rational analysis of serial reproduction

Jing Xu
Thomas Griffiths

Many human interactions involve pieces of information being passed from one person to another, raising the question of how this process of information transmission is affected by the capacities of the agents involved. In the 1930s, Sir Frederic Bartlett explored the influence of memory biases in âserial reproductionâ of information, in which one personâs reconstruction of a stimulus from memory becomes the stimulus seen by the next person. These experiments were done using relatively uncontrolled stimuli such as pictures and stories, but suggested that serial reproduction would transform information in a way that reflected the biases inherent in memory. We formally analyze serial reproduction using a Bayesian model of reconstruction from memory, giving a general result characterizing the effect of memory biases on information transmission. We then test the predictions of this account in two experiments using simple one-dimensional stimuli. Our results provide theoretical and empirical justification for the idea that serial reproduction reflects memory biases.

NeurIPS Conference 2008 Conference Paper

Modeling human function learning with Gaussian processes

Thomas Griffiths
Chris Lucas
Joseph Williams
Michael Kalish

Accounts of how people learn functional relationships between continuous variables have tended to focus on two possibilities: that people are estimating explicit functions, or that they are simply performing associative learning supported by similarity. We provide a rational analysis of function learning, drawing on work on regression in machine learning and statistics. Using the equivalence of Bayesian linear regression and Gaussian processes, we show that learning explicit rules and using similarity can be seen as two views of one solution to this problem. We use this insight to define a Gaussian process model of human function learning that combines the strengths of both approaches.

NeurIPS Conference 2008 Conference Paper

Modeling the effects of memory on human online sentence processing with particle filters

Roger Levy
Florencia Reali
Thomas Griffiths

Language comprehension in humans is significantly constrained by memory, yet rapid, highly incremental, and capable of utilizing a wide range of contextual information to resolve ambiguity and form expectations about future input. In contrast, most of the leading psycholinguistic models and fielded algorithms for natural language parsing are non-incremental, have run time superlinear in input length, and/or enforce structural locality constraints on probabilistic dependencies between events. We present a new limited-memory model of sentence comprehension which involves an adaptation of the particle filter, a sequential Monte Carlo method, to the problem of incremental parsing. We show that this model can reproduce classic results in online sentence comprehension, and that it naturally provides the first rational account of an outstanding problem in psycholinguistics, in which the preferred alternative in a syntactic ambiguity seems to grow more attractive over time even in the absence of strong disambiguating information.

NeurIPS Conference 2007 Conference Paper

A Probabilistic Approach to Language Change

Alexandre Bouchard-Côté
Percy Liang
Dan Klein
Thomas Griffiths

We present a probabilistic approach to language change in which word forms are represented by phoneme sequences that undergo stochastic edits along the branches of a phylogenetic tree. Our framework combines the advantages of the classical comparative method with the robustness of corpus-based probabilistic models. We use this framework to explore the consequences of two different schemes for defining probabilistic models of phonological change, evaluating these schemes using the reconstruction of ancient word forms in Romance languages. The result is an efficient inference procedure for automatically inferring ancient word forms from modern languages, which can be generalized to support inferences about linguistic phylogenies.

NeurIPS Conference 2007 Conference Paper

Markov Chain Monte Carlo with People

Adam Sanborn
Thomas Griffiths

Many formal models of cognition implicitly use subjective probability distributions to capture the assumptions of human learners. Most applications of these models determine these distributions indirectly. We propose a method for directly determining the assumptions of human learners by sampling from subjective probability distributions. Using a correspondence between a model of human choice and Markov chain Monte Carlo (MCMC), we describe a method for sampling from the distributions over objects that people associate with different categories. In our task, subjects choose whether to accept or reject a proposed change to an object. The task is constructed so that these decisions follow an MCMC acceptance rule, defining a Markov chain for which the stationary distribution is the category distribution. We test this procedure for both artificial categories acquired in the laboratory, and natural categories acquired from experience.

NeurIPS Conference 2006 Conference Paper

A Nonparametric Bayesian Method for Inferring Features From Similarity Judgments

Daniel Navarro
Thomas Griffiths

The additive clustering model is widely used to infer the features of a set of stimuli from their similarities, on the assumption that similarity is a weighted linear function of common features. This paper develops a fully Bayesian formulation of the additive clustering model, using methods from nonparametric Bayesian statistics to allow the number of features to vary. We use this to explore several approaches to parameter estimation, showing that the nonparametric Bayesian approach provides a straightforward way to obtain estimates of both the number of features used in producing similarity judgments and their importance.

NeurIPS Conference 2006 Conference Paper

Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models

Mark Johnson
Thomas Griffiths
Sharon Goldwater

This paper introduces adaptor grammars, a class of probabilistic models of lan- guage that generalize probabilistic context-free grammars (PCFGs). Adaptor grammars augment the probabilistic rules of PCFGs with “adaptors” that can in- duce dependencies among successive uses. With a particular choice of adaptor, based on the Pitman-Yor process, nonparametric Bayesian models of language using Dirichlet processes and hierarchical Dirichlet processes can be written as simple grammars. We present a general-purpose inference algorithm for adaptor grammars, making it easy to deﬁne and use such models, and illustrate how several existing nonparametric Bayesian models can be expressed within this framework.

AAAI Conference 2006 Conference Paper

Learning Systems of Concepts with an Infinite Relational Model

Charles Kemp
Thomas Griffiths

Relationships between concepts account for a large proportion of semantic knowledge. We present a nonparametric Bayesian model that discovers systems of related concepts. Given data involving several sets of entities, our model discovers the kinds of entities in each set and the relations between kinds that are possible or likely. We apply our approach to four problems: clustering objects and features, learning ontologies, discovering kinship systems, and discovering structure in political data.

NeurIPS Conference 2006 Conference Paper

Particle Filtering for Nonparametric Bayesian Matrix Factorization

Frank Wood
Thomas Griffiths

Many unsupervised learning problems can be expressed as a form of matrix factorization, reconstructing an observed data matrix as the product of two matrices of latent variables. A standard challenge in solving these problems is determining the dimensionality of the latent matrices. Nonparametric Bayesian matrix factorization is one way of dealing with this challenge, yielding a posterior distribution over possible factorizations of unbounded dimensionality. A drawback to this approach is that posterior estimation is typically done using Gibbs sampling, which can be slow for large problems and when conjugate priors cannot be used. As an alternative, we present a particle filter for posterior estimation in nonparametric Bayesian matrix factorization models. We illustrate this approach with two matrix factorization models and show favorable performance relative to Gibbs sampling.

NeurIPS Conference 2005 Conference Paper

Infinite latent feature models and the Indian buffet process

Zoubin Ghahramani
Thomas Griffiths

We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution is suitable for use as a prior in probabilistic models that represent objects using a potentially infinite array of features. We identify a simple generative process that results in the same distribution over equivalence classes, which we call the Indian buffet process. We illustrate the use of this distribution as a prior in an infinite latent feature model, deriving a Markov chain Monte Carlo algorithm for inference in this model and applying the algorithm to an image dataset.

NeurIPS Conference 2005 Conference Paper

Interpolating between types and tokens by estimating power-law generators

Sharon Goldwater
Mark Johnson
Thomas Griffiths

Standard statistical models of language fail to capture one of the most striking properties of natural languages: the power-law distribution in the frequencies of word tokens. We present a framework for developing statistical models that generically produce power-laws, augmenting stan- dard generative models with an adaptor that produces the appropriate pattern of token frequencies. We show that taking a particular stochastic process – the Pitman-Yor process – as an adaptor justiﬁes the appearance of type frequencies in formal analyses of natural language, and improves the performance of a model for unsupervised learning of morphology.

NeurIPS Conference 2004 Conference Paper

Integrating Topics and Syntax

Thomas Griffiths
Mark Steyvers
David Blei
Joshua Tenenbaum

Statistical approaches to language learning typically focus on either short-range syntactic dependencies or long-range semantic dependencies between words. We present a generative model that uses both kinds of dependencies, and can be used to simultaneously find syntactic classes and semantic topics despite having no representation of syntax or seman- tics beyond statistical dependency. This model is competitive on tasks like part-of-speech tagging and document classification with models that exclusively use short- and long-range dependencies respectively.

NeurIPS Conference 2004 Conference Paper

Parametric Embedding for Class Visualization

Tomoharu Iwata
Kazumi Saito
Naonori Ueda
Sean Stromsten
Thomas Griffiths
Joshua Tenenbaum

In this paper, we propose a new method, Parametric Embedding (PE), for visualizing the posteriors estimated over a mixture model. PE simultane- ously embeds both objects and their classes in a low-dimensional space. PE takes as input a set of class posterior vectors for given data points, and tries to preserve the posterior structure in an embedding space by minimizing a sum of Kullback-Leibler divergences, under the assump- tion that samples are generated by a Gaussian mixture with equal covari- ances in the embedding space. PE has many potential uses depending on the source of the input data, providing insight into the classiﬁer’s be- havior in supervised, semi-supervised and unsupervised settings. The PE algorithm has a computational advantage over conventional embedding methods based on pairwise object relations since its complexity scales with the product of the number of objects and the number of classes. We demonstrate PE by visualizing supervised categorization of web pages, semi-supervised categorization of digits, and the relations of words and latent topics found by an unsupervised algorithm, Latent Dirichlet Allo- cation.

NeurIPS Conference 2003 Conference Paper

From Algorithmic to Subjective Randomness

Thomas Griffiths
Joshua Tenenbaum

We explore the phenomena of subjective randomness as a case study in understanding how people discover structure embedded in noise. We present a rational account of randomness perception based on the statis- tical problem of model selection: given a stimulus, inferring whether the process that generated it was random or regular. Inspired by the mathe- matical definition of randomness given by Kolmogorov complexity, we characterize regularity in terms of a hierarchy of automata that augment a finite controller with different forms of memory. We find that the reg- ularities detected in binary sequences depend upon presentation format, and that the kinds of automata that can identify these regularities are in- formative about the cognitive processes engaged by different formats.

NeurIPS Conference 2003 Conference Paper

Hierarchical Topic Models and the Nested Chinese Restaurant Process

Thomas Griffiths
Michael Jordan
Joshua Tenenbaum
David Blei

We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, gen- erating an appropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior al- lows arbitrarily large branching factors and readily accommodates grow- ing data collections. We build a hierarchical topic model by combining this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation. We illustrate our approach on simulated data and with an application to the modeling of NIPS abstracts.

NeurIPS Conference 2003 Conference Paper

Semi-Supervised Learning with Trees

Charles Kemp
Thomas Griffiths
Sean Stromsten
Joshua Tenenbaum

We describe a nonparametric Bayesian approach to generalizing from few labeled examples, guided by a larger set of unlabeled objects and the assumption of a latent tree-structure to the domain. The tree (or a distribution over trees) may be inferred using the unlabeled data. A prior over concepts generated by a mutation process on the inferred tree(s) allows efﬁcient computation of the optimal Bayesian classiﬁcation func- tion from the labeled examples. We test our approach on eight real-world datasets.

NeurIPS Conference 2002 Conference Paper

Dynamical Causal Learning

David Danks
Thomas Griffiths
Joshua Tenenbaum

theories of human causal focus primarily on long-run predictions: learning and Current psychological two by judgment estimating parameters of a causal Bayes nets (though for different parameterizations), and a third through structural learning. This short-run behavior by examining paper dynamical versions of these three theories, and comparing their predictions to a real-world dataset. focuses on people's

NeurIPS Conference 2002 Conference Paper

Prediction and Semantic Association

Thomas Griffiths
Mark Steyvers

We explore the consequences of viewing semantic association as the result of attempting to predict the concepts likely to arise in a particular context. We argue that the success of existing accounts of semantic representation comes as a result of indirectly addressing this problem, and show that a closer correspondence to human data can be obtained by taking a probabilistic approach that explicitly models the generative structure of language.

NeurIPS Conference 2002 Conference Paper

Theory-Based Causal Inference

Joshua Tenenbaum
Thomas Griffiths

People routinely make sophisticated causal inferences unconsciously, ef- fortlessly, and from very little data – often from just one or a few ob- servations. We argue that these inferences can be explained as Bayesian computations over a hypothesis space of causal graphical models, shaped by strong top-down prior knowledge in the form of intuitive theories. We present two case studies of our approach, including quantitative mod- els of human causal judgments and brief comparisons with traditional bottom-up models of inference.

NeurIPS Conference 2001 Conference Paper

Using Vocabulary Knowledge in Bayesian Multinomial Estimation

Thomas Griffiths
Joshua Tenenbaum

Estimating the parameters of sparse multinomial distributions is an important component of many statistical learning tasks. Recent approaches have used uncertainty over the vocabulary of symbols in a multinomial distribution as a means of accounting for sparsity. We present a Bayesian approach that allows weak prior knowledge, in the form of a small set of approximate candidate vocabularies, to be used to dramatically improve the resulting estimates. We demonstrate these improvements in applications to text compres(cid: 173) sion and estimating distributions over words in newsgroup data.

NeurIPS Conference 2000 Conference Paper

Structure Learning in Human Causal Induction

Joshua Tenenbaum
Thomas Griffiths

We use graphical models to explore the question of how people learn sim(cid: 173) ple causal relationships from data. The two leading psychological theo(cid: 173) ries can both be seen as estimating the parameters of a fixed graph. We argue that a complete account of causal induction should also consider how people learn the underlying causal graph structure, and we propose to model this inductive process as a Bayesian inference. Our argument is supported through the discussion of three data sets.