Author name cluster

David D. Jensen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

2 author rows

ICML Conference 2021 Conference Paper

How and Why to Use Experimental Data to Evaluate Methods for Observational Causal Inference

Amanda Gentzel
Purva Pruthi
David D. Jensen

Methods that infer causal dependence from observational data are central to many areas of science, including medicine, economics, and the social sciences. A variety of theoretical properties of these methods have been proven, but empirical evaluation remains a challenge, largely due to the lack of observational data sets for which treatment effect is known. We describe and analyze observational sampling from randomized controlled trials (OSRCT), a method for evaluating causal inference methods using data from randomized controlled trials (RCTs). This method can be used to create constructed observational data sets with corresponding unbiased estimates of treatment effect, substantially increasing the number of data sets available for evaluating causal inference methods. We show that, in expectation, OSRCT creates data sets that are equivalent to those produced by randomly sampling from empirical data sets in which all potential outcomes are available. We then perform a large-scale evaluation of seven causal inference methods over 37 data sets, drawn from RCTs, as well as simulators, real-world computational systems, and observational data sets augmented with a synthetic response variable. We find notable performance differences when comparing across data from different sources, demonstrating the importance of using data from a variety of sources when evaluating any causal inference method.

Details

AAAI Conference 2021 Conference Paper

Improving Causal Inference by Increasing Model Expressiveness

David D. Jensen

The ability to learn and reason with causal knowledge is a key aspect of intelligent behavior. In contrast to mere statistical association, knowledge of causation enables reasoning about the effects of actions. Causal reasoning is vital for autonomous agents and for a range of applications in science, medicine, business, and government. However, current methods for causal inference are hobbled because they use relatively inexpressive models. Surprisingly, current causal models eschew nearly every major representational innovation common in a range of other fields both inside and outside of computer science, including representation of objects, relationships, time, space, and hierarchy. Even more surprisingly, a range of recent research provides strong evidence that more expressive representations make possible causal inferences that are otherwise impossible and remove key biases that would otherwise afflict more naive inferences. New research on causal inference should target increases in expressiveness to improve accuracy and effectiveness.

PDF Details

ICML Conference 2020 Conference Paper

Causal Inference using Gaussian Processes with Structured Latent Confounders

Sam Witty
Kenta Takatsu
David D. Jensen
Vikash K. Mansinghka

Latent confounders—unobserved variables that influence both treatment and outcome—can bias estimates of causal effects. In some cases, these confounders are shared across observations, e. g. all students taking a course are influenced by the course’s difficulty in addition to any educational interventions they receive individually. This paper shows how to semiparametrically model latent confounders that have this structure and thereby improve estimates of causal effects. The key innovations are a hierarchical Bayesian model, Gaussian processes with structured latent confounders (GP-SLC), and a Monte Carlo inference algorithm for this model based on elliptical slice sampling. GP-SLC provides principled Bayesian uncertainty estimates of individual treatment effect with minimal assumptions about the functional forms relating confounders, covariates, treatment, and outcome. Finally, this paper shows GP-SLC is competitive with or more accurate than widely used causal inference techniques on three benchmark datasets, including the Infant Health and Development Program and a dataset showing the effect of changing temperatures on state-wide energy consumption across New England.

Details

ICLR Conference 2020 Conference Paper

Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps for Deep Reinforcement Learning

Akanksha Atrey
Kaleigh Clary
David D. Jensen

Saliency maps are frequently used to support explanations of the behavior of deep reinforcement learning (RL) agents. However, a review of how saliency maps are used in practice indicates that the derived explanations are often unfalsifiable and can be highly subjective. We introduce an empirical approach grounded in counterfactual reasoning to test the hypotheses generated from saliency maps and assess the degree to which they correspond to the semantics of RL environments. We use Atari games, a common benchmark for deep RL, to evaluate three types of saliency maps. Our results show the extent to which existing claims about Atari games can be evaluated and suggest that saliency maps are best viewed as an exploratory tool rather than an explanatory tool.

Details

UAI Conference 2019 Conference Paper

Object Conditioning for Causal Inference

David D. Jensen
Javier Burroni
Matthew J. Rattigan

We describe and analyze a form of conditioning that is widely applied within social science and applied statistics but that is virtually unknown within causal graphical models. This approach, which we term object conditioning, can adjust for the effects of latent confounders and yet avoid the pitfall of conditioning on colliders. We describe object conditioning using plate models and show how its probabilistic implications can be explained using the property of exchangeability. We show that several seemingly obvious interpretations of object conditioning are insufficient to describe its probabilistic implications. Finally, we use object conditioning to describe and unify key aspects of a diverse set of techniques for causal inference, including within-subjects designs, difference-in-differences designs, and interrupted time-series designs.

Details

UAI Conference 2016 Conference Paper

Inferring Causal Direction from Relational Data

David Arbour
Katerina Marazopoulou
David D. Jensen

Inferring the direction of causal dependence from observational data is a fundamental problem in many scientific fields. Significant progress has been made in inferring causal direction from data that are independent and identically distributed (i. i. d.), but little is understood about this problem in the more general relational setting with multiple types of interacting entities. This work examines the task of inferring the causal direction of peer dependence in relational data. We show that, in contrast to the i. i. d. setting, the direction of peer dependence can be inferred using simple procedures, regardless of the form of the underlying distribution, and we provide a theoretical characterization on the identifiability of direction. We then examine the conditions under which the presence of confounding can be detected. Finally, we demonstrate the efficacy of the proposed methods with synthetic experiments, and we provide an application on realworld data. 1

Details

UAI Conference 2015 Conference Paper

Learning the Structure of Causal Models with Relational and Temporal Dependence

Katerina Marazopoulou
Marc E. Maier
David D. Jensen

Many real-world domains are inherently relational and temporal—they consist of heterogeneous entities that interact with each other over time. Effective reasoning about causality in such domains requires representations that explicitly model relational and temporal dependence. In this work, we provide a formalization of temporal relational models. We define temporal extensions to abstract ground graphs—a lifted representation that abstracts paths of dependence over all possible ground graphs. Temporal abstract ground graphs enable a sound and complete method for answering d-separation queries on temporal relational models. These methods provide the foundation for a constraint-based algorithm, TRCD, that learns causal models from temporal relational data. We provide experimental evidence that demonstrates the need to explicitly represent time when inferring causal dependence. We also demonstrate the expressive gain of TRCD compared to earlier algorithms that do not explicitly represent time.

Details

UAI Conference 2013 Conference Paper

A Sound and Complete Algorithm for Learning Causal Models from Relational Data

Marc E. Maier
Katerina Marazopoulou
David Arbour
David D. Jensen

The PC algorithm learns maximally oriented causal Bayesian networks. However, there is no equivalent complete algorithm for learning the structure of relational models, a more expressive generalization of Bayesian networks. Recent developments in the theory and representation of relational models support lifted reasoning about conditional independence. This enables a powerful constraint for orienting bivariate dependencies and forms the basis of a new algorithm for learning structure. We present the relational causal discovery (RCD) algorithm that learns causal relational models. We prove that RCD is sound and complete, and we present empirical results that demonstrate eﬀectiveness.

Details

ICML Conference 2013 Conference Paper

Copy or Coincidence? A Model for Detecting Social Influence and Duplication Events

Lisa Friedland
David D. Jensen
Michael Lavine

In this paper, we analyze the task of inferring rare links between pairs of entities that seem too similar to have occurred by chance. Variations of this task appear in such diverse areas as social network analysis, security, fraud detection, and entity resolution. To address the task in a general form, we propose a simple, flexible mixture model in which most entities are generated independently from a distribution but a small number of pairs are constrained to be similar. We predict the true pairs using a likelihood ratio that trades off the entities’ similarity with their rarity. This method always outperforms using only similarity; however, with certain parameter settings, similarity turns out to be surprisingly competitive. Using real data, we apply the model to detect twins given their birth weights and to re-identify cell phone users based on distinctive usage patterns.

Details

ICML Conference 2007 Conference Paper

Graph clustering with network structure indices

Matthew J. Rattigan
Marc E. Maier
David D. Jensen

Graph clustering has become ubiquitous in the study of relational data sets. We examine two simple algorithms: a new graphical adaptation of the k -medoids algorithm and the Girvan-Newman method based on edge betweenness centrality. We show that they can be effective at discovering the latent groups or communities that are defined by the link structure of a graph. However, both approaches rely on prohibitively expensive computations, given the size of modern relational data sets. Network structure indices (NSIs) are a proven technique for indexing network structure and efficiently finding short paths. We show how incorporating NSIs into these graph clustering algorithms can overcome these complexity limitations. We also present promising quantitative and qualitative evaluations of the modified algorithms on synthetic and real data sets.

Details

ICML Conference 2003 Conference Paper

Avoiding Bias when Aggregating Relational Data with Degree Disparity

David D. Jensen
Jennifer Neville
Michael Hay

Details

ICML Conference 2003 Conference Paper

Identifying Predictive Structures in Relational Data Using Multiple Instance Learning

Amy McGovern
David D. Jensen

Details

ICML Conference 2002 Conference Paper

Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning

David D. Jensen
Jennifer Neville

Details

ICML Conference 1997 Conference Paper

The Effects of Training Set Size on Decision Tree Complexity

Tim Oates 0001
David D. Jensen

Details