Arrow Research search

Author name cluster

Gary Doran

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
2 author rows

Possible papers

7

IJCAI Conference 2016 Conference Paper

A Unifying Framework for Learning Bag Labels from Generalized Multiple-Instance Data

  • Gary Doran
  • Andrew Latham
  • Soumya Ray

We study the problem of bag-level classification from generalized multiple-instance (GMI) data. GMI learning is an extension of the popular multiple-instance setting. In GMI data, bags are labeled positive if they contain instances of certain types, and avoid instances of other types. For example, an image of a "sunny beach"' should contain sand and sea, but not clouds. We formulate a novel generative process for the GMI setting in which bags are distributions over instances. In this model, we show that a broad class of distribution-distance kernels is sufficient to represent arbitrary GMI concepts. Further, we show that a variety of previously proposed kernel approaches to the standard MI and GMI settings can be unified under the distribution kernel framework. We perform an extensive empirical study which indicates that the family of distribution distance kernels is accurate for a wide variety of real-world MI and GMI tasks as well as efficient when compared to a large set of baselines. Our theoretical and empirical results indicate that distribution-distance kernels can serve as a unifying framework for learning bag labels from GMI (and therefore MI) problems.

JMLR Journal 2016 Journal Article

Multiple-Instance Learning from Distributions

  • Gary Doran
  • Soumya Ray

We propose a new theoretical framework for analyzing the multiple-instance learning (MIL) setting. In MIL, training examples are provided to a learning algorithm in the form of labeled sets, or "bags," of instances. Applications of MIL include 3-D quantitative structure--activity relationship prediction for drug discovery and content-based image retrieval for web search. The goal of an algorithm is to learn a function that correctly labels new bags or a function that correctly labels new instances. We propose that bags should be treated as latent distributions from which samples are observed. We show that it is possible to learn accurate instance- and bag-labeling functions in this setting as well as functions that correctly rank bags or instances under weak assumptions. Additionally, our theoretical results suggest that it is possible to learn to rank efficiently using traditional, well-studied "supervised" learning approaches. We perform an extensive empirical evaluation that supports the theoretical predictions entailed by the new framework. The proposed theoretical framework leads to a better understanding of the relationship between the MI and standard supervised learning settings, and it provides new methods for learning from MI data that are more accurate, more efficient, and have better understood theoretical properties than existing MI-specific algorithms. [abs] [ pdf ][ bib ] &copy JMLR 2016. ( edit, beta )

IJCAI Conference 2016 Conference Paper

Precision Instrument Targeting via Image Registration for the Mars 2020 Rover

  • Gary Doran
  • David R. Thompson
  • Tara Estlin

A key component of Mars exploration is the operation of robotic instruments on the surface, such as those on board the Mars Exploration Rovers, the Mars Science Laboratory (MSL), and the planned Mars 2020 Rover. As the instruments carried by these rovers have become more advanced, the area targeted by some instruments becomes smaller, revealing more fine-grained details about the geology and chemistry of rocks on the surface. However, thermal fluctuations, rover settling or slipping, and inherent inaccuracies in pointing mechanisms all lead to pointing error that is on the order of the target size (several millimeters) or larger. We show that given a target located on a previously acquired image, the rover can align this with a new image to visually locate the target and refine the current pointing. Due to round-trip communication constraints, this visual targeting must be done efficiently on board the rover using relatively limited computing hardware. We employ existing ORB features for landmark-based image registration, describe and theoretically justify a novel approach to filtering false landmark matches, and employ a random forest classifier to automatically reject failed alignments. We demonstrate the efficacy of our approach using over 3, 800 images acquired by Remote Micro-Imager on board the Curiosity rover.

UAI Conference 2014 Conference Paper

A Permutation-Based Kernel Conditional Independence Test

  • Gary Doran
  • Krikamol Muandet
  • Kun Zhang 0001
  • Bernhard Schölkopf

Determining conditional independence (CI) relationships between random variables is a challenging but important task for problems such as Bayesian network learning and causal discovery. We propose a new kernel CI test that uses a single, learned permutation to convert the CI test problem into an easier two-sample test problem. The learned permutation leaves the joint distribution unchanged if and only if the null hypothesis of CI holds. Then, a kernel two-sample test, which has been studied extensively in prior work, can be applied to a permuted and an unpermuted sample to test for CI. We demonstrate that the test (1) easily allows the incorporation of prior knowledge during the permutation step, (2) has power competitive with state-of-the-art kernel CI tests, and (3) accurately estimates the null distribution of the test statistic, even as the dimensionality of the conditioning variable grows.

AAAI Conference 2014 Conference Paper

Learning Instance Concepts from Multiple-Instance Data with Bags as Distributions

  • Gary Doran
  • Soumya Ray

We analyze and evaluate a generative process for multipleinstance learning (MIL) in which bags are distributions over instances. We show that our generative process contains as special cases generative models explored in prior work, while excluding scenarios known to be hard for MIL. Further, under the mild assumption that every negative instance is observed with nonzero probability in some negative bag, we show that it is possible to learn concepts that accurately label instances from MI data in this setting. Finally, we show that standard supervised approaches can learn concepts with low area-under-ROC error from MI data in this setting. We validate this surprising result with experiments using several synthetic and real-world MI datasets that have been annotated with instance labels.

AAAI Conference 2013 Conference Paper

SMILe: Shuffled Multiple-Instance Learning

  • Gary Doran
  • Soumya Ray

Resampling techniques such as bagging are often used in supervised learning to produce more accurate classifiers. In this work, we show that multiple-instance learning admits a different form of resampling, which we call “shuffling. ” In shuffling, we resample instances in such a way that the resulting bags are likely to be correctly labeled. We show that resampling results in both a reduction of bag label noise and a propagation of additional informative constraints to a multiple-instance classifier. We empirically evaluate shuffling in the context of multiple-instance classification and multiple-instance active learning and show that the approach leads to significant improvements in accuracy.