Author name cluster

Alexander Gray

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

Transformers Learn Faster with Semantic Focus

Parikshit Ram
Kenneth Clarkson
Tim Klinger
Shashanka Ubaru
Alexander Gray

Various forms of sparse attention have been explored to mitigate the quadratic computational and memory cost of the attention mechanism in transformers. We study sparse transformers not through a lens of efficiency but rather in terms of learnability and generalization. Empirically studying a range of attention mechanisms, we find that input-dependent sparse attention models appear to converge faster and generalize better than standard attention models, while input-agnostic sparse attention models show no such benefits -- a phenomenon that is robust across architectural and optimization hyperparameter choices. This can be interpreted as demonstrating that concentrating a model's "semantic focus" with respect to the tokens currently being considered (in the form of input-dependent sparse attention) accelerates learning. We develop a theoretical characterization of the conditions that explain this behavior. We establish a connection between the stability of the standard softmax and the loss function's Lipschitz properties, then show how sparsity affects the stability of the softmax and the subsequent convergence and generalization guarantees resulting from the attention mechanism. This allows us to theoretically establish that input-agnostic sparse attention does not provide any benefits. We also characterize conditions when semantic focus (input-dependent sparse attention) can provide improved guarantees, and we validate that these conditions are in fact met in our empirical evaluations.

NeurIPS Conference 2024 Conference Paper

Abductive Reasoning in Logical Credal Networks

Radu Marinescu
Junkyu Lee
Debarun Bhattacharjya
Fabio Cozman
Alexander Gray

Logical Credal Networks or LCNs were recently introduced as a powerful probabilistic logic framework for representing and reasoning with imprecise knowledge. Unlike many existing formalisms, LCNs have the ability to represent cycles and allow specifying marginal and conditional probability bounds on logic formulae which may be important in many realistic scenarios. Previous work on LCNs has focused exclusively on marginal inference, namely computing posterior lower and upper probability bounds on a query formula. In this paper, we explore abductive reasoning tasks such as solving MAP and Marginal MAP queries in LCNs given some evidence. We first formally define the MAP and Marginal MAP tasks for LCNs and subsequently show how to solve these tasks exactly using search-based approaches. We then propose several approximate schemes that allow us to scale MAP and Marginal MAP inference to larger problem instances. An extensive empirical evaluation demonstrates the effectiveness of our algorithms on both random LCN instances as well as LCNs derived from more realistic use-cases.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban Simulation

Bowen Li
Zhaoyu Li
Qiwei Du
Jinqi Luo
Wenshan Wang
Yaqi Xie
Simon Stepputtis
Chen Wang

Recent years have witnessed the rapid development of Neuro-Symbolic (NeSy) AI systems, which integrate symbolic reasoning into deep neural networks. However, most of the existing benchmarks for NeSy AI fail to provide long-horizon reasoning tasks with complex multi-agent interactions. Furthermore, they are usually constrained by fixed and simplistic logical rules over limited entities, making them far from real-world complexities. To address these crucial gaps, we introduce LogiCity, the first simulator based on customizable first-order logic (FOL) for an urban-like environment with multiple dynamic agents. LogiCity models diverse urban elements using semantic and spatial concepts, such as $\texttt{IsAmbulance}(\texttt{X})$ and $\texttt{IsClose}(\texttt{X}, \texttt{Y})$. These concepts are used to define FOL rules that govern the behavior of various agents. Since the concepts and rules are abstractions, they can be universally applied to cities with any agent compositions, facilitating the instantiation of diverse scenarios. Besides, a key feature of LogiCity is its support for user-configurable abstractions, enabling customizable simulation complexities for logical reasoning. To explore various aspects of NeSy AI, LogiCity introduces two tasks, one features long-horizon sequential decision-making, and the other focuses on one-step visual reasoning, varying in difficulty and agent behaviors. Our extensive evaluation reveals the advantage of NeSy frameworks in abstract reasoning. Moreover, we highlight the significant challenges of handling more complex abstractions in long-horizon multi-agent scenarios or under high-dimensional, imbalanced data. With its flexible design, various features, and newly raised challenges, we believe LogiCity represents a pivotal step forward in advancing the next generation of NeSy AI. All the code and data are open-sourced at our website.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Approximate Inference in Logical Credal Networks

Radu Marinescu
Haifeng Qian
Alexander Gray
Debarun Bhattacharjya
Francisco Barahona
Tian Gao
Ryan Riegel

The Logical Credal Network or LCN is a recent probabilistic logic designed for effective aggregation and reasoning over multiple sources of imprecise knowledge. An LCN specifies a set of probability distributions over all interpretations of a set of logical formulas for which marginal and conditional probability bounds on their truth values are known. Inference in LCNs involves the exact solution of a non-convex non-linear program defined over an exponentially large number of non-negative real valued variables and, therefore, is limited to relatively small problems. In this paper, we present ARIEL -- a novel iterative message-passing scheme for approximate inference in LCNs. Inspired by classical belief propagation for graphical models, our method propagates messages that involve solving considerably smaller local non-linear programs. Experiments on several classes of LCNs demonstrate clearly that ARIEL yields high quality solutions compared with exact inference and scales to much larger problems than previously considered.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

Credal Marginal MAP

Radu Marinescu
Debarun Bhattacharjya
Junkyu Lee
Fabio Cozman
Alexander Gray

Credal networks extend Bayesian networks to allow for imprecision in probability values. Marginal MAP is a widely applicable mixed inference task that identifies the most likely assignment for a subset of variables (called MAP variables). However, the task is extremely difficult to solve in credal networks particularly because the evaluation of each complete MAP assignment involves exact likelihood computations (combinatorial sums) over the vertices of a complex joint credal set representing the space of all possible marginal distributions of the MAP variables. In this paper, we explore Credal Marginal MAP inference and develop new exact methods based on variable elimination and depth-first search as well as several approximation schemes based on the mini-bucket partitioning and stochastic local search. An extensive empirical evaluation demonstrates the effectiveness of our new methods on random as well as real-world benchmark problems.

NeurIPS Conference 2022 Conference Paper

Logical Credal Networks

Radu Marinescu
Haifeng Qian
Alexander Gray
Debarun Bhattacharjya
Francisco Barahona
Tian Gao
Ryan Riegel
Pravinda Sahu

We introduce Logical Credal Networks (or LCNs for short) -- an expressive probabilistic logic that generalizes prior formalisms that combine logic and probability. Given imprecise information represented by probability bounds and conditional probability bounds on logic formulas, an LCN specifies a set of probability distributions over all its interpretations. Our approach allows propositional and first-order logic formulas with few restrictions, e. g. , without requiring acyclicity. We also define a generalized Markov condition that allows us to identify implicit independence relations between atomic formulas. We evaluate our method on benchmark problems such as random networks, Mastermind games with uncertainty and credit card fraud detection. Our results show that the LCN outperforms existing approaches; its advantage lies in aggregating multiple sources of imprecise information.

AAAI Conference 2022 Conference Paper

Neuro-Symbolic Inductive Logic Programming with Logical Neural Networks

Prithviraj Sen
Breno W. S. R. de Carvalho
Ryan Riegel
Alexander Gray

Recent work on neuro-symbolic inductive logic programming has led to promising approaches that can learn explanatory rules from noisy, real-world data. While some proposals approximate logical operators with differentiable operators from fuzzy or real-valued logic that are parameter-free thus diminishing their capacity to fit the data, other approaches are only loosely based on logic making it difficult to interpret the learned “rules”. In this paper, we propose learning rules with the recently proposed logical neural networks (LNN). Compared to others, LNNs offer a strong connection to classical Boolean logic thus allowing for precise interpretation of learned rules while harboring parameters that can be trained with gradient-based optimization to effectively fit the data. We extend LNNs to induce rules in first-order logic. Our experiments on standard benchmarking tasks confirm that LNN rules are highly interpretable and can achieve comparable or higher accuracy due to their flexible parameterization.

AAAI Conference 2021 System Paper

A Semantic Parsing and Reasoning-Based Approach to Knowledge Base Question Answering

Ibrahim Abdelaziz
Srinivas Ravishankar
Pavan Kapanipathi
Salim Roukos
Alexander Gray

Knowledge Base Question Answering (KBQA) is a task where existing techniques have faced significant challenges, such as the need for complex question understanding, reasoning, and large training datasets. In this work, we demonstrate Deep Thinking Question Answering (DTQA), a semantic parsing and reasoning-based KBQA system. DTQA (1) integrates multiple, reusable modules that are trained specifically for their individual tasks (e. g. semantic parsing, entity linking, and relationship linking), eliminating the need for end-to-end KBQA training data; (2) leverages semantic parsing and a reasoner for improved question understanding. DTQA is a system of systems that achieves state-of-the-art performance on two popular KBQA datasets.

AAAI Conference 2020 Conference Paper

An ADMM Based Framework for AutoML Pipeline Configuration

Sijia Liu
Parikshit Ram
Deepak Vijaykeerthy
Djallel Bouneffouf
Gregory Bramble
Horst Samulowitz
Dakuo Wang
Andrew Conn

We study the AutoML problem of automatically conﬁguring machine learning pipelines by jointly selecting algorithms and their appropriate hyper-parameters for all steps in supervised learning pipelines. This black-box (gradient-free) optimization with mixed integer & continuous variables is a challenging problem. We propose a novel AutoML scheme by leveraging the alternating direction method of multipliers (ADMM). The proposed framework is able to (i) decompose the optimization problem into easier sub-problems that have a reduced number of variables and circumvent the challenge of mixed variable categories, and (ii) incorporate black-box constraints alongside the black-box optimization objective. We empirically evaluate the ﬂexibility (in utilizing existing AutoML techniques), effectiveness (against open source AutoML toolkits), and unique capability (of executing AutoML with practically motivated black-box constraints) of our proposed scheme on a collection of binary classiﬁcation data sets from UCI ML & OpenML repositories. We observe that on an average our framework provides signiﬁcant gains in comparison to other AutoML frameworks (Auto-sklearn & TPOT), highlighting the practical advantages of this framework.

NeurIPS Conference 2013 Conference Paper

Which Space Partitioning Tree to Use for Search?

Parikshit Ram
Alexander Gray

We consider the task of nearest-neighbor search with the class of binary-space-partitioning trees, which includes kd-trees, principal axis trees and random projection trees, and try to rigorously answer the question which tree to use for nearest-neighbor search? '' To this end, we present the theoretical results which imply that trees with better vector quantization performance have better search performance guarantees. We also explore another factor affecting the search performance -- margins of the partitions in these trees. We demonstrate, both theoretically and empirically, that large margin partitions can improve the search performance of a space-partitioning tree. "

NeurIPS Conference 2012 Conference Paper

Minimax Multi-Task Learning and a Generalized Loss-Compositional Paradigm for MTL

Nishant Mehta
Dongryeol Lee
Alexander Gray

Since its inception, the modus operandi of multi-task learning (MTL) has been to minimize the task-wise mean of the empirical risks. We introduce a generalized loss-compositional paradigm for MTL that includes a spectrum of formulations as a subfamily. One endpoint of this spectrum is minimax MTL: a new MTL formulation that minimizes the maximum of the tasks' empirical risks. Via a certain relaxation of minimax MTL, we obtain a continuum of MTL formulations spanning minimax MTL and classical MTL. The full paradigm itself is loss-compositional, operating on the vector of empirical risks. It incorporates minimax MTL, its relaxations, and many new MTL formulations as special cases. We show theoretically that minimax MTL tends to avoid worst case outcomes on newly drawn test tasks in the learning to learn (LTL) test setting. The results of several MTL formulations on synthetic and real problems in the MTL and LTL test settings are encouraging.

NeurIPS Conference 2009 Conference Paper

Linear-time Algorithms for Pairwise Statistical Problems

Parikshit Ram
Dongryeol Lee
William March
Alexander Gray

Several key computational bottlenecks in machine learning involve pairwise distance computations, including all-nearest-neighbors (finding the nearest neighbor(s) for each point, e. g. in manifold learning) and kernel summations (e. g. in kernel density estimation or kernel machines). We consider the general, bichromatic case for these problems, in addition to the scientific problem of N-body potential calculation. In this paper we show for the first time O(N) worst case runtimes for practical algorithms for these problems based on the cover tree data structure (Beygelzimer, Kakade, Langford, 2006).

NeurIPS Conference 2009 Conference Paper

Rank-Approximate Nearest Neighbor Search: Retaining Meaning and Speed in High Dimensions

Parikshit Ram
Dongryeol Lee
Hua Ouyang
Alexander Gray

The long-standing problem of efficient nearest-neighbor (NN) search has ubiquitous applications ranging from astrophysics to MP3 fingerprinting to bioinformatics to movie recommendations. As the dimensionality of the dataset increases, exact NN search becomes computationally prohibitive; (1+eps)-distance-approximate NN search can provide large speedups but risks losing the meaning of NN search present in the ranks (ordering) of the distances. This paper presents a simple, practical algorithm allowing the user to, for the first time, directly control the true accuracy of NN search (in terms of ranks) while still achieving the large speedups over exact NN. Experiments with high-dimensional datasets show that it often achieves faster and more accurate results than the best-known distance-approximate method, with much more stable behavior.

NeurIPS Conference 2009 Conference Paper

Submanifold density estimation

Arkadas Ozakin
Alexander Gray

Kernel density estimation is the most widely-used practical method for accurate nonparametric density estimation. However, long-standing worst-case theoretical results showing that its performance worsens exponentially with the dimension of the data have quashed its application to modern high-dimensional datasets for decades. In practice, it has been recognized that often such data have a much lower-dimensional intrinsic structure. We propose a small modification to kernel density estimation for estimating probability density functions on Riemannian submanifolds of Euclidean space. Using ideas from Riemannian geometry, we prove the consistency of this modified estimator and show that the convergence rate is determined by the intrinsic dimension of the submanifold. We conclude with empirical results demonstrating the behavior predicted by our theory.

NeurIPS Conference 2008 Conference Paper

Fast High-dimensional Kernel Summations Using the Monte Carlo Multipole Method

Dongryeol Lee
Alexander Gray

We propose a new fast Gaussian summation algorithm for high-dimensional datasets with high accuracy. First, we extend the original fast multipole-type methods to use approximation schemes with both hard and probabilistic error. Second, we utilize a new data structure called subspace tree which maps each data point in the node to its lower dimensional mapping as determined by any linear dimension reduction method such as PCA. This new data structure is suitable for reducing the cost of each pairwise distance computation, the most dominant cost in many kernel methods. Our algorithm guarantees probabilistic relative error on each kernel sum, and can be applied to high-dimensional Gaussian summations which are ubiquitous inside many kernel methods as the key computational bottleneck. We provide empirical speedup results on low to high-dimensional datasets up to 89 dimensions.

NeurIPS Conference 2008 Conference Paper

QUIC-SVD: Fast SVD Using Cosine Trees

Michael Holmes
Isbell
Charles Lee
Alexander Gray

The Singular Value Decomposition is a key operation in many machine learning methods. Its computational cost, however, makes it unscalable and impractical for the massive-sized datasets becoming common in applications. We present a new method, QUIC-SVD, for fast approximation of the full SVD with automatic sample size minimization and empirical relative error control. Previous Monte Carlo approaches have not addressed the full SVD nor benefited from the efficiency of automatic, empirically-driven sample sizing. Our empirical tests show speedups of several orders of magnitude over exact SVD. Such scalability should enable QUIC-SVD to meet the needs of a wide array of methods and applications.

NeurIPS Conference 2007 Conference Paper

Ultrafast Monte Carlo for Statistical Summations

Charles Isbell
Michael Holmes
Alexander Gray

Machine learning contains many computational bottlenecks in the form of nested summations over datasets. Kernel estimators and other methods are burdened by these expensive computations. Exact evaluation is typically O(n2 ) or higher, which severely limits application to large datasets. We present a multi-stage stratified Monte Carlo method for approximating such summations with probabilistic relative error control. The essential idea is fast approximation by sampling in trees. This method differs from many previous scalability techniques (such as standard multi-tree methods) in that its error is stochastic, but we derive conditions for error control and demonstrate that they work. Further, we give a theoretical sample complexity for the method that is independent of dataset size, and show that this appears to hold in experiments, where speedups reach as high as 1014, many orders of magnitude beyond the previous state of the art.

JMLR Journal 2006 Journal Article

New Algorithms for Efficient High-Dimensional Nonparametric Classification

Ting Liu
Andrew W. Moore
Alexander Gray

This paper is about non-approximate acceleration of high-dimensional nonparametric operations such as k nearest neighbor classifiers. We attempt to exploit the fact that even if we want exact answers to nonparametric queries, we usually do not need to explicitly find the data points close to the query, but merely need to answer questions about the properties of that set of data points. This offers a small amount of computational leeway, and we investigate how much that leeway can be exploited. This is applicable to many algorithms in nonparametric statistics, memory-based learning and kernel-based learning. But for clarity, this paper concentrates on pure k -NN classification. We introduce new ball-tree algorithms that on real-world data sets give accelerations from 2-fold to 100-fold compared against highly optimized traditional ball-tree-based k -NN. These results include data sets with up to 10 6 dimensions and 10 5 records, and demonstrate non-trivial speed-ups while giving exact answers. [abs] [ pdf ][ bib ] &copy JMLR 2006. ( edit, beta )

NeurIPS Conference 2005 Conference Paper

Dual-Tree Fast Gauss Transforms

Dongryeol Lee
Andrew Moore
Alexander Gray

In previous work we presented an efﬁcient approach to computing ker- nel summations which arise in many machine learning methods such as kernel density estimation. This approach, dual-tree recursion with ﬁnite- difference approximation, generalized existing methods for similar prob- lems arising in computational physics in two ways appropriate for sta- tistical problems: toward distribution sensitivity and general dimension, partly by avoiding series expansions. While this proved to be the fastest practical method for multivariate kernel density estimation at the optimal bandwidth, it is much less efﬁcient at larger-than-optimal bandwidths. In this work, we explore the extent to which the dual-tree approach can be integrated with multipole-like Hermite expansions in order to achieve reasonable efﬁciency across all bandwidth scales, though only for low di- mensionalities. In the process, we derive and demonstrate the ﬁrst truly hierarchical fast Gauss transforms, effectively combining the best tools from discrete algorithms and continuous approximation theory. 1 Fast Gaussian Summation Kernel summations are fundamental in both statistics/learning and computational physics.

NeurIPS Conference 2004 Conference Paper

An Investigation of Practical Approximate Nearest Neighbor Algorithms

Ting Liu
Andrew Moore
Ke Yang
Alexander Gray

This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimen- sional perception areas such as computer vision, with dozens of publica- tions in recent years. Much of this enthusiasm is due to a successful new approximate nearest neighbor approach called Locality Sensitive Hash- ing (LSH). In this paper we ask the question: can earlier spatial data structure approaches to exact nearest neighbor, such as metric trees, be altered to provide approximate answers to proximity queries and if so, how? We introduce a new kind of metric tree that allows overlap: certain datapoints may appear in both the children of a parent. We also intro- duce new approximate k-NN search algorithms on this structure. We show why these structures should be able to exploit the same random- projection-based approximations that LSH enjoys, but with a simpler al- gorithm and perhaps with greater efficiency. We then provide a detailed empirical evaluation on five large, high dimensional datasets which show up to 31-fold accelerations over LSH. This result holds true throughout the spectrum of approximation levels.

NeurIPS Conference 2003 Conference Paper

New Algorithms for Efficient High Dimensional Non-parametric Classification

Ting Liu
Andrew Moore
Alexander Gray

Alexander Gray Computer Science Dept. Carnegie Mellon University Pittsburgh, PA 15213 agray@cs. cmu. edu This paper is about non-approximate acceleration of high dimensional nonparametric operations such as k nearest neighbor classiﬁers and the prediction phase of Support Vector Machine classiﬁers. We attempt to exploit the fact that even if we want exact answers to nonparametric queries, we usually do not need to explicitly ﬁnd the datapoints close to the query, but merely need to ask questions about the properties about that set of datapoints. This offers a small amount of computational lee- way, and we investigate how much that leeway can be exploited. For clarity, this paper concentrates on pure k-NN classiﬁcation and the pre- diction phase of SVMs. We introduce new ball tree algorithms that on real-world datasets give accelerations of 2-fold up to 100-fold compared against highly optimized traditional ball-tree-based k-NN. These results include datasets with up to 106 dimensions and 105 records, and show non-trivial speedups while giving exact answers.

NeurIPS Conference 2002 Conference Paper

Automatic Derivation of Statistical Algorithms: The EM Family and Beyond

Bernd Fischer
Johann Schumann
Wray Buntine
Alexander Gray

Machine learning has reached a point where many probabilistic meth- ods can be understood as variations, extensions and combinations of a much smaller set of abstract themes, e. g. , as different instances of the EM algorithm. This enables the systematic derivation of algorithms cus- tomized for different models. Here, we describe the AUTO BAYES sys- tem which takes a high-level statistical model speciﬁcation, uses power- ful symbolic techniques based on schema-based program synthesis and computer algebra to derive an efﬁcient specialized algorithm for learning that model, and generates executable code implementing that algorithm. This capability is far beyond that of code collections such as Matlab tool- boxes or even tools for model-independent optimization such as BUGS for Gibbs sampling: complex new algorithms can be generated with- out new programming, algorithms can be highly specialized and tightly crafted for the exact structure of the model and data, and efﬁcient and commented code can be generated for different languages or systems. We present automatically-derived algorithms ranging from closed-form solutions of Bayesian textbook problems to recently-proposed EM algo- rithms for clustering, regression, and a multinomial form of PCA. 1 Automatic Derivation of Statistical Algorithms Overview. We describe a symbolic program synthesis system which works as a “statistical algorithm compiler: ” it compiles a statistical model speciﬁcation into a custom algorithm design and from that further down into a working program implementing the algorithm design. This system, AUTOBAYES, can be loosely thought of as “part theorem prover, part Mathematica, part learning textbook, and part Numerical Recipes. ” It provides much more ﬂexibility than a ﬁxed code repository such as a Matlab toolbox, and allows the creation of efﬁcient algorithms which have never before been implemented, or even written down. AUTOBAYES is intended to automate the more routine application of complex methods in novel contexts. For example, recent multinomial extensions to PCA [2, 4] can be derived in this way. The algorithm design problem. Given a dataset and a task, creating a learning method can be characterized by two main questions: 1. What is the model? 2. What algorithm will optimize the model parameters? The statistical algorithm (i. e. , a parameter optimization algorithm for the statistical model) can then be implemented manually. The system in this paper answers the algorithm question given that the user has chosen a model for the data, and continues through to implementation. Performing this task at the state-of-the-art level requires an intertwined meld of probability theory, computational mathematics, and software engineering. However, a number of factors unite to allow us to solve the algorithm design problem computationally: 1. The existence of fundamental building blocks (e. g. , standardized probability distributions, standard optimization procedures, and generic data structures). 2. The existence of common representations (i. e. , graphical models [3, 13] and program schemas). 3. The formalization of schema applicability constraints as guards. 1 The challenges of algorithm design. The design problem has an inherently combinatorial nature, since subparts of a function may be optimized recursively and in different ways. It also involves the use of new data structures or approximations to gain performance. As the research in statistical algorithms advances, its creative focus should move beyond the ultimately mechanical aspects and towards extending the abstract applicability of already existing schemas (algorithmic principles like EM), improving schemas in ways that gener- alize across anything they can be applied to, and inventing radically new schemas. 2 Combining Schema-based Synthesis and Bayesian Networks with 0 < n_points; 1 model mog as ’Mixture of Gaussians’; with 0 < n classes with n classes << n_points; with 1 = sum(I: = 1. .n_classes, phi(I)); 7 double phi(1. .n classes) as ’weights’ 8 9 double mu(1. .n classes); 9 double sigma(1. .n_classes); 2 const int n points as ’nr. of data points’ 3 4 const int n classes: = 3 as ’nr. classes’ 5 6 Statistical Models. Externally, AUTOBAYES has the look and feel of a compiler. Users specify their model of interest in a high-level speciﬁcation language (as opposed to a program- ming language). The ﬁgure shows the speciﬁcation of the mixture of Gaus- sians example used throughout this paper. 2 Note the constraint that the sum of the class probabilities must equal one (line 8) along with others (lines 3 and 5) that make optimization of the model well-deﬁned. Also note the ability to specify assumptions of the kind in line 6, which may be used by some algorithms. The last line speciﬁes the goal 10 int c(1. .n points) as ’class labels’; 11 c ˜ disc(vec(I: = 1. .n classes, phi(I))); 12 data double x(1. .n_points) as ’data’; 13 x(I) ˜ gauss(mu(c(I)), sigma(c(I))); 14 max pr(x| phi, mu, sigma ) wrt phi, mu, sigma; inference task: maximize the conditional probability pr rameters

NeurIPS Conference 2000 Conference Paper

`N-Body' Problems in Statistical Learning

Alexander Gray
Andrew Moore

We present efficient algorithms for all-point-pairs problems, or 'N(cid: 173) body '-like problems, which are ubiquitous in statistical learning. We focus on six examples, including nearest-neighbor classification, kernel density estimation, outlier detection, and the two-point correlation. These include any problem which abstractly requires a comparison of each of the N points in a dataset with each other point and would naively be solved using N 2 distance computations. In practice N is often large enough to make this infeasible. We present a suite of new geometric t echniques which are applicable in principle to any 'N-body' computation including large-scale mixtures of Gaussians, RBF neural networks, and HMM 's. Our algorithms exhibit favorable asymptotic scaling and are empirically several orders of magnitude faster than the naive computation, even for small datasets. We are aware of no exact algorithms for these problems which are more effi(cid: 173) cient either empirically or theoretically. In addition, our framework yields simple and elegant algorithms. It also permits two important generalizations beyond the standard all-point-pairs problems, which are more difficult. These are represented by our final examples, the multiple two-point correlation and the notorious n-point correlation.

AAAI Conference 1999 Conference Paper

An Integrated System for Multi-Rover Scientific Exploration

Tara Estlin
Alexander Gray
Tobias Mann
Gregg Rabideau
Rebecca Castaño
Steve Chien
Eric Mjolsness
Jet Propulsion Laboratory

This paper describes an integrated system for coordinating multiple rover behavior with the overall goal of collecting planetary surface data. TheMulti- P~overIntegrated Science UnderstandingSystemcombines concepts from machine learning with planning and scheduling to perform autonomous scientific exploration by cooperating rovers. Theintegrated system utilizes a novel machinelearning clustering component to analyze science data and direct newscience activities. Aplanning and scheduling systemis employedto generate rover plans for achievingscience goals andto coordinate activities amongrovers. Wedescribe each of these components and discuss someof the key integration issues that arose during developmentand influenced both systemdesign and performance.

ICRA Conference 1988 Conference Paper

FLAIR-robotic printed circuit board assembly workcell

Sidney Liebes
William Gong
Alexander Gray
Henrique A. S. Martins
Bruce Modesitt
Richard Tella

The FLAIR (flexible assembly intelligent robot) workcell loads odd-shaped through-hole and surface-mount components onto printed circuit boards. It is 100% data driven, handles diverse components and boards in arbitrary batch sizes, uses computer vision for viewing components and boards, and incorporates sensor-based error detection and recovery, generic parts feeders, general purpose lead straightening, and lead clinching. It automatically calibrates cameras and robot. FLAIR functions either stand alone or are integrated into an assembly line. >