Author name cluster

Ali Rahimi

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

2 author rows

ICML Conference 2019 Conference Paper

Faster Algorithms for Binary Matrix Factorization

Ravi Kumar 0001
Rina Panigrahy
Ali Rahimi
David P. Woodruff

We give faster approximation algorithms for well-studied variants of Binary Matrix Factorization (BMF), where we are given a binary $m \times n$ matrix $A$ and would like to find binary rank-$k$ matrices $U, V$ to minimize the Frobenius norm of $U \cdot V - A$. In the first setting, $U \cdot V$ denotes multiplication over $\mathbb{Z}$, and we give a constant-factor approximation algorithm that runs in $2^{O(k^2 \log k)} \textrm{poly}(mn)$ time, improving upon the previous $\min(2^{2^k}, 2^n) \textrm{poly}(mn)$ time. Our techniques generalize to minimizing $\|U \cdot V - A\|_p$ for $p \geq 1$, in $2^{O(k^{\lceil p/2 \rceil + 1}\log k)} \textrm{poly}(mn)$ time. For $p = 1$, this has a graph-theoretic consequence, namely, a $2^{O(k^2)} \poly(mn)$-time algorithm to approximate a graph as a union of disjoint bicliques. In the second setting, $U \cdot V$ is over $\GF(2)$, and we give a bicriteria constant-factor approximation algorithm that runs in $2^{O(k^3)} \poly(mn)$ time to find binary rank-$O(k \log m)$ matrices $U$, $V$ whose cost is as good as the best rank-$k$ approximation, improving upon $\min(2^{2^k}mn, \min(m, n)^{k^{O(1)}} \textrm{poly}(mn))$ time.

Details

NeurIPS Conference 2011 Conference Paper

Structure Learning for Optimization

Shulin Yang
Ali Rahimi

We describe a family of global optimization procedures that automatically decompose optimization problems into smaller loosely coupled problems, then combine the solutions of these with message passing algorithms. We show empirically that these methods excel in avoiding local minima and produce better solutions with fewer function evaluations than existing global optimization methods. To develop these methods, we introduce a notion of coupling between variables of optimization that generalizes the notion of coupling that arises from factoring functions into terms that involve small subsets of the variables. It therefore subsumes the notion of independence between random variables in statistics, sparseness of the Hessian in nonlinear optimization, and the generalized distributive law. Despite being more general, this notion of coupling is easier to verify empirically -- making structure estimation easy -- yet it allows us to migrate well-established inference methods on graphical models to the setting of global optimization.

PDF Details

NeurIPS Conference 2010 Conference Paper

Random Conic Pursuit for Semidefinite Programming

Ariel Kleiner
Ali Rahimi
Michael Jordan

We present a novel algorithm, Random Conic Pursuit, that solves semidefinite programs (SDPs) via repeated optimization over randomly selected two-dimensional subcones of the PSD cone. This scheme is simple, easily implemented, applicable to very general SDPs, scalable, and theoretically interesting. Its advantages are realized at the expense of an ability to readily compute highly exact solutions, though useful approximate solutions are easily obtained. This property renders Random Conic Pursuit of particular interest for machine learning applications, in which the relevant SDPs are generally based upon random data and so exact minima are often not a priority. Indeed, we present empirical results to this effect for various SDPs encountered in machine learning; these experiments demonstrate the potential practical usefulness of Random Conic Pursuit. We also provide a preliminary analysis that yields insight into the theoretical properties and convergence of the algorithm.

PDF Details

JMLR Journal 2009 Journal Article

Similarity-based Classification: Concepts and Algorithms

Yihua Chen
Eric K. Garcia
Maya R. Gupta
Ali Rahimi
Luca Cazzanti

This paper reviews and extends the field of similarity-based classification, presenting new analyses, algorithms, data sets, and a comprehensive set of experimental results for a rich collection of classification problems. Specifically, the generalizability of using similarities as features is analyzed, design goals and methods for weighting nearest-neighbors for similarity-based learning are proposed, and different methods for consistently converting similarities into kernels are compared. Experiments on eight real data sets compare eight approaches and their variants to similarity-based learning. [abs] [ pdf ][ bib ] &copy JMLR 2009. ( edit, beta )

PDF Details

NeurIPS Conference 2008 Conference Paper

Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning

Ali Rahimi
Benjamin Recht

Randomized neural networks are immortalized in this AI Koan: In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. What are you doing? '' asked Minsky. I am training a randomly wired neural net to play tic-tac-toe, '' Sussman replied. Why is the net wired randomly? '' asked Minsky. Sussman replied, I do not want it to have any preconceptions of how to play. '' Minsky then shut his eyes. Why do you close your eyes? '' Sussman asked his teacher. So that the room will be empty, '' replied Minsky. At that moment, Sussman was enlightened. We analyze shallow random networks with the help of concentration of measure inequalities. Specifically, we consider architectures that compute a weighted sum of their inputs after passing them through a bank of arbitrary randomized nonlinearities. We identify conditions under which these networks exhibit good classification performance, and bound their test error in terms of the size of the dataset and the number of random nonlinearities.

PDF Details

NeurIPS Conference 2007 Conference Paper

Random Features for Large-Scale Kernel Machines

Ali Rahimi
Benjamin Recht

To accelerate the training of kernel machines, we propose to map the input data to a randomized low-dimensional feature space and then apply existing fast linear methods. The features are designed so that the inner products of the transformed data are approximately equal to those in the feature space of a user speciﬁed shift- invariant kernel. We explore two sets of random features, provide convergence bounds on their ability to approximate various radial basis kernels, and show that in large-scale classiﬁcation and regression tasks linear machine learning al- gorithms applied to these features outperform state-of-the-art large-scale kernel machines.

PDF Details

NeurIPS Conference 2006 Conference Paper

Unsupervised Regression with Applications to Nonlinear System Identification

Ali Rahimi
Ben Recht

We derive a cost functional for estimating the relationship between highdimensional observations and the low-dimensional process that generated them with no input-output examples. Limiting our search to invertible observation functions confers numerous benefits, including a compact representation and no suboptimal local minima. Our approximation algorithms for optimizing this cost functional are fast and give diagnostic bounds on the quality of their solution. Our method can be viewed as a manifold learning algorithm that utilizes a prior on the low-dimensional manifold coordinates. The benefits of taking advantage of such priors in manifold learning and searching for the inverse observation functions in system identification are demonstrated empirically by learning to track moving targets from raw measurements in a sensor network setting and in an RFID tracking experiment.

PDF Details

IROS Conference 2002 Conference Paper

Bayesian network for online global pose estimation

Ali Rahimi
Trevor Darrell

We cast the location estimation problem in vision-based robotic navigation in a Bayesian framework. We derive an efficient online algorithm for updating the trajectory of a robot as new frames of data become available. For each new frame, the algorithm computes the pose of the robot relative to past frames and combines these relative pose changes to obtain a robust estimate of its trajectory. The complexity of this algorithm grows linearly with the number of frames so far processed. Since it is effectively tracking against an appearance-based map, our algorithm provides consistent results in circular environments, where the robot returns to places already visited.

Details

NeurIPS Conference 2002 Conference Paper

Location Estimation with a Differential Update Network

Ali Rahimi
Trevor Darrell

Given a set of hidden variables with an a-priori Markov structure, we derive an online algorithm which approximately updates the posterior as pairwise measurements between the hidden variables become available. The update is performed using Assumed Density Filtering: to incorporate each pairwise measurement, we compute the optimal Markov structure which represents the true posterior and use it as a prior for incorporating the next measurement. We demonstrate the resulting algorithm by cal- culating globally consistent trajectories of a robot as it navigates along a 2D trajectory. To update a trajectory of length t, the update takes O(t). When all conditional distributions are linear-Gaussian, the algorithm can be thought of as a Kalman Filter which simpliﬁes the state covariance matrix after incorporating each measurement.

PDF Details