Author name cluster

Ravindran Kannan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

48 papers

2 author rows

ICLR Conference 2025 Conference Paper

LevAttention: Time, Space and Streaming Efficient Algorithm for Heavy Attentions

Ravindran Kannan
Chiranjib Bhattacharyya
Praneeth Kacham
David P. Woodruff

A central problem related to transformers can be stated as follows: given two $n \times d$ matrices $Q$ and $K$, and a non-negative function $f$, define the matrix $A$ as follows: (1) apply the function $f$ to each entry of the $n \times n$ matrix $Q K^T$, and then (2) normalize each of the row sums of $A$ to be equal to $1$. The matrix $A$ can be computed in $O(n^2 d)$ time assuming $f$ can be applied to a number in constant time, but the quadratic dependence on $n$ is prohibitive in applications where it corresponds to long context lengths. For a large class of functions $f$, we show how to find all the "large attention scores", i.e., entries of $A$ which are at least a positive value $\varepsilon$, in time with linear dependence on $n$ (i.e., $n \cdot \textrm{poly}(d/\varepsilon)$) for a positive parameter $\varepsilon > 0$. Our class of functions include all functions $f$ of the form $f(x) = |x|^p$, as explored recently in transformer models. Using recently developed tools from randomized numerical linear algebra, we prove that for any $K$, there is a "universal set" $U \subset [n]$ of size independent of $n$, such that for any $Q$ and any row $i$, the large attention scores $A_{i,j}$ in row $i$ of $A$ all have $j \in U$. We also find $U$ in $n \cdot \textrm{poly}(d/\varepsilon)$ time. Notably, we (1) make no assumptions on the data, (2) our workspace does not grow with $n$, and (3) our algorithms can be computed in streaming and parallel settings. We empirically show the benefits of our scheme for vision transformers, showing how to train new models that use our universal set while training as well, showing that our model is able to consistently select "important keys'" during training. We also provide theoretical motivation by formulating a planted model in which our efficient algorithms provably identify relevant keys for each query.

SODA Conference 2022 Conference Paper

How many Clusters? - An algorithmic answer

Chiranjib Bhattacharyya
Ravindran Kannan
Amit Kumar 0001

Many algorithms for clustering high dimensional data assume that k, the number of clusters, is given. However, there has been little work on provably inferring k from the data. This paper gives polynomial time algorithms for finding k from the data assuming it satisfies certain natural deterministic conditions. Informally, we assume that there is a Ground Truth (GT) clustering of the data with the following properties: (i) Each cluster has a certain minimum size, (ii) the inter-mean separation of any two distinct clusters in the GT is large enough (although still weaker than what is typically assumed in the literature), and (iii) we define a novel “no large sub-cluster” (NLSC) property that characterizes the notion of a cluster by stipulating that there be no subsets of low “directional variance”. NLSC is indeed satisfied by large class of distributions including log-concave densities. The first major contribution is an algorithm for finding k where m, the minimum GT cluster size, is assumed to be known. This algorithm uses a novel rounding procedure which finds subsets of size m with low Directional Variance by rounding a SDP relaxation using Cheeger's inequality and it is shown that k is precisely the number of such sets whose means are well-separated. The harder problem of finding k when m not given is addressed by running the previous algorithm for each value of m to find candidate values of k and the corresponding k -clustering. The second major contribution of this paper is a test which certifies the correct candidate thereby yielding a polynomial time algorithm which finds k.

ICML Conference 2021 Conference Paper

Finding k in Latent k- polytope

Chiranjib Bhattacharyya
Ravindran Kannan
Amit Kumar 0001

The recently introduced Latent $k-$ Polytope($\LkP$) encompasses several stochastic Mixed Membership models including Topic Models. The problem of finding $k$, the number of extreme points of $\LkP$, is a fundamental challenge and includes several important open problems such as determination of number of components in Ad-mixtures. This paper addresses this challenge by introducing Interpolative Convex Rank(\INR) of a matrix defined as the minimum number of its columns whose convex hull is within Hausdorff distance $\varepsilon$ of the convex hull of all columns. The first important contribution of this paper is to show that under \emph{standard assumptions} $k$ equals the \INR of a \emph{subset smoothed data matrix} defined from Data generated from an $\LkP$. The second important contribution of the paper is a polynomial time algorithm for finding $k$ under standard assumptions. An immediate corollary is the first polynomial time algorithm for finding the \emph{inner dimension} in Non-negative matrix factorisation(NMF) with assumptions which are qualitatively different than existing ones such as \emph{Separability}. %An immediate corollary is the first polynomial time algorithm for finding the \emph{inner dimension} in Non-negative matrix factorisation(NMF) with assumptions considerably weaker than \emph{Separability}.

ICLR Conference 2021 Conference Paper

Learning a Latent Simplex in Input Sparsity Time

Ainesh Bakshi
Chiranjib Bhattacharyya
Ravindran Kannan
David P. Woodruff
Samson Zhou

We consider the problem of learning a latent $k$-vertex simplex $K\in\mathbb{R}^d$, given $\mathbf{A}\in\mathbb{R}^{d\times n}$, which can be viewed as $n$ data points that are formed by randomly perturbing some latent points in $K$, possibly beyond $K$. A large class of latent variable models, such as adversarial clustering, mixed membership stochastic block models, and topic models can be cast in this view of learning a latent simplex. Bhattacharyya and Kannan (SODA 2020) give an algorithm for learning such a $k$-vertex latent simplex in time roughly $O(k\cdot\text{nnz}(\mathbf{A}))$, where $\text{nnz}(\mathbf{A})$ is the number of non-zeros in $\mathbf{A}$. We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $\mathbf{A}$, which holds in many of these applications. Further, we show this assumption is necessary, as otherwise an algorithm for learning a latent simplex would imply a better low rank approximation algorithm than what is known. We obtain a spectral low-rank approximation to $\mathbf{A}$ in input-sparsity time and show that the column space thus obtained has small $\sin\Theta$ (angular) distance to the right top-$k$ singular space of $\mathbf{A}$. Our algorithm then selects $k$ points in the low-rank subspace with the largest inner product (in absolute value) with $k$ carefully chosen random vectors. By working in the low-rank subspace, we avoid reading the entire matrix in each iteration and thus circumvent the $\Theta(k\cdot\text{nnz}(\mathbf{A}))$ running time.

SODA Conference 2020 Conference Paper

Finding a latent k -simplex in O * ( k · nnz(data)) time via Subset Smoothing

Chiranjib Bhattacharyya
Ravindran Kannan

ICML Conference 2020 Conference Paper

Near-optimal sample complexity bounds for learning Latent k-polytopes and applications to Ad-Mixtures

Chiranjib Bhattacharyya
Ravindran Kannan

Deriving Optimal bounds on Sample Complexity of Latent Variable models is an active area of research. Recently such bounds were obtained for Mixture of Gaussians \cite{HSNCAY18}, no such results are known for Ad-mixtures, a generalization of Mixture distributions. In this paper we show that $O^*(dk/m)$ samples are sufficient to learn each of $k-$ topic vectors of LDA, a popular Ad-mixture model, with vocabulary size $d$ and $m\in \Omega(1)$ words per document, to any constant error in $L_1$ norm. The result is a corollary of the major contribution of this paper: the first sample complexity upper bound for the problem (introduced in \cite{BK20}) of learning the vertices of a Latent $k-$ Polytope in $\RR^d$, given perturbed points from it. The bound, $O^*(dk/\beta)$, is optimal and linear in number of parameters. It applies to many stochastic models including a broad class Ad-mixtures. To demonstrate the generality of the approach we specialize the setting to Mixed Membership Stochastic Block Models(MMSB) and show for the first time that if an MMSB has $k$ blocks, the sample complexity is $O^*(k^2)$ under usual assumptions.

ICML Conference 2016 Conference Paper

Non-negative Matrix Factorization under Heavy Noise

Chiranjib Bhattacharyya
Navin Goyal
Ravindran Kannan
Jagdeep Pani

The Noisy Non-negative Matrix factorization (NMF) is: given a data matrix A (d x n), find non-negative matrices B; C (d x k, k x n respy.) so that A = BC +N, where N is a noise matrix. Existing polynomial time algorithms with proven error guarantees require EACH column N_⋅j to have l1 norm much smaller than ||(BC)_⋅j ||_1, which could be very restrictive. In important applications of NMF such as Topic Modeling as well as theoretical noise models (e. g. Gaussian with high sigma), almost EVERY column of N_. j violates this condition. We introduce the heavy noise model which only requires the average noise over large subsets of columns to be small. We initiate a study of Noisy NMF under the heavy noise model. We show that our noise model subsumes noise models of theoretical and practical interest (for e. g. Gaussian noise of maximum possible sigma). We then devise an algorithm TSVDNMF which under certain assumptions on B, C, solves the problem under heavy noise. Our error guarantees match those of previous algorithms. Our running time of O(k. (d+n)^2) is substantially better than the O(d. n^3) for the previous best. Our assumption on B is weaker than the “Separability” assumption made by all previous results. We provide empirical justification for our assumptions on C. We also provide the first proof of identifiability (uniqueness of B) for noisy NMF which is not based on separability and does not use hard to check geometric conditions. Our algorithm outperforms earlier polynomial time algorithms both in time and error, particularly in the presence of high noise.

NeurIPS Conference 2014 Conference Paper

A provable SVD-based algorithm for learning topics in dominant admixture corpus

Trapit Bansal
Chiranjib Bhattacharyya
Ravindran Kannan

Topic models, such as Latent Dirichlet Allocation (LDA), posit that documents are drawn from admixtures of distributions over words, known as topics. The inference problem of recovering topics from such a collection of documents drawn from admixtures, is NP-hard. Making a strong assumption called separability, [4] gave the first provable algorithm for inference. For the widely used LDA model, [6] gave a provable algorithm using clever tensor-methods. But [4, 6] do not learn topic vectors with bounded $l_1$ error (a natural measure for probability vectors). Our aim is to develop a model which makes intuitive and empirically supported assumptions and to design an algorithm with natural, simple components such as SVD, which provably solves the inference problem for the model with bounded $l_1$ error. A topic in LDA and other models is essentially characterized by a group of co-occurring words. Motivated by this, we introduce topic specific Catchwords, a group of words which occur with strictly greater frequency in a topic than any other topic individually and are required to have high frequency together rather than individually. A major contribution of the paper is to show that under this more realistic assumption, which is empirically verified on real corpora, a singular value decomposition (SVD) based algorithm with a crucial pre-processing step of thresholding, can provably recover the topics from a collection of documents drawn from Dominant admixtures. Dominant admixtures are convex combination of distributions in which one distribution has a significantly higher contribution than the others. Apart from the simplicity of the algorithm, the sample complexity has near optimal dependence on $w_0$, the lowest probability that a topic is dominant, and is better than [4]. Empirical evidence shows that on several real world corpora, both Catchwords and Dominant admixture assumptions hold and the proposed algorithm substantially outperforms the state of the art [5].

FOCS Conference 2014 Conference Paper

Spectral Approaches to Nearest Neighbor Search

Amirali Abdullah
Alexandr Andoni
Ravindran Kannan
Robert Krauthgamer

We study spectral algorithms for the high-dimensional Nearest Neighbor Search problem (NNS). In particular, we consider a semi-random setting where a dataset is chosen arbitrarily from an unknown subspace of low dimension, and then perturbed by full-dimensional Gaussian noise. We design spectral NNS algorithms whose query time depends polynomially on the dimension and logarithmically on the size of the point set. These spectral algorithms use a repeated computation of the top PCA vector/subspace, and are effective even when the random-noise magnitude is much larger than the interpoint distances. Our motivation is that in practice, a number of spectral NNS algorithms outperform the random-projection methods that seem otherwise theoretically optimal on worst-case datasets. In this paper we aim to provide theoretical justification for this disparity. The full version of this extended abstract is available on arXiv.

STOC Conference 2012 Conference Paper

Computing a nonnegative matrix factorization - provably

Sanjeev Arora
Rong Ge 0001
Ravindran Kannan
Ankur Moitra

FOCS Conference 2010 Conference Paper

Clustering with Spectral Norm and the k-Means Algorithm

Amit Kumar 0001
Ravindran Kannan

There has been much progress on efficient algorithms for clustering data points generated by a mixture of k probability distributions under the assumption that the means of the distributions are well-separated, i. e. , the distance between the means of any two distributions is at least Ω(k) standard deviations. These results generally make heavy use of the generative model and particular properties of the distributions. In this paper, we show that a simple clustering algorithm works without assuming any generative (probabilistic) model. Our only assumption is what we call a "proximity condition'': the projection of any data point onto the line joining its cluster center to any other cluster center is Ω(k) standard deviations closer to its own center than the other center. Here the notion of standard deviations is based on the spectral norm of the matrix whose rows represent the difference between a point and the mean of the cluster to which it belongs. We show that in the generative models studied, our proximity condition is satisfied and so we are able to derive most known results for generative models as corollaries of our main result. We also prove some new results for generative models - e. g. , we can cluster all but a small fraction of points only assuming a bound on the variance. Our algorithm relies on the well known k-means algorithm, and along the way, we prove a result of independent interest - that the k-means algorithm converges to the "true centers'' even in the presence of spurious points provided the initial (estimated) centers are close enough to the corresponding actual centers and all but a small fraction of the points satisfy the proximity condition. Finally, we present a new technique for boosting the ratio of inter-center separation to standard deviation. This allows us to prove results for learning certain mixture of distributions under weaker separation conditions.

STOC Conference 2010 Conference Paper

Spectral methods for matrices and tensors

Ravindran Kannan

FOCS Conference 2009 Conference Paper

A New Probability Inequality Using Typical Moments and Concentration Results

Ravindran Kannan

We present two probability inequalities. The simpler first inequality weakens both hypotheses in Hoffding-Azumaine quality. Using it, we generalize concentration results previously known for the uniform density for the TSP, MWST and Random Projections to long-tailed inhomogeneous distributions. The second more complicated inequality further weakens the moment requirements and using it, we prove the best possible concentration for the long-studied bin packing problem as well as some others.

STOC Conference 2009 Conference Paper

Random walks on polytopes and an affine interior point method for linear programming

Ravindran Kannan
Hariharan Narayanan

FOCS Conference 2008 Conference Paper

Market Equilibria in Polynomial Time for Fixed Number of Goods or Agents

Nikhil R. Devanur
Ravindran Kannan

We consider markets in the classical Arrow-Debreu model. There are n agents and m goods. Each buyer has a concave utility function (of the bundle of goods he/she buys) and an initial bundle. At an ldquoequilibriumrdquo set of prices for goods, if each individual buyer separately ex-changes the initial bundle for an optimal bundle at the set prices, the market clears, i. e. , all goods are exactly consumed. Classical theorems guarantee the existence of equilibria, but computing them has been the subject of much recent research. In the related area of Multi-Agent Games, much attention has been paid to the complexity as well as algorithms. While most general problems are hard, polynomial time algorithms have been developed for restricted classes of games, when one assumes the number of strategies is constant. For the Market Equilibrium problem, several important special cases of utility functions have been tackled. Here we begin a program for this problem similar to that for multi-agent games, where general utilities are considered. We begin by showing that if the utilities are separable piece-wise linear concave (PLC) functions, and the number of goods(or alternatively the number of buyers) is constant, then we can compute an exact equilibrium in polynomial time. Our technique for the constant number of goods is to de-compose the space of price vectors into cells using certain hyperplanes, so that in each cell, each buyerpsilas threshold marginal utility is known. Still, one needs to solve a linear optimization problem in each cell. We then show the main result - that for general (non-separable) PLC utilities, an exact equilibrium can be found in polynomial time provided the number of goods is constant. The starting point of the algorithm is a ldquocell-decompositionrdquo of the space of price vectors using polynomial surfaces (instead of hyperplanes). We use results from computational algebraic geometry to bound the number of such cells. For solving the problem inside each cell, we introduce and use a novel LP-duality based method. We note that if the number of buyers and agents both can vary, the problem is PPAD hard even for the very special case of PLC utilities - namely Leontief utilities.

SODA Conference 2007 Conference Paper

Games of fixed rank: a hierarchy of bimatrix games

Ravindran Kannan
Thorsten Theobald

SODA Conference 2007 Conference Paper

Spectral clustering with limited independence

Anirban Dasgupta 0001
John E. Hopcroft
Ravindran Kannan
Pradipta Mitra

SODA Conference 2006 Conference Paper

The space complexity of pass-efficient algorithms for clustering

Kevin L. Chang
Ravindran Kannan

STOC Conference 2005 Conference Paper

Tensor decomposition and approximation schemes for constraint satisfaction problems

Wenceslas Fernandez de la Vega
Marek Karpinski
Ravindran Kannan
Santosh S. Vempala

SODA Conference 2003 Conference Paper

Pass efficient algorithms for approximating large matrices

Petros Drineas
Ravindran Kannan

STOC Conference 2002 Conference Paper

Random sampling and approximation of MAX-CSP problems

Noga Alon
Wenceslas Fernandez de la Vega
Ravindran Kannan
Marek Karpinski

We present a new efficient sampling method for approximating r -dimensional Maximum Constraint Satisfaction Problems, MAX-rCSP, on n variables up to an additive error εn r . We prove a newgeneral paradigm in that it suffices, for a given set of constraints, to pick a small uniformly random subset of its variables, and the optimum value of the subsystem induced on these variables gives (after a direct normalization and with high probability) an approximation to the optimum of the whole system up to an additive error of εn r . Our method gives for the first time a polynomial in ε —1 bound on the sample size necessary to carry out the above approximation. Moreover, this bound is independent in the exponent on the dimension r . The above method gives a completely uniform sampling technique for all the MAX-rCSP problems, and improves the best known sample bounds for the low dimensional problems, like MAX-CUT. The method of solution depends on a new result on t he cut norm of random subarrays, and a new sampling technique for high dimensional linear programs. This method could be also of independent interest.

FOCS Conference 2001 Conference Paper

Fast Monte-Carlo Algorithms for Approximate Matrix Multiplication

Petros Drineas
Ravindran Kannan

Given an m x n matrix A and an n x p matrix B, we present 2 simple and intuitive algorithms to compute an approximation P to the pr oductA·B, with provable bounds for the norm of the err or matrix "P- A·B. Both algorithms run in O(mp+mn+np) time. In both algorithms, we randomly pick s = O(1) columns of A to form an m x s matrix S and the corresponding rows of B to form an s x p matrix R. After scaling the columns of S and the rows of R, we multiply them together to obtain our approximation P. The choice of the probability distribution we use for picking the columns of A and the scaling are the crucial features which enable us to give fairly elementary proofs of the error bounds. Our first algorithm can be implemented without storing the matrices A and B in Random Access Memory, provided we can make two passes through the matrices (stored in external memory). The second algorithm has a smaller bound on the 2-norm of the error matrix, but requires storage of A and B in RAM. We also present a fast algorithm that "describes" P as a sum of rank one matrices if B = A T.

STOC Conference 2001 Conference Paper

Learning mixtures of arbitrary gaussians

Sanjeev Arora
Ravindran Kannan

Mixtures of gaussian (or normal) distributions arise in a variety of application areas. Many techniques have been proposed for the task of finding the component gaussians given samples from the mixture, such as the EM algorithm , a local-search heuristic from Dempster, Laird and Rubin~(1977). However, such heuristics are known to require time exponential in the dimension (i.e., number of variables) in the worst case, even when the number of components is $2$. This paper presents the first algorithm that provably learns the component gaussians in time that is polynomial in the dimension. The gaussians may have arbitrary shape provided they satisfy a “nondegeneracy” condition, which requires their high-probability regions to be not “too close” together.

FOCS Conference 2000 Conference Paper

On Clusterings - Good, Bad and Spectral

Ravindran Kannan
Santosh S. Vempala
Adrian Vetta

We propose a new measure for assessing the quality of a clustering. A simple heuristic is shown to give worst-case guarantees under the new measure. Then we present two results regarding the quality of the clustering found by a popular spectral algorithm. One proffers worst case guarantees whilst the other shows that if there exists a "good" clustering then the spectral algorithm will find one close to it.

SODA Conference 1999 Conference Paper

Clustering in Large Graphs and Matrices

Petros Drineas
Alan M. Frieze
Ravindran Kannan
Santosh S. Vempala
V. Vinay

STOC Conference 1999 Conference Paper

Faster Mixing via Average Conductance

László Lovász 0001
Ravindran Kannan

FOCS Conference 1998 Conference Paper

Approximation of Diameters: Randomization Doesn't Help

Andreas Brieden
Peter Gritzmann
Ravindran Kannan
Victor Klee
László Lovász 0001
Miklós Simonovits

We describe a deterministic polynomial-time algorithm which, for a convex body K in Euclidean n-space, finds upper and lower bounds on K's diameter which differ by a factor of O(/spl radic/n/logn). We show that this is, within a constant factor, the best approximation to the diameter that a polynomial-time algorithm can produce even if randomization is allowed. We also show that the above results hold for other quantities similar to the diameter-namely; inradius, circumradius, width, and maximization of the norm over K. In addition to these results for Euclidean spaces, we give tight results for the error of deterministic polynomial-time approximations of radii and norm-maxima for convex bodies in finite-dimensional l/sub p/ spaces.

FOCS Conference 1998 Conference Paper

Fast Monte-Carlo Algorithms for Finding Low-Rank Approximations

Alan M. Frieze
Ravindran Kannan
Santosh S. Vempala

In several applications, the data consists of an m/spl times/n matrix A and it is of interest to find an approximation D of a specified rank k to A where, k is much smaller than m and n. Traditional methods like the Singular Value Decomposition (SVD) help us find the "best" such approximation. However, these methods take time polynomial in m, n which is often too prohibitive. In this paper, we develop an algorithm which is qualitatively faster provided we may sample the entries of the matrix according to a natural probability distribution. Indeed, in the applications such sampling is possible. Our main result is that we can find the description of a matrix D* of rank at most k so that /spl par/A-D*/spl par//sub F//spl les/min/D, rank(D)/spl les/k/spl par/A-D/spl par//sub F/+/spl epsiv//spl par/A/spl par//sub F/ holds with probability at least 1-/spl delta/. (For any matrix M, /spl par/M/spl par//sub F//sup 2/ denotes the sum of the squares of all the entries of M.) The algorithm takes time polynomial in k, 1//spl epsiv/, log(1//spl delta/) only, independent of m, n.

FOCS Conference 1998 Conference Paper

Local Search in Smooth Convex Sets

Ravindran Kannan
Andreas Nolte

In this paper we analyse two very simple techniques to minimize a linear function over a convex set. The first is a deterministic algorithm based on gradient descent. The second is a randomized algorithm which makes a small local random change at every step. The second method can be used when the convex set is presented by just a membership oracle whereas the first requires something similar to a separation oracle. We define a simple notation of smoothness of convex sets and show that both algorithms provide a near optimal solution for smooth convex sets in polynomial time. We describe several application examples from linear and stochastic programming where the relevant sets are indeed smooth and thus our algorithms apply. The main point of the paper is that such simple algorithms yield good running time bounds for natural problems.

STOC Conference 1997 Conference Paper

Sampling Lattice Points

Ravindran Kannan
Santosh S. Vempala

SODA Conference 1997 Conference Paper

Simple Markov-Chain Algorithms for Generating Bipartite Graphs and Tournaments (Extended Abstract)

Ravindran Kannan
Prasad Tetali
Santosh S. Vempala

FOCS Conference 1996 Conference Paper

A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions

Avrim Blum
Alan M. Frieze
Ravindran Kannan
Santosh S. Vempala

The authors consider the problem of learning a linear threshold function (a halfspace in n dimensions, also called a "perceptron"). Methods for solving this problem generally fall into two categories. In the absence of noise, this problem can be formulated as a linear program and solved in polynomial time with the ellipsoid algorithm (or interior point methods). On the other hand, simple greedy algorithms such as the perceptron algorithm seem to work well in practice and can be made noise tolerant; but, their running time depends on a separation parameter (which quantifies the amount of "wiggle room" available) and can be exponential in the description length of the input. They show how simple greedy methods can be used to find weak hypotheses (hypotheses that classify noticeably more than half of the examples) in polynomial time, without dependence on any separation parameter. This results in a polynomial-time algorithm for learning linear threshold functions in the PAC model in the presence of random classification noise. The algorithm is based on a new method for removing outliers in data. Specifically, for any set S of points in R/sup n/, each given to b bits of precision, they show that one can remove only a small fraction of S so that in the remaining set T, for every vector v, max/sub x/spl epsiv/T/(v/spl middot/x)/sup 2//spl les/poly(n, b)|T|/sup -1//spl Sigma//sub x/spl epsiv/T/(v/spl middot/x)/sup 2/. After removing these outliers, they are able to show that a modified version of the perceptron learning algorithm works in polynomial time, even in the presence of random classification noise.

FOCS Conference 1996 Conference Paper

Learning Linear Transformations

Alan M. Frieze
Mark Jerrum
Ravindran Kannan

We present a polynomial time algorithm to learn (in Valiant's PAC model) an arbitrarily oriented cube in n-space, given uniformly distributed sample points from it. In fact, we solve the more general problem of learning, in polynomial time, a linear (affine) transformation of a product distribution.

FOCS Conference 1996 Conference Paper

Sampling According to the Multivariate Normal Density

Ravindran Kannan
Guangxing Li

This paper deals with the normal density of n dependent random variables. This is a function of the form: ce(-x/sup T/Ax) where A is an n/spl times/n positive definite matrix, a: is the n-vector of the random variables and c is a suitable constant. The first problem we consider is the (approximate) evaluation of the integral of this function over the positive orthant /spl int/(x/sub 1/=0)/sup /spl infin///spl int/(x/sub 2/=0)/sup /spl infin///spl middot//spl middot//spl middot//spl int/(x/sub n/=0)/sup /spl infin//ce(-x/sup T/Ax). This problem has a long history and a substantial literature. Related to it is the problem of drawing a sample from the positive orthant with probability density (approximately) equal to ce(-x/sup T/Ax). We solve both these problems here in polynomial time using rapidly mixing Markov Chains. For proving rapid convergence of the chains to their stationary distribution, we use a geometric property called the isoperimetric inequality. Such an inequality has been the subject of recent papers for general log-concave functions. We use these techniques, but the main thrust of the paper is to exploit the special property of the normal density to prove a stronger inequality than for general log-concave functions. We actually consider first the problem of drawing a sample according to the normal density with A equal to the identity matrix from a convex set K in R/sup n/ which contains the unit ball. This problem is motivated by the problem of computing the volume of a convex set in a way we explain later. Also, the methods used in the solution of this and the orthant problem are similar.

FOCS Conference 1996 Conference Paper

The Regularity Lemma and Approximation Schemes for Dense Problems

Alan M. Frieze
Ravindran Kannan

There are two main contributions of the present paper. In the first, we use the constructive version of the Regularity Lemma to give directly simple polynomial time approximation schemes for several graph "subdivision" problems in dense graphs including the Max Cut problem, the Graph Bisection problem, the Min l-way cut problem and Graph Separator problem. Arora, Karger and Karpinski (1992) gave the first PTASs for these problems whose running time is O(n/sup o(1/e2)/). Our PTASs have running time where the exponent of n is a constant independent of e. The central point here is that the Regularity Lemma provides an explanation of why these Max-SNP hard problems turn out to be easy in dense graphs. We also give a simple PTAS for dense versions of a special case of the Quadratic Assignment Problem (QAP).

FOCS Conference 1994 Conference Paper

Markov Chains and Polynomial Time Algorithms

Ravindran Kannan

This paper outlines the use of rapidly mixing Markov Chains in randomized polynomial time algorithms to solve approximately certain counting problems. They fall into two classes: combinatorial problems like counting the number of perfect matchings in certain graphs and geometric ones like computing the volumes of convex sets. >

FOCS Conference 1993 Conference Paper

Learning an Intersection of k Halfspaces over a Uniform Distribution

Avrim Blum
Ravindran Kannan

We present a polynomial-time algorithm to learn an intersection of a constant number of halfspaces in n dimensions, over the uniform distribution on an n-dimensional ball. The algorithm we present in fact can learn an intersection of an arbitrary (polynomial) number of halfspaces over this distribution, if the subspace spanned by the normal vectors to the bounding hyperplanes has constant dimension. This generalizes previous results for this distribution, in particular a result of E. B. Baum (1990) who showed how to learn an intersection of 2 halfspaces defined by hyperplanes that pass through the origin (his results in fact held for a variety of symmetric distributions). Our algorithm uses estimates of second moments to find vectors in a low-dimensional "relevant subspace". We believe that the algorithmic techniques studied here may be useful in other geometric learning applications. >

STOC Conference 1991 Conference Paper

Sampling and Integration of Near Log-Concave functions

David L. Applegate
Ravindran Kannan

STOC Conference 1989 Conference Paper

A Random Polynomial Time Algorithm for Approximating the Volume of Convex Bodies

Martin E. Dyer
Alan M. Frieze
Ravindran Kannan

STOC Conference 1986 Conference Paper

On Nontrivial Separators for k-Page Graphs and Simulations by Nondeterministic One-Tape Turing Machines

Zvi Galil
Ravindran Kannan
Endre Szemerédi

FOCS Conference 1984 Conference Paper

Linear Congruential Generators Do Not Produce Random Sequences

Alan M. Frieze
Ravindran Kannan
J. C. Lagarias 0001

One of the most popular and fast methods of generating "random" sequence are linear congruential generators. This paper discusses the predictability of the sequence given only a constant proportion /spl alpha/ of the leading bits of the first few numbers generated. We show that the rest of the sequence is predictable in polynomial time, almost always, provided /spl alpha/ > 2/5.

STOC Conference 1984 Conference Paper

Polynomial Factorization and Nonrandomness of Bits of Algebraic and Some Transcendental Numbers

Ravindran Kannan
Arjen K. Lenstra
László Lovász 0001

FOCS Conference 1984 Conference Paper

Sublinear Parallel Algorithm for Computing the Greatest Common Divisor of Two Integers

Ravindran Kannan
Gary L. Miller
Larry Rudolph

The advent of practical parallel processors has caused a reexamination of many existing algorithms with the hope of discovering a parallel implementation. One of the oldest and best known algorithms is Euclid's algorithm for computing the greatest common divisor (GCD). In this paper we present a parallel algorithm to compute the GCD of two integers. The two salient features of the algorithm are: the observation based on the pigeon hole principle that we can easily find an integer combination of the two integers A and B which has fewer bits than n and the idea of working in phases so as to perform arithmetics on n-bit integers only once every phase, the more frequent operations being performed on O(log/sup 2/n)-bit integers. It appears that yet another approach is needed if the GCD is to be computed in poly-log parallel time.

STOC Conference 1983 Conference Paper

Alternation and the Power of Nondeterminism

Ravindran Kannan

STOC Conference 1983 Conference Paper

Improved Algorithms for Integer Programming and Related Lattice Problems

Ravindran Kannan

FOCS Conference 1981 Conference Paper

A Circuit-Size Lower Bound

Ravindran Kannan

As remarked in Cook (1980), we do not know any nonlinear lower bound on the circuit size of a language in P or even in NP. The best known lower bound seems to be due to Paul (1975). Instead of trying to prove lower bounds on the circuit-size of a "natural" language, this note raises the question of whether some language in a class is of provably high circuit complexity. We show that for each nonnegative integer k, there is a language Lk in Σ2P ∩ π2P (of the Meyer and Stockmeyer (1972) hierarchy) Which does not have O(nk)-size circuits. The method is indirect and does not produce the language Lk. Other results of a similar nature are presented and several questions raised.

FOCS Conference 1981 Conference Paper

Towards Separating Nondeterministic Time from Deterministic Time

Ravindran Kannan

It would be of interest to separate nondeterminism from determinism i. e. , to show that for all "nice" functions t(n), NTIME (t(n)), (the class of languages accepted by multitape nondeterministic Turing machines in time O(t(n))) strictly contains DTIME (t(n)) (the class of languages accepted by multitape deterministic Turing machines in time O(t(n)). We establish a weaker form of the statement. We show that there is a universal constant k such that for all "nice" functions t(n), the class of languages that can be accepted simultaneously in deterministic time O(t(n)) and space o((t(n))1/k) is strictly contained in NTIME (t(n)). (We will use the notation SPACE, TIME (s(n), t(n)) to denote the class of languages accepted by a deterministic TM in time O(t(n)) and simultaneously space O(s(n)).) This result is proved using a time-alternation trade-off and several other applications of this trade-off are presented. For example, we show that for each language L in SPACE, TIME (nl-ε, ni) (where o≪ε≪l, ε, i constants) there exists a j such that L is accepted by a O(n) time bounded alternating Turing machine with j alternations. The trade-off also leads to the separation ∪SεSt SPACE, TIME (s, t)⊂+Σ2 TIME(t) where t(n) is any "nice" function and St is a class of "nice" functions in o(t). Here St includes most natural functions for natural t. For example, nj/log*n is in Snj.

STOC Conference 1980 Conference Paper

The Orbit Problem is Decidable

Ravindran Kannan
Richard J. Lipton