Author name cluster

Guy Lebanon

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

29 papers

2 author rows

JMLR Journal 2016 Journal Article

LLORMA: Local Low-Rank Matrix Approximation

Joonseok Lee
Seungyeon Kim
Guy Lebanon
Yoram Singer
Samy Bengio

Matrix approximation is a common tool in recommendation systems, text mining, and computer vision. A prevalent assumption in constructing matrix approximations is that the partially observed matrix is low-rank. In this paper, we propose, analyze, and experiment with two procedures, one parallel and the other global, for constructing local matrix approximations. The two approaches approximate the observed matrix as a weighted sum of low-rank matrices. These matrices are limited to a local region of the observed matrix. We analyze the accuracy of the proposed local low-rank modeling. Our experiments show improvements in prediction accuracy over classical approaches for recommendation tasks. [abs] [ pdf ][ bib ] &copy JMLR 2016. ( edit, beta )

AIJ Journal 2016 Journal Article

Smooth sparse coding via marginal regression for learning sparse representations

Krishnakumar Balasubramanian
Kai Yu
Guy Lebanon

AAAI Conference 2015 Conference Paper

Estimating Temporal Dynamics of Human Emotions

Seungyeon Kim
Joonseok Lee
Guy Lebanon
Haesun Park

Sentiment analysis predicts a one-dimensional quantity describing the positive or negative emotion of an author. Mood analysis extends the one-dimensional sentiment response to a multi-dimensional quantity, describing a diverse set of human emotions. In this paper, we extend sentiment and mood analysis temporally and model emotions as a function of time based on temporal streams of blog posts authored by a specific author. The model is useful for constructing predictive models and discovering scientific models of human emotions.

AAAI Conference 2015 Conference Paper

Local Context Sparse Coding

Seungyeon Kim
Joonseok Lee
Guy Lebanon
Haesun Park

The n-gram model has been widely used to capture the local ordering of words, yet its exploding feature space often causes an estimation issue. This paper presents local context sparse coding (LCSC), a non-probabilistic topic model that effectively handles large feature spaces using sparse coding. In addition, it introduces a new concept of locality, local contexts, which provides a representation that can generate locally coherent topics and document representations. Our model efficiently finds topics and representations by applying greedy coordinate descent updates. The model is useful for discovering local topics and the semantic flow of a document, as well as constructing predictive models.

ICML Conference 2013 Conference Paper

Local Low-Rank Matrix Approximation

Joonseok Lee
Seungyeon Kim 0001
Guy Lebanon
Yoram Singer

Matrix approximation is a common tool in recommendation systems, text mining, and computer vision. A prevalent assumption in constructing matrix approximations is that the partially observed matrix is of low-rank. We propose a new matrix approximation model where we assume instead that the matrix is locally of low-rank, leading to a representation of the observed matrix as a weighted sum of low-rank matrices. We analyze the accuracy of the proposed local low-rank modeling. Our experiments show improvements in prediction accuracy over classical approaches for recommendation tasks.

ICML Conference 2013 Conference Paper

Smooth Sparse Coding via Marginal Regression for Learning Sparse Representations

Krishnakumar Balasubramanian 0002
Kai Yu 0001
Guy Lebanon

We propose and analyze a novel framework for learning sparse representations, based on two statistical techniques: kernel smoothing and marginal regression. The proposed approach provides a flexible framework for incorporating feature similarity or temporal information present in data sets, via nonparametric kernel smoothing. We provide generalization bounds for dictionary learning using smooth sparse coding and show how the sample complexity depends on the L1 norm of kernel function used. Furthermore, we propose using marginal regression for obtaining sparse codes, which significantly improves the speed and allows one to scale to large dictionary sizes easily. We demonstrate the advantages of the proposed approach, both in terms of accuracy and speed by extensive experimentation on several real data sets. In addition, we demonstrate how the proposed approach could be used for improving semisupervised sparse coding.

NeurIPS Conference 2012 Conference Paper

Automatic Feature Induction for Stagewise Collaborative Filtering

Joonseok Lee
Mingxuan Sun
Seungyeon Kim
Guy Lebanon

Recent approaches to collaborative filtering have concentrated on estimating an algebraic or statistical model, and using the model for predicting missing ratings. In this paper we observe that different models have relative advantages in different regions of the input space. This motivates our approach of using stagewise linear combinations of collaborative filtering algorithms, with non-constant combination coefficients based on kernel smoothing. The resulting stagewise model is computationally scalable and outperforms a wide selection of state-of-the-art collaborative filtering algorithms.

JMLR Journal 2012 Journal Article

PREA: Personalized Recommendation Algorithms Toolkit

Joonseok Lee
Mingxuan Sun
Guy Lebanon

Recommendation systems are important business applications with significant economic impact. In recent years, a large number of algorithms have been proposed for recommendation systems. In this paper, we describe an open-source toolkit implementing many recommendation algorithms as well as popular evaluation metrics. In contrast to other packages, our toolkit implements recent state-of-the-art algorithms as well as most classic algorithms. [abs] [ pdf ][ bib ] [ code ] &copy JMLR 2012. ( edit, beta )

ICML Conference 2012 Conference Paper

The Landmark Selection Method for Multiple Output Prediction

Krishnakumar Balasubramanian 0002
Guy Lebanon

JMLR Journal 2011 Journal Article

Unsupervised Supervised Learning II: Margin-Based Classification Without Labels

Krishnakumar Balasubramanian
Pinar Donmez
Guy Lebanon

Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing a margin-based risk function. Traditionally, these risk functions are computed based on a labeled data set. We develop a novel technique for estimating such risks using only unlabeled data and the marginal label distribution. We prove that the proposed risk estimator is consistent on high-dimensional data sets and demonstrate it on synthetic and real-world data. In particular, we show how the estimate is used for evaluating classifiers in transfer learning, and for training classifiers with no labeled data whatsoever. [abs] [ pdf ][ bib ] &copy JMLR 2011. ( edit, beta )

ICML Conference 2010 Conference Paper

Asymptotic Analysis of Generative Semi-Supervised Learning

Joshua V. Dillon
Krishnakumar Balasubramanian 0002
Guy Lebanon

JMLR Journal 2010 Journal Article

Stochastic Composite Likelihood

Joshua V. Dillon
Guy Lebanon

Maximum likelihood estimators are often of limited practical use due to the intensive computation they require. We propose a family of alternative estimators that maximize a stochastic variation of the composite likelihood function. Each of the estimators resolve the computation-accuracy tradeoff differently, and taken together they span a continuous spectrum of computation-accuracy tradeoff resolutions. We prove the consistency of the estimators, provide formulas for their asymptotic variance, statistical robustness, and computational complexity. We discuss experimental results in the context of Boltzmann machines and conditional random fields. The theoretical and experimental studies demonstrate the effectiveness of the estimators when the computational resources are insufficient. They also demonstrate that in some cases reduced computational complexity is associated with robustness thereby increasing statistical accuracy. [abs] [ pdf ][ bib ] &copy JMLR 2010. ( edit, beta )

JMLR Journal 2010 Journal Article

Unsupervised Supervised Learning I: Estimating Classification and Regression Errors without Labels

Pinar Donmez
Guy Lebanon
Krishnakumar Balasubramanian

Estimating the error rates of classifiers or regression models is a fundamental task in machine learning which has thus far been studied exclusively using supervised learning techniques. We propose a novel unsupervised framework for estimating these error rates using only unlabeled data and mild assumptions. We prove consistency results for the framework and demonstrate its practical applicability on both synthetic and real world data. [abs] [ pdf ][ bib ] &copy JMLR 2010. ( edit, beta )

UAI Conference 2009 Conference Paper

Domain Knowledge Uncertainty and Probabilistic Parameter Constraints

Yi Mao
Guy Lebanon

Incorporating domain knowledge into the modeling process is an effective way to improve learning accuracy. However, as it is provided by humans, domain knowledge can only be specified with some degree of uncertainty. We propose to explicitly model such uncertainty through probabilistic constraints over the parameter space. In contrast to hard parameter constraints, our approach is effective also when the domain knowledge is inaccurate and generally results in superior modeling accuracy. We focus on generative and conditional modeling where the parameters are assigned a Dirichlet or Gaussian prior and demonstrate the framework with experiments on both synthetic and real-world data.

ICML Conference 2008 Conference Paper

Local likelihood modeling of temporal text streams

Guy Lebanon
Yang Zhao

JMLR Journal 2008 Journal Article

Non-Parametric Modeling of Partially Ranked Data

Guy Lebanon
Yi Mao

Statistical models on full and partial rankings of n items are often of limited practical use for large n due to computational consideration. We explore the use of non-parametric models for partially ranked data and derive computationally efficient procedures for their use for large n. The derivations are largely possible through combinatorial and algebraic manipulations based on the lattice of partial rankings. A bias-variance analysis and an experimental study demonstrate the applicability of the proposed method. [abs] [ pdf ][ bib ] &copy JMLR 2008. ( edit, beta )

NeurIPS Conference 2007 Conference Paper

Non-parametric Modeling of Partially Ranked Data

Guy Lebanon
Yi Mao

Statistical models on full and partial rankings of n items are often of limited prac- tical use for large n due to computational consideration. We explore the use of non-parametric models for partially ranked data and derive ef(cid: 2)cient procedures for their use for large n. The derivations are largely possible through combinatorial and algebraic manipulations based on the lattice of partial rankings. In particular, we demonstrate for the (cid: 2)rst time a non-parametric coherent and consistent model capable of ef(cid: 2)ciently aggregating partially ranked data of different types.

UAI Conference 2007 Conference Paper

Statistical Translation, Heat Kernels and Expected Distances

Joshua V. Dillon
Yi Mao
Guy Lebanon
Jian Zhang 0003

High dimensional structured data such as text and images is often poorly understood and misrepresented in statistical modeling. The standard histogram representation suffers from high variance and performs poorly in general. We explore novel connections between statistical translation, heat kernels on manifolds and graphs, and expected distances. These connections provide a new framework for unsupervised metric learning for text documents. Experiments indicate that the resulting distances are generally superior to their more standard counterparts.

JMLR Journal 2007 Journal Article

The Locally Weighted Bag of Words Framework for Document Representation

Guy Lebanon
Yi Mao
Joshua Dillon

The popular bag of words assumption represents a document as a histogram of word occurrences. While computationally efficient, such a representation is unable to maintain any sequential information. We present an effective sequential document representation that goes beyond the bag of words representation and its n -gram extensions. This representation uses local smoothing to embed documents as smooth curves in the multinomial simplex thereby preserving valuable sequential information. In contrast to bag of words or n -grams, the new representation is able to robustly capture medium and long range sequential trends in the document. We discuss the representation and its geometric properties and demonstrate its applicability for various text processing tasks. [abs] [ pdf ][ bib ] &copy JMLR 2007. ( edit, beta )

NeurIPS Conference 2006 Conference Paper

Isotonic Conditional Random Fields and Local Sentiment Flow

Yi Mao
Guy Lebanon

We examine the problem of predicting local sentiment flow in documents, and its application to several areas of text analysis. Formally, the problem is stated as predicting an ordinal sequence based on a sequence of word sets. In the spirit of isotonic regression, we develop a variant of conditional random fields that is well suited to handle this problem. Using the Mobius transform, we express the model as a simple convex optimization problem. Experiments demonstrate the model and its applications to sentiment prediction, style analysis, and text summarization.

UAI Conference 2006 Conference Paper

Sequential Document Representations and Simplicial Curves

Guy Lebanon

The popular bag of words assumption represents a document as a histogram of word occurrences. While computationally efficient, such a representation is unable to maintain any sequential information. We present a continuous and differentiable sequential document representation that goes beyond the bag of words assumption, and yet is efficient and effective. This representation employs smooth curves in the multinomial simplex to account for sequential information. We discuss the representation and its geometric properties and demonstrate its applicability for the task of text classification.

JMLR Journal 2005 Journal Article

Diffusion Kernels on Statistical Manifolds

John Lafferty
Guy Lebanon

A family of kernels for statistical learning is introduced that exploits the geometric structure of statistical models. The kernels are based on the heat equation on the Riemannian manifold defined by the Fisher information metric associated with a statistical family, and generalize the Gaussian kernel of Euclidean space. As an important special case, kernels based on the geometry of multinomial families are derived, leading to kernel-based learning algorithms that apply naturally to discrete data. Bounds on covering numbers and Rademacher averages for the kernels are proved using bounds on the eigenvalues of the Laplacian on Riemannian manifolds. Experimental results are presented for document classification, for which the use of multinomial geometry is natural and well motivated, and improvements are obtained over the standard use of Gaussian or linear kernels, which have been the standard for text classification. [abs] [ pdf ][ bib ] &copy JMLR 2005. ( edit, beta )

UAI Conference 2004 Conference Paper

An Extended Cencov-Campbell Characterization of Conditional Information Geometry

Guy Lebanon

We formulate and prove an axiomatic characterization of conditional information geometry, for both the normalized and the nonnormalized cases. This characterization extends the axiomatic derivation of the Fisher geometry by Cencov and Campbell to the cone of positive conditional models, and as a special case to the manifold of conditional distributions. Due to the close connection between the conditional I-divergence and the product Fisher information metric the characterization provides a new axiomatic interpretation of the primal problems underlying logistic regression and AdaBoost.

ICML Conference 2004 Conference Paper

Hyperplane margin classifiers on the multinomial manifold

Guy Lebanon
John D. Lafferty

UAI Conference 2003 Conference Paper

Learning Riemannian Metrics

Guy Lebanon

We propose a solution to the problem of estimating a Riemannian metric associated with a given differentiable manifold. The metric learning problem is based on minimizing the relative volume of a given set of points. We derive the details for a family of metrics on the multinomial simplex. The resulting metric has applications in text classification and bears some similarity to TFIDF representation of text documents

NeurIPS Conference 2002 Conference Paper

Conditional Models on the Ranking Poset

Guy Lebanon
John Lafferty

A distance-based conditional model on the ranking poset is presented for use in classiﬁcation and ranking. The model is an extension of the Mallows model, and generalizes the classiﬁer combination methods used by several ensemble learning algorithms, including error correcting output codes, discrete AdaBoost, logistic regression and cranking. The algebraic structure of the ranking poset leads to a simple Bayesian inter- pretation of the conditional model and its special cases. In addition to a unifying view, the framework suggests a probabilistic interpretation for error correcting output codes and an extension beyond the binary coding scheme.

ICML Conference 2002 Conference Paper

Cranking: Combining Rankings Using Conditional Probability Models on Permutations

Guy Lebanon
John D. Lafferty

NeurIPS Conference 2002 Conference Paper

Information Diffusion Kernels

Guy Lebanon
John Lafferty

A new family of kernels for statistical learning is introduced that ex- ploits the geometric structure of statistical models. Based on the heat equation on the Riemannian manifold deﬁned by the Fisher informa- tion metric, information diffusion kernels generalize the Gaussian kernel of Euclidean space, and provide a natural way of combining generative statistical modeling with non-parametric discriminative learning. As a special case, the kernels give a new approach to applying kernel-based learning algorithms to discrete data. Bounds on covering numbers for the new kernels are proved using spectral theory in differential geometry, and experimental results are presented for text classiﬁcation.

NeurIPS Conference 2001 Conference Paper

Boosting and Maximum Likelihood for Exponential Models

Guy Lebanon
John Lafferty

We derive an equivalence between AdaBoost and the dual of a convex optimization problem, showing that the only difference between mini- mizing the exponential loss used by AdaBoost and maximum likelihood for exponential models is that the latter requires the model to be normal- ized to form a conditional probability distribution over labels. In addi- tion to establishing a simple and easily understood connection between the two methods, this framework enables us to derive new regularization procedures for boosting that directly correspond to penalized maximum likelihood. Experiments on UCI datasets support our theoretical analy- sis and give additional insight into the relationship between boosting and logistic regression.