Arrow Research search

Author name cluster

Kai Yu 0001

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers
1 author row

Possible papers

14

UAI Conference 2013 Conference Paper

High-dimensional Joint Sparsity Random Effects Model for Multi-task Learning

  • Krishnakumar Balasubramanian 0002
  • Kai Yu 0001
  • Tong Zhang 0001

Joint sparsity regularization in multi-task learning has attracted much attention in recent years. The traditional convex formulation employs the group Lasso relaxation to achieve joint sparsity across tasks. Although this approach leads to a simple convex formulation, it suffers from several issues due to the looseness of the relaxation. To remedy this problem, we view jointly sparse multi-task learning as a specialized random effects model, and derive a convex relaxation approach that involves two steps. The first step learns the covariance matrix of the coefficients using a convex formulation which we refer to as sparse covariance coding; the second step solves a ridge regression problem with a sparse quadratic regularizer based on the covariance matrix obtained in the first step. It is shown that this approach produces an asymptotically optimal quadratic regularizer in the multitask learning setting when the number of tasks approaches infinity. Experimental results demonstrate that the convex formulation obtained via the proposed model significantly outperforms group Lasso (and related multi-stage formulations).

ICML Conference 2013 Conference Paper

Smooth Sparse Coding via Marginal Regression for Learning Sparse Representations

  • Krishnakumar Balasubramanian 0002
  • Kai Yu 0001
  • Guy Lebanon

We propose and analyze a novel framework for learning sparse representations, based on two statistical techniques: kernel smoothing and marginal regression. The proposed approach provides a flexible framework for incorporating feature similarity or temporal information present in data sets, via nonparametric kernel smoothing. We provide generalization bounds for dictionary learning using smooth sparse coding and show how the sample complexity depends on the L1 norm of kernel function used. Furthermore, we propose using marginal regression for obtaining sparse codes, which significantly improves the speed and allows one to scale to large dictionary sizes easily. We demonstrate the advantages of the proposed approach, both in terms of accuracy and speed by extensive experimentation on several real data sets. In addition, we demonstrate how the proposed approach could be used for improving semisupervised sparse coding.

ICML Conference 2006 Conference Paper

Active learning via transductive experimental design

  • Kai Yu 0001
  • Jinbo Bi
  • Volker Tresp

This paper considers the problem of selecting the most informative experiments x to get measurements y for learning a regression model y = f (x). We propose a novel and simple concept for active learning, transductive experimental design , that explores available unmeasured experiments (i.e., unlabeled data) and has a better scalability in comparison with classic experimental design methods. Our in-depth analysis shows that the new method tends to favor experiments that are on the one side hard-to-predict and on the other side representative for the rest of the experiments. Efficient optimization of the new design problem is achieved through alternating optimization and sequential greedy search. Extensive experimental results on synthetic problems and three real-world tasks, including questionnaire design for preference learning, active learning for text categorization, and spatial sensor placement, highlight the advantages of the proposed approaches.

UAI Conference 2006 Conference Paper

Infinite Hidden Relational Models

  • Zhao Xu 0001
  • Volker Tresp
  • Kai Yu 0001
  • Hans-Peter Kriegel

In many cases it makes sense to model a relationship symmetrically, not implying any particular directionality. Consider the classical example of a recommendation system where the rating of an item by a user should symmetrically be dependent on the attributes of both the user and the item. The attributes of the (known) relationships are also relevant for predicting attributes of entities and for predicting attributes of new relations. In recommendation systems, the exploitation of relational attributes is often referred to as collaborative filtering. Again, in many applications one might prefer to model the collaborative effect in a symmetrical way. In this paper we present a relational model, which is completely symmetrical. The key innovation is that we introduce for each entity (or object) an infinite-dimensional latent variable as part of a Dirichlet process (DP) model. We discuss inference in the model, which is based on a DP Gibbs sampler, i.e., the Chinese restaurant process. We extend the Chinese restaurant process to be applicable to relational modeling. Our approach is evaluated in three applications. One is a recommendation system based on the MovieLens data set. The second application concerns the prediction of the function of yeast genes/proteins on the data set of KDD Cup 2001 using a multi-relational model. The third application involves a relational medical domain. The experimental results show that our model gives significantly improved estimates of attributes describing relationships or entities in complex relational models.

UAI Conference 2003 Conference Paper

Collaborative Ensemble Learning: Combining Collaborative and Content-Based Information Filtering via Hierarchical Bayes

  • Kai Yu 0001
  • Anton Schwaighofer
  • Volker Tresp
  • Wei-Ying Ma
  • HongJiang Zhang

Collaborative filtering (CF) and content-based filtering (CBF) have widely been used in information filtering applications. Both approaches have their strengths and weaknesses which is why researchers have developed hybrid systems. This paper proposes a novel approach to unify CF and CBF in a probabilistic framework, named collaborative ensemble learning. It uses probabilistic SVMs to model each user's profile (as CBF does).At the prediction phase, it combines a society OF users profiles, represented by their respective SVM models, to predict an active users preferences(the CF idea ).The combination scheme is embedded in a probabilistic framework and retains an intuitive explanation.Moreover, collaborative ensemble learning does not require a global training stage and thus can incrementally incorporate new data.We report results based on two data sets. For the Reuters-21578 text data set, we simulate user ratings under the assumption that each user is interested in only one category. In the second experiment, we use users' opinions on a set of 642 art images that were collected through a web-based survey. For both data sets, collaborative ensemble achieved excellent performance in terms of recommendation accuracy.