Author name cluster

Shipeng Yu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

17 papers

2 author rows

JMLR Journal 2016 Journal Article

Multiplicative Multitask Feature Learning

Xin Wang
Jinbo Bi
Shipeng Yu
Jiangwen Sun
Minghu Song

We investigate a general framework of multiplicative multitask feature learning which decomposes individual task's model parameters into a multiplication of two components. One of the components is used across all tasks and the other component is task-specific. Several previous methods can be proved to be special cases of our framework. We study the theoretical properties of this framework when different regularization conditions are applied to the two decomposed components. We prove that this framework is mathematically equivalent to the widely used multitask feature learning methods that are based on a joint regularization of all model parameters, but with a more general form of regularizers. Further, an analytical formula is derived for the across-task component as related to the task- specific component for all these regularizers, leading to a better understanding of the shrinkage effects of different regularizers. Study of this framework motivates new multitask learning algorithms. We propose two new learning formulations by varying the parameters in the proposed framework. An efficient blockwise coordinate descent algorithm is developed suitable for solving the entire family of formulations with rigorous convergence analysis. Simulation studies have identified the statistical properties of data that would be in favor of the new formulations. Extensive empirical studies on various classification and regression benchmark data sets have revealed the relative advantages of the two new formulations by comparing with the state of the art, which provides instructive insights into the feature learning problem with multiple tasks. [abs] [ pdf ][ bib ] &copy JMLR 2016. ( edit, beta )

PDF Details

AIIM Journal 2015 Journal Article

Predicting readmission risk with institution-specific prediction models

Shipeng Yu
Faisal Farooq
Alexander Van Esbroeck
Glenn Fung
Vikram Anand
Balaji Krishnapuram

Details DOI

NeurIPS Conference 2014 Conference Paper

On Multiplicative Multitask Feature Learning

Xin Wang
Jinbo Bi
Shipeng Yu
Jiangwen Sun

We investigate a general framework of multiplicative multitask feature learning which decomposes each task's model parameters into a multiplication of two components. One of the components is used across all tasks and the other component is task-specific. Several previous methods have been proposed as special cases of our framework. We study the theoretical properties of this framework when different regularization conditions are applied to the two decomposed components. We prove that this framework is mathematically equivalent to the widely used multitask feature learning methods that are based on a joint regularization of all model parameters, but with a more general form of regularizers. Further, an analytical formula is derived for the across-task component as related to the task-specific component for all these regularizers, leading to a better understanding of the shrinkage effect. Study of this framework motivates new multitask learning algorithms. We propose two new learning formulations by varying the parameters in the proposed framework. Empirical studies have revealed the relative advantages of the two new formulations by comparing with the state of the art, which provides instructive insights into the feature learning problem with multiple tasks.

PDF Details

JMLR Journal 2012 Journal Article

Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks

Vikas C. Raykar
Shipeng Yu

With the advent of crowdsourcing services it has become quite cheap and reasonably effective to get a data set labeled by multiple annotators in a short amount of time. Various methods have been proposed to estimate the consensus labels by correcting for the bias of annotators with different kinds of expertise. Since we do not have control over the quality of the annotators, very often the annotations can be dominated by spammers, defined as annotators who assign labels randomly without actually looking at the instance. Spammers can make the cost of acquiring labels very expensive and can potentially degrade the quality of the final consensus labels. In this paper we propose an empirical Bayesian algorithm called SpEM that iteratively eliminates the spammers and estimates the consensus labels based only on the good annotators. The algorithm is motivated by defining a spammer score that can be used to rank the annotators. Experiments on simulated and real data show that the proposed approach is better than (or as good as) the earlier approaches in terms of the accuracy and uses a significantly smaller number of annotators. [abs] [ pdf ][ bib ] &copy JMLR 2012. ( edit, beta )

PDF Details

JMLR Journal 2011 Journal Article

Bayesian Co-Training

Shipeng Yu
Balaji Krishnapuram
Rómer Rosales
R. Bharat Rao

Co-training (or more generally, co-regularization) has been a popular algorithm for semi-supervised learning in data with two feature representations (or views), but the fundamental assumptions underlying this type of models are still unclear. In this paper we propose a Bayesian undirected graphical model for co-training, or more generally for semi-supervised multi-view learning. This makes explicit the previously unstated assumptions of a large class of co-training type algorithms, and also clarifies the circumstances under which these assumptions fail. Building upon new insights from this model, we propose an improved method for co-training, which is a novel co-training kernel for Gaussian process classifiers. The resulting approach is convex and avoids local-maxima problems, and it can also automatically estimate how much each view should be trusted to accommodate noisy or unreliable views. The Bayesian co-training approach can also elegantly handle data samples with missing views, that is, some of the views are not available for some data points at learning time. This is further extended to an active sensing framework, in which the missing (sample, view) pairs are actively acquired to improve learning performance. The strength of active sensing model is that one actively sensed (sample, view) pair would improve the joint multi-view classification on all the samples. Experiments on toy data and several real world data sets illustrate the benefits of this approach. [abs] [ pdf ][ bib ] &copy JMLR 2011. ( edit, beta )

PDF Details

NeurIPS Conference 2011 Conference Paper

Ranking annotators for crowdsourced labeling tasks

Vikas Raykar
Shipeng Yu

PDF Details

JMLR Journal 2010 Journal Article

Learning From Crowds

Vikas C. Raykar
Shipeng Yu
Linda H. Zhao
Gerardo Hermosillo Valadez
Charles Florin
Luca Bogoni
Linda Moy

For many supervised learning tasks it may be infeasible (or very expensive) to obtain objective and reliable labels. Instead, we can collect subjective (possibly noisy) labels from multiple experts or annotators. In practice, there is a substantial amount of disagreement among the annotators, and hence it is of great practical interest to address conventional supervised learning problems in this scenario. In this paper we describe a probabilistic approach for supervised learning when we have multiple annotators providing (possibly noisy) labels but no absolute gold standard. The proposed algorithm evaluates the different experts and also gives an estimate of the actual hidden labels. Experimental results indicate that the proposed method is superior to the commonly used majority voting baseline. [abs] [ pdf ][ bib ] &copy JMLR 2010. ( edit, beta )

PDF Details

IJCAI Conference 2009 Conference Paper

Liang Sun
Shuiwang Ji
Shipeng Yu
Jieping Ye

Canonical correlation analysis (CCA) and partial least squares (PLS) are well-known techniques for feature extraction from two sets of multidimensional variables. The fundamental difference between CCA and PLS is that CCA maximizes the correlation while PLS maximizes the covariance. Although both CCA and PLS have been applied successfully in various applications, the intrinsic relationship between them remains unclear. In this paper, we attempt to address this issue by showing the equivalence relationship between CCA and orthonormalized partial least squares (OPLS), a variant of PLS. We further extend the equivalence relationship to the case when regularization is employed for both sets of variables. In addition, we show that the CCA projection for one set of variables is independent of the regularization on the other set of variables. We have performed experimental studies using both synthetic and real data sets and our results conﬁrm the established equivalence relationship. The presented analysis provides novel insights into the connection between these two existing algorithms as well as the effect of the regularization.

PDF Details

IJCAI Conference 2009 Conference Paper

Zheng Zhao
Liang Sun
Shipeng Yu
Huan Liu
Jieping Ye

Kernel discriminant analysis (KDA) is an effective approach for supervised nonlinear dimensionality reduction. Probabilistic models can be used with KDA to improve its robustness. However, the state of the art of such models could only handle binary class problems, which conﬁnes their application in many real world problems. To overcome this limitation, we propose a novel nonparametric probabilistic model based on Gaussian Process for KDA to handle multiclass problems. The model provides a novel Bayesian interpretation for KDA, which allows its parameters to be automatically tuned through the optimization of the marginal loglikelihood of the data. Empirical study demonstrates the efﬁcacy of the proposed model.

PDF Details

ICML Conference 2009 Conference Paper

Supervised learning from multiple experts: whom to trust when everyone lies a bit

Vikas C. Raykar
Shipeng Yu
Linda H. Zhao
Anna K. Jerebko
Charles Florin
Gerardo Hermosillo Valadez
Luca Bogoni
Linda Moy

We describe a probabilistic approach for supervised learning when we have multiple experts/annotators providing (possibly noisy) labels but no absolute gold standard. The proposed algorithm evaluates the different experts and also gives an estimate of the actual hidden labels. Experimental results indicate that the proposed method is superior to the commonly used majority voting baseline.

Details

NeurIPS Conference 2007 Conference Paper

Bayesian Co-Training

Shipeng Yu
Balaji Krishnapuram
Harald Steck
R. Rao
Rómer Rosales

We propose a Bayesian undirected graphical model for co-training, or more generally for semi-supervised multi-view learning. This makes explicit the previously unstated assumptions of a large class of co-training type algorithms, and also clarifies the circumstances under which these assumptions fail. Building upon new insights from this model, we propose an improved method for co-training, which is a novel co-training kernel for Gaussian process classifiers. The resulting approach is convex and avoids local-maxima problems, unlike some previous multi-view learning methods. Furthermore, it can automatically estimate how much each view should be trusted, and thus accommodate noisy or unreliable views. Experiments on toy data and real world data sets illustrate the benefits of this approach.

PDF Details

ICML Conference 2007 Conference Paper

Local learning projections

Mingrui Wu
Kai Yu 0001
Shipeng Yu
Bernhard Schölkopf

Details

ICML Conference 2007 Conference Paper

Robust multi-task learning with t -processes

Shipeng Yu
Volker Tresp
Kai Yu 0001

Details

ICML Conference 2006 Conference Paper

Collaborative ordinal regression

Shipeng Yu
Kai Yu 0001
Volker Tresp
Hans-Peter Kriegel

Details

NeurIPS Conference 2006 Conference Paper

Stochastic Relational Models for Discriminative Link Prediction

Kai Yu
Wei Chu
Shipeng Yu
Volker Tresp
Zhao Xu

We introduce a Gaussian process (GP) framework, stochastic relational models (SRM), for learning social, physical, and other relational phenomena where interactions between entities are observed. The key idea is to model the stochastic structure of entity relationships (i. e. , links) via a tensor interaction of multiple GPs, each defined on one type of entities. These models in fact define a set of nonparametric priors on infinite dimensional tensor matrices, where each element represents a relationship between a tuple of entities. By maximizing the marginalized likelihood, information is exchanged between the participating GPs through the entire relational network, so that the dependency structure of links is messaged to the dependency of entities, reflected by the adapted GP kernels. The framework offers a discriminative approach to link prediction, namely, predicting the existences, strengths, or types of relationships based on the partially observed linkage network as well as the attributes of entities (if given). We discuss properties and variants of SRM and derive an efficient learning algorithm. Very encouraging experimental results are achieved on a toy problem and a user-movie preference link prediction task. In the end we discuss extensions of SRM to general relational learning tasks.

PDF Details

ICML Conference 2005 Conference Paper

Dirichlet enhanced relational learning

Zhao Xu 0001
Volker Tresp
Kai Yu 0001
Shipeng Yu
Hans-Peter Kriegel

Details

NeurIPS Conference 2005 Conference Paper

Soft Clustering on Graphs

Kai Yu
Shipeng Yu
Volker Tresp

We propose a simple clustering framework on graphs encoding pairwise data similarities. Unlike usual similarity-based methods, the approach softly assigns data to clusters in a probabilistic way. More importantly, a hierarchical clustering is naturally derived in this framework to gradually merge lower-level clusters into higher-level ones. A random walk analysis indicates that the algorithm exposes clustering structures in various resolutions, i. e. , a higher level statistically models a longer-term diffusion on graphs and thus discovers a more global clustering structure. Finally we provide very encouraging experimental results.

PDF Details