Author name cluster

Hailun Lin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

1 author row

IJCAI Conference 2018 Conference Paper

Fast Cross-Validation

Yong Liu
Hailun Lin
Lizhong Ding
Weiping Wang
Shizhong Liao

Cross-validation (CV) is the most widely adopted approach for selecting the optimal model. However, the computation of CV has high complexity due to multiple times of learner training, making it disabled for large scale model selection. In this paper, we present an approximate approach to CV based on the theoretical notion of Bouligand influence function (BIF) and the Nystr\"{o}m method for kernel methods. We first establish the relationship between the theoretical notion of BIF and CV, and propose a method to approximate the CV via the Taylor expansion of BIF. Then, we provide a novel computing method to calculate the BIF for general distribution, and evaluate BIF for sample distribution. Finally, we use the Nystr\"{o}m method to accelerate the computation of the BIF matrix for giving the finally approximate CV criterion. The proposed approximate CV requires training only once and is suitable for a wide variety of kernel methods. Experimental results on lots of datasets how that our approximate CV has no statistical discrepancy with the original CV, but can significantly improve the efficiency.

PDF Details

IJCAI Conference 2017 Conference Paper

Efficient Kernel Selection via Spectral Analysis

Jian Li
Yong Liu
Hailun Lin
Yinliang Yue
Weiping Wang

Kernel selection is a fundamental problem of kernel methods. Existing measures for kernel selection either provide less theoretical guarantee or have high computational complexity. In this paper, we propose a novel kernel selection criterion based on a newly defined spectral measure of a kernel matrix, with sound theoretical foundation and high computational efficiency. We first show that the spectral measure can be used to derive generalization bounds for some kernel-based algorithms. By minimizing the derived generalization bounds, we propose the kernel selection criterion with spectral measure. Moreover, we demonstrate that the popular minimum graph cut and maximum mean discrepancy are two special cases of the proposed criterion. Experimental results on lots of data sets show that our proposed criterion can not only give the comparable results as the state-of-the-art criterion, but also significantly improve the efficiency.

PDF Details

AAAI Conference 2017 Conference Paper

Generalization Analysis for Ranking Using Integral Operator

Yong Liu
Shizhong Liao
Hailun Lin
Yinliang Yue
Weiping Wang

The study on generalization performance of ranking algorithms is one of the fundamental issues in ranking learning theory. Although several generalization bounds have been proposed based on different measures, the convergence rates of the existing bounds are usually at most O 1 √ n, where n is the size of data set. In this paper, we derive novel generalization bounds for the regularized ranking in reproducing kernel Hilbert space via integral operator of kernel function. We prove that the rates of our bounds are much faster than O 1 √ n. Speciﬁcally, we ﬁrst introduce a notion of local Rademacher complexity for ranking, called local ranking Rademacher complexity, which is used to measure the complexity of the space of loss functions of the ranking. Then, we use the local ranking Rademacher complexity to obtain a basic generalization bound. Finally, we establish the relationship between the local Rademacher complexity and the eigenvalues of integral operator, and further derive sharp generalization bounds of faster convergence rate.

PDF Details

AAAI Conference 2017 Conference Paper

Infinite Kernel Learning: Generalization Bounds and Algorithms

Yong Liu
Shizhong Liao
Hailun Lin
Yinliang Yue
Weiping Wang

Kernel learning is a fundamental problem both in recent research and application of kernel methods. Existing kernel learning methods commonly use some measures of generalization errors to learn the optimal kernel in a convex (or conic) combination of prescribed basic kernels. However, the generalization bounds derived by these measures usually have slow convergence rates, and the basic kernels are ﬁnite and should be speciﬁed in advance. In this paper, we propose a new kernel learning method based on a novel measure of generalization error, called principal eigenvalue proportion (PEP), which can learn the optimal kernel with sharp generalization bounds over the convex hull of a possibly inﬁnite set of basic kernels. We ﬁrst derive sharp generalization bounds based on the PEP measure. Then we design two kernel learning algorithms for ﬁnite kernels and inﬁnite kernels respectively, in which the derived sharp generalization bounds are exploited to guarantee faster convergence rates, moreover, basic kernels can be learned automatically for inﬁnite kernel learning instead of being prescribed in advance. Theoretical analysis and empirical results demonstrate that the proposed kernel learning method outperforms the state-of-the-art kernel learning methods.

PDF Details

AAAI Conference 2016 Conference Paper

Locally Adaptive Translation for Knowledge Graph Embedding

Yantao Jia
Yuanzhuo Wang
Hailun Lin
Xiaolong Jin
Xueqi Cheng

Knowledge graph embedding aims to represent entities and relations in a large-scale knowledge graph as elements in a continuous vector space. Existing methods, e. g. , TransE and TransH, learn embedding representation by deﬁning a global margin-based loss function over the data. However, the optimal loss function is determined during experiments whose parameters are examined among a closed set of candidates. Moreover, embeddings over two knowledge graphs with different entities and relations share the same set of candidate loss functions, ignoring the locality of both graphs. This leads to the limited performance of embedding related applications. In this paper, we propose a locally adaptive translation method for knowledge graph embedding, called TransA, to ﬁnd the optimal loss function by adaptively determining its margin over different knowledge graphs. Experiments on two benchmark data sets demonstrate the superiority of the proposed method, as compared to the-state-of-the-art ones.

PDF Details