Author name cluster

Yingming Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

2 author rows

IJCAI Conference 2022 Conference Paper

Multi-Proxy Learning from an Entropy Optimization Perspective

Yunlong Yu
Dingyi Zhang
Yingming Li
Zhongfei Zhang

Deep Metric Learning, a task that learns a feature embedding space where semantically similar samples are located closer than dissimilar samples, is a cornerstone of many computer vision applications. Most of the existing proxy-based approaches usually exploit the global context via learning a single proxy for each training class, which struggles in capturing the complex non-uniform data distribution with different patterns. In this work, we present an easy-to-implement framework to effectively capture the local neighbor relationships via learning multiple proxies for each class that collectively approximate the intra-class distribution. In the context of large intra-class visual diversity, we revisit the entropy learning under the multi-proxy learning framework and provide a training routine that both minimizes the entropy of intra-class probability distribution and maximizes the entropy of inter-class probability distribution. In this way, our model is able to better capture the intra-class variations and smooth the inter-class differences and thus facilitates to extract more semantic feature representations for the downstream tasks. Extensive experimental results demonstrate that the proposed approach achieves competitive performances. Codes and an appendix are provided.

PDF Details DOI

AAAI Conference 2021 Conference Paper

Harmonized Dense Knowledge Distillation Training for Multi-Exit Architectures

Xinglu Wang
Yingming Li

Multi-exit architectures, in which a sequence of intermediate classifiers are introduced at different depths of the feature layers, perform adaptive computation by early exiting “easy” samples to speed up the inference. In this paper, a novel Harmonized Dense Knowledge Distillation (HDKD) training method for multi-exit architecture is designed to encourage each exit to flexibly learn from all its later exits. In particular, a general dense knowledge distillation training objective is proposed to incorporate all possible beneficial supervision information for multi-exit learning, where a harmonized weighting scheme is designed for the multi-objective optimization problem consisting of multi-exit classification loss and dense distillation loss. A bilevel optimization algorithm is introduced for alternatively updating the weights of multiple objectives and the multi-exit network parameters. Specifically, the loss weighting parameters are optimized with respect to its performance on validation set by gradient descent. Experiments on CIFAR100 and ImageNet show that the HDKD strategy harmoniously improves the performance of the state-of-the-art multi-exit neural networks. Moreover, this method does not require within architecture modifications and can be effectively combined with other previouslyproposed training techniques and further boosts the performance.

PDF Details

NeurIPS Conference 2020 Conference Paper

Deep Metric Learning with Spherical Embedding

Dingyi Zhang
Yingming Li
Zhongfei Zhang

Deep metric learning has attracted much attention in recent years, due to seamlessly combining the distance metric learning and deep neural network. Many endeavors are devoted to design different pair-based angular loss functions, which decouple the magnitude and direction information for embedding vectors and ensure the training and testing measure consistency. However, these traditional angular losses cannot guarantee that all the sample embeddings are on the surface of the same hypersphere during the training stage, which would result in unstable gradient in batch optimization and may influence the quick convergence of the embedding learning. In this paper, we first investigate the effect of the embedding norm for deep metric learning with angular distance, and then propose a spherical embedding constraint (SEC) to regularize the distribution of the norms. SEC adaptively adjusts the embeddings to fall on the same hypersphere and performs more balanced direction update. Extensive experiments on deep metric learning, face recognition, and contrastive self-supervised learning show that the SEC-based angular space learning strategy significantly improves the performance of the state-of-the-art.

PDF Details

IJCAI Conference 2020 Conference Paper

SBAT: Video Captioning with Sparse Boundary-Aware Transformer

Tao Jin
Siyu Huang
Ming Chen
Yingming Li
Zhongfei Zhang

In this paper, we focus on the problem of applying the transformer structure to video captioning effectively. The vanilla transformer is proposed for uni-modal language generation task such as machine translation. However, video captioning is a multimodal learning problem, and the video features have much redundancy between different time steps. Based on these concerns, we propose a novel method called sparse boundary-aware transformer (SBAT) to reduce the redundancy in video representation. SBAT employs boundary-aware pooling operation for scores from multihead attention and selects diverse features from different scenarios. Also, SBAT includes a local correlation scheme to compensate for the local information loss brought by sparse operation. Based on SBAT, we further propose an aligned cross-modal encoding scheme to boost the multimodal interaction. Experimental results on two benchmark datasets show that SBAT outperforms the state-of-the-art methods under most of the metrics.

PDF Details DOI

AAAI Conference 2018 Short Paper

FR-ANet: A Face Recognition Guided Facial Attribute Classification Network

Jiajiong Cao
Yingming Li
Xi Li
Zhongfei Zhang

In this paper, we study the problem of facial attribute learning. In particular, we propose a Face Recognition guided facial Attribute classiﬁcation Network, called FR-ANet. All the attributes share low-level features, while high-level features are specially learned for attribute groups. Further, to utilize the identity information, high-level features are merged to perform face identity recognition. The experimental results on CelebA and LFWA datasets demonstrate the promise of the FR-ANet.

PDF Details

AAAI Conference 2018 Conference Paper

Learning With Incomplete Labels

Yingming Li
Zenglin Xu
Zhongfei Zhang

For many real-world tagging problems, training labels are usually obtained through social tagging and are notoriously incomplete. Consequently, handling data with incomplete labels has become a difﬁcult challenge, which usually leads to a degenerated performance on label prediction. To improve the generalization performance, in this paper, we ﬁrst propose the Improved Cross-View learning (referred as ICVL) model, which considers both global and local patterns of label relationship to enrich the original label set. Further, by extending the ICVL model with an outlier detection mechanism, we introduce the Improved Cross-View learning with Outlier Detection (referred as ICVL-OD) model to remove the abnormal tags resulting from label enrichment. Extensive evaluations on three benchmark datasets demonstrate that ICVL and ICVL-OD outstand with superior performances in comparison with the competing methods.

PDF Details

AAAI Conference 2018 Conference Paper

Multi-Channel Pyramid Person Matching Network for Person Re-Identification

Chaojie Mao
Yingming Li
Yaqing Zhang
Zhongfei Zhang
Xi Li

In this work, we present a Multi-Channel deep convolutional Pyramid Person Matching Network (MC-PPMN) based on the combination of the semantic-components and the colortexture distributions to address the problem of person reidentiﬁcation. In particular, we learn separate deep representations for semantic-components and color-texture distributions from two person images and then employ pyramid person matching network (PPMN) to obtain correspondence representations. These correspondence representations are fused to perform the re-identiﬁcation task. Further, the proposed framework is optimized via a uniﬁed end-to-end deep learning scheme. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our approach against the state-of-the-art literature, especially on the rank-1 recognition rate.

PDF Details

AAAI Conference 2017 Conference Paper

Learning with Feature Network and Label Network Simultaneously

Yingming Li
Ming Yang
Zenglin Xu
Zhongfei (Mark) Zhang

For many supervised learning problems, limited training samples and incomplete labels are two difﬁcult challenges, which usually lead to degenerated performance on label prediction. To improve the generalization performance, in this paper, we propose Doubly Regularized Multi-Label learning (DRML) by exploiting feature network and label network regularization simultaneously. In more details, the proposed algorithm ﬁrst constructs a feature network and a label network with marginalized linear denoising autoencoder in data feature set and label set, respectively, and then learns a robust predictor with the feature network and the label network regularization simultaneously. While DRML is a general method for multilabel learning, in the evaluations we focus on the speciﬁc application of multi-label text tagging. Extensive evaluations on three benchmark data sets demonstrate that DRML outstands with a superior performance in comparison with some existing multi-label learning methods.

PDF Details

AAAI Conference 2016 Conference Paper

Learning with Marginalized Corrupted Features and Labels Together

Yingming Li
Ming Yang
Zenglin Xu
Zhongfei Zhang

Tagging has become increasingly important in many real-world applications noticeably including web applications, such as web blogs and resource sharing systems. Despite this importance, tagging methods often face difﬁcult challenges such as limited training samples and incomplete labels, which usually lead to degenerated performance on tag prediction. To improve the generalization performance, in this paper, we propose Regularized Marginalized Cross-View learning (RMCV) by jointly modeling on attribute noise and label noise. In more details, the proposed model constructs inﬁnite training examples with attribute noises from known exponential-family distributions and exploits label noise via marginalized denoising autoencoder. Therefore, the model beneﬁts from its robustness and alleviates the problem of tag sparsity. While RMCV is a general method for learning tagging, in the evaluations we focus on the speciﬁc application of multi-label text tagging. Extensive evaluations on three benchmark data sets demonstrate that RMCV outstands with a superior performance in comparison with state-of-the-art methods.

PDF Details

IJCAI Conference 2016 Conference Paper

Multi-View Learning with Limited and Noisy Tagging

Yingming Li
Ming Yang
Zenglin Xu
Zhongfei (Mark) Zhang

Multi-view tagging has become increasingly popular in the applications where data representations by multiple views exist. A robust multi-view tagging method must have the capability to meet the two challenging requirements: limited labeled training samples and noisy labeled training samples. In this paper, we investigate this challenging problem of learning with limited and noisy tagging and propose a discriminative model, called MSMC, that exploits both labeled and unlabeled data through a semi-parametric regularization and takes advantage of the multi-label space consistency into the optimization. While MSMC is a general method for learning with multi-view, limited, and noisy tagging, in the evaluations we focus on the specific application of noisy image tagging with limited labeled training samples on a benchmark dataset. Extensive evaluations in comparison with state-of-the-art literature demonstrate that MSMC outstands with a superior performance.

PDF Details

ICML Conference 2013 Conference Paper

Multi-Task Learning with Gaussian Matrix Generalized Inverse Gaussian Model

Ming Yang 0012
Yingming Li
Zhongfei Zhang

In this paper, we study the multi-task learning problem with a new perspective of considering the structure of the residue error matrix and the low-rank approximation to the task covariance matrix simultaneously. In particular, we first introduce the Matrix Generalized Inverse Gaussian (MGIG) prior and define a Gaussian Matrix Generalized Inverse Gaussian (GMGIG) model for low-rank approximation to the task covariance matrix. Through combining the GMGIG model with the residual error structure assumption, we propose the GMGIG regression model for multi-task learning. To make the computation tractable, we simultaneously use variational inference and sampling techniques. In particular, we propose two sampling strategies for computing the statistics of the MGIG distribution. Experiments show that this model is superior to the peer methods in regression and prediction.

Details