Arrow Research search

Author name cluster

Yingming Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers
2 author rows

Possible papers

11

IJCAI Conference 2022 Conference Paper

Multi-Proxy Learning from an Entropy Optimization Perspective

  • Yunlong Yu
  • Dingyi Zhang
  • Yingming Li
  • Zhongfei Zhang

Deep Metric Learning, a task that learns a feature embedding space where semantically similar samples are located closer than dissimilar samples, is a cornerstone of many computer vision applications. Most of the existing proxy-based approaches usually exploit the global context via learning a single proxy for each training class, which struggles in capturing the complex non-uniform data distribution with different patterns. In this work, we present an easy-to-implement framework to effectively capture the local neighbor relationships via learning multiple proxies for each class that collectively approximate the intra-class distribution. In the context of large intra-class visual diversity, we revisit the entropy learning under the multi-proxy learning framework and provide a training routine that both minimizes the entropy of intra-class probability distribution and maximizes the entropy of inter-class probability distribution. In this way, our model is able to better capture the intra-class variations and smooth the inter-class differences and thus facilitates to extract more semantic feature representations for the downstream tasks. Extensive experimental results demonstrate that the proposed approach achieves competitive performances. Codes and an appendix are provided.

AAAI Conference 2021 Conference Paper

Harmonized Dense Knowledge Distillation Training for Multi-Exit Architectures

  • Xinglu Wang
  • Yingming Li

Multi-exit architectures, in which a sequence of intermediate classifiers are introduced at different depths of the feature layers, perform adaptive computation by early exiting “easy” samples to speed up the inference. In this paper, a novel Harmonized Dense Knowledge Distillation (HDKD) training method for multi-exit architecture is designed to encourage each exit to flexibly learn from all its later exits. In particular, a general dense knowledge distillation training objective is proposed to incorporate all possible beneficial supervision information for multi-exit learning, where a harmonized weighting scheme is designed for the multi-objective optimization problem consisting of multi-exit classification loss and dense distillation loss. A bilevel optimization algorithm is introduced for alternatively updating the weights of multiple objectives and the multi-exit network parameters. Specifically, the loss weighting parameters are optimized with respect to its performance on validation set by gradient descent. Experiments on CIFAR100 and ImageNet show that the HDKD strategy harmoniously improves the performance of the state-of-the-art multi-exit neural networks. Moreover, this method does not require within architecture modifications and can be effectively combined with other previouslyproposed training techniques and further boosts the performance.

NeurIPS Conference 2020 Conference Paper

Deep Metric Learning with Spherical Embedding

  • Dingyi Zhang
  • Yingming Li
  • Zhongfei Zhang

Deep metric learning has attracted much attention in recent years, due to seamlessly combining the distance metric learning and deep neural network. Many endeavors are devoted to design different pair-based angular loss functions, which decouple the magnitude and direction information for embedding vectors and ensure the training and testing measure consistency. However, these traditional angular losses cannot guarantee that all the sample embeddings are on the surface of the same hypersphere during the training stage, which would result in unstable gradient in batch optimization and may influence the quick convergence of the embedding learning. In this paper, we first investigate the effect of the embedding norm for deep metric learning with angular distance, and then propose a spherical embedding constraint (SEC) to regularize the distribution of the norms. SEC adaptively adjusts the embeddings to fall on the same hypersphere and performs more balanced direction update. Extensive experiments on deep metric learning, face recognition, and contrastive self-supervised learning show that the SEC-based angular space learning strategy significantly improves the performance of the state-of-the-art.

IJCAI Conference 2020 Conference Paper

SBAT: Video Captioning with Sparse Boundary-Aware Transformer

  • Tao Jin
  • Siyu Huang
  • Ming Chen
  • Yingming Li
  • Zhongfei Zhang

In this paper, we focus on the problem of applying the transformer structure to video captioning effectively. The vanilla transformer is proposed for uni-modal language generation task such as machine translation. However, video captioning is a multimodal learning problem, and the video features have much redundancy between different time steps. Based on these concerns, we propose a novel method called sparse boundary-aware transformer (SBAT) to reduce the redundancy in video representation. SBAT employs boundary-aware pooling operation for scores from multihead attention and selects diverse features from different scenarios. Also, SBAT includes a local correlation scheme to compensate for the local information loss brought by sparse operation. Based on SBAT, we further propose an aligned cross-modal encoding scheme to boost the multimodal interaction. Experimental results on two benchmark datasets show that SBAT outperforms the state-of-the-art methods under most of the metrics.

AAAI Conference 2018 Short Paper

FR-ANet: A Face Recognition Guided Facial Attribute Classification Network

  • Jiajiong Cao
  • Yingming Li
  • Xi Li
  • Zhongfei Zhang

In this paper, we study the problem of facial attribute learning. In particular, we propose a Face Recognition guided facial Attribute classification Network, called FR-ANet. All the attributes share low-level features, while high-level features are specially learned for attribute groups. Further, to utilize the identity information, high-level features are merged to perform face identity recognition. The experimental results on CelebA and LFWA datasets demonstrate the promise of the FR-ANet.

AAAI Conference 2018 Conference Paper

Learning With Incomplete Labels

  • Yingming Li
  • Zenglin Xu
  • Zhongfei Zhang

For many real-world tagging problems, training labels are usually obtained through social tagging and are notoriously incomplete. Consequently, handling data with incomplete labels has become a difficult challenge, which usually leads to a degenerated performance on label prediction. To improve the generalization performance, in this paper, we first propose the Improved Cross-View learning (referred as ICVL) model, which considers both global and local patterns of label relationship to enrich the original label set. Further, by extending the ICVL model with an outlier detection mechanism, we introduce the Improved Cross-View learning with Outlier Detection (referred as ICVL-OD) model to remove the abnormal tags resulting from label enrichment. Extensive evaluations on three benchmark datasets demonstrate that ICVL and ICVL-OD outstand with superior performances in comparison with the competing methods.

AAAI Conference 2018 Conference Paper

Multi-Channel Pyramid Person Matching Network for Person Re-Identification

  • Chaojie Mao
  • Yingming Li
  • Yaqing Zhang
  • Zhongfei Zhang
  • Xi Li

In this work, we present a Multi-Channel deep convolutional Pyramid Person Matching Network (MC-PPMN) based on the combination of the semantic-components and the colortexture distributions to address the problem of person reidentification. In particular, we learn separate deep representations for semantic-components and color-texture distributions from two person images and then employ pyramid person matching network (PPMN) to obtain correspondence representations. These correspondence representations are fused to perform the re-identification task. Further, the proposed framework is optimized via a unified end-to-end deep learning scheme. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our approach against the state-of-the-art literature, especially on the rank-1 recognition rate.

AAAI Conference 2017 Conference Paper

Learning with Feature Network and Label Network Simultaneously

  • Yingming Li
  • Ming Yang
  • Zenglin Xu
  • Zhongfei (Mark) Zhang

For many supervised learning problems, limited training samples and incomplete labels are two difficult challenges, which usually lead to degenerated performance on label prediction. To improve the generalization performance, in this paper, we propose Doubly Regularized Multi-Label learning (DRML) by exploiting feature network and label network regularization simultaneously. In more details, the proposed algorithm first constructs a feature network and a label network with marginalized linear denoising autoencoder in data feature set and label set, respectively, and then learns a robust predictor with the feature network and the label network regularization simultaneously. While DRML is a general method for multilabel learning, in the evaluations we focus on the specific application of multi-label text tagging. Extensive evaluations on three benchmark data sets demonstrate that DRML outstands with a superior performance in comparison with some existing multi-label learning methods.

AAAI Conference 2016 Conference Paper

Learning with Marginalized Corrupted Features and Labels Together

  • Yingming Li
  • Ming Yang
  • Zenglin Xu
  • Zhongfei Zhang

Tagging has become increasingly important in many real-world applications noticeably including web applications, such as web blogs and resource sharing systems. Despite this importance, tagging methods often face difficult challenges such as limited training samples and incomplete labels, which usually lead to degenerated performance on tag prediction. To improve the generalization performance, in this paper, we propose Regularized Marginalized Cross-View learning (RMCV) by jointly modeling on attribute noise and label noise. In more details, the proposed model constructs infinite training examples with attribute noises from known exponential-family distributions and exploits label noise via marginalized denoising autoencoder. Therefore, the model benefits from its robustness and alleviates the problem of tag sparsity. While RMCV is a general method for learning tagging, in the evaluations we focus on the specific application of multi-label text tagging. Extensive evaluations on three benchmark data sets demonstrate that RMCV outstands with a superior performance in comparison with state-of-the-art methods.

IJCAI Conference 2016 Conference Paper

Multi-View Learning with Limited and Noisy Tagging

  • Yingming Li
  • Ming Yang
  • Zenglin Xu
  • Zhongfei (Mark) Zhang

Multi-view tagging has become increasingly popular in the applications where data representations by multiple views exist. A robust multi-view tagging method must have the capability to meet the two challenging requirements: limited labeled training samples and noisy labeled training samples. In this paper, we investigate this challenging problem of learning with limited and noisy tagging and propose a discriminative model, called MSMC, that exploits both labeled and unlabeled data through a semi-parametric regularization and takes advantage of the multi-label space consistency into the optimization. While MSMC is a general method for learning with multi-view, limited, and noisy tagging, in the evaluations we focus on the specific application of noisy image tagging with limited labeled training samples on a benchmark dataset. Extensive evaluations in comparison with state-of-the-art literature demonstrate that MSMC outstands with a superior performance.

ICML Conference 2013 Conference Paper

Multi-Task Learning with Gaussian Matrix Generalized Inverse Gaussian Model

  • Ming Yang 0012
  • Yingming Li
  • Zhongfei Zhang

In this paper, we study the multi-task learning problem with a new perspective of considering the structure of the residue error matrix and the low-rank approximation to the task covariance matrix simultaneously. In particular, we first introduce the Matrix Generalized Inverse Gaussian (MGIG) prior and define a Gaussian Matrix Generalized Inverse Gaussian (GMGIG) model for low-rank approximation to the task covariance matrix. Through combining the GMGIG model with the residual error structure assumption, we propose the GMGIG regression model for multi-task learning. To make the computation tractable, we simultaneously use variational inference and sampling techniques. In particular, we propose two sampling strategies for computing the statistics of the MGIG distribution. Experiments show that this model is superior to the peer methods in regression and prediction.