Arrow Research search

Author name cluster

Yuming Shen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers
2 author rows

Possible papers

9

AAAI Conference 2023 Conference Paper

Deconstructed Generation-Based Zero-Shot Model

  • Dubing Chen
  • Yuming Shen
  • Haofeng Zhang
  • Philip H.S. Torr

Recent research on Generalized Zero-Shot Learning (GZSL) has focused primarily on generation-based methods. However, current literature has overlooked the fundamental principles of these methods and has made limited progress in a complex manner. In this paper, we aim to deconstruct the generator-classifier framework and provide guidance for its improvement and extension. We begin by breaking down the generator-learned unseen class distribution into class-level and instance-level distributions. Through our analysis of the role of these two types of distributions in solving the GZSL problem, we generalize the focus of the generation-based approach, emphasizing the importance of (i) attribute generalization in generator learning and (ii) independent classifier learning with partially biased data. We present a simple method based on this analysis that outperforms SotAs on four public GZSL datasets, demonstrating the validity of our deconstruction. Furthermore, our proposed method remains effective even without a generative model, representing a step towards simplifying the generator-classifier structure. Our code is available at https://github.com/cdb342/DGZ.

AAAI Conference 2022 Conference Paper

Boosting Generative Zero-Shot Learning by Synthesizing Diverse Features with Attribute Augmentation

  • Xiaojie Zhao
  • Yuming Shen
  • Shidong Wang
  • Haofeng Zhang

The recent advance in deep generative models outlines a promising perspective in the realm of Zero-Shot Learning (ZSL). Most generative ZSL methods use category semantic attributes plus a Gaussian noise to generate visual features. After generating unseen samples, this family of approaches effectively transforms the ZSL problem into a supervised classification scheme. However, the existing models use a single semantic attribute, which contains the complete attribute information of the category. The generated data also carry the complete attribute information, but in reality, visual samples usually have limited attributes. Therefore, the generated data from attribute could have incomplete semantics. Based on this fact, we propose a novel framework to boost ZSL by synthesizing diverse features. This method uses augmented semantic attributes to train the generative model, so as to simulate the real distribution of visual features. We evaluate the proposed model on four benchmark datasets, observing significant performance improvement against the state-ofthe-art.

IJCAI Conference 2022 Conference Paper

Learning to Hash Naturally Sorts

  • Jiaguo Yu
  • Yuming Shen
  • Menghan Wang
  • Haofeng Zhang
  • Philip H. S. Torr

Learning to hash pictures a list-wise sorting problem. Its testing metrics, e. g. , mean-average precision, count on a sorted candidate list ordered by pair-wise code similarity. However, scarcely does one train a deep hashing model with the sorted results end-to-end because of the non-differentiable nature of the sorting operation. This inconsistency in the objectives of training and test may lead to sub-optimal performance since the training loss often fails to reflect the actual retrieval metric. In this paper, we tackle this problem by introducing Naturally-Sorted Hashing (NSH). We sort the Hamming distances of samples' hash codes and accordingly gather their latent representations for self-supervised training. Thanks to the recent advances in differentiable sorting approximations, the hash head receives gradients from the sorter so that the hash encoder can be optimized along with the training procedure. Additionally, we describe a novel Sorted Noise-Contrastive Estimation (SortedNCE) loss that selectively picks positive and negative samples for contrastive learning, which allows NSH to mine data semantic relations during training in an unsupervised manner. Our extensive experiments show the proposed NSH model significantly outperforms the existing unsupervised hashing methods on three benchmarked datasets.

IJCAI Conference 2022 Conference Paper

Zero-Shot Logit Adjustment

  • Dubing Chen
  • Yuming Shen
  • Haofeng Zhang
  • Philip H. S. Torr

Semantic-descriptor-based Generalized Zero-Shot Learning (GZSL) poses challenges in recognizing novel classes in the test phase. The development of generative models enables current GZSL techniques to probe further into the semantic-visual link, culminating in a two-stage form that includes a generator and a classifier. However, existing generation-based methods focus on enhancing the generator's effect while neglecting the improvement of the classifier. In this paper, we first analyze of two properties of the generated pseudo unseen samples: bias and homogeneity. Then, we perform variational Bayesian inference to back-derive the evaluation metrics, which reflects the balance of the seen and unseen classes. As a consequence of our derivation, the aforementioned two properties are incorporated into the classifier training as seen-unseen priors via logit adjustment. The Zero-Shot Logit Adjustment further puts semantic-based classifiers into effect in generation-based GZSL. Our experiments demonstrate that the proposed technique achieves state-of-the-art when combined with the basic generator, and it can improve various generative Zero-Shot Learning frameworks. Our codes are available on https: //github. com/cdb342/IJCAI-2022-ZLA.

NeurIPS Conference 2021 Conference Paper

You Never Cluster Alone

  • Yuming Shen
  • Ziyi Shen
  • Menghan Wang
  • Jie Qin
  • Philip Torr
  • Ling Shao

Recent advances in self-supervised learning with instance-level contrastive objectives facilitate unsupervised clustering. However, a standalone datum is not perceiving the context of the holistic cluster, and may undergo sub-optimal assignment. In this paper, we extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation that encodes the context of each data group. Contrastive learning with this representation then rewards the assignment of each datum. To implement this vision, we propose twin-contrast clustering (TCC). We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one. On one hand, with the corresponding assignment variables being the weight, a weighted aggregation along the data points implements the set representation of a cluster. We further propose heuristic cluster augmentation equivalents to enable cluster-level contrastive learning. On the other hand, we derive the evidence lower-bound of the instance-level contrastive objective with the assignments. By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps. Extensive experiments show that TCC outperforms the state-of-the-art on benchmarked datasets.

IJCAI Conference 2020 Conference Paper

Set and Rebase: Determining the Semantic Graph Connectivity for Unsupervised Cross-Modal Hashing

  • Weiwei Wang
  • Yuming Shen
  • Haofeng Zhang
  • Yazhou Yao
  • Li Liu

The label-free nature of unsupervised cross-modal hashing hinders models from exploiting the exact semantic data similarity. Existing research typically simulates the semantics by a heuristic geometric prior in the original feature space. However, this introduces heavy bias into the model as the original features are not fully representing the underlying multi-view data relations. To address the problem above, in this paper, we propose a novel unsupervised hashing method called Semantic-Rebased Cross-modal Hashing (SRCH). A novel ‘Set-and-Rebase’ process is defined to initialize and update the cross-modal similarity graph of training data. In particular, we set the graph according to the intra-modal feature geometric basis and then alternately rebase it to update the edges within according to the hashing results. We develop an alternating optimization routine to rebase the graph and train the hashing auto-encoders with closed-form solutions so that the overall framework is efficiently trained. Our experimental results on benchmarked datasets demonstrate the superiority of our model against state-of-the-art algorithms.

AAAI Conference 2019 Conference Paper

Validation of Growing Knowledge Graphs by Abductive Text Evidences

  • Jianfeng Du
  • Jeff Z. Pan
  • Sylvia Wang
  • Kunxun Qi
  • Yuming Shen
  • Yu Deng

This paper proposes a validation mechanism for newly added triples in a growing knowledge graph. Given a logical theory, a knowledge graph, a text corpus, and a new triple to be validated, this mechanism computes a sorted list of explanations for the new triple to facilitate the validation of it, where an explanation, called an abductive text evidence, is a set of pairs of the form (triple, window) where appending the set of triples on the left to the knowledge graph enforces entailment of the new triple under the logical theory, while every sentence window on the right which is contained in the text corpus explains to some degree why the triple on the left is true. From the angle of practice, a special class of abductive text evidences called TEP-based abductive text evidence is proposed, which is constructed from explanation patterns seen before in the knowledge graph. Accordingly, a method for computing the complete set of TEP-based abductive text evidences is proposed. Moreover, a method for sorting abductive text evidences based on distantly supervised learning is proposed. To evaluate the proposed validation mechanism, four knowledge graphs with logical theories are constructed from the four great classical masterpieces of Chinese literature. Experimental results on these datasets demonstrate the efficiency and effectiveness of the proposed mechanism.

AAAI Conference 2018 Conference Paper

Towards Affordable Semantic Searching: Zero-Shot Retrieval via Dominant Attributes

  • Yang Long
  • Li Liu
  • Yuming Shen
  • Ling Shao

Instance-level retrieval has become an essential paradigm to index and retrieves images from large-scale databases. Conventional instance search requires at least an example of the query image to retrieve images that contain the same object instance. Existing semantic retrieval can only search semantically-related images, such as those sharing the same category or a set of tags, not the exact instances. Meanwhile, the unrealistic assumption is that all categories or tags are known beforehand. Training models for these semantic concepts highly rely on instance-level attributes or human captions which are expensive to acquire. Given the above challenges, this paper studies the Zero-shot Retrieval problem that aims for instance-level image search using only a few dominant attributes. The contributions are: 1) we utilise automatic word embedding to infer class-level attributes to circumvent expensive human labelling; 2) the inferred class-attributes can be extended into discriminative instance attributes through our proposed Latent Instance Attributes Discovery (LIAD) algorithm; 3) our method is not restricted to complete attribute signatures, query of dominant attributes can also be dealt with. On two benchmarks, CUB and SUN, extensive experiments demonstrate that our method can achieve promising performance for the problem. Moreover, our approach can also benefit conventional ZSL tasks.

ICRA Conference 2017 Conference Paper

Semi-supervised vision-language mapping via variational learning

  • Yuming Shen
  • Li Zhang 0013
  • Ling Shao 0001

Understanding the semantic relations between vision and language data has become a research trend in artificial intelligence and robotic systems. The lack of training data is an essential issue for vision-language understanding. We address the problem of image and sentence cross-modal retrieval when paired training samples are not sufficient. Inspired by recent works in variational inference, in this paper, the autoencoding variational Bayes framework is novelly extended to a semi-supervised model for image-sentence mapping task. Our method does not require all training images and sentences to be paired. The proposed model is an end-to-end system, and consists of a two-level variational embedding structure where unpaired data are involved in the first level embedding to give support to intra-modality statistics so that the lower bound of the joint marginal likelihood of paired data embeddings can be better approximated. The proposed retrieval model is evaluated on two popular datasets, i. e. Flickr30K and Flickr8K, producing superior performances compared with related state-of-the-art methods.