Arrow Research search

Author name cluster

Guangyi Lv

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
1 author row

Possible papers

5

IJCAI Conference 2021 Conference Paper

DeepME: Deep Mixture Experts for Large-scale Image Classification

  • Ming He
  • Guangyi Lv
  • Weidong He
  • Jianping Fan
  • Guihua Zeng

Although deep learning has demonstrated its outstanding performance on image classification, most well-known deep networks make efforts to optimize both their structures and their node weights for recognizing fewer (e. g. , no more than 1000) object classes. Therefore, it is attractive to extend or mixture such well-known deep networks to support large-scale image classification. According to our best knowledge, how to adaptively and effectively fuse multiple CNNs for large-scale image classification is still under-explored. On this basis, a deep mixture algorithm is developed to support large-scale image classification in this paper. First, a soft spectral clustering method is developed to construct a two-layer ontology (group layer and category layer) by assigning large numbers of image categories into a set of groups according to their inter-category semantic correlations, where the semantically-related image categories under the neighbouring group nodes may share similar learning complexities. Then, such two-layer ontology is further used to generate the task groups, in which each task group contains partial image categories with similar learning complexities and one particular base deep network is learned. Finally, a gate network is learned to combine all base deep networks with fewer diverse outputs to generate a mixture network with larger outputs. Our experimental results on ImageNet10K have demonstrated that our proposed deep mixture algorithm can achieve very competitive results (top 1 accuracy: 32. 13%) on large-scale image classification tasks.

AAAI Conference 2021 Conference Paper

Making the Relation Matters: Relation of Relation Learning Network for Sentence Semantic Matching

  • Kun Zhang
  • Le Wu
  • Guangyi Lv
  • Meng Wang
  • Enhong Chen
  • Shulan Ruan

Sentence semantic matching is one of the fundamental tasks in natural language processing, which requires an agent to determine the semantic relation among input sentences. Recently, deep neural networks have achieved impressive performance in this area, especially BERT. Despite their effectiveness, most of these models treat output labels as meaningless one-hot vectors, underestimating the semantic information and guidance of relations that these labels reveal, especially for tasks with a small number of labels. To address this problem, we propose a Relation of Relation Learning Network (R2 -Net) for sentence semantic matching. Specifically, we first employ BERT to encode the input sentences from a global perspective. Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective. To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task for guiding R2 -Net to consider more about relations. Meanwhile, a triplet loss is employed to distinguish the intra-class and inter-class relations in a finer granularity. Empirical experiments on two sentence semantic matching tasks demonstrate the superiority of our proposed model.

AAAI Conference 2019 Conference Paper

DRr-Net: Dynamic Re-Read Network for Sentence Semantic Matching

  • Kun Zhang
  • Guangyi Lv
  • Linyuan Wang
  • Le Wu
  • Enhong Chen
  • Fangzhao Wu
  • Xing Xie

Sentence semantic matching requires an agent to determine the semantic relation between two sentences, which is widely used in various natural language tasks such as Natural Language Inference (NLI) and Paraphrase Identification (PI). Among all matching methods, attention mechanism plays an important role in capturing the semantic relations and properly aligning the elements of two sentences. Previous methods utilized attention mechanism to select important parts of sentences at one time. However, the important parts of the sentence during semantic matching are dynamically changing with the degree of sentence understanding. Selecting the important parts at one time may be insufficient for semantic understanding. To this end, we propose a Dynamic Re-read Network (DRr-Net) approach for sentence semantic matching, which is able to pay close attention to a small region of sentences at each step and re-read the important words for better sentence semantic understanding. To be specific, we first employ Attention Stack-GRU (ASG) unit to model the original sentence repeatedly and preserve all the information from bottom-most word embedding input to up-most recurrent output. Second, we utilize Dynamic Re-read (DRr) unit to pay close attention to one important word at one time with the consideration of learned information and re-read the important words for better sentence semantic understanding. Extensive experiments on three sentence matching benchmark datasets demonstrate that DRr-Net has the ability to model sentence semantic more precisely and significantly improve the performance of sentence semantic matching. In addition, it is very interesting that some of finding in our experiments are consistent with the findings of psychological research.

AAAI Conference 2017 Conference Paper

A Context-Enriched Neural Network Method for Recognizing Lexical Entailment

  • Kun Zhang
  • Enhong Chen
  • Qi Liu
  • Chuanren Liu
  • Guangyi Lv

Recognizing lexical entailment (RLE) always plays an important role in inference of natural language, i. e. , identifying whether one word entails another, for example, fox entails animal. In the literature, automatically recognizing lexical entailment for word pairs deeply relies on words’ contextual representations. However, as a “prototype” vector, a single representation cannot reveal multifaceted aspects of the words due to their homonymy and polysemy. In this paper, we propose a supervised Context-Enriched Neural Network (CENN) method for recognizing lexical entailment. To be specific, we first utilize multiple embedding vectors from different contexts to represent the input word pairs. Then, through different combination methods and attention mechanism, we integrate different embedding vectors and optimize their weights to predict whether there are entailment relations in word pairs. Moreover, our proposed framework is flexible and open to handle different word contexts and entailment perspectives in the text corpus. Extensive experiments on five datasets show that our approach significantly improves the performance of automatic RLE in comparison with several state-of-the-art methods.

AAAI Conference 2016 Conference Paper

Reading the Videos: Temporal Labeling for Crowdsourced Time-Sync Videos Based on Semantic Embedding

  • Guangyi Lv
  • Tong Xu
  • Enhong Chen
  • Qi Liu
  • Yi Zheng

Recent years have witnessed the boom of online sharing media contents, which raise significant challenges in effective management and retrieval. Though a large amount of efforts have been made, precise retrieval on video shots with certain topics has been largely ignored. At the same time, due to the popularity of novel time-sync comments, or so-called “bullet-screen comments”, video semantics could be now combined with timestamps to support further research on temporal video labeling. In this paper, we propose a novel video understanding framework to assign temporal labels on highlighted video shots. To be specific, due to the informal expression of bullet-screen comments, we first propose a temporal deep structured semantic model (T-DSSM) to represent comments into semantic vectors by taking advantage of their temporal correlation. Then, video highlights are recognized and labeled via semantic vectors in a supervised way. Extensive experiments on a real-world dataset prove that our framework could effectively label video highlights with a significant margin compared with baselines, which clearly validates the potential of our framework on video understanding, as well as bullet-screen comments interpretation.