Author name cluster

Haoqing Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

1 author row

NeurIPS Conference 2023 Conference Paper

Focus Your Attention when Few-Shot Classification

Haoqing Wang
Shibo Jie
Zhihong Deng

Since many pre-trained vision transformers emerge and provide strong representation for various downstream tasks, we aim to adapt them to few-shot image classification tasks in this work. The input images typically contain multiple entities. The model may not focus on the class-related entities for the current few-shot task, even with fine-tuning on support samples, and the noise information from the class-independent ones harms performance. To this end, we first propose a method that uses the attention and gradient information to automatically locate the positions of key entities, denoted as position prompts, in the support images. Then we employ the cross-entropy loss between their many-hot presentation and the attention logits to optimize the model to focus its attention on the key entities during fine-tuning. This ability then can generalize to the query samples. Our method is applicable to different vision transformers (e. g. , columnar or pyramidal ones), and also to different pre-training ways (e. g. , single-modal or vision-language pre-training). Extensive experiments show that our method can improve the performance of full or parameter-efficient fine-tuning methods on few-shot tasks. Code is available at https: //github. com/Haoqing-Wang/FORT.

PDF Details

AIJ Journal 2023 Journal Article

Towards well-generalizing meta-learning via adversarial task augmentation

Haoqing Wang
Huiyu Mai
Yuhang Gong
Zhi-Hong Deng

Meta-learning aims to use the knowledge from previous tasks to facilitate the learning of novel tasks. Many meta-learning models elaborately design various task-shared inductive bias, and learn it from a large number of tasks, so the generalization capability of the learned inductive bias depends on the diversity of the training tasks. A common assumption in meta-learning is that the training tasks and the test tasks come from the same or similar task distributions. However, this is usually not strictly satisfied in practice, so meta-learning models need to cope with various novel in-domain or cross-domain tasks. To this end, we propose to use task augmentation to increase the diversity of training tasks, thereby improving the generalization capability of meta-learning models. Concretely, we consider the worst-case problem around the base task distribution, and derive the adversarial task augmentation method which can generate inductive bias-adaptive ‘challenging’ tasks. Our method can be used as a simple plug-and-play module for various meta-learning models, and improve their generalization capability. We conduct extensive experiments under in-domain and cross-domain few-shot learning and unsupervised few-shot learning settings, and evaluate our method on different types of data (images and text). Experimental results show that our method can effectively improve the generalization capability of various meta-learning models under different settings.

Details DOI

JBHI Journal 2022 Journal Article

SimH: A Novel Representation Learning Model With Activation and Projection Mechanisms for COVID-19 Knowledge Bases

Qing Su
Enhai Ou
Yuping Sun
Chunyan Lv
Guobo Xie
Haoqing Wang
Honglin Huang

The emergence of coronavirus disease 2019 (COVID-19) has had a significant impact on healthcare and the economy. With representation learning applied in constructing COVID-19 knowledge graphs, abundant COVID-19-related knowledge collected by clinicians and scientists all over the world can be utilized to deepen their understanding of the mechanism and related biological functions of the disease. However, most existing representation learning models cannot deal well with COVID-19 knowledge graph due to its low-connected star-like structure and various complex nonlinear relationships. Besides, lacking reliable negative triplets is also a difficult problem, yet to be adequately resolved. In this article, we propose a novel representation learning model called translation on hyperplanes with an activation operation and similar semantic sampling (SimH) for COVID-19 knowledge graphs. In our proposed SimH, an activation operation is designed to provide additional interaction features for low-in-degree entities. Then the hyperplane projection technique is introduced to the distance-based scoring function so that those complex nonlinear relationships can be modeled with lower complexity maintained in comparison with other nonlinear models. Moreover, a negative triplet sampling method that adaptively replaces entities with similar semantics is introduced to generate reliable negative triplets. To verify the effectiveness of SimH, extensive experiments are conducted on the COVID-19-Concepts dataset. The experimental results show that our SimH model achieves significant improvements in prediction and classification accuracy over existing knowledge representation learning models.

Details DOI

IJCAI Conference 2021 Conference Paper

Cross-Domain Few-Shot Classification via Adversarial Task Augmentation

Haoqing Wang
Zhi-Hong Deng

Few-shot classification aims to recognize unseen classes with few labeled samples from each class. Many meta-learning models for few-shot classification elaborately design various task-shared inductive bias (meta-knowledge) to solve such tasks, and achieve impressive performance. However, when there exists the domain shift between the training tasks and the test tasks, the obtained inductive bias fails to generalize across domains, which degrades the performance of the meta-learning models. In this work, we aim to improve the robustness of the inductive bias through task augmentation. Concretely, we consider the worst-case problem around the source task distribution, and propose the adversarial task augmentation method which can generate the inductive bias-adaptive 'challenging' tasks. Our method can be used as a simple plug-and-play module for various meta-learning models, and improve their cross-domain generalization capability. We conduct extensive experiments under the cross-domain setting, using nine few-shot classification datasets: mini-ImageNet, CUB, Cars, Places, Plantae, CropDiseases, EuroSAT, ISIC and ChestX. Experimental results show that our method can effectively improve the few-shot classification performance of the meta-learning models under domain shift, and outperforms the existing works. Our code is available at https: //github. com/Haoqing-Wang/CDFSL-ATA.

PDF Details DOI

NeurIPS Conference 2019 Conference Paper

Fast Structured Decoding for Sequence Models

Zhiqing Sun
Zhuohan Li
Haoqing Wang
Di He
Zi Lin
Zhihong Deng

Autoregressive sequence models achieve state-of-the-art performance in domains like machine translation. However, due to the autoregressive factorization nature, these models suffer from heavy latency during inference. Recently, non-autoregressive sequence models were proposed to speed up the inference time. However, these models assume that the decoding process of each token is conditionally independent of others. Such a generation process sometimes makes the output sentence inconsistent, and thus the learned non-autoregressive models could only achieve inferior accuracy compared to their autoregressive counterparts. To improve then decoding consistency and reduce the inference cost at the same time, we propose to incorporate a structured inference module into the non-autoregressive models. Specifically, we design an efficient approximation for Conditional Random Fields (CRF) for non-autoregressive sequence models, and further propose a dynamic transition technique to model positional contexts in the CRF. Experiments in machine translation show that while increasing little latency (8~14ms, our model could achieve significantly better translation performance than previous non-autoregressive models on different translation datasets. In particular, for the WMT14 En-De dataset, our model obtains a BLEU score of 26. 80, which largely outperforms the previous non-autoregressive baselines and is only 0. 61 lower in BLEU than purely autoregressive models.

PDF Details