Author name cluster

Ling Ge

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

1 author row

AAAI Conference 2024 Conference Paper

DA-Net: A Disentangled and Adaptive Network for Multi-Source Cross-Lingual Transfer Learning

Ling Ge
Chunming Hu
Guanghui Ma
Jihong Liu
Hong Zhang

Multi-Source cross-lingual transfer learning deals with the transfer of task knowledge from multiple labelled source languages to an unlabeled target language under the language shift. Existing methods typically focus on weighting the predictions produced by language-specific classifiers of different sources that follow a shared encoder. However, all source languages share the same encoder, which is updated by all these languages. The extracted representations inevitably contain different source languages' information, which may disturb the learning of the language-specific classifiers. Additionally, due to the language gap, language-specific classifiers trained with source labels are unable to make accurate predictions for the target language. Both facts impair the model's performance. To address these challenges, we propose a Disentangled and Adaptive Network ~(DA-Net). Firstly, we devise a feedback-guided collaborative disentanglement method that seeks to purify input representations of classifiers, thereby mitigating mutual interference from multiple sources. Secondly, we propose a class-aware parallel adaptation method that aligns class-level distributions for each source-target language pair, thereby alleviating the language pairs' language gap. Experimental results on three different tasks involving 38 languages validate the effectiveness of our approach.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Discrepancy and Uncertainty Aware Denoising Knowledge Distillation for Zero-Shot Cross-Lingual Named Entity Recognition

Ling Ge
Chunming Hu
Guanghui Ma
Jihong Liu
Hong Zhang

The knowledge distillation-based approaches have recently yielded state-of-the-art (SOTA) results for cross-lingual NER tasks in zero-shot scenarios. These approaches typically employ a teacher network trained with the labelled source (rich-resource) language to infer pseudo-soft labels for the unlabelled target (zero-shot) language, and force a student network to approximate these pseudo labels to achieve knowledge transfer. However, previous works have rarely discussed the issue of pseudo-label noise caused by the source-target language gap, which can mislead the training of the student network and result in negative knowledge transfer. This paper proposes an discrepancy and uncertainty aware Denoising Knowledge Distillation model (DenKD) to tackle this issue. Specifically, DenKD uses a discrepancy-aware denoising representation learning method to optimize the class representations of the target language produced by the teacher network, thus enhancing the quality of pseudo labels and reducing noisy predictions. Further, DenKD employs an uncertainty-aware denoising method to quantify the pseudo-label noise and adjust the focus of the student network on different samples during knowledge distillation, thereby mitigating the noise's adverse effects. We conduct extensive experiments on 28 languages including 4 languages not covered by the pre-trained models, and the results demonstrate the effectiveness of our DenKD.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Multi-View Robust Graph Representation Learning for Graph Classification

Guanghui Ma
Chunming Hu
Ling Ge
Hong Zhang

The robustness of graph classification models plays an essential role in providing highly reliable applications. Previous studies along this line primarily focus on seeking the stability of the model in terms of overall data metrics (e. g. , accuracy) when facing data perturbations, such as removing edges. Empirically, we find that these graph classification models also suffer from semantic bias and confidence collapse issues, which substantially hinder their applicability in real-world scenarios. To address these issues, we present MGRL, a multi-view representation learning model for graph classification tasks that achieves robust results. Firstly, we proposes an instance-view consistency representation learning method, which utilizes multi-granularity contrastive learning technique to perform semantic constraints on instance representations at both the node and graph levels, thus alleviating the semantic bias issue. Secondly, we proposes a class-view discriminative representation learning method, which employs the prototype-driven class distance optimization technique to adjust intra- and inter-class distances, thereby mitigating the confidence collapse issue. Finally, extensive experiments and visualizations on eight benchmark dataset demonstrate the effectiveness of MGRL.

PDF Details DOI

AAAI Conference 2023 Conference Paper

ProKD: An Unsupervised Prototypical Knowledge Distillation Network for Zero-Resource Cross-Lingual Named Entity Recognition

Ling Ge
Chunming Hu
Guanghui Ma
Hong Zhang
Jihong Liu

For named entity recognition (NER) in zero-resource languages, utilizing knowledge distillation methods to transfer language-independent knowledge from the rich-resource source languages to zero-resource languages is an effective means. Typically, these approaches adopt a teacher-student architecture, where the teacher network is trained in the source language, and the student network seeks to learn knowledge from the teacher network and is expected to perform well in the target language. Despite the impressive performance achieved by these methods, we argue that they have two limitations. Firstly, the teacher network fails to effectively learn language-independent knowledge shared across languages due to the differences in the feature distribution between the source and target languages. Secondly, the student network acquires all of its knowledge from the teacher network and ignores the learning of target language-specific knowledge. Undesirably, these limitations would hinder the model's performance in the target language. This paper proposes an unsupervised prototype knowledge distillation network (ProKD) to address these issues. Specifically, ProKD presents a contrastive learning-based prototype alignment method to achieve class feature alignment by adjusting the prototypes' distance from the source and target languages, boosting the teacher network's capacity to acquire language-independent knowledge. In addition, ProKD introduces a prototype self-training method to learn the intrinsic structure of the language by retraining the student network on the target data using samples' distance information from prototypes, thereby enhancing the student network's ability to acquire language-specific knowledge. Extensive experiments on three benchmark cross-lingual NER datasets demonstrate the effectiveness of our approach.

PDF Details DOI