Arrow Research search

Author name cluster

Linh Le

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
1 author row

Possible papers

3

EAAI Journal 2026 Journal Article

Enhancing multimodal emotion recognition with dynamic fuzzy membership and attention fusion

  • Nhut Minh Nguyen
  • Trung Minh Nguyen
  • Thanh Trung Nguyen
  • Phuong-Nam Tran
  • Nhat Truong Pham
  • Linh Le
  • Alice Othmani
  • Abdulmotaleb El Saddik

Multimodal learning has been shown to enhance classification outcomes in speech emotion recognition (SER). Despite this advantage, multimodal approaches in SER often face key challenges, including limited robustness to uncertainty, difficulty generalizing across diverse emotional contexts, and inefficiencies in integrating heterogeneous modalities. To overcome these constraints, we propose a multimodal emotion recognition architecture, named FleSER, which leverages dynamic fuzzy membership and attention-based fusion. Unlike most previous SER studies that apply fuzzy logic at the decision level, FleSER introduces a feature-level, rule-based dynamic fuzzy membership mechanism that adaptively refines modality representations prior to fusion. The FleSER architecture leverages audio and textual modalities, employing self-modality and cross-modality attention mechanisms with the α interpolation to capture complementary emotional cues. The α interpolation-based feature fusion mechanism adaptively emphasizes the more informative modality across varying contexts, ensuring robust multimodal integration. This comprehensive design enhances recognition accuracy. We evaluate FleSER on multiple benchmark datasets, surpassing previous state-of-the-art (SOTA) approaches and demonstrating superior effectiveness in emotion recognition. Ablation studies further validate the effectiveness of each key component, including unimodal and multimodal input effectiveness, fuzzy membership functions, fusion strategies, and the projection dimension, on the performance of the FleSER architecture.

AAAI Conference 2024 Conference Paper

CoLAL: Co-learning Active Learning for Text Classification

  • Linh Le
  • Genghong Zhao
  • Xia Zhang
  • Guido Zuccon
  • Gianluca Demartini

In the machine learning field, the challenge of effectively learning with limited data has become increasingly crucial. Active Learning (AL) algorithms play a significant role in this by enhancing model performance. We introduce a novel AL algorithm, termed Co-learning (CoLAL), designed to select the most diverse and representative samples within a training dataset. This approach utilizes noisy labels and predictions made by the primary model on unlabeled data. By leveraging a probabilistic graphical model, we combine two multi-class classifiers into a binary one. This classifier determines if both the main and the peer models agree on a prediction. If they do, the unlabeled sample is assumed to be easy to classify and is thus not beneficial to increase the target model's performance. We prioritize data that represents the unlabeled set without overlapping decision boundaries. The discrepancies between these boundaries can be estimated by the probability that two models result in the same prediction. Through theoretical analysis and experimental validation, we reveal that the integration of noisy labels into the peer model effectively identifies target model's potential inaccuracies. We evaluated the CoLAL method across seven benchmark datasets: four text datasets (AGNews, DBPedia, PubMed, SST-2) and text-based state-of-the-art (SOTA) baselines, and three image datasets (CIFAR100, MNIST, OpenML-155) and computer vision SOTA baselines. The results show that our CoLAL method significantly outperforms existing SOTA in text-based AL, and is competitive with SOTA image-based AL techniques.

EAAI Journal 2021 Journal Article

Transformers-based information extraction with limited data for domain-specific business documents

  • Minh-Tien Nguyen
  • Dung Tien Le
  • Linh Le

Information extraction plays an important role for data transformation in business cases. However, building extraction systems in actual cases face two challenges: (i) the availability of labeled data is usually limited and (ii) highly detailed classification is required. This paper introduces a model for addressing the two challenges. Different from prior studies that usually require a large number of training samples, our extraction model is trained with a small number of data for extracting a large number of information types. To do that, the model takes into account the contextual aspect of pre-trained language models trained on a huge amount of data on general domains for word representation. To adapt to our downstream task, the model employs transfer learning by stacking Convolutional Neural Networks to learn hidden representation for classification. To confirm the efficiency of our method, we apply the model to two actual cases of document processing for bidding and sale documents of two Japanese companies. Experimental results on real testing sets show that, with a small number of training data, our model achieves high accuracy accepted by our clients.