Arrow Research search
Back to EAAI

EAAI 2023

Imbalanced data classification: Using transfer learning and active sampling

Journal Article journal-article Applied Artificial Intelligence ยท Artificial Intelligence

Abstract

Recently, deep learning models have made great breakthroughs in the field of computer vision, relying on large-scale class-balanced datasets. However, most of them do not consider the class-imbalanced data. In reality, the class-imbalanced distribution can lead to the degradation of model performance, reducing the generalization of these models. In addition, in the era of big data, many applications need to use real-time visual data. These data come from different mobile devices, which continuously generate a huge number of visual data. However, there are few studies using real-time data from information systems, real-time data is easy to capture but difficult to use. In order to solve the above problems, we propose a new model (Transfer Learning Classifier, TLC) based on transfer learning to deal with class-imbalanced data. The model includes active sampling module, real-time data augmentation module and DenseNet module. Among them, (1) the newly proposed active sampling module can dynamically adjust the number of samples with skewed distribution; (2) the data augmentation module can expand the real-time data to avoid over-fitting and insufficient data; (3) the DenseNet module is a standard DenseNet network pre-trained on the ImageNet dataset and transferred to TLC for relearning, and then we adjust the memory usage of the standard DenseNet to make it more efficient. In addition, we have applied a new end-to-end real-time data storage and analysis system. A large number of experiments have been carried out on four different long mantissa data sets. Experimental results show that the proposed TLC model can effectively deal with the static data as well as the real-time data, and the classification effect of imbalanced data is better than that of existing models.

Authors

Keywords

  • Class-imbalanced data
  • Transfer learning
  • Active sampling
  • DenseNet
  • Real-time data

Context

Venue
Engineering Applications of Artificial Intelligence
Archive span
1988-2026
Indexed papers
13269
Paper id
284410832583746944