Arrow Research search
Back to IS

IS 2018

Instance-based Domain Adaptation via Multiclustering Logistic Approximation

Journal Article journal-article Artificial Intelligence ยท Intelligent Systems

Abstract

With the explosive growth of the Internet online texts, we could nowadays easily collect a large amount of labeled training data from different source domains. However, a basic assumption in building statistical machine learning models for sentiment analysis is that the training and test data must be drawn from the same distribution. Directly training a statistical model usually results in poor performance, when the training and test data have different distributions. Faced with the massive labeled data from different domains, it is therefore important to identify the source-domain training instances that are closely relevant to the target domain, and make better use of them. In this work, we propose a new approach, called multiclustering logistic approximation (MLA), to address this problem. In MLA, we adapt the source-domain training data to the target domain via a framework of multiclustering logistic approximation. Experimental results demonstrate that MLA has significant advantages over the state-of-the-art instance adaptation methods, especially in the scenario of multidistributional training data.

Authors

Keywords

  • Training data
  • Logistics
  • Training
  • Sentiment analysis
  • Adaptation models
  • Portable computers
  • Biological system modeling
  • Domain Adaptation
  • Machine Learning
  • Statistical Models
  • Poor Performance
  • Sampling Bias
  • Target Domain
  • Text Classification
  • Source Domain
  • Field Of Machine Learning
  • Training Instances
  • Domain Adaptation Methods
  • Subcategories
  • Class Labels
  • Single Domain
  • Density Ratio
  • Top Categories
  • Target Domain Data
  • Under Category
  • Kullback-Leibler Distance
  • One-class Classification
  • Instance Selection
  • Affective Computing
  • multiclustering logistic approximation
  • instance adaptation
  • multidistributional training data
  • Internet/Web technologies

Context

Venue
IEEE Intelligent Systems
Archive span
2001-2026
Indexed papers
2921
Paper id
925225061346186852