Arrow Research search
Back to AAAI

AAAI 2023

A Coreset Learning Reality Check

Conference Paper AAAI Technical Track on Machine Learning II Artificial Intelligence

Abstract

Subsampling algorithms are a natural approach to reduce data size before fitting models on massive datasets. In recent years, several works have proposed methods for subsampling rows from a data matrix while maintaining relevant information for classification. While these works are supported by theory and limited experiments, to date there has not been a comprehensive evaluation of these methods. In our work, we directly compare multiple methods for logistic regression drawn from the coreset and optimal subsampling literature and discover inconsistencies in their effectiveness. In many cases, methods do not outperform simple uniform subsampling.

Authors

Keywords

  • ML: Classification and Regression
  • ML: Dimensionality Reduction/Feature Selection
  • ML: Scalability of ML Systems

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
126901019948124464