Arrow Research search
Back to AAAI

AAAI 2024

Patch-Aware Sample Selection for Efficient Masked Image Modeling

Conference Paper AAAI Technical Track on Machine Learning VI Artificial Intelligence

Abstract

Nowadays sample selection is drawing increasing attention. By extracting and training only on the most informative subset, sample selection can effectively reduce the training cost. Although sample selection is effective in conventional supervised learning, applying it to Masked Image Modeling (MIM) still poses challenges due to the gap between sample-level selection and patch-level pre-training. In this paper, we inspect the sample selection in MIM pre-training and find the basic selection suffers from performance degradation. We attribute this degradation primarily to 2 factors: the random mask strategy and the simple averaging function. We then propose Patch-Aware Sample Selection (PASS), including a low-cost Dynamic Trained Mask Predictor (DTMP) and Weighted Selection Score (WSS). DTMP consistently masks the informative patches in samples, ensuring a relatively accurate representation of selection score. WSS enhances the selection score using patch-level disparity. Extensive experiments show the effectiveness of PASS in selecting the most informative subset and accelerating pretraining. PASS exhibits superior performance across various datasets, MIM methods, and downstream tasks. Particularly, PASS improves MAE by 0.7% on ImageNet-1K while utilizing only 37% data budget and achieves ~1.7x speedup.

Authors

Keywords

  • CV: Learning & Optimization for CV
  • ML: Deep Learning Algorithms
  • ML: Deep Neural Architectures and Foundation Models
  • ML: Representation Learning
  • ML: Unsupervised & Self-Supervised Learning

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
36371730804796576