A Training-free Synthetic Data Selection Method for Semantic Segmentation

Hao Tang; Siyue Yu; Jian Pang; Bingfeng Zhang

doi:10.1609/aaai.v39i7.32777

Back to AAAI

AAAI 2025

A Training-free Synthetic Data Selection Method for Semantic Segmentation

Conference Paper AAAI Technical Track on Computer Vision VI Artificial Intelligence

PDF Details DOI

Abstract

Training semantic segmenter with synthetic data has been attracting great attention due to its easy accessibility and huge quantities. Most previous methods focused on producing large-scale synthetic image-annotation samples and then training the segmenter with all of them. However, such a solution remains a main challenge in that the poor-quality samples are unavoidable, and using them to train the model will damage the training process. In this paper, we propose a training-free Synthetic Data Selection (SDS) strategy with CLIP to select high-quality samples for building a reliable synthetic dataset. Specifically, given massive synthetic image-annotation pairs, we first design a Perturbation-based CLIP Similarity (PCS) to measure the reliability of synthetic image, thus removing samples with low-quality images. Then we propose a class-balance Annotation Similarity Filter (ASF) by comparing the synthetic annotation with the response of CLIP to remove the samples related to low-quality annotations. The experimental results show that using our method significantly reduces the data size by half, while the trained segmenter achieves higher performance.

A Training-free Synthetic Data Selection Method for Semantic Segmentation

Abstract

Authors

Keywords

Context