INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation

Jian Hu; Zixu Cheng; Shaogang Gong

doi:10.24963/ijcai.2025/124

Back to IJCAI

IJCAI 2025

INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation

Conference Paper Agent-based and Multi-agent Systems Artificial Intelligence

PDF Details DOI

Abstract

Task-generic promptable image segmentation aims to achieve segmentation of diverse samples under a single task description by utilizing only one task-generic prompt. Current methods leverage the generalization capabilities of Vision-Language Models (VLMs) to infer instance-specific prompts from these task-generic prompts in order to guide the segmentation process. However, when VLMs struggle to generalise to some image instances, predicting instance-specific prompts becomes poor. To solve this problem, we introduce Instance-specific Negative Mining for Task-Generic Promptable Segmentation (INT). The key idea of INT is to adaptively reduce the influence of irrelevant (negative) prior knowledge whilst to increase the use the most plausible prior knowledge, selected by negative mining with higher contrast, in order to optimise instance-specific prompts generation. Specifically, INT consists of two components: (1) instance-specific prompt generation, which progressively fliters out incorrect information in prompt generation; (2) semantic mask generation, which ensures each image instance segmentation matches correctly the semantics of the instance-specific prompts. INT is validated on six datasets, including camouflaged objects and medical images, demonstrating its effectiveness, robustness and scalability.

Authors

Keywords

Computer Vision: CV: Multimodal learning
Computer Vision: CV: Segmentation, grouping and shape analysis
Computer Vision: CV: Transfer, low-shot, semi- and un- supervised learning
Computer Vision: CV: Vision, language and reasoning

Context

Venue: International Joint Conference on Artificial Intelligence
Archive span: 1969-2025
Indexed papers: 14525
Paper id: 1053719603129142621