Arrow Research search
Back to AAAI

AAAI 2026

Disentangled Hypergraph-Guided Mamba Scanning for Fine-Grained Visual Recognition

Conference Paper AAAI Technical Track on Computer Vision X Artificial Intelligence

Abstract

Fine-grained Visual Recognition (FGVR) aims to distinguish between categories with subtle inter-class differences and large intra-class variations. While Vision Transformers with attention mechanisms have been widely adopted for FGVR, they usually suffer from high computational complexity and entangled global representations. Recent advancements in state-space models, exemplified by Mamba, have showcased substantial potential in vision-related tasks due to their linear scalability and rich sequence modeling capacity. To this end, we propose DHMamba, a novel Mamba based FGVR method. The proposed method leverages hypergraph to guide selective scanning and strengthen Mamba’s capability in modeling fine-grained semantics. Furthermore, a Disentangled Local Scanning (DLS) module is introduced to utilize hyperedges to allocate distinct informative patches into independent channels for mitigating the representational entanglement. Extensive experiments conducted on multiple FGVR benchmarks demonstrate that the proposed DHMamba outperforms the state-of-the-art methods, validating the efficacy of combining state-space modeling with hypergraph-based feature structuring.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
1113783273894030447