AAAI 2026
Disentangled Hypergraph-Guided Mamba Scanning for Fine-Grained Visual Recognition
Abstract
Fine-grained Visual Recognition (FGVR) aims to distinguish between categories with subtle inter-class differences and large intra-class variations. While Vision Transformers with attention mechanisms have been widely adopted for FGVR, they usually suffer from high computational complexity and entangled global representations. Recent advancements in state-space models, exemplified by Mamba, have showcased substantial potential in vision-related tasks due to their linear scalability and rich sequence modeling capacity. To this end, we propose DHMamba, a novel Mamba based FGVR method. The proposed method leverages hypergraph to guide selective scanning and strengthen Mamba’s capability in modeling fine-grained semantics. Furthermore, a Disentangled Local Scanning (DLS) module is introduced to utilize hyperedges to allocate distinct informative patches into independent channels for mitigating the representational entanglement. Extensive experiments conducted on multiple FGVR benchmarks demonstrate that the proposed DHMamba outperforms the state-of-the-art methods, validating the efficacy of combining state-space modeling with hypergraph-based feature structuring.
Authors
Keywords
No keywords are indexed for this paper.
Context
- Venue
- AAAI Conference on Artificial Intelligence
- Archive span
- 1980-2026
- Indexed papers
- 28718
- Paper id
- 1113783273894030447