EAAI Journal 2026 Journal Article
A hierarchical semantic collaboration-based network for infrared and visible image fusion
- Liuyan Shi
- Rencan Nie
- Jinde Cao
- Jiang Zuo
- Xiaoli Li
To address the inherent divergence between image fusion and downstream semantic tasks, this study proposes a Hierarchical Semantic Collaboration-Based Network (HSCNet) for infrared and visible image fusion. The proposed framework jointly models cross-modal features across pixel and semantic domains through a multi-level feature sharing strategy, effectively reducing pixel-level information loss and enhancing semantic reconstruction. A semantic-driven feedback mechanism enables bidirectional optimization between the fusion and segmentation branches, thereby improving the semantic expressiveness of the fused images. Furthermore, a Hierarchical Semantic Transformer (HST) decomposes image representations into global structural and local detail components, facilitating task-specific denoising and reconstruction. Extensive evaluations on three public datasets demonstrate that HSCNet consistently achieves state-of-the-art (SOTA) performance, ranking first across all fusion metrics. For downstream applications, the model attains the highest segmentation accuracy, with a mean Intersection over Union (mIoU) of 79. 48, and superior detection performance, achieving a mean Average Precision (mAP) of 0. 559 over the range [0. 5: 0. 95], outperforming existing methods. These results confirm that HSCNet not only produces perceptually superior fused images but also enhances their compatibility with high-level semantic understanding in real-world artificial intelligence (AI) applications.