EAAI Journal 2026 Journal Article
Multi-view simulation for robust polyp segmentation via cross-gated decoding and soft-attention fusion
- Linbo Wang
- Cong Chen
- Jinxian Qiu
- Zhengyi Liu
- Xianyong Fang
- Shaohua Wan
Polyp segmentation is crucial for early colorectal cancer detection but remains challenging due to significant shape variations and ambiguous boundaries. Existing methods often rely on single-view analysis, overlooking the potential of multi-view representations to provide complementary segmentation cues. To address this, we propose a novel multi-view simulation-based polyp segmentation network (MVSNet) that generates diverse views of an input image through directional flipping and extracts robust features using a shared Pyramid Vision Transformer (PVT). Additional two tactics are proposed to effectively utilize the rich features from each view and the whole view set respectively: (1) A cross gating based view-aware multi-stage decoding method, which applies the element-wise cross gating to both the coarse and fine features in each stage and thus boosts multi-stage decoded features with high discrimination specific to each individual view for its initial segmentation mask; and (2) a soft-attention based cross-View prediction method, which takes soft attention among different views to adaptively weight the contributions from each view for the final prediction. Extensive experiments on five benchmark datasets (Kvasir-SEG, ClinicDB, ColonDB, ETIS, and Endoscene) validate the effectiveness of our approach, achieving the highest mean Dice scores of 0. 923, 0. 946, 0. 825, 0. 821, and 0. 902 respectively, demonstrating consistent superiority over existing state-of-the-art competitors. Code is available at https: //github. com/linbowang/MVSNet.