EAAI Journal 2026 Journal Article
A lightweight multi-window attention transformer for image super-resolution
- Yuqing Yang
- Hao Liu
- Jun Zhang
- Wenfei Luo
- Jiaqian Wang
- Yuxiang Shi
- Hongxia Deng
In recent years, Transformer-based models have achieved strong performance in image super-resolution (SR). However, their high computational complexity and parameter cost still limit deployment on resource-constrained devices. To better balance efficiency and representation capability, this paper proposes a lightweight Transformer for image super-resolution, termed Multi-Window Attention Transformer for Image Super-Resolution (MWAT-SR), which adopts a hierarchical multi-window attention strategy. In shallow layers, Local Dense Attention (LDA) with small windows is used to preserve local high-frequency details. In deeper layers, larger windows are introduced together with a Hybrid Sparse-Channel Attention (HSCA) mechanism, which combines sparse spatial interaction and channel-wise semantic modeling to enlarge the effective receptive field under controlled computational cost. In addition, a Window-Adaptive Multi-Scale Convolutional Feed-Forward Network (WAMC-FFN) is designed to adjust convolution kernel sizes according to the window scale, thereby enhancing multi-scale texture representation. Experimental results on standard benchmark datasets show that MWAT-SR achieves competitive reconstruction performance across × 2, × 3, and × 4 settings, while maintaining a favorable trade-off between reconstruction quality and computational complexity.