AAAI Conference 2026 Conference Paper
S²Flow: Towards Fast and Authentic Training-Free High-Resolution Video Generation
- Chaoqun Wang
- Shaobo Min
- Xu Yang
Rectified flow models have shown strong potential in high-fidelity video generation, yet extending them to high-resolution remains challenging due to the high cost of full attention and error accumulation in the ODE-solving process. In this paper, we propose S^2Flow, a training-free framework that enables efficient and authentic high-resolution video generation by jointly exploring Flow-guided Sparse attention and Second-order ODE solution. Specifically, S^2Flow exploits and transfers the semantic and structural information from the low-resolution flow trajectory to guide the high-resolution flow in two aspects. First, S^2Flow dynamically captures the sparse patterns of the spatio-temporal attention maps from low-resolution videos to construct localized 3D windows, enabling efficient window attention in high-resolution inference. This can significantly reduce redundant computation while preserving contextual dependencies. Second, S^2Flow adopts a second-order ODE solver based on Taylor expansion, where the high-order derivative is approximated via central difference from the low-resolution flow, facilitating accurate high-resolution denoising. Extensive experiments on VBench dataset demonstrate that S^2Flow outperforms prior methods in both visual quality and inference speed, enabling 4x acceleration on 2560x1536 video generation.