AAAI Conference 2026 Conference Paper
WaveEx: Accelerating Flow Matching-based Speech Generation via Wavelet-guided Extrapolation
- Xiaoqian Liu
- Xiyan Gui
- Zhengkun Ge
- Yuan Ge
- Chang Zou
- Jiacheng Liu
- Zhikang Niu
- Qixi Zheng
Flow matching-based generative models offer a principled approach to modeling continuous-time dynamics in speech generation. However, inference is often computationally expensive due to repeated neural network evaluations required by ODE solvers. We propose WaveEx, a training-free and plug-in acceleration framework which replaces portions of ODE integration with wavelet-guided extrapolation. By leveraging the multi-scale structure of latent trajectories, WaveEx predicts future states directly in the frequency domain without additional model evaluations or architectural changes. WaveEx consistently accelerates inference across diverse speech generation tasks. The gains are especially pronounced in tasks like speech synthesis (up to 5.73× speedup) and music generation (2.75×), where flow matching plays a central role in alignment modeling and dense ODE integration. Even in tasks with simpler input-output mappings such as speech enhancement (4.55×) and voice conversion (2.75×), WaveEx still achieves notable acceleration, demonstrating the robustness and generalizability of the approach. These results highlight wavelet-guided extrapolation as a lightweight and broadly applicable alternative to full ODE solving for flow matching-based speech generation.