NeurIPS Conference 2025 Conference Paper
Improving Bilinear RNN with Closed-loop Control
- Jiaxi Hu
- Yongqi Pan
- Jusen Du
- Disen Lan
- Tang Tang
- Qingsong Wen
- Yuxuan Liang
- Weigao Sun
Recent efficient sequence modeling methods, such as Gated DeltaNet, TTT, and RWKV-7, have achieved performance improvements by supervising the recurrent memory management through the Delta learning rule. Unlike previous state-space models (e. g. , Mamba) and gated linear attentions (e. g. , GLA), these models introduce interactions between the recurrent state and the key vector, resulting in a bilinear recursive structure. In this paper, we first introduce the concept of Bilinear RNNs with a comprehensive analysis on the advantages and limitations of these models. Then based on the closed-loop control theory, we propose a novel Bilinear RNN variant named Comba, which adopts a scalar-plus-low-rank state transition, with both state feedback and output feedback corrections. We also implement a hardware-efficient chunk-wise parallel kernel in Triton and train models with 340M/1. 3B parameters on a large-scale corpus. Comba demonstrates its superior performance and computation efficiency on both language modeling and vision tasks.