IJCAI Conference 2023 Conference Paper
DAMO-StreamNet: Optimizing Streaming Perception in Autonomous Driving
- Jun-Yan He
- Zhi-Qi Cheng
- Chenyang Li
- Wangmeng Xiang
- Binghui Chen
- Bin Luo
- Yifeng Geng
- Xuansong Xie
In the realm of autonomous driving, real-time perception or streaming perception remains under-explored. This research introduces DAMO-StreamNet, a novel framework that merges the cutting-edge elements of the YOLO series with a detailed examination of spatial and temporal perception techniques. DAMO-StreamNet's main inventions include: (1) a robust neck structure employing deformable convolution, bolstering receptive field and feature alignment capabilities; (2) a dual-branch structure synthesizing short-path semantic features and long-path temporal features, enhancing the accuracy of motion state prediction; (3) logits-level distillation facilitating efficient optimization, which aligns the logits of teacher and student networks in semantic space; and (4) a real-time prediction mechanism that updates the features of support frames with the current frame, providing smooth streaming perception during inference. Our testing shows that DAMO-StreamNet surpasses current state-of-the-art methodologies, achieving 37. 8% (normal size (600, 960)) and 43. 3% (large size (1200, 1920)) sAP without requiring additional data. This study not only establishes a new standard for real-time perception but also offers valuable insights for future research. The source code is at https: //github. com/zhiqic/DAMO-StreamNet.