EAAI Journal 2026 Journal Article
Asynchronous multithreading reinforcement learning with attention-based significance measurement for collision-free robot navigation
- Chao Sun
- Jiang Wang
- Xing Wu
- Chaoxu Mu
- Changyin Sun
Collision avoidance is one crucial technique to achieve safe and efficient robotic vehicle navigation in unknown environments. However, moving obstacles with unpredictability in dynamic scenarios, usually increase the difficulty and complexity in collision avoidance of robotic vehicles. To enhance the stability of collision avoidance and boost its adaptability to uncertain dynamic scenes, a new attention-based significance measurement actor–critic (ASMAC) architecture is proposed. It is an end-to-end robot navigation model that uses imperfect local observation to directly plan precise collision-free motion commands. Firstly, a significance-measured rollout replaybuffer (SMRR) is presented to categorize the experiences into different pools. It can prevent any overfitting or bias that may result from repeatedly sampling experience of a certain type during policy learning. Then, we enhance the traditional actor–critic network by integrating a multi-head local attention module to extract the local information at entity level. This way, the collision avoidance system can focus on key environmental features to compute more lightweight and respond more swiftly to dynamic changes in environment. Besides, a multi-step lookahead prediction (MLP) reward function is designed in the ASMAC-based reinforcement learning (RL) framework to prevent the generation of unnatural, intrusive, and short-sighted motion decisions. Finally, the asynchronous multithreading (AM) mechanism and proximal policy optimization (PPO) algorithm are extended to ASMAC model to offload the expensive online computation to an offline training process, enhancing the exploration efficiency in navigation policy learning of robotic vehicles. Extensive simulation and real-world physical experiments show that our method can generate time-efficient and collision-free guide paths in complex dynamic scenes, to successfully dodge collisions while moving towards the goal.