Author name cluster

Hanming Deng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

1 author row

NeurIPS Conference 2025 Conference Paper

HCRMP: An LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving

Zhiwen Chen
Hanming Deng
Zhuoren Li
Huanxi Wen
Guizhe Jin
Ran Yu
Bo Leng

Integrating the understanding and reasoning capabilities of Large Language Models (LLM) with the self-learning capabilities of Reinforcement Learning (RL) enables more reliable driving performance under complex driving conditions. There has been a lot of work exploring LLM-Dominated RL methods in the field of autonomous driving motion planning. These methods, which utilize LLM to directly generate policies or provide decisive instructions during policy learning of RL agent, are centrally characterized by an over-reliance on LLM outputs. However, LLM outputs are susceptible to hallucinations. Evaluations show that state-of-the-art LLM indicates a non-hallucination rate of only approximately 57. 95\% when assessed on essential driving-related tasks. Thus, in these methods, hallucinations from the LLM can directly jeopardize the performance of driving policies. This paper argues that maintaining relative independence between the LLM and the RL is vital for solving the hallucinations problem. Consequently, this paper is devoted to propose a novel LLM-Hinted RL paradigm. The LLM is used to generate semantic hints for state augmentation and policy optimization to assist RL agent in motion planning, while the RL agent counteracts potential erroneous semantic indications through policy learning to achieve excellent driving performance. Based on this paradigm, we propose the HCRMP (LLM-Hinted Contextual Reinforcement Learning Motion Planner) architecture, which is designed that includes ①Augmented Semantic Representation Module to extend state space. ②Contextual Stability Anchor Module enhances the reliability of multi-critic weight hints by utilizing information from the knowledge base. ③Semantic Cache Module is employed to seamlessly integrate LLM low-frequency guidance with RL high-frequency control. Extensive experiments in CARLA validate HCRMP's strong overall driving performance. HCRMP achieves a task success rate of up to 80. 3\% under diverse driving conditions with different traffic densities. Under safety-critical driving conditions, HCRMP significantly reduces the collision rate by 11. 4\%, which effectively improves the driving performance in complex scenarios.

PDF Details

NeurIPS Conference 2025 Conference Paper

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Changyao Tian
Hao Li
Gen Luo
Xizhou Zhu
Weijie Su
Hanming Deng
Jinguo Zhu
Jie Shao

Compositional training has been the de-facto paradigm in existing Multimodal Large Language Models (MLLMs), where pre-trained vision encoders are connected with pre-trained LLMs through continuous multimodal pre-training. However, the multimodal scaling property of this paradigm remains difficult to explore due to the separated training. In this paper, we focus on the native training of MLLMs in an end-to-end manner and systematically study its design space and scaling property under a practical setting, i. e. , data constraint. Through careful study of various choices in MLLM, we obtain the optimal meta-architecture that best balances performance and training cost. After that, we further explore the scaling properties of the native MLLM and indicate the positively correlated scaling relationship between visual encoders and LLMs. Based on these findings, we propose a native MLLM called NaViL, combined with a simple and cost-effective recipe. Experimental results on 14 multimodal benchmarks confirm the competitive performance of NaViL against existing MLLMs. Besides that, our findings and results provide in-depth insights for the future study of native MLLMs.

PDF Details

AAAI Conference 2020 Conference Paper

Reinforcing Neural Network Stability with Attractor Dynamics

Hanming Deng
Yang Hua
Tao Song
Zhengui Xue
Ruhui Ma
Neil Robertson
Haibing Guan

Recent approaches interpret deep neural works (DNNs) as dynamical systems, drawing the connection between stability in forward propagation and generalization of DNNs. In this paper, we take a step further to be the ﬁrst to reinforce this stability of DNNs without changing their original structure and verify the impact of the reinforced stability on the network representation from various aspects. More speciﬁcally, we reinforce stability by modeling attractor dynamics of a DNN and propose relu-max attractor network (RMAN), a light-weight module readily to be deployed on state-of-the-art ResNet-like networks. RMAN is only needed during training so as to modify a ResNet’s attractor dynamics by minimizing an energy function together with the loss of the original learning task. Through intensive experiments, we show that RMAN-modiﬁed attractor dynamics bring a more structured representation space to ResNet and its variants, and more importantly improve the generalization ability of ResNet-like networks in supervised tasks due to reinforced stability.

PDF Details