Author name cluster

Junbo Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

2 author rows

IROS Conference 2025 Conference Paper

A Coarse-to-Fine Approach to Multi-Modality 3D Occupancy Grounding

Zhan Shi
Song Wang 0019
Junbo Chen
Jianke Zhu

Visual grounding aims at identifying objects or regions in a scene based on natural language descriptions, which is essential for spatially aware perception in autonomous driving. However, existing visual grounding tasks typically depend on bounding boxes that often fail to capture fine-grained details. Not all voxels within a bounding box are occupied, resulting in inaccurate object representations. To address this, we introduce a benchmark for 3D occupancy grounding in challenging outdoor scenes. Built on the nuScenes dataset, it fuses natural language with voxel-level occupancy annotations, offering more precise object perception compared to the traditional grounding task. Moreover, we propose GroundingOcc, an end-to-end model designed for 3D occupancy grounding through multimodal learning. It combines visual, textual, and point cloud features to predict object location and occupancy information from coarse to fine. Specifically, GroundingOcc comprises a multimodal encoder for feature extraction, an occupancy head for voxel-wise predictions, and a grounding head for refining localization. Additionally, a 2D grounding module and a depth estimation module enhance geometric understanding, thereby boosting model performance. Extensive experiments on the benchmark demonstrate that our method outperforms existing baselines on 3D occupancy grounding. The dataset is available at https://github.com/RONINGOD/GroundingOcc.

Details

IROS Conference 2025 Conference Paper

MambaMap: Online Vectorized HD Map Construction using State Space Model

Ruizi Yang
Xiaolu Liu
Junbo Chen
Jianke Zhu

High-definition (HD) maps are essential for autonomous driving, as they provide precise road information for downstream tasks. Recent advances highlight the potential of temporal modeling in addressing challenges like occlusions and extended perception range. However, existing methods either fail to fully exploit temporal information or incur substantial computational overhead in handling extended sequences. To tackle these challenges, we propose MambaMap, a novel framework that efficiently fuses long-range temporal features in the state space to construct online vectorized HD maps. Specifically, MambaMap incorporates a memory bank to store and utilize information from historical frames, dynamically updating BEV features and instance queries to improve robustness against noise and occlusions. Moreover, we introduce a gating mechanism in the state space, selectively integrating dependencies of map elements in high computational efficiency. In addition, we design innovative multi-directional and spatial-temporal scanning strategies to enhance feature extraction at both BEV and instance levels. These strategies significantly boost the prediction accuracy of our approach while ensuring robust temporal consistency. Extensive experiments on the nuScenes and Argoverse2 datasets demonstrate that our proposed MambaMap approach outperforms state-of-the-art methods across various splits and perception ranges. Source code will be available at https://github.com/ZiziAmy/MambaMap.

Details

IJCAI Conference 2025 Conference Paper

Reliable and Calibrated Semantic Occupancy Prediction by Hybrid Uncertainty Learning

Song Wang
Zhongdao Wang
Jiawei Yu
Wentong Li
Bailan Feng
Junbo Chen
Jianke Zhu

Vision-centric semantic occupancy prediction plays a crucial role in autonomous driving, which requires accurate and reliable predictions from low-cost sensors. Although having notably narrowed the accuracy gap with LiDAR, there is still few research effort to explore the reliability and calibration in predicting semantic occupancy from camera. In this paper, we conduct a comprehensive evaluation of existing semantic occupancy prediction models from a reliability perspective for the first time. Despite the gradual alignment of camera-based models with LiDAR in terms of accuracy, a significant reliability gap still persists. To address this concern, we propose ReliOcc, a method designed to enhance the reliability of camera-based occupancy networks. ReliOcc provides a plug-and-play scheme for existing models, which integrates hybrid uncertainty from individual voxels with sampling-based noise and relative voxels through mix-up learning. Besides, an uncertainty-aware calibration strategy is devised to further improve model reliability in offline mode. Extensive experiments under various settings demonstrate that ReliOcc significantly enhances the reliability of learned model while maintaining the accuracy for both geometric and semantic predictions. Notably, our proposed approach exhibits robustness to sensor failures and out of domain noises during inference.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

HVOFusion: Incremental Mesh Reconstruction Using Hybrid Voxel Octree

Shaofan Liu
Junbo Chen
Jianke Zhu

Incremental scene reconstruction is essential to the navigation in robotics. Most of the conventional methods typically make use of either TSDF (truncated signed distance functions) volume or neural networks to implicitly represent the surface. Due to the voxel representation or involving with time-consuming sampling, they have difficulty in balancing speed, memory storage, and surface quality. In this paper, we propose a novel hybrid voxel-octree approach to effectively fuse octree with voxel structures so that we can take advantage of both implicit surface and explicit triangular mesh representation. Such sparse structure preserves triangular faces in the leaf nodes and produces partial meshes sequentially for incremental reconstruction. This storage scheme allows us to naturally optimize the mesh in explicit 3D space to achieve higher surface quality. We iteratively deform the mesh towards the target and recovers vertex colors by optimizing a shading model. Experimental results on several datasets show that our proposed approach is capable of quickly and accurately reconstructing a scene with realistic colors. Code is available at https: //github. com/Frankuzi/HVOFusion

PDF Details DOI

IJCAI Conference 2024 Conference Paper

Label-efficient Semantic Scene Completion with Scribble Annotations

Song Wang
Jiawei Yu
Wentong Li
Hao Shi
Kailun Yang
Junbo Chen
Jianke Zhu

Semantic scene completion aims to infer the 3D geometric structures with semantic classes from camera or LiDAR, which provide essential occupancy information in autonomous driving. Prior endeavors concentrate on constructing the network or benchmark in a fully supervised manner. While the dense occupancy grids need point-wise semantic annotations, which incur expensive and tedious labeling costs. In this paper, we build a new label-efficient benchmark, named ScribbleSC, where the sparse scribble-based semantic labels are combined with dense geometric labels for semantic scene completion. In particular, we propose a simple yet effective approach called Scribble2Scene, which bridges the gap between the sparse scribble annotations and fully-supervision. Our method consists of geometric-aware auto-labelers construction and online model training with an offline-to-online distillation module to enhance the performance. Experiments on SemanticKITTI demonstrate that Scribble2Scene achieves competitive performance against the fully-supervised counterparts, showing 99% performance of the fully-supervised models with only 13. 5% voxels labeled. Both annotations of ScribbleSC and our full implementation are available at https: //github. com/songw-zju/Scribble2Scene.

PDF Details DOI

ICRA Conference 2023 Conference Paper

FLYOVER: A Model-Driven Method to Generate Diverse Highway Interchanges for Autonomous Vehicle Testing

Yuan Zhou 0005
Gengjie Lin
Yun Tang 0003
Kairui Yang
Wei Jing
Ping Zhang
Junbo Chen
Liang Gong

It has become a consensus that autonomous vehicles (AVs) will first be widely deployed on highways. However, the complexity of highway interchanges becomes the bottleneck for their deployment. An AV should be sufficiently tested under different highway interchanges, which is still challenging due to the lack of available datasets containing diverse highway interchanges. In this paper, we propose a model-driven method, Flyover, to generate a dataset of diverse interchanges with measurable diversity coverage. First, Flyover uses a labeled digraph to model interchange topology. Second, Flyover takes real-world interchanges as input to guarantee topology practicality and extracts different topology equivalence classes by classifying corresponding topology models. Third, for each topology class, Flyover identifies the corresponding geometrical features for the ramps and generates concrete interchanges using k-way combinatorial coverage and differential evolution. To illustrate the diversity and applicability of the generated interchange dataset, we test the built-in traffic flow control algorithm in SUMO and the fuel-optimization trajectory tracking algorithm deployed to Alibaba's autonomous trucks on the dataset. The results show that except for the geometrical difference, the interchanges are diverse in throughput and fuel consumption under the traffic flow control and trajectory tracking algorithms, respectively.

Details

IROS Conference 2022 Conference Paper

Cola-HRL: Continuous-Lattice Hierarchical Reinforcement Learning for Autonomous Driving

Lingping Gao
Ziqing Gu
Cong Qiu
Lanxin Lei
Shengbo Eben Li
Sifa Zheng
Wei Jing
Junbo Chen

Reinforcement learning (RL) has shown promising performance in autonomous driving applications in recent years. The early end-to-end RL method is usually unexplainable and fails to generate stable actions, while the hierarchical RL (HRL) method can tackle the above issues by dividing complex problems into multiple sub-tasks. Prior HRL works either select discrete driving behaviors with continuous control commands, or generate expected goals for the low-level controller. However, they typically have strong scenario dependence or fail to generate goals with good quality. To address the above challenges, we propose a Continuous-Lattice Hierarchical RL (Cola-HRL) method for autonomous driving tasks to make high-quality decisions in various scenarios. We utilize the continuous-lattice module to generate reasonable goals, ensuring temporal and spatial reachability. Then, we train and evaluate our method under different traffic scenarios based on real-world High Definition maps. Experimental results show our method can handle multiple scenarios. In addition, our method also demonstrates better performance and driving behaviors compared to existing RL methods.

Details

ICRA Conference 2022 Conference Paper

Domain Generalization for Vision-based Driving Trajectory Generation

Yunkai Wang
Dongkun Zhang
Yuxiang Cui
Zexi Chen
Wei Jing
Junbo Chen
Rong Xiong
Yue Wang 0020

One of the challenges in vision-based driving trajectory generation is dealing with out-of-distribution scenarios. In this paper, we propose a domain generalization method for vision-based driving trajectory generation for autonomous vehicles in urban environments, which can be seen as a solution to extend the Invariant Risk Minimization (IRM) method in complex problems. We leverage an adversarial learning approach to train a trajectory generator as the decoder. Based on the pre-trained decoder, we infer the latent variables corresponding to the trajectories, and pre-train the encoder by regressing the inferred latent variable. Finally, we fix the decoder but fine-tune the encoder with the final trajectory loss. We compare our proposed method with the state-of-the-art trajectory generation method and some recent domain generalization methods on both datasets and simulation, demonstrating that our method has better generalization ability. Our project is available at https://sites.google.com/view/dg-traj-gen.

Details

ICRA Conference 2022 Conference Paper

Learning Observation-Based Certifiable Safe Policy for Decentralized Multi-Robot Navigation

Yuxiang Cui
Longzhong Lin
Xiaolong Huang
Dongkun Zhang
Yunkai Wang
Wei Jing
Junbo Chen
Rong Xiong

Safety is of great importance in multi-robot navigation problems. In this paper, we propose a control barrier function (CBF) based optimizer that ensures robot safety with both high probability and flexibility, using only sensor measurement. The optimizer takes action commands from the policy network as initial values and provides refinement to drive the potentially dangerous ones back into safe regions. With the help of a deep world model that predicts the evolution of surrounding dynamics and the consequences of different actions, the CBF module can guide the optimization within a reasonable time horizon. We also present a novel joint training framework that improves the cooperation between the Reinforcement Learning (RL) based policy and the CBF-based optimizer by utilizing reward feedback from the CBF module. We observe that our policy can achieve a higher success rate while maintaining the safety of multiple robots in significantly fewer episodes. Experiments are conducted in multiple scenarios both in simulation and the real world, the results demonstrate the effectiveness of our method in maintaining the safety of multiple robots. Code is available at https://github.com/YuxiangCui/MARL-OCBF.

Details

IROS Conference 2021 Conference Paper

KB-Tree: Learnable and Continuous Monte-Carlo Tree Search for Autonomous Driving Planning

Lanxin Lei
Ruiming Luo
Renjie Zheng
Jingke Wang
Jianwei Zhang
Cong Qiu
Liulong Ma
Liyang Jin

In this paper, we present a novel learnable and continuous Monte-Carlo Tree Search method, named as KB-Tree, for motion planning in autonomous driving. The proposed method utilizes an asymptotical PUCB based on Kernel Regression (KR-AUCB) as a novel UCB variant, to improve the exploitation and exploration performance. In addition, we further optimize the sampling in continuous space by adapting Bayesian Optimization (BO) in the selection process of MCTS. Moreover, we use a customized Graph Neural Network (GNN) as our feature extractor to improve the learning performance. To the best of our knowledge, we are the first to apply the continuous MCTS method in autonomous driving. To validate our method, we conduct extensive experiments under several weakly and strongly interactive scenarios. The results show that our proposed method performs well in all tasks, and outperforms the learning-based continuous MCTS method and the state-of-the-art Reinforcement Learning (RL) baseline.

Details