Arrow Research search

Author name cluster

Zewei Zhou

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
2 author rows

Possible papers

5

AAAI Conference 2026 Conference Paper

Driving with Regulation: Trustworthy and Interpretable Decision-Making for Autonomous Driving with Retrieval-Augmented Reasoning

  • Tianhui Cai
  • Yifan Liu
  • Zewei Zhou
  • Haoxuan Ma
  • Seth Z. Zhao
  • Zhiwen Wu
  • Xu Han
  • Zhiyu Huang

Understanding and adhering to traffic regulations is essential for autonomous vehicles to ensure safety and trustworthiness. However, traffic regulations are complex, context-dependent, and differ between regions, posing a major challenge to conventional rule-based decision-making approaches. We present an interpretable, regulation-aware decision-making framework, DriveReg, which enables autonomous vehicles to understand and adhere to region-specific traffic laws and safety guidelines. The framework integrates a Retrieval Augmented Generation (RAG)-based Traffic Regulation Retrieval Agent, which retrieves relevant rules from regulatory documents based on the current situation, and a Large Language Model (LLM)-powered Reasoning Agent that evaluates actions for legal compliance and safety. Our design emphasizes interpretability to enhance transparency and trustworthiness. To support systematic evaluation, we introduce DriveReg Scenarios Dataset, a comprehensive dataset of driving scenarios across Boston, Singapore, and Los Angeles, with both hypothesized text-based cases and real-world driving data, specifically constructed and annotated to evaluate models’ capacity for regulation understanding and reasoning. We validate our framework on the DriveReg Scenarios Dataset and real-world deployment, demonstrating strong performance and robustness across diverse environments.

NeurIPS Conference 2025 Conference Paper

AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

  • Zewei Zhou
  • Tianhui Cai
  • Seth Zhao
  • Yun Zhang
  • Zhiyu Huang
  • Bolei Zhou
  • Jiaqi Ma

Recent advancements in Vision-Language-Action (VLA) models have shown promise for end-to-end autonomous driving by leveraging world knowledge and reasoning capabilities. However, current VLA models often struggle with physically infeasible action outputs, complex model structures, or unnecessarily long reasoning. In this paper, we propose AutoVLA, a novel VLA model that unifies reasoning and action generation within a single autoregressive generation model for end-to-end autonomous driving. AutoVLA performs semantic reasoning and trajectory planning directly from raw visual inputs and language instructions. We tokenize continuous trajectories into discrete, feasible actions, enabling direct integration into the language model. For training, we employ supervised fine-tuning to equip the model with dual thinking modes: fast thinking (trajectory-only) and slow thinking (enhanced with chain-of-thought reasoning). To further enhance planning performance and efficiency, we introduce a reinforcement fine-tuning method based on Group Relative Policy Optimization (GRPO), reducing unnecessary reasoning in straightforward scenarios. Extensive experiments across real-world and simulated datasets and benchmarks, including nuPlan, nuScenes, Waymo, and CARLA, demonstrate the competitive performance of AutoVLA in both open-loop and closed-loop settings. Qualitative results showcase the adaptive reasoning and accurate planning capabilities of AutoVLA in diverse scenarios.

ICRA Conference 2025 Conference Paper

Co-MTP: A Cooperative Trajectory Prediction Framework with Multi-Temporal Fusion for Autonomous Driving

  • Xinyu Zhang
  • Zewei Zhou
  • Zhaoyi Wang
  • Yangjie Ji
  • Yanjun Huang
  • Hong Chen

Vehicle-to-everything technologies (V2X) have become an ideal paradigm to extend the perception range and see through the occlusion. Exiting efforts focus on single-frame cooperative perception, however, how to capture the temporal cue between frames with V2X to facilitate the prediction task even the planning task is still underexplored. In this paper, we introduce the Co-MTP, a general cooperative trajectory prediction framework with multi-temporal fusion for autonomous driving, which leverages the V2X system to fully capture the interaction among agents in both history and future domains to benefit the planning. In the history domain, V2X can complement the incomplete history trajectory in single-vehicle perception, and we design a heterogeneous graph transformer to learn the fusion of the history feature from multiple agents and capture the history interaction. Moreover, the goal of prediction is to support future planning. Thus, in the future domain, V2X can provide the prediction results of surrounding objects, and we further extend the graph transformer to capture the future interaction among the ego planning and the other vehicles' intentions and obtain the final future scenario state under a certain planning action. We evaluate the Co-MTP framework on the real-world dataset V2X-Seq, and the results show that Co-MTP achieves state-of-the-art performance and that both history and future fusion can greatly benefit prediction. Our code is available on our project website: https://xiaomiaozhang.github.io/Co-MTP/

IROS Conference 2025 Conference Paper

CooperRisk: A Driving Risk Quantification Pipeline with Multi-Agent Cooperative Perception and Prediction

  • Mingyue Lei
  • Zewei Zhou
  • Hongchen Li
  • Jia Hu 0003
  • Jiaqi Ma 0003

Risk quantification is a critical component of safe autonomous driving, however, constrained by the limited perception range and occlusion of single-vehicle systems in complex and dense scenarios. Vehicle-to-everything (V2X) paradigm has been a promising solution to sharing complementary perception information, nevertheless, how to ensure the risk interpretability while understanding multi-agent interaction with V2X remains an open question. In this paper, we introduce the first V2X-enabled risk quantification pipeline, CooperRisk, to fuse perception information from multiple agents and quantify the scenario driving risk in future multiple timestamps. The risk is represented as a scenario risk map to ensure interpretability based on risk severity and exposure, and the multi-agent interaction is captured by the learning-based cooperative prediction model. We carefully design a risk-oriented transformer-based prediction model with multi-modality and multi-agent considerations. It aims to ensure scene-consistent future behaviors of multiple agents and avoid conflicting predictions that could lead to overly conservative risk quantification and cause the ego vehicle to become overly hesitant to drive. Then, the temporal risk maps could serve to guide a model predictive control planner. We evaluate the CooperRisk pipeline in a real-world V2X dataset V2XPnP, and the experiments demonstrate its superior performance in risk quantification, showing a 44. 35% decrease in conflict rate between the ego vehicle and background traffic participants.

ICRA Conference 2023 Conference Paper

Noise and Environmental Justice in Drone Fleet Delivery Paths: A Simulation-Based Audit and Algorithm for Fairer Impact Distribution

  • Zewei Zhou
  • Martim Brandão

Despite the growing interest in the use of drone fleets for delivery of food and parcels, the negative impact of such technology is still poorly understood. In this paper we investigate the impact of such fleets in terms of noise pollution and environmental justice. We use simulation with real population data to analyze the spatial distribution of noise, and find that: 1) noise increases rapidly with fleet size; and 2) drone fleets can produce noise hotspots that extend far beyond warehouses or charging stations, at levels that lead to annoyance and interference of human activities. This, we will show, leads to concerns of fairness of noise distribution. We then propose an algorithm that successfully balances the spatial distribution of noise across the city, and discuss the limitations of such purely technical approaches. We complement the work with a discussion of environmental justice, showing how careless UAV fleet development and regulation can lead to reinforcing well-being deficiencies of poor and marginalized communities.