Arrow Research search

Author name cluster

Han Zheng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers
2 author rows

Possible papers

14

EAAI Journal 2026 Journal Article

A physics-informed neural network approach for marine turbocharger performance evaluation in ocean-going vessels: Under incomplete parameter conditions

  • Tianfeng Fang
  • Han Zheng
  • Yu Hong
  • Xinbo Zhu
  • Yifan Liu

Marine turbochargers are critical for improving fuel efficiency and reducing emissions in maritime transport, yet evaluating their performance under dynamic ocean conditions remains challenging due to incomplete sensor data and limitations of conventional models. While physics-based approaches lack adaptability, purely data-driven methods require complete datasets and may violate thermodynamic consistency. This study proposes a physics-informed neural network (PINN) framework that embeds mean - value thermodynamic equations into the loss function, integrating multi-source operational data with physical constraints. The method employs a two-stage progressive training strategy with adaptive weighting and a novel gradient coordination mechanism to balance competing objectives and ensure stable, thermodynamically consistent learning. Validated on 2. 1 million data points from a 300, 000 DWT bulk carrier, the PINN achieves high accuracy in predicting key turbocharger performance indicators, significantly outperforming traditional models in bench tests and maintaining robust performance even under incomplete parameter conditions. This framework bridges physics-based modeling and deep learning, enabling robust turbocharger evaluation in dynamic environments, advancing condition monitoring, and supporting condition-based maintenance of ship propulsion systems.

AAAI Conference 2026 Conference Paper

CP-Router: An Uncertainty-Aware Router Between LLM and LRM

  • Jiayuan Su
  • Fulin Lin
  • Zhaopeng Feng
  • Han Zheng
  • Teng Wang
  • Zhenyu Xiao
  • Xinlong Zhao
  • Zuozhu Liu

Recent advances in large reasoning models (LRMs) have significantly enhanced long-chain reasoning capabilities over standard large language models (LLMs). However, LRMs often produce unnecessarily lengthy outputs even for simple queries, leading to inefficiencies or even accuracy degradation compared to LLMs. To address this, we propose CP-Router, a training-free, model-agnostic routing framework that dynamically selects between an LLM and an LRM, demonstrated with multiple-choice question answering (MCQA) prompts. The routing decision is guided by the prediction uncertainty estimates derived via Conformal Prediction (CP), which provides rigorous coverage guarantees. To improve uncertainty differentiation across inputs, we introduce Full and Binary Entropy (FBE), a novel entropy-based criterion that adaptively selects the appropriate CP threshold. Experiments across MCQA and QA benchmarks—including mathematics, logical reasoning, and Chinese chemistry—demonstrate that CP-Router efficiently reduces token usage while maintaining or even improving accuracy compared to using LRM alone. We further demonstrate the generality and robustness of CP-Router by extending it to diverse model pairings beyond the LLM–LRM setting.

JAIR Journal 2026 Journal Article

Learning-guided Prioritized Planning for Lifelong Multi-Agent Path Finding in Warehouse Automation

  • Han Zheng
  • Yining Ma
  • Brandon Araki
  • Jingkai Chen
  • Cathy Wu

Lifelong Multi-Agent Path Finding (MAPF) is critical for modern warehouse automation, which requires multiple robots to continuously navigate conflict-free paths to optimize the overall system throughput. However, the complexity of warehouse environments and the long-term dynamics of lifelong MAPF often demand costly adaptations to classical search-based solvers. While machine learning methods have been explored, their superiority over search-based methods remains inconclusive. In this paper, we introduce Reinforcement Learning (RL) guided Rolling Horizon Prioritized Planning (RL-RH-PP), the first framework integrating RL with search-based planning for lifelong MAPF. Specifically, we leverage classical Prioritized Planning (PP) as a backbone for its simplicity and flexibility in integrating with a learning-based priority assignment policy. By formulating dynamic priority assignment as a Partially Observable Markov Decision Process (POMDP), RL-RH-PP exploits the sequential decision-making nature of lifelong planning while delegating complex spatial-temporal interactions among agents to reinforcement learning. An attention-based neural network autoregressively decodes priority orders on-the-fly, enabling efficient sequential single-agent planning by the PP planner. Evaluations in realistic warehouse simulations show that RL-RH-PP achieves the highest total throughput among baselines and generalizes effectively across agent densities, planning horizons, and warehouse layouts. Our interpretive analyses reveal that RL-RH-PP proactively prioritizes congested agents and strategically redirects agents from congestion, easing traffic flow and boosting throughput. These findings highlight the potential of learning-guided approaches to augment traditional heuristics in modern warehouse automation.

AAAI Conference 2025 Conference Paper

D^2-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models

  • Qian Zeng
  • Jie Song
  • Han Zheng
  • Hao Jiang
  • Mingli Song

Diffusion models have achieved cutting-edge performance in image generation. However, their lengthy denoising process and computationally intensive score estimation network impede their scalability in low-latency and resource-constrained scenarios. Post-training quantization (PTQ) compresses and accelerates diffusion models without retraining, but it inevitably introduces additional quantization noise, resulting in mean and variance deviations. In this work, we propose D2-DPM, a dual denoising mechanism aimed at precisely mitigating the adverse effects of quantization noise on the noise estimation network. Specifically, we first unravel the impact of quantization noise on the sampling equation into two components: the mean deviation and the variance deviation. The mean deviation alters the drift coefficient of the sampling equation, influencing the trajectory trend, while the variance deviation magnifies the diffusion coefficient, impacting the convergence of the sampling trajectory. The proposed D2-DPM is thus devised to denoise the quantization noise at each time step, and then denoise the noisy sample through the inverse diffusion iterations. Experimental results demonstrate that D2-DPM achieves superior generation quality, yielding a 1.42 lower FID than the full-precision model while achieving 3.99x compression and 11.67x bit-operation acceleration.

IROS Conference 2025 Conference Paper

Embodied Escaping: End-to-End Reinforcement Learning for Robot Navigation in Narrow Environment

  • Han Zheng
  • Jiale Zhang
  • Mingyang Jiang
  • Peiyuan Liu
  • Danni Liu
  • Tong Qin 0001
  • Ming Yang 0002

Autonomous navigation is a fundamental task for robot vacuum cleaners in indoor environments. Since their core function is to clean entire areas, robots inevitably encounter dead zones in cluttered and narrow scenarios. Existing planning methods often fail to escape due to complex environmental constraints, high-dimensional search spaces, and high difficulty maneuvers. To address these challenges, this paper proposes an embodied escaping model that leverages a reinforcement learning-based policy with an efficient action mask for dead zone escaping. To alleviate the issue of the sparse reward in training, we introduce a hybrid training policy that improves learning efficiency. In handling redundant and ineffective action options, we design a novel action representation to reshape the discrete action space with a uniform turning radius. Furthermore, we develop an action mask strategy to select valid actions quickly, balancing precision and efficiency. In real-world experiments, our robot is equipped with a Lidar, IMU, and two-wheel encoders. Extensive quantitative and qualitative experiments across varying difficulty levels demonstrate that our robot can consistently escape from challenging dead zones. Moreover, our approach significantly outperforms compared path planning and reinforcement learning methods in terms of success rate and collision avoidance. A video showcasing our methodology and real-world demonstrations is available at https://youtu.be/kBaaYWGhNuE.

ICRA Conference 2025 Conference Paper

Planning-Oriented Cooperative Perception Among Heterogeneous Vehicles

  • Han Zheng
  • Fan Ye 0003
  • Yuanyuan Yang 0001

Vehicle-to-vehicle (V2V) based cooperative perception enhances autonomous driving by overcoming single-agent perception limitations such as occlusions, without relying on extensive infrastructure. However, most existing methods have two key limitations. They treat cooperative perception in isolation, with little consideration for downstream tasks such as planning, leading to poor coordination and inefficient planning decisions. They also assume perception model homogeneity across all vehicles, which can be impractical among vehicles from different manufacturers. To bridge such gaps, we propose Scout, an early-fusion framework for planning-oriented cooperative perception among vehicles of heterogeneous models. Specifically, we formalize a notion of $\Delta \theta$ -Risk Increment Distribution (RID) to capture the distribution of the risk increment by incomplete perception to the current trajectory plan, and define a Priority Index (PI) metric for prioritizing cooperative perception on riskier regions. We develop algorithms to estimate $\Delta \theta$ -RID and PI at run-time with theoretical bounds. Empirical results demonstrate that Scout surpasses state-of-the-art methods and strong baselines on challenging benchmarks, achieving higher success rates with only 3-10% of their communication volume.

EAAI Journal 2024 Journal Article

Identification of product definition patterns in mass customization by multi-information fusion weighted support vector machine

  • Ruoda Wang
  • Yu Sun
  • Jun Ni
  • Han Zheng

In mass customization, companies have built product families to enhance design efficiency and meet customer requirements. However, the complex and diverse customer requirements make the traditional process of mapping customer needs to product families challenging and heavily reliant on prior knowledge. To address this challenge, the mapping task is treated as a classification problem, with customer requirements as classification features and product families as category labels. Based on information theory, this study considers the information gain (IG) and mutual information (MI) between the classification features and the labels. The uncertainty relationship between the two is explored using grey relational analysis (GRA). A hybrid weighting matrix is constructed by combining the effects of these three aspects, which is then used to improve the calculation of the classical support vector machine (CSVM) kernel function, forming a multi-information fusion weighted support vector machine (MIFWSVM) model. This model can take new requirements as input and output product variants that may satisfy the customer. To demonstrate the effectiveness of the proposed method, a case study of a mechanical press company was reported, comparing the MIFWSVM model with classical classifiers and exploring the impact of different weighting methods on the performance of CSVM. The MIFWSVM model achieved an average accuracy of 0. 9205 with a standard deviation of 0. 0506 and a macro F1 score of 0. 9032 with a standard deviation of 0. 0589, outperforming other methods. These results indicate that the MIFWSVM model significantly improves the accuracy and stability of customer demand mapping.

ICRA Conference 2024 Conference Paper

Multi-agent Path Finding for Cooperative Autonomous Driving

  • Zhongxia Yan 0001
  • Han Zheng
  • Cathy Wu 0002

Anticipating possible future deployment of connected and automated vehicles (CAVs), cooperative autonomous driving at intersections has been studied by many works in control theory and intelligent transportation across decades. Simultaneously, recent parallel works in robotics have devised efficient algorithms for multi-agent path finding (MAPF), though often in environments with simplified kinematics. In this work, we hybridize insights and algorithms from MAPF with the structure and heuristics of optimizing the crossing order of CAVs at signal-free intersections. We devise an optimal and complete algorithm, Order-based Search with Kinematics Arrival Time Scheduling (OBS-KATS), which significantly outperforms existing algorithms, fixed heuristics, and prioritized planning with KATS. The performance is maintained under different vehicle arrival rates, lane lengths, crossing speeds, and control horizon. Through ablations and dissections, we offer insight on the contributing factors to OBS-KATS’s performance. Our work is directly applicable to many similarly scaled traffic and multi-robot scenarios with directed lanes.

IROS Conference 2024 Conference Paper

Multi-agent Path Finding for Mixed Autonomy Traffic Coordination

  • Han Zheng
  • Zhongxia Yan 0001
  • Cathy Wu 0002

In the evolving landscape of urban mobility, the prospective integration of Connected and Automated Vehicles (CAVs) with Human-Driven Vehicles (HDVs) presents a complex array of challenges and opportunities for autonomous driving systems. While recent advancements in robotics have yielded Multi-Agent Path Finding (MAPF) algorithms tailored for agent coordination task characterized by simplified kinematics and complete control over agent behaviors, these solutions are inapplicable in mixed-traffic environments where uncontrollable HDVs must coexist and interact with CAVs. Addressing this gap, we propose the Behavior Prediction Kinematic Priority Based Search (BK-PBS), which leverages an offline-trained conditional prediction model to forecast HDV responses to CAV maneuvers, integrating these insights into a Priority Based Search (PBS) where the A* search proceeds over motion primitives to accommodate kinematic constraints. We compare BK-PBS with CAV planning algorithms derived by rule-based car-following models, and reinforcement learning. Through comprehensive simulation on a highway merging scenario across diverse scenarios of CAV penetration rate and traffic density, BK-PBS outperforms these baselines in reducing collision rates and enhancing system-level travel delay. Our work is directly applicable to many scenarios of multi-human multi-robot coordination.

AAAI Conference 2023 Conference Paper

Adaptive Policy Learning for Offline-to-Online Reinforcement Learning

  • Han Zheng
  • Xufang Luo
  • Pengfei Wei
  • Xuan Song
  • Dongsheng Li
  • Jing Jiang

Conventional reinforcement learning (RL) needs an environment to collect fresh data, which is impractical when online interactions are costly. Offline RL provides an alternative solution by directly learning from the previously collected dataset. However, it will yield unsatisfactory performance if the quality of the offline datasets is poor. In this paper, we consider an offline-to-online setting where the agent is first learned from the offline dataset and then trained online, and propose a framework called Adaptive Policy Learning for effectively taking advantage of offline and online data. Specifically, we explicitly consider the difference between the online and offline data and apply an adaptive update scheme accordingly, that is, a pessimistic update strategy for the offline dataset and an optimistic/greedy update scheme for the online dataset. Such a simple and effective method provides a way to mix the offline and online RL and achieve the best of both worlds. We further provide two detailed algorithms for implementing the framework through embedding value or policy-based RL algorithms into it. Finally, we conduct extensive experiments on popular continuous control tasks, and results show that our algorithm can learn the expert policy with high sample efficiency even when the quality of offline dataset is poor, e.g., random dataset.

ICRA Conference 2023 Conference Paper

DAMS-LIO: A Degeneration-Aware and Modular Sensor-Fusion LiDAR-inertial Odometry

  • Fuzhang Han
  • Han Zheng
  • Wenjun Huang
  • Rong Xiong
  • Yue Wang 0020
  • Yanmei Jiao

With robots being deployed in increasingly complex environments like underground mines and planetary surfaces, the multi-sensor fusion method has gained more and more attention which is a promising solution to state estimation in the such scene. The fusion scheme is a central component of these methods. In this paper, a light-weight iEKF-based LiDAR-inertial odometry system is presented, which utilizes a degeneration-aware and modular sensor-fusion pipeline that takes both LiDAR points and relative pose from another odometry as the measurement in the update process only when degeneration is detected. Both the Cramer-Rao Lower Bound (CRLB) theory and simulation test are used to demonstrate the higher accuracy of our method compared to methods using a single observation. Furthermore, the proposed system is evaluated in perceptually challenging datasets against various state-of-the-art sensor-fusion methods. The results show that the proposed system achieves real-time and high estimation accuracy performance despite the challenging environment and poor observations.

NeurIPS Conference 2020 Conference Paper

Cooperative Heterogeneous Deep Reinforcement Learning

  • Han Zheng
  • Pengfei Wei
  • Jing Jiang
  • Guodong Long
  • Qinghua Lu
  • Chengqi Zhang

Numerous deep reinforcement learning agents have been proposed, and each of them has its strengths and flaws. In this work, we present a Cooperative Heterogeneous Deep Reinforcement Learning (CHDRL) framework that can learn a policy by integrating the advantages of heterogeneous agents. Specifically, we propose a cooperative learning framework that classifies heterogeneous agents into two classes: global agents and local agents. Global agents are off-policy agents that can utilize experiences from the other agents. Local agents are either on-policy agents or population-based evolutionary algorithms (EAs) agents that can explore the local area effectively. We employ global agents, which are sample-efficient, to guide the learning of local agents so that local agents can benefit from the sample-efficient agents and simultaneously maintain their advantages, e. g. , stability. Global agents also benefit from effective local searches. Experimental studies on a range of continuous control tasks from the Mujoco benchmark show that CHDRL achieves better performance compared with state-of-the-art baselines.

JBHI Journal 2019 Journal Article

Label-Efficient Breast Cancer Histopathological Image Classification

  • Qi Qi
  • Yanlong Li
  • Jitian Wang
  • Han Zheng
  • Yue Huang
  • Xinghao Ding
  • Gustavo Kunde Rohde

The automatic classification of breast cancer histopathological images has great significance in computer-aided diagnosis. Recently, deep learning via neural networks has enabled pattern detection and prediction using large, labeled datasets; whereas, collecting and annotating sufficient histological data using professional pathologists is time consuming, tedious, and extremely expensive. In the proposed paper, a deep active learning framework is designed and implemented for classification of breast cancer histopathological images, with the goal of maximizing the learning accuracy from very limited labeling. This method involves manual annotation of the most valuable unlabeled samples, which are then integrated into the training set. The model is then iteratively updated with an increasing training set. Here, two selection strategies are discussed for the proposed deep active learning framework: An entropy-based strategy and a confidence-boosting strategy. The proposed method has been validated using a publicly available breast cancer histopathological image dataset, wherein each image patch is binarily classified as benign or malignant. The experimental results demonstrate that, compared with a random selection, our proposed framework can reduce annotation costs up to 66. 67%, with higher accuracy and less expensive annotation than standard query strategy.

JBHI Journal 2017 Journal Article

Epithelium-Stroma Classification via Convolutional Neural Networks and Unsupervised Domain Adaptation in Histopathological Images

  • Yue Huang
  • Han Zheng
  • Chi Liu
  • Xinghao Ding
  • Gustavo K. Rohde

Epithelium-stroma classification is a necessary preprocessing step in histopathological image analysis. Current deep learning based recognition methods for histology data require collection of large volumes of labeled data in order to train a new neural network when there are changes to the image acquisition procedure. However, it is extremely expensive for pathologists to manually label sufficient volumes of data for each pathology study in a professional manner, which results in limitations in real-world applications. A very simple but effective deep learning method, that introduces the concept of unsupervised domain adaptation to a simple convolutional neural network (CNN), has been proposed in this paper. Inspired by transfer learning, our paper assumes that the training data and testing data follow different distributions, and there is an adaptation operation to more accurately estimate the kernels in CNN in feature extraction, in order to enhance performance by transferring knowledge from labeled data in source domain to unlabeled data in target domain. The model has been evaluated using three independent public epithelium-stroma datasets by cross-dataset validations. The experimental results demonstrate that for epithelium-stroma classification, the proposed framework outperforms the state-of-the-art deep neural network model, and it also achieves better performance than other existing deep domain adaptation methods. The proposed model can be considered to be a better option for real-world applications in histopathological image analysis, since there is no longer a requirement for large-scale labeled data in each specified domain.