Arrow Research search

Author name cluster

Zhirui Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
2 author rows

Possible papers

7

AAAI Conference 2026 Conference Paper

scCluBench: Comprehensive Benchmarking of Clustering Algorithms for Single-Cell RNA Sequencing

  • Ping Xu
  • Zaitian Wang
  • Zhirui Wang
  • Pengjiang Li
  • Jiajia Wang
  • Ran Zhang
  • Pengfei Wang
  • Yuanchun Zhou

Cell clustering is crucial for uncovering cellular heterogeneity in single-cell RNA sequencing (scRNA-seq) data by identifying cell types and marker genes. Despite its importance, existing benchmarks for scRNA-seq clustering remain fragmented, lacking standardized protocols and often omitting recent advances in artificial intelligence.To fill these gaps, we present scCluBench, a comprehensive benchmark of clustering algorithms for scRNA-seq data. scCluBench provides 36 scRNA-seq datasets collected from diverse public sources, covering multiple tissues, which are uniformly processed to ensure consistency for systematic evaluation and downstream analyses. To assess performance, we collect and reproduce a range of scRNA-seq clustering methods, including traditional, deep learning-based, graph-based, and biological foundation models. We comprehensively evaluate each method both quantitatively and qualitatively, using core performance metrics and visualization analyses. Furthermore, we construct representative downstream biological tasks, such as marker gene identification and cell type annotation, to further assess the practical utility. scCluBench then investigates the performance differences and applicability boundaries of various clustering models across diverse analytical tasks, systematically assessing their robustness and scalability in real-world scenarios. Overall, scCluBench offers a standardized and user-friendly benchmark for scRNA-seq clustering, with standardized datasets, unified evaluation protocols, and transparent analyses, facilitating informed method selection and providing valuable insights into model generalizability and application scope.

AAAI Conference 2025 Conference Paper

SemStereo: Semantic-Constrained Stereo Matching Network for Remote Sensing

  • Chen Chen
  • Liangjin Zhao
  • Yuanchun He
  • Yingxuan Long
  • Kaiqiang Chen
  • Zhirui Wang
  • Yanfeng Hu
  • Xian Sun

Semantic segmentation and 3D reconstruction are two fundamental tasks in remote sensing, typically treated as separate or loosely coupled tasks. Despite attempts to integrate them into a unified network, the constraints between the two heterogeneous tasks are not explicitly modeled, since the pioneering studies either utilize a loosely coupled parallel structure or engage in only implicit interactions, failing to capture the inherent connections. In this work, we explore the connections between the two tasks and propose a new network that imposes semantic constraints on the stereo matching task, both implicitly and explicitly. Implicitly, we transform the traditional parallel structure to a new cascade structure termed Semantic-Guided Cascade structure, where the deep features enriched with semantic information are utilized for the computation of initial disparity maps, enhancing semantic guidance. Explicitly, we propose a Semantic Selective Refinement (SSR) module and a Left-Right Semantic Consistency (LRSC) module. The SSR refines the initial disparity map under the guidance of the semantic map. The LRSC ensures semantic consistency between two views via reducing the semantic divergence after transforming the semantic map from one view to the other using the disparity map. Experiments on the US3D and WHU datasets demonstrate that our method achieves state-of-the-art performance for both semantic segmentation and stereo matching.

IROS Conference 2025 Conference Paper

STEP Planner: Constructing cross-hierarchical subgoal tree as an embodied long-horizon task planner

  • Tianxing Zhou
  • Zhirui Wang
  • Ao Haojia
  • Guangyan Chen
  • Boyang Xing
  • Cheng Jingwen
  • Yi Yang 0009
  • Yufeng Yue

The ability to perform reliable long-horizon task planning is crucial for deploying robots in real-world environments. However, directly employing Large Language Models (LLMs) as action sequence generators often results in low success rates due to their limited reasoning ability for long-horizon embodied tasks. In the STEP framework, we construct a subgoal tree through a pair of closed-loop models: a subgoal decomposition model and a leaf node termination model. Within this framework, we develop a hierarchical tree structure that spans from coarse to fine resolutions. The subgoal decomposition model leverages a foundation LLM to break down complex goals into manageable subgoals, thereby spanning the subgoal tree. The leaf node termination model provides real-time feedback based on environmental states, determining when to terminate the tree spanning and ensuring each leaf node can be directly converted into a primitive action. Experiments conducted in both the VirtualHome WAH-NL benchmark and on real robots demonstrate that STEP achieves long-horizon embodied task completion with success rates up to 34% (WAH-NL) and 25% (real robot) outperforming SOTA methods.

NeurIPS Conference 2024 Conference Paper

Drones Help Drones: A Collaborative Framework for Multi-Drone Object Trajectory Prediction and Beyond

  • Zhechao Wang
  • Peirui Cheng
  • Mingxin Chen
  • Pengju Tian
  • Zhirui Wang
  • Xinming Li
  • Xue Yang
  • Xian Sun

Collaborative trajectory prediction can comprehensively forecast the future motion of objects through multi-view complementary information. However, it encounters two main challenges in multi-drone collaboration settings. The expansive aerial observations make it difficult to generate precise Bird's Eye View (BEV) representations. Besides, excessive interactions can not meet real-time prediction requirements within the constrained drone-based communication bandwidth. To address these problems, we propose a novel framework named "Drones Help Drones" (DHD). Firstly, we incorporate the ground priors provided by the drone's inclined observation to estimate the distance between objects and drones, leading to more precise BEV generation. Secondly, we design a selective mechanism based on the local feature discrepancy to prioritize the critical information contributing to prediction tasks during inter-drone interactions. Additionally, we create the first dataset for multi-drone collaborative prediction, named "Air-Co-Pred", and conduct quantitative and qualitative experiments to validate the effectiveness of our DHD framework. The results demonstrate that compared to state-of-the-art approaches, DHD reduces position deviation in BEV representations by over 20\% and requires only a quarter of the transmission ratio for interactions while achieving comparable prediction performance. Moreover, DHD also shows promising generalization to the collaborative 3D object detection in CoPerception-UAVs.

AAAI Conference 2023 Conference Paper

GraphPrompt: Graph-Based Prompt Templates for Biomedical Synonym Prediction

  • Hanwen Xu
  • Jiayou Zhang
  • Zhirui Wang
  • Shizhuo Zhang
  • Megh Bhalerao
  • Yucong Liu
  • Dawei Zhu
  • Sheng Wang

In the expansion of biomedical dataset, the same category may be labeled with different terms, thus being tedious and onerous to curate these terms. Therefore, automatically mapping synonymous terms onto the ontologies is desirable, which we name as biomedical synonym prediction task. Unlike biomedical concept normalization (BCN), no clues from context can be used to enhance synonym prediction, making it essential to extract graph features from ontology. We introduce an expert-curated dataset OBO-syn encompassing 70 different types of concepts and 2 million curated concept-term pairs for evaluating synonym prediction methods. We find BCN methods perform weakly on this task for not making full use of graph information. Therefore, we propose GraphPrompt, a prompt-based learning approach that creates prompt templates according to the graphs. GraphPrompt obtained 37.2% and 28.5% improvement on zero-shot and few-shot settings respectively, indicating the effectiveness of these graph-based prompt templates. We envision that our method GraphPrompt and OBO-syn dataset can be broadly applied to graph-based NLP tasks, and serve as the basis for analyzing diverse and accumulating biomedical data. All the data and codes are avalible at: https://github.com/HanwenXuTHU/GraphPrompt

UAI Conference 2022 Conference Paper

PathFlow: A normalizing flow generator that finds transition paths

  • Tianyi Liu
  • Weihao Gao
  • Zhirui Wang
  • Chong Wang 0002

Sampling from a Boltzmann distribution to calculate important macro statistics is one of the central tasks in the study of large atomic and molecular systems. Recently, a one-shot configuration sampler, the Boltzmann generator [Noé et al. , 2019], is introduced. Though a Boltzmann generator can directly generate independent metastable states, it lacks the ability to find transition pathways and describe the whole transition process. In this paper, we propose PathFlow that can function as a one-shot generator as well as a transition pathfinder. More specifically, a normalizing flow model is constructed to map the base distribution and linear interpolated path in the latent space to the Boltzmann distribution and a minimum (free) energy path in the configuration space simultaneously. PathFlow can be trained by standard gradient-based optimizers using the proposed gradient estimator with a theoretical guarantee. PathFlow, validated with the extensively studied examples including a synthetic Müller potential and Alanine dipeptide, shows a remarkable performance.

ICRA Conference 2020 Conference Paper

Online calibration of exterior orientations of a vehicle-mounted surround-view camera system

  • Zhanpeng Ouyang
  • Lan Hu
  • Yukan Lu
  • Zhirui Wang
  • Xin Peng 0005
  • Laurent Kneip

The increasing availability of surround-view camera systems in passenger vehicles motivates their use as an exterior perception modality for intelligent vehicle behaviour. An important problem within this context is the extrinsic calibration between the cameras, which is challenging due to the often reduced overlap between the fields of view of neighbouring views. Our work is motivated by two insights. First, we argue that the accuracy of vision-based vehicle motion estimation depends crucially on the quality of exterior orientation calibration, while design parameters for camera positions typically provide sufficient accuracy. Second, we demonstrate how planar vehicle motion related direction vectors can be used to accurately identify individual camera-to-vehicle rotations, which are more useful than the commonly and tediously derived camera-to-camera transformations. We present a complete and highly practicable online optimisation strategy to obtain the exterior orientation parameters and conclude with successful tests on simulated, indoor, and large-scale outdoor experiments.