Arrow Research search

Author name cluster

Zhu Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
2 author rows

Possible papers

7

AAAI Conference 2026 Conference Paper

Benchmarking LLMs’ Mathematical Reasoning with Unseen Random Variables Questions

  • Zijin Hong
  • Hao Wu
  • Su Dong
  • Junnan Dong
  • Yilin Xiao
  • Yujing Zhang
  • Zhu Wang
  • Feiran Huang

Recent studies have raised significant concerns regarding the reliability of current mathematical benchmarks, highlighting key limitations such as simplistic design and potential data contamination that undermine evaluation accuracy. Consequently, developing a reliable benchmark that effectively evaluates large language models' (LLMs) genuine capabilities in mathematical reasoning remains a critical challenge. To address these concerns, we propose RV-Bench, a novel evaluation methodology for Benchmarking LLMs with Random Variables in mathematical reasoning. Specifically, we develop question-generating functions to produce random variable questions (RVQs), whose background content mirrors the original benchmark problems, but with randomized variable combinations, rendering them "unseen" to LLMs. Models must completely understand the inherent question pattern to correctly answer RVQs with diverse variable combinations. Thus, an LLMs' genuine reasoning capability is reflected through its accuracy and robustness on RV-Bench. We conducted extensive experiments on over 30 representative LLMs across more than 1,000 RVQs. Our findings reveal that LLMs exhibit a proficiency imbalance between encountered and "unseen" data distributions. Furthermore, RV-Bench reveals that proficiency generalization across similar mathematical reasoning tasks is limited, but we verified that it can still be effectively elicited through test-time scaling.

AAAI Conference 2026 Conference Paper

Conditional Distribution Learning for Graph Classification

  • Jie Chen
  • Hua Mao
  • Chuanbin Liu
  • Zhu Wang
  • Xi Peng

Leveraging the diversity and quantity of data provided by various graph-structured data augmentations while preserving intrinsic semantic information is challenging. Additionally, successive layers in graph neural network (GNN) tend to produce more similar node embeddings, while graph contrastive learning aims to increase the dissimilarity between negative pairs of node embeddings. This inevitably results in a conflict between the message-passing mechanism (MPM) of GNNs and the contrastive learning (CL) of negative pairs via intraviews. In this paper, we propose a conditional distribution learning (CDL) method that learns graph representations from graph-structured data for semisupervised graph classification. Specifically, we present an end-to-end graph representation learning model to align the conditional distributions of weakly and strongly augmented features over the original features. This alignment enables the CDL model to effectively preserve intrinsic semantic information when both weak and strong augmentations are applied to graph-structured data. To avoid the conflict between the MPM and the CL of negative pairs, positive pairs of node representations are retained for measuring the similarity between the original features and the corresponding weakly augmented features. Extensive experiments with several benchmark graph datasets demonstrate the effectiveness of the proposed CDL method.

TMLR Journal 2025 Journal Article

Targeted Unlearning Using Perturbed Sign Gradient Methods With Applications On Medical Images

  • George R. Nahass
  • Zhu Wang
  • Homa Rashidisabet
  • Won Hwa Kim
  • Sasha Hubschman
  • Jeffrey C. Peterson
  • Pete Setabutr
  • Chad A. Purnell

Machine unlearning aims to remove the influence of specific training samples from a trained model without full retraining. While prior work has largely focused on privacy-motivated settings, we recast unlearning as a general-purpose tool for post-deployment model revision. Specifically, we focus on utilizing unlearning in clinical contexts where data shifts, device deprecation, and policy changes are common. To this end, we propose a bilevel optimization formulation of boundary-based unlearning that can be solved using iterative algorithms. We provide convergence guarantees when first order algorithms are used to unlearn and introduce a tunable loss design for controlling the forgetting–retention tradeoff. Across benchmark and real-world clinical imaging datasets, our approach outperforms baselines on both forgetting and retention metrics, including scenarios involving imaging devices and anatomical outliers. This work demonstrates the feasibility of unlearning on clinical imaging datasets and proposes it as a tool for model maintenance in scenarios that require removing the influence of specific data points without full model retraining. Code is available $\href{https://github.com/monkeygobah/unlearning_langevin}{here}$.

NeurIPS Conference 2024 Conference Paper

IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents

  • Homaira H. Shomee
  • Zhu Wang
  • Sourav Medya
  • Sathya N. Ravi

In this paper, we introduce IMPACT (Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents), a large-scale multimodal patent dataset with detailed captions for design patent figures. Our dataset includes half a million design patents comprising 3. 61 million figures along with captions from patents granted by the United States Patent and Trademark Office (USPTO) over a 16-year period from 2007 to 2022. We incorporate the metadata of each patent application with elaborate captions that are coherent with multiple viewpoints of designs. Even though patents themselves contain a variety of design figures, titles, and descriptions of viewpoints, we find that they lack detailed descriptions that are necessary to perform multimodal tasks such as classification and retrieval. IMPACT closes this gap thereby providing researchers with necessary ingredients to instantiate a variety of multimodal tasks. Our dataset has a huge potential for novel design inspiration and can be used with advanced computer vision models in tandem. We perform preliminary evaluations on the dataset on the popular patent analysis tasks such as classification and retrieval. Our results indicate that integrating images with generated captions significantly improves the performance of different models on the corresponding tasks. Given that design patents offer various benefits for modeling novel tasks, we propose two standard computer vision tasks that have not been investigated in analyzing patents as future directions using IMPACT as a benchmark viz. , 3D Image Construction and Visual Question Answering (VQA). To facilitate research in these directions, we make our IMPACT dataset and the code/models used in this work publicly available at https: //github. com/AI4Patents/IMPACT.

IROS Conference 2024 Conference Paper

SOAR: Simultaneous Exploration and Photographing with Heterogeneous UAVs for Fast Autonomous Reconstruction

  • Mingjie Zhang
  • Chen Feng 0006
  • Zengzhi Li
  • Guiyong Zheng
  • Yiming Luo
  • Zhu Wang
  • Jinni Zhou
  • Shaojie Shen

Unmanned Aerial Vehicles (UAVs) have gained significant popularity in scene reconstruction. This paper presents SOAR, a LiDAR-Visual heterogeneous multi-UAV system specifically designed for fast autonomous reconstruction of complex environments. Our system comprises a LiDAR-equipped explorer with a large field-of-view (FoV), alongside photographers equipped with cameras. To ensure rapid acquisition of the scene’s surface geometry, we employ a surface frontier-based exploration strategy for the explorer. As the surface is progressively explored, we identify the uncovered areas and generate viewpoints incrementally. These viewpoints are then assigned to photographers through solving a Consistent Multiple Depot Multiple Traveling Salesman Problem (Consistent-MDMTSP), which optimizes scanning efficiency while ensuring task consistency. Finally, photographers utilize the assigned viewpoints to determine optimal coverage paths for acquiring images. We present extensive benchmarks in the realistic simulator, which validates the performance of SOAR compared with classical and state-of-the-art methods. For more details, please see our project page at sysu-star.github.io/SOAR.

NeurIPS Conference 2023 Conference Paper

Implicit Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis

  • Zhu Wang
  • Sourav Medya
  • Sathya Ravi

Deep network models are often purely inductive during both training and inference on unseen data. When these models are used for prediction, but they may fail to capture important semantic information and implicit dependencies within datasets. Recent advancements have shown that combining multiple modalities in large-scale vision and language settings can improve understanding and generalization performance. However, as the model size increases, fine-tuning and deployment become computationally expensive, even for a small number of downstream tasks. Moreover, it is still unclear how domain or prior modal knowledge can be specified in a backpropagation friendly manner, especially in large-scale and noisy settings. To address these challenges, we propose a simplified alternative of combining features from pretrained deep networks and freely available semantic explicit knowledge. In order to remove irrelevant explicit knowledge that does not correspond well to the images, we introduce an implicit Differentiable Out-of-Distribution (OOD) detection layer. This layer addresses outlier detection by solving for fixed points of a differentiable function and using the last iterate of fixed point solver to backpropagate. In practice, we apply our model on several vision and language downstream tasks including visual question answering, visual reasoning, and image-text retrieval on different datasets. Our experiments show that it is possible to design models that perform similarly to state-of-the-art results but with significantly fewer samples and less training time. Our models and code are available here: https: //github. com/ellenzhuwang/implicit_vkood

TIST Journal 2016 Journal Article

Recognizing Parkinsonian Gait Pattern by Exploiting Fine-Grained Movement Function Features

  • Tianben Wang
  • Zhu Wang
  • Daqing Zhang
  • Tao Gu
  • Hongbo Ni
  • Jiangbo Jia
  • Xingshe Zhou
  • Jing Lv

Parkinson's disease (PD) is one of the typical movement disorder diseases among elderly people, which has a serious impact on their daily lives. In this article, we propose a novel computation framework to recognize gait patterns in patients with PD. The key idea of our approach is to distinguish gait patterns in PD patients from healthy individuals by accurately extracting gait features that capture all three aspects of movement functions, that is, stability, symmetry, and harmony. The proposed framework contains three steps: gait phase discrimination, feature extraction and selection, and pattern classification. In the first step, we put forward a sliding window--based method to discriminate four gait phases from plantar pressure data. Based on the gait phases, we extract and select gait features that characterize stability, symmetry, and harmony of movement functions. Finally, we recognize PD gait patterns by applying a hybrid classification model. We evaluate the framework using an open dataset that contains real plantar pressure data of 93 PD patients and 72 healthy individuals. Experimental results demonstrate that our framework significantly outperforms the four baseline approaches.