Author name cluster

Yuqi Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

17 papers

2 author rows

EAAI Journal 2026 Journal Article

A vision mamba-enhanced network with frequency-directional feature fusion for pavement crack segmentation

Xi Li
Yuqi Wang
Qiang Zhou
Jianhui Zhan
Deng Zuo
Weichao Chen

Driven by the rapid advancement of intelligent transportation and infrastructure digitalization, pavement crack detection has emerged as a research hotspot bridging civil engineering and computer vision. It plays a vital role in improving road safety and optimizing maintenance operations. However, the irregular morphology and susceptibility to background interference pose significant challenges to achieving accurate and robust automatic detection. To effectively address these issues, a Vision Mamba-Enhanced Network with Frequency-Directional Feature Fusion, VMFDF-Net, was proposed in this work for attaining efficient crack segmentation. The network integrates global modelling and local feature extraction capabilities, combining a Vision Mamba-inspired mechanism with convolutional neural networks (CNNs). Multi-frequency, multi-directional feature modelling, and an adaptive gated fusion module were introduced to enhance crack representation. The proposed Vision Mamba Convolutional (VMambaConv) module integrates local texture enhancement with long-range dependency modelling to improve crack feature extraction. In the bottleneck layer, a Wavelet and Directional-aware feature Cascade (WDcascade) module adaptively fuses multi-source features through a gated mechanism, enhancing the representation of diverse crack patterns in complex backgrounds. The effectiveness of the developed model was evaluated on four publicly available crack datasets: CRACK500, DeepCrack, CFD, and EdmCrack600. Six evaluation metrics were used for fair and objective assessment. The experimental results demonstrated that the model can generally achieve superior performance. Our code is open source on GitHub: https: //github. com/Acruelsummer/VMFDF-Net/tree/main.

Details DOI

EAAI Journal 2026 Journal Article

Data-augmented prediction of laser shock peening residual stress fields via attention-integrated generative adversarial networks

Jiajun Sun
Yuchen Liang
Yuqi Wang
Anping Li
Kaiyu Luo
Jinzhong Lu

Accurately and efficiently predicting residual stress fields after laser shock peening (LSP) is critical for optimizing material performance and extending component service life. Traditional finite element simulations, however, are computationally heavy and time-consuming, limiting their use in iterative design and real-time optimization. This paper proposes an innovative method integrating finite element simulation, generative modeling, and attention-enhanced learning for rapid residual stress field prediction, with key results: A conventional attention-enhanced graph convolutional generative adversarial network model (AGCGAN) is designed to predict residual stress fields from LSP parameters, boosting accuracy and cross-domain generalization. A finite element platform based on a specific material constitutive model is developed to implement data augmentation, generating enriched training data under varying laser energy and spot overlap rates to ensure physical authenticity. The proposed neural network addresses limited LSP simulation samples and training difficulties through effective utilization of augmented data, achieving 95. 98 % prediction accuracy, 445x faster computation than traditional methods, and highly realistic generated images, efficiently solving design challenges in LSP surface treatment at low cost.

Details DOI

AAAI Conference 2026 Conference Paper

Trustworthy Classification for Complex Social Surveys: A Memory-Enhanced Hierarchical Framework with Calibrated Uncertainty

Zeqiang Wang
Rebecca Oldroyd
Yuqi Wang
Jiageng Wu
Jie Yang
Wei Wang
Nishanth R. Sastry
Jon Johnson

Automated classification of complex social survey questionnaires is crucial for large-scale social science research but faces significant reliability challenges due to intricate hierarchical label structures, severe class imbalance, semantic ambiguity, and incomplete data coverage. Conventional classification methods often struggle with these combined complexities, yielding results that lack trustworthiness. We introduce HOCM, a framework designed for trustworthy classification in complex, real-world taxonomies. It features two synergistic components: (1) memory-enhanced contrastive learning, tailored to learn robust representations from noisy, imbalanced data by leveraging quality-aware category memory banks; and (2) hierarchical uncertainty calibration, which enforces taxonomic consistency while providing reliable confidence estimates and identifying inputs falling outside well-represented known categories. Our evaluation on a large-scale, real-world social survey dataset—a challenging exemplar of our target problem class—demonstrates that HOCM maintains strong accuracy on known classes while effectively identifying uncertain cases, significantly boosting accuracy on confident predictions. Furthermore, it adeptly detects low-resource/unknown categories. HOCM provides a more reliable automated classification tool, enabling efficient expert review and enhancing the trustworthiness of analysis in domains with complex, hierarchical data.

PDF Details DOI

JBHI Journal 2025 Journal Article

A Phase-Enhanced Neural Network With Dual-Path Transformer for Single-Channel Chest Sound Separation

Yuqi Wang
Han Yang
Zhixing Gao
Zhiwei Dai
Kang Yu
Tingting Song
Hao Yang
Yunfeng Wang

Auscultation of the chest is a fundamental diagnostic tool for cardiovascular and pulmonary diseases. However, the two main chest sound parts, heart sound (HS) and lung sound (LS), are often mixed, limiting diagnostic accuracy. This paper presents a novel Phase-Enhanced Neural Network (PENN) for HS and LS separation. To address the under-utilization of phase information, PENN integrates a feedforward connection that feeds the input spectrum into the Restorer, enabling phase recovery based on the local inference feature of phase. A time-frequency Dual-Path Transformer (DPT) is employed to expand the network's receptive field and enhance performance. To interpret the effectiveness of PENN, two new metrics, mSI-SDRi and pSI-SDRi, are proposed to separately evaluate the contributions of magnitude and phase. Experiments show that PENN achieves pSI-SDRi improvements of 1. 44 dB for HS and 2. 25 dB for LS under a LS cutoff frequency ( $f_{c\text{lung}}$ ) of 60Hz. Extensive experimental results demonstrate the effectiveness and robustness of PENN, offering a promising solution to improve the accuracy of auscultation.

Details DOI

AIIM Journal 2025 Journal Article

Concordance-based Predictive Uncertainty (CPU)-Index: Proof-of-concept with application towards improved specificity of lung cancers on low dose screening CT

Yuqi Wang
Aarzu Gupta
Fakrul Islam Tushar
Breylon Riley
Avivah Wang
Tina D. Tailor
Stacy Tantum
Jian-Guo Liu

In this paper, we introduce a novel concordance-based predictive uncertainty (CPU)-Index, which integrates insights from subgroup analysis and personalized AI time-to-event models. Through its application in refining lung cancer screening (LCS) predictions generated by an individualized AI time-to-event model trained with fused data of low dose CT (LDCT) radiomics with patient demographics, we demonstrate its effectiveness, resulting in improved risk assessment compared to the Lung CT Screening Reporting & Data System (Lung-RADS). Subgroup-based Lung-RADS faces challenges in representing individual variations and relies on a limited set of predefined characteristics, resulting in variable predictions. Conversely, personalized AI time-to-event models are hindered by transparency issues and biases from censored data. By measuring the prediction consistency between subgroup analysis and AI time-to-event models, the CPU-Index framework offers a nuanced evaluation of the bias–variance trade-off and improves the transparency and reliability of predictions. Consistency was estimated by the concordance index of subgroup analysis-based similarity rank and model prediction similarity rank. Subgroup analysis-based similarity loss was defined as the sum-of-the-difference between Lung-RADS and feature-level 0-1 loss. Model prediction similarity loss was defined as squared loss. To test our approach, we identified 3, 326 patients who underwent LDCT for LCS from 1/1/2015 to 6/30/2020 with confirmation of lung cancer on pathology within one year. For each LDCT image, the lesion associated with a Lung-RADS score was detected using a pretrained deep learning model from Medical Open Network for AI (MONAI), from which radiomic features were extracted. Radiomics were optimally fused with patient demographics via a positional encoding scheme and used to train a neural multi-task logistic regression time-to-event model that predicts malignancy. Performance was maximized when radiomics features were fused with positionally encoded demographic features. In this configuration, our algorithm raised the AUC from 0. 81 ± 0. 04 to 0. 89 ± 0. 02. Compared to standard Lung-RADS, our approach reduced the False-Positive-Rate from 0. 41 ± 0. 02 to 0. 30 ± 0. 12 while maintaining the same False-Negative-Rate. Our methodology enhances lung cancer risk assessment by estimating prediction uncertainty and adjusting accordingly. Furthermore, the optimal integration of radiomics and patient demographics improved overall diagnostic performance, indicating their complementary nature.

Details DOI

EAAI Journal 2025 Journal Article

Cutting tool life prediction and extension through generative model-augmented deep learning and laser remanufacturing techniques

Yuchen Liang
Yuqi Wang
Raymond Chiong
Anping Li
Jinzhong Lu

Predicting and extending the remaining life of cutting tools during machining processes is essential for sustainable manufacturing. Traditional prognosis methods often struggle to adapt to different working conditions over the machining process lifecycle. This paper proposes a novel framework that effectively addresses the challenges by integrating multi-source data and using deep learning techniques. The system integrates augmented-power and vibration data collected from computer numerical control machines with the following innovations: (1) A hybrid temporal convolutional network (TCN)-attention model is developed for cutting tool remaining life prognosis, which achieves the best accuracy of 98. 51 % and average of 97. 62 %. In addition, optimal laser shock peening parameters are selected using a deep neural network and enhanced ternary bees algorithm. (2) A time-series generative adversarial network is used for data augmentation, which increases data quantity for TCN model training. (3) Data quality is evaluated using the t-distributed stochastic neighbor embedding, Fréchet inception distance, and root mean squared error to ensure similarity between real and generated data. (4) The effectiveness of the remanufacturing approach is validated with a 28. 95 % and 30. 77 % increase in tool life based on finite element analysis and experimental testing, respectively. This comprehensive approach contributes to enhancing tool life prediction accuracy and optimizing sustainable remanufacturing processes, thereby enhancing production efficiency and reducing waste in machining operations.

Details DOI

NeurIPS Conference 2025 Conference Paper

DriveDPO: Policy Learning via Safety DPO For End-to-End Autonomous Driving

Shuyao Shang
Yuntao Chen
Yuqi Wang
Yingyan Li
ZHAO-XIANG ZHANG

End-to-end autonomous driving has substantially progressed by directly predicting future trajectories from raw perception inputs, which bypasses traditional modular pipelines. However, mainstream methods trained via imitation learning suffer from critical safety limitations, as they fail to distinguish between trajectories that appear human-like but are potentially unsafe. Some recent approaches attempt to address this by regressing multiple rule-driven scores but decoupling supervision from policy optimization, resulting in suboptimal performance. To tackle these challenges, we propose DriveDPO, a Safety Direct Preference Optimization Policy Learning framework. First, we distill a unified policy distribution from human imitation similarity and rule-based safety scores for direct policy optimization. Further, we introduce an iterative Direct Preference Optimization stage formulated as trajectory-level preference alignment. Extensive experiments on the NAVSIM benchmark demonstrate that DriveDPO achieves a new state-of-the-art PDMS of 90. 0. Furthermore, qualitative results across diverse challenging scenarios highlight DriveDPO’s ability to produce safer and more reliable driving behaviors.

PDF Details

EAAI Journal 2025 Journal Article

LCRTR-Net: A low-cost real-time recognition network for rail corrugation in railway transportation

Xueyang Tang
Xiaopei Cai
Yuqi Wang
Yue Hou

Rail corrugation has a significant impact on the safety of high-speed railway operations, making its identification particularly important. Traditional manual inspection methods are infeasible for large-scale identification within limited time frames, while existing methods based on machine vision or axle box acceleration face challenges such as high costs, complex equipment installation and maintenance, as well as difficulties in achieving real-time performance. To address these challenges, this study proposes an innovative low-cost real-time recognition network (LCRTR-Net), which utilizes accelerometers installed on the underside of the train body and combines wavelet packet decomposition with dilated causal convolution in a residual neural network. Specifically, the approach first extracts the latent features of train body acceleration caused by rail corrugation through wavelet packet decomposition and reconstruction. Next, dilated causal convolution is employed to capture the temporal causal relationships and long-term dependencies of these latent features. Finally, the integration of residual connections further enhances the feature extraction performance and computational efficiency of LCRTR-Net. Experimental results demonstrate that LCRTR-Net exhibits significant generalization ability and real-time performance, achieving an average recognition accuracy exceeding 97. 0%, with a recognition time of only 0. 17 ms per rail corrugation sample, significantly outperforming existing rail corrugation recognition methods. This indicates that LCRTR-Net has broad application potential in practical railway operations. Future research directions will focus on unsupervised or few-shot learning algorithms and multi-sensor integration to further improve recognition accuracy and real-time performance, promoting the practical application of this technology.

Details DOI

NeurIPS Conference 2025 Conference Paper

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Jorge (Zhoujun) Cheng
Shibo Hao
Tianyang Liu
Fan Zhou
Yutao Xie
Feng Yao
Yuexin Bian
Nilabjo Dey

Reinforcement learning (RL) has shown promise in enhancing large language model (LLM) reasoning, yet progress towards broader capabilities is limited by the availability of high-quality, multi-domain datasets. This work introduces \ours, a 92K RL-for-reasoning dataset designed to address this gap, covering six reasoning domains: Math, Code, Science, Logic, Simulation, and Tabular, each with corresponding verifiers. We build \ours via a careful data-curation pipeline, including sourcing, deduplication, reward design, and domain-specific and difficulty-based filtering, to facilitate the systematic investigation of cross-domain RL generalization. Our study using \ours suggests the efficacy of a simple mixed-domain RL training approach and reveals several key aspects affecting cross-domain transferability. We further train two models {\ours}-7B and {\ours}-32B purely with RL on our curated data and observe largely improved performance over leading open RL reasoning model baselines, with gains of 7. 3\% and 7. 8\% respectively on an extensive 17-task, six-domain evaluation suite. We are releasing our dataset, code, and evaluation suite to the community, aiming to support further research and development of more general RL-enhanced reasoning models.

PDF Details

IROS Conference 2025 Conference Paper

Teacher Motion Priors: Enhancing Robot Locomotion over Challenging Terrain

Fangcheng Jin
Yuqi Wang
Peixin Ma
Guodong Yang
Pan Zhao
En Li 0001
Zhengtao Zhang

Achieving robust locomotion on complex terrains remains a challenge due to high-dimensional control and environmental uncertainties. This paper introduces a teacher-prior framework based on the teacher-student paradigm, integrating imitation and auxiliary task learning to improve learning efficiency and generalization. Unlike traditional paradigms that strongly rely on encoder-based state embeddings, our framework decouples the network design, simplifying the policy network and deployment. A high-performance teacher policy is first trained using privileged information to acquire generalizable motion skills. The teacher’s motion distribution is transferred to the student policy, which relies only on noisy proprioceptive data, via a generative adversarial mechanism to mitigate performance degradation caused by distributional shifts. Additionally, auxiliary task learning enhances the student policy’s feature representation, speeding up convergence and improving adaptability to varying terrains. The framework is validated on a humanoid robot, showing a great improvement in locomotion stability on dynamic terrains and significant reductions in development costs. This work provides a practical solution for deploying robust locomotion strategies in humanoid robots.

Details

TIST Journal 2024 Journal Article

Biomedical Information Retrieval with Positive-Unlabeled Learning and Knowledge Graphs

Yuqi Wang
Qiuyi Chen
Haiyang Zhang
Wei Wang
Qiufeng Wang
Yushan Pan
Liangru Xie
Kaizhu Huang

The rapid growth of biomedical publications has presented significant challenges in the field of information retrieval. Most existing work focuses on document retrieval given explicit queries. However, in real applications such as curated biomedical database maintenance, explicit queries are missing. In this paper, we propose a two-step model for biomedical information retrieval in the case that only a small set of example documents is available without explicit queries. Initially, we extract keywords from the observed documents using large pre-trained language models and biomedical knowledge graphs. These keywords are then enriched with domain-specific entities. Information retrieval techniques can subsequently use the collected entities to rank the documents. Following this, we introduce an iterative Positive-Unlabeled learning method to classify all unlabeled documents. Experiments conducted on the PubMed dataset demonstrate that the proposed technique outperforms the state-of-the-art positive-unlabeled learning methods. The results underscore the effectiveness of integrating large language models and biomedical knowledge graphs in improving zero-shot information retrieval performance in the biomedical domain.

Details DOI

NeurIPS Conference 2024 Conference Paper

DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model

Yuqi Wang
Ke Cheng
Jiawei He
Qitai Wang
Hengchen Dai
Yuntao Chen
Fei Xia
Zhaoxiang Zhang

Driving world models have gained increasing attention due to their ability to model complex physical dynamics. However, their superb modeling capability is yet to be fully unleashed due to the limited video diversity in current driving datasets. We introduce DrivingDojo, the first dataset tailor-made for training interactive world models with complex driving dynamics. Our dataset features video clips with a complete set of driving maneuvers, diverse multi-agent interplay, and rich open-world driving knowledge, laying a stepping stone for future world model development. We further define an action instruction following (AIF) benchmark for world models and demonstrate the superiority of the proposed dataset for generating action-controlled future predictions.

PDF Details DOI

YNIMG Journal 2024 Journal Article

Mapping morphological cortical networks with joint probability distributions from multiple morphological features

Yuqi Wang
Junle Li
Suhui Jin
Jing Wang
Yating Lv
Qihong Zou
Jinhui Wang

Morphological features sourced from structural magnetic resonance imaging can be used to infer human brain connectivity. Although integrating different morphological features may theoretically be beneficial for obtaining more precise morphological connectivity networks (MCNs), the empirical evidence to support this supposition is scarce. Moreover, the incorporation of different morphological features remains an open question. In this study, we proposed a method to construct cortical MCNs based on multiple morphological features. Specifically, we adopted a multi-dimensional kernel density estimation algorithm to fit regional joint probability distributions (PDs) from different combinations of four morphological features, and estimated inter-regional similarity in the joint PDs via Jensen-Shannon divergence. We evaluated the method by comparing the resultant MCNs with those built based on different single morphological features in terms of topological organization, test-retest reliability, biological plausibility, and behavioral and cognitive relevance. We found that, compared to MCNs built based on different single morphological features, MCNs derived from multiple morphological features displayed less segregated, but more integrated network architecture and different hubs, had higher test-retest reliability, encompassed larger proportions of inter-hemispheric edges and edges between brain regions within the same cytoarchitectonic class, and explained more inter-individual variance in behavior and cognition. These findings were largely reproducible when different brain atlases were used for cortical parcellation. Further analysis of macaque MCNs revealed weak, but significant correlations with axonal connectivity from tract-tracing, independent of the number of morphological features. Altogether, this paper proposes a new method for integrating different morphological features, which will be beneficial for constructing MCNs.

Details DOI

NeurIPS Conference 2022 Conference Paper

4D Unsupervised Object Discovery

Yuqi Wang
Yuntao Chen
ZHAO-XIANG ZHANG

Object discovery is a core task in computer vision. While fast progresses have been made in supervised object detection, its unsupervised counterpart remains largely unexplored. With the growth of data volume, the expensive cost of annotations is the major limitation hindering further study. Therefore, discovering objects without annotations has great significance. However, this task seems impractical on still-image or point cloud alone due to the lack of discriminative information. Previous studies underlook the crucial temporal information and constraints naturally behind multi-modal inputs. In this paper, we propose 4D unsupervised object discovery, jointly discovering objects from 4D data -- 3D point clouds and 2D RGB images with temporal information. We present the first practical approach for this task by proposing a ClusterNet on 3D point clouds, which is jointly iteratively optimized with a 2D localization network. Extensive experiments on the large-scale Waymo Open Dataset suggest that the localization network and ClusterNet achieve competitive performance on both class-agnostic 2D object detection and 3D instance segmentation, bridging the gap between unsupervised methods and full supervised ones. Codes and models will be made available at https: //github. com/Robertwyq/LSMOL.

PDF Details

JBHI Journal 2022 Journal Article

MDADP: A Webserver Integrating Database and Prediction Tools for Microbe-Disease Associations

Lei Wang
Hao Li
Yuqi Wang
Yihong Tan
Zhiping Chen
Tingrui Pei
Quan Zou

More and more evidence has demonstrated that microbiota play important roles in the life processes of the human body. In recent years, various computational methods have been proposed for identifying potentially disease-associated microbes to save costs in traditional biological experiments. However, prediction performances of these methods are generally limited by outdated and incomplete datasets. And moreover, until now, there are limited studies that can provide visual predictive tools for inferring possible microbe-disease associations (MDAs) as well. Hence, in this manuscript, a novel webserver called MDADP will be proposed to identify latent MDAs, in which, a new MDA database together with interactive prediction tools for MDAs studies will be designed simultaneously. Especially, in the newly constructed MDA database, 2019 known MDAs between 58 diseases and 703 microbes have been manually collected first. And then, through adopting the average ranking method and the co-confidence method respectively, eight representative computational models have been integrated together to identify potential disease-related microbes. As a result, MDADP can provide not only interactive features for users to access and capture MDAs entities, but alsoeffective tools for users to identify candidate microbes for different diseases. To our knowledge, MDADP is the first online platform that incorporates a new MDA database with comprehensive MDA prediction tools. Therefore, we believe that it will be a valuable source of information for researches in microbiology and disease-related fields. MDADP can be accessed at http://mdadp.leelab2997.cn.

Details DOI

IJCAI Conference 2019 Conference Paper

Metadata-driven Task Relation Discovery for Multi-task Learning

Zimu Zheng
Yuqi Wang
Quanyu Dai
Huadi Zheng
Dan Wang

Task Relation Discovery (TRD), i. e. , reveal the relation of tasks, has notable value: it is the key concept underlying Multi-task Learning (MTL) and provides a principled way for identifying redundancies across tasks. However, task relation is usually specifically determined by data scientist resulting in the additional human effort for TRD, while transfer based on brute-force methods or mere training samples may cause negative effects which degrade the learning performance. To avoid negative transfer in an automatic manner, our idea is to leverage commonly available context attributes in nowadays systems, i. e. , the metadata. In this paper, we, for the first time, introduce metadata into TRD for MTL and propose a novel Metadata Clustering method, which jointly uses historical samples and additional metadata to automatically exploit the true relatedness. It also avoids the negative transfer by identifying reusable samples between related tasks. Experimental results on five real-world datasets demonstrate that the proposed method is effective for MTL with TRD, and particularly useful in complicated systems with diverse metadata but insufficient data samples. In general, this study helps in automatic relation discovery among partially related tasks and sheds new light on the development of TRD in MTL through the use of metadata as apriori information.

PDF Details

AAAI Conference 2016 Conference Paper

Active Learning with Cross-Class Knowledge Transfer

Yuchen Guo
Guiguang Ding
Yuqi Wang
Xiaoming Jin

When there are insufﬁcient labeled samples for training a supervised model, we can adopt active learning to select the most informative samples for human labeling, or transfer learning to transfer knowledge from related labeled data source. Combining transfer learning with active learning has attracted much research interest in recent years. Most existing works follow the setting where the class labels in source domain are the same as the ones in target domain. In this paper, we focus on a more challenging cross-class setting where the class labels are totally different in two domains but related to each other in an intermediary attribute space, which is barely investigated before. We propose a novel and effective method that utilizes the attribute representation as the seed parameters to generate the classiﬁcation models for classes. And we propose a joint learning framework that takes into account the knowledge from the related classes in source domain, and the information in the target domain. Besides, it is simple to perform uncertainty sampling, a fundamental technique for active learning, based on the framework. We conduct experiments on three benchmark datasets and the results demonstrate the efﬁcacy of the proposed method.

PDF Details