Author name cluster

Weidong Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

2 author rows

EAAI Journal 2026 Journal Article

Automated crack measurement in slab tracks using deformable instance segmentation and boundary augmentation with unsupervised style transfer

Wenbo Hu
Zheng Wu
Weidong Wang
Xianhua Liu
Jun Peng

Crack detection and measurement in slab tracks are critical for maintenance decision-making. Pre-trained deep learning segmentation models often struggle with cracking instances due to domain adaptation and data scarcity. This study proposes an instance segmentation framework incorporating dynamic snake convolution (DSConv) modules, combined with an unsupervised style transfer-based boundary augmentation strategy. The DSConv-enhanced architecture prioritizes linear crack features in cluttered backgrounds, while the augmentation introduces controlled perturbations to global pixels and local crack boundaries, generating structurally consistent diversified training samples. The results demonstrate that the deformable DSConv enhanced architecture achieves optimal mean average precision (mAP), improving segmentation performance by nearly 13 % compared to "fine-tuned Segment Anything Model". Its segmentation capability surpasses eight state-of-the-art models, especially for multiple intermittent microcracks. Furthermore, unsupervised style transfer-generated augmented data enhances crack instance segmentation performance by 10 % compared to non-augmented baselines, surpassing conventional methods including horizontal flipping and color jittering. Quantitative crack width distributions from segmentation-quantification analysis provide more comprehensive structural health insights than manual discrete-point measurements, facilitating precise maintenance decisions for railway infrastructure.

Details DOI

AAAI Conference 2026 Conference Paper

E³SAM2: Entropy-Aware and Edge-Guided Adaptation of SAM2 for Echocardiography Video Segmentation

Long Zheng
Zhi Li
Weidong Wang
Zhenyu Dai
Shuyun Li

Foundation segmentation models, such as SAM and its video-oriented variant SAM2, have achieved remarkable success in natural image and video segmentation. However, their direct application to echocardiography video is challenged by structural uncertainty arising from severe speckle noise and blurry anatomical boundaries. To address this, we propose E³SAM2, a lightweight adaptation framework that introduces a novel entropy-based methodology to explicitly model and mitigate such uncertainty. Specifically, an entropy-guided attention mechanism is introduced to steer the model’s focus toward structurally reliable features, particularly in speckle-dominated regions. Additionally, an entropy regularization loss is introduced to further enhance target-background discrimination. To better resolve indistinct anatomical contours, an edge-aware supervision module is incorporated to inject explicit boundary priors for sharper delineation. These components are efficiently integrated through a global-local feature adapter. Experiments on CAMUS and EchoNet-Dynamic datasets demonstrate that E³SAM2 achieves state-of-the-art segmentation and clinical estimation performance, while maintaining high computational efficiency.

PDF Details DOI

EAAI Journal 2026 Journal Article

Knowledge graph-based operation and maintenance risk analysis and early warning approach for railway traction power supply systems

Shi Qiu
Xiaojian Li
Yongjun Chen
Weidong Wang
Jin Wang
Runan Cheng
Qasim Zaheer

The railway traction power supply system (RTPSS) is a critical component in the operation of electrified railways. However, as the network expands and maintenance cycles lengthen, it faces increasing operational risk. To enhance the accuracy of risk management and the timeliness of decision-making, this paper presents a risk analysis framework for the operation and maintenance (O&M) of RTPSS by utilizing knowledge graph technology. Initially, natural language processing (NLP) techniques are employed to handle massive fault data, constructing a systematic model to comprehensively represent the global modeling of multi-risk coupling mechanisms and cross-system cascade failures. Subsequently, a method for evaluating the early warning levels of risk events is proposed, which integrates multidimensional data. This method systematically assesses early warning levels by considering risk probability, risk loss data, and network topology data. Finally, the study outlines the process of mapping the early warning levels of RTPSS O&M risk onto knowledge graphs by dynamically integrating physical data with graph-based approaches. This approach enables maintenance personnel to quickly identify and comprehend the operational status of the RTPSS. Case study results demonstrate that the proposed method significantly enhances systematization, comprehensiveness, and observability, providing a more accurate and holistic tool for managing RTPSS O&M risk.

Details DOI

EAAI Journal 2026 Journal Article

Multimodal graph neural network framework for railway fastener tightness assessment from high-resolution point clouds

Qasim Zaheer
S Muhammad Ahmed Hassan Shah
Weidong Wang
Haleema Ehsan
Chengbo Ai
Jin Wang
Shi Qiu

Railway fasteners are critical to the safety and structural integrity of railway infrastructure, yet conventional tightness assessment methods based on manual inspection are labor-intensive, subjective, and difficult to scale. This paper presents a hybrid dual-phase framework for automated fastener tightness estimation that integrates multimodal self-supervised contrastive learning with graph-based feature analysis. The framework exploits complementary information from images, depth maps, point clouds, and mesh representations to learn mechanically meaningful features without requiring manual annotations. Experimental results demonstrate stable cross-modal feature alignment, with cosine similarity remaining consistent across configurations, and show that incorporating three-dimension geometric information increases representational richness, as reflected by higher feature-space separation. Inference-time analysis indicates that multimodal feature extraction requires approximately 3 s for image-depth processing and 2 s for graph-based three-dimension processing per fastener, supporting practical deployment through offline or batch-based operation. Overall, the proposed framework provides a robust and physically interpretable approach for railway fastener tightness monitoring and establishes a foundation for scalable intelligent maintenance systems.

Details DOI

EAAI Journal 2025 Journal Article

A strongly supervised hyperspectral unmixing framework for precise mineral composition and coal ash content estimation

Yao Cui
Ziqi Lv
Ying Gao
Yuxin Wu
Xuan Zhao
Qingxuan Meng
Jun Dong
Zhiqiang Xu

Accurate coal ash content detection is essential for advancing intelligent clean coal processing and holds significant practical value across mining, washing, combustion, and conversion technologies. This paper introduces a strongly supervised hyperspectral unmixing (SSHU) framework designed to estimate mineral composition proportions and ash content. We conducted systematic ablation experiments on concentrated coal and tailings coal datasets to evaluate the method's effectiveness and analyze the mechanisms of proportional prior information and reconstruction decoders. Results demonstrate that proportional prior information effectively constrains the proportional encoder, making estimated mineral and pure coal distributions closer to actual material distributions. The reconstruction decoder enhances the proportional encoder's feature extraction ability, guides model convergence, and improves both proportion and ash content estimation accuracy. Compared to existing hyperspectral unmixing methods, our approach incorporates pure substance spectral information during model training and combines proportional prior constraints. This provides a robust solution for complex mixture analysis and demonstrates significant potential in hyperspectral unmixing applications.

Details DOI

EAAI Journal 2025 Journal Article

Adaptive adversarial pattern contrast algorithm for black-box model and domain attack

Weidong Wang
Yi Wang
Zhi Li
Long Zheng
Li Zhang

The transferability of adversarial attacks in deep neural networks (DNNs) is a significant challenge, especially for achieving effective attacks across models and data domains. Unfortunately, existing attack approaches primarily focus on cross-model transferability, often overlooking the potential for black-box attacks across diverse data domains. This paper proposes the Adaptive adversarial Pattern Contrast (APEC) algorithm, designed to achieve cross-model and domain adversarial attacks with high transferability. Firstly, APEC generates transferable adversarial examples by leveraging spatial characteristics such as regional homogeneity, repetition, and density, thereby increasing classifier misclassification rates. Secondly, a key innovation in APEC is the similarity contrast loss inspired by contrastive learning. It guides the model to learn discriminative adversarial features by aligning adversarial examples with adversarial patterns and distancing them from clean examples. Importantly, this optimization is performed label-free, enhancing APEC’s practicality in real-world black-box scenarios. Additionally, we introduce a Gaussian low-pass filter in APEC to generate adversarial perturbation patterns adaptively. This operation suppresses high-frequency information while preserving the low-frequency characteristics of natural examples, enhancing APEC’s attack capabilities. The APEC algorithm shows relative improvement across models and data domains compared to state-of-the-art transferability attacks. Our code is available at https: //github. com/cs-igps/APEC-TransferAttack.

Details DOI

ICML Conference 2025 Conference Paper

EPIC: Efficient Position-Independent Caching for Serving Large Language Models

Junhao Hu
Wenrui Huang
Weidong Wang
Haoyi Wang
Tiancheng Hu
Qin Zhang
Hao Feng
Xusheng Chen

Large Language Models (LLMs) show great capabilities in a wide range of applications, but serving them efficiently becomes increasingly challenging as requests (prompts) become more complex. Context caching improves serving performance by reusing Key-Value (KV) vectors, the intermediate representations of tokens that are repeated across requests. However, existing context caching requires exact prefix matches across requests, limiting reuse cases in settings such as few-shot learning and retrieval-augmented generation, where immutable content (e. g. , documents) remains unchanged across requests but is preceded by varying prefixes. Position-Independent Caching (PIC) addresses this issue by enabling modular reuse of the KV vectors regardless of prefixes. We formalize PIC and advance prior work by introducing EPIC, a serving system incorporating our new LegoLink algorithm, which mitigates the inappropriate “attention sink” effect at every document beginning, to maintain accuracy with minimal computation. Experiments show that EPIC achieves up to 8$\times$ improvements in Time-To-First-Token (TTFT) and 7$\times$ throughput gains over existing systems, with negligible or no accuracy loss.

Details

EAAI Journal 2025 Journal Article

Learning-centered multi-modal federated learning complimented with filter rotation for clinical image diagnosis

Isaac Adjei-Mensah
Qingxian Wang
Isaac Osei Agyemang
Adu Asare Baffour
Sophyani Banaamwini Yussif
Collins Sey
Linda Delali Fiasam
Ijeoma Amuche Chikwendu

The demand for accurate and efficient diagnostic tools in medical imaging, highlighted by the COronaVIrus Disease of 2019 (COVID-19) pandemic, has emphasized the need for robust computer-aided diagnostic systems. However, the challenge of privacy preservation in data sharing across institutions remains a hurdle. To address these, this study introduces Federated Network (FedNet), a federated learning (FL)-based framework designed to facilitate decentralized training of medical data while preserving privacy. FedNet incorporates innovative methodologies: a filter rotation technique to enhance feature extraction by capturing image features at varying orientations, and Dark-mixer image augmentation to improve image contrast and feature visibility. A novel learning-centered multi-modal learning (L-CML) approach, focusing on learning of modalities rather than traditional data modalities, to improve learning efficiency is introduced. These techniques are implemented within an FL setup, using the PySyft encryption and decryption technique for privacy protection. The objective of FedNet is to improve accuracy and efficiency in medical imaging through decentralized and privacy-preserving learning. This study aims to demonstrate the adaptability of FedNet to diverse medical imaging applications. Through extensive experiments, FedNet demonstrated promising results, achieving training and validation accuracies of 96. 47 % and 95. 50 %, respectively. FedNet outperformed frameworks such as Federated Averaging (FedAvg), Federated Proximal (FedProx), and Federated Batch Normalization (FedBN) in key metrics, such as precision, recall, and computational efficiency, attaining accuracies of 94. 01 %, 93. 50 %, and 95. 92 %, respectively. FedNet's filter rotation, Dark-mixer, and L-CML integration enhanced feature extraction and model performance, offering a promising solution for advancing medical image analysis toward accurate, efficient, and privacy-preserving diagnostics.

Details DOI

EAAI Journal 2024 Journal Article

A high-confidence instance boundary regression approach and its application in coal-gangue separation

Ziqi Lv
Weidong Wang
Kanghui Zhang
Rui Tian
Yonghan Lv
Meijie Sun
Zhiqiang Xu

Details DOI

EAAI Journal 2024 Journal Article

Automated detection and quantification of pavement cracking around manhole

Jun Peng
Weidong Wang
Wenbo Hu
Chengbo Ai
Xinyue Xu
Youyin Shi
Jin Wang
Zhifa Ran

Details DOI

EAAI Journal 2024 Journal Article

GFNet: A pioneering approach for precisely estimating ash content in coal through the fusion of graph convolution and feedforward network

Kanghui Zhang
Weidong Wang
Yao Cui
Ziqi Lv
Yuhan Fan

Details DOI

EAAI Journal 2021 Journal Article

Computer vision detection of foreign objects in coal processing using attention CNN

Kanghui Zhang
Weidong Wang
Ziqi Lv
Yuhan Fan
Yang Song

Details DOI

IROS Conference 2021 Conference Paper

DSVP: Dual-Stage Viewpoint Planner for Rapid Exploration by Dynamic Expansion

Hongbiao Zhu
Chao Cao
Yukun Xia
Sebastian A. Scherer
Ji Zhang 0003
Weidong Wang

We present a method for efficiently exploring highly convoluted environments. The method incorporates two planning stages - an exploration stage for extending the boundary of the map, and a relocation stage for explicitly transiting the robot to different sub-areas in the environment. The exploration stage develops a local Rapidly-exploring Random Tree (RRT) in the free space of the environment, and the relocation stage maintains a global graph through the mapped environment, both are dynamically expanded over replanning steps. The method is compared to existing state-of-the-art methods in various challenging simulation and real environments. Experiment comparisons show that our method is twice as efficient in exploring spaces using less processing than the existing methods. Further, we release a benchmark environment to evaluate exploration algorithms as well as facilitate development of autonomous navigation systems. The benchmark environment and our method are open-sourced.

Details