Author name cluster

Jie Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

98 papers

2 author rows

TAAS Journal 2026 Journal Article

A Novel Physics-Informed Federated Learning Framework for Robust Bearing Fault Diagnosis

Jiaqi Chen
Jie Wang
Yongquan Jiang
ZhengHong Wang
Fan Zhang
Yan Yang

Rolling bearing failures are a primary cause of catastrophic machinery breakdowns, posing significant economic and safety risks. Effective fault diagnosis is frequently hindered by challenges inherent to modern industrial settings, including data privacy constraints, statistical heterogeneity across Non-Independent and Identically Distributed (Non-IID) datasets, and the prevalence of few-shot learning scenarios. To address these challenges, this paper introduces CARR-MgNet, a novel physics-informed federated learning framework. The framework utilizes a M ulti- g ranularity fusion Net work (MgNet) backbone, which enhances feature robustness by embedding physical fault characteristics directly into its convolutional kernels. To ensure stable federated training across heterogeneous clients, we then introduce a C lass- A verage R epresentation R egularization (CARR) mechanism to effectively mitigate client drift. Extensive experiments on four public industrial datasets validate the state-of-the-art performance of our proposed framework. Under challenging non-IID conditions, CARR-MgNet surpasses established baselines, including FedProx and MOON, by up to 8.2% in accuracy. Furthermore, it reduces the number of communication rounds required to reach 95% accuracy by 40% compared to FedAvg and reduces total communication overhead by 35%. These results demonstrate that our physics-informed federated approach provides a robust, communication-efficient, and privacy-preserving solution for real-world industrial fault diagnosis.

Details DOI

EAAI Journal 2026 Journal Article

A short-term water demand forecasting method integrating wavelet stepwise decomposition and spatial-temporal features

Chenlei Xie
Jie Wang
Tao Chen
Qiansheng Fang
Shanshou Li
Xuelei Yang

Accurate short-term water demand forecasting is crucial for the management and scheduling of water distribution systems. However, existing decomposition-based prediction models face two major challenges: prevalent data leakage during global decomposition, which distorts model evaluation, and the inherent shift-variance in methods designed to avoid leakage, resulting in poor forecasting accuracy. To address these issues, this paper proposes an innovative forecasting framework integrating Wavelet stepwise decomposition (WSD) with spatial-temporal features. The core contributions of this work are threefold: First, the proposed WSD method employs a fixed-length sliding window for decomposition, fundamentally eliminating data leakage. Second, correlation analysis is introduced to optimize the selection of the mother wavelet, thereby minimizing errors caused by shift-variance. Third, a hybrid prediction model is constructed, where Extreme gradient boosting (XGBoost) fits the stable trends of low-frequency subseries, and an inverted Transformer (iTransformer) captures the dynamic dependencies within multi-dimensional spatial-temporal features of high-frequency subseries, significantly enhancing their prediction accuracy. Experimental results on a real-world water distribution networks (WDN) demonstrate that the proposed method outperforms benchmark models, including Long short-term memory (LSTM) and graph-based models.

Details DOI

EAAI Journal 2026 Journal Article

Cross-domain attention guided multi-source domain adaptation method for machinery fault diagnosis

Jie Wang
Jianning Gou
Haidong Shao
Yiming Xiao
Ying Peng
Bin Liu

Compared with single-source approaches, multi-source domain adaptation (MSDA) for fault diagnosis integrates complementary information from various domains. This avoids the subjectivity and arbitrariness associated with selecting a single source. However, existing MSDA methods for fault diagnosis typically enforce global distribution alignment between the feature of source and target domains. Such alignment often leads to the loss of discriminative fault features in the target domain, resulting in negative transfer. To address aforementioned issues, a cross-domain attention guided MSDA model (CDA-MSDA) is proposed in this paper. In this framework, a cross-domain attention module is constructed to dynamically fuse source and target domain features. This module effectively enhances the transfer of task-relevant features in the source domain and preserves discriminative features in the target domain. Then, a fault knowledge distillation module is developed to guide the feature extractor and classifier in achieving cross-domain fault category alignment. Finally, a multi-model dynamic collaborative decision module is designed. By aggregating prediction results from multiple classifiers, it addresses prediction conflicts arising from the varying reliability of different source domains. Extensive experiments on three benchmark datasets across 16 transfer tasks validate the effectiveness of the proposed method. Specifically, CDA-MSDA achieves an average diagnostic accuracy of 94. 99 %, outperforming state-of-the-art baselines by 2–10 %, demonstrating superior robustness and stability in complex fault diagnosis scenarios.

Details DOI

EAAI Journal 2026 Journal Article

Dual-stage interpretable domain generalization fault diagnosis: integrating prior knowledge and gradient-weighted class activation mapping

Ying Peng
Haidong Shao
Yiming Xiao
Jie Wang
Bin Liu

Recent advancements in domain generalization methods for fault diagnosis have achieved excellent performance. However, its inherent black-box characteristics seriously hinder its practical deployment in critical industrial scenarios. In addition, current cross-domain interpretability research often focuses on a single stage, resulting in an incomplete and unreliable understanding of model behavior. To overcome the above bottlenecks, this article proposes a dual-stage interpretable domain generalization fault diagnosis framework. In the first stage, a prior knowledge-guided feature extractor is constructed to extract steady-state and transient features from low- and high-frequency directions, thereby improving the model's ante-hoc interpretability. In the second stage, gradient-weighted class activation mapping is employed to visualize the class activation maps, revealing the attention regions during signal processing and enabling post-hoc interpretability analysis. The proposed method is validated using two distinct gearbox datasets, demonstrating superior performance in diagnostic accuracy and model interpretability compared to conventional domain generalization fault diagnosis approaches. In addition, the prior knowledge-guided feature extractor proves effective when integrated into other domain generalization models, and gradient-weighted class activation mapping proves to be a valuable tool for post-hoc interpretability assessment in the field of domain generalization fault diagnosis.

Details DOI

AAAI Conference 2026 Conference Paper

Mitigating Hallucinations in Large Language Models via Causal Reasoning

Yuangang Li
Yiqing Shen
Yi Nian
Jiechao Gao
Ziyi Wang
Chenxiao Yu
Li Li
Jie Wang

Large language models (LLMs) exhibit logically inconsistent hallucinations that appear coherent yet violate reasoning principles, with recent research suggesting an inverse relationship between causal reasoning capabilities and such hallucinations. However, existing reasoning approaches in LLMs, such as Chain-of-Thought (CoT) and its graph-based variants, operate at the linguistic token level rather than modeling the underlying causal relationships between variables, lacking the ability to represent conditional independencies or satisfy causal identification assumptions. To bridge this gap, we introduce causal-DAG construction and reasoning (CDCR-SFT), a supervised fine-tuning framework that trains LLMs to explicitly construct variable-level directed acyclic graph (DAG) and then perform reasoning over it. Moreover, we present a dataset comprising 25,368 samples (CausalDR), where each sample includes an input question, explicit causal DAG, graph-based reasoning trace, and validated answer. Experiments on four LLMs across eight tasks show that CDCR-SFT improves the causal reasoning capability with the state-of-the-art 95.33% accuracy on CLADDER (surpassing human performance of 94.8% for the first time) and reduces the hallucination on HaluEval with 10% improvements. It demonstrates that explicit causal structure modeling in LLMs can effectively mitigate logical inconsistencies in LLM outputs.

PDF Details DOI

JBHI Journal 2026 Journal Article

RT-SAM: Visual-Prompt Fusion and Uncertainty Enhancement for Nasopharyngeal Carcinoma Radiotherapy Target Delineation

Hee Guan Khor
Xin Yang
Yihua Sun
Sijuan Huang
Yingni Wang
Jie Wang
Shaobin Wang
Lu Bai

Precise delineation of the clinical target volume (CTV) and nodal CTV (CTV $_{{\mathit{nd}}}$ ) is crucial for effective radiotherapy planning in nasopharyngeal carcinoma (NPC). Manual contouring is labor-intensive and subject to substantial inter-observer variability, particularly in regions with complex anatomy and indistinct boundaries. This study presents RT-SAM, a novel framework that adapts the Medical Segment Anything Model 2 (MedSAM-2) for automated CTV (i. e. , primary CTV and CTV $_{nd}$ ) contouring in NPC computed tomography (CT) images. The framework synergistically integrates a generalist foundation model (MedSAM-2) with a domain-specific specialist network (2D U-Net) through three principal contributions: (1) automated generation of multi-modal prompts—comprising mask, bounding box, and point representations—derived from specialist network predictions to guide the generalist model; (2) a Visual-Prompt Fusion Attention (ViPFA) mechanism that optimizes feature-prompt interactions through bidirectional cross-modal attention; and (3) an Uncertainty-Enhanced Prediction Adjustment (UEPA) mechanism that enhances model robustness via confidence-based refinement and selective domain adaptation. Comprehensive evaluation on a multi-center cohort of 256 clinical NPC cases from Sun Yat-sen University Cancer Center and 212 public NPC cases from the SegRap2025 lymph node CTV dataset using 5- fold cross-validation demonstrates that RT-SAM achieves a mean DICE coefficient of 0. 796 $\pm$ 0. 033 (mean $\pm$ standard deviation), significantly outperforming current state-of-the-art methods. Clinical validation by eight radiation oncologists demonstrates that RT-SAM contours are clinically indistinguishable from expert delineations in blinded Turing assessments, achieve superior quality ratings in 75% of comparisons with mean scores of 2. 73 for RT-SAM versus 2. 66 for manual expert contours, and attain clinically acceptable ratings in over 97% of cases. These results demonstrate that RT-SAM is a clinically feasible solution for automated CTV contouring, with strong potential to standardize treatment planning and mitigate inter-observer variability in NPC radiotherapy.

Details DOI

AAAI Conference 2026 Conference Paper

S2-Boost: Synergistic Semantic Boosting for Coarse-to-Fine Ensemble Learning

Guanxiong He
Zheng Wang
Jie Wang
Liaoyuan Tang
Rong Wang
Feiping Nie

Neuroscientific evidence reveals that human visual recognition is not an instantaneous event but a hierarchical process, where the brain constructs a holistic perception by progressively integrating simple features like edges or texture into complex scenes. Ensemble learning successfully utilizes this principle, yet existing methods typically integrate models at the decision level, neglecting the rich, complementary information within the feature space itself and thus fundamentally limiting their potential. To address this, we introduce Synergistic Semantic Boosting (S2-Boosting), a framework that employs a self-supervised hierarchical semantic learning module to decompose an image into complementary, semantically meaningful parts autonomously. These parts guide a boosting procedure where a sequence of specialized learners, each focusing on a specific semantic partition, collaboratively corrects the ensemble's errors. We further present encouraging results on real-world image datasets, highlighting the intrinsic interpretability, paving the way for more robust and transparent models.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Towards Federated Clustering: A Client-wise Private Graph Aggregation Framework

Guanxiong He
Zheng Wang
Jie Wang
Liaoyuan Tang
Rong Wang
Feiping Nie

Federated clustering addresses the critical challenge of extracting patterns from decentralized, unlabeled data. However, it is hampered by the flaw that current approaches are forced to accept a compromise between performance and privacy: transmitting embedding representations risks sensitive data leakage, while sharing only abstract cluster prototypes leads to diminished model accuracy. To resolve this dilemma, we propose Structural Privacy-Preserving Federated Graph Clustering (SPP-FGC), a novel algorithm that innovatively leverages local structural graphs as the primary medium for privacy-preserving knowledge sharing, thus moving beyond the limitations of conventional techniques. Our framework operates on a clear client-server logic; on the client-side, each participant constructs a private structural graph that captures intrinsic data relationships, which the server then securely aggregates and aligns to form a comprehensive global graph from which a unified clustering structure is derived. The framework offers two distinct modes to suit different needs. SPP-FGC is designed as an efficient one-shot method that completes its task in a single communication round, ideal for rapid analysis. For more complex, unstructured data like images, SPP-FGC+ employs an iterative process where clients and the server collaboratively refine feature representations to achieve superior downstream performance. Extensive experiments demonstrate that our framework achieves state-of-the-art performance, improving clustering accuracy by up to 10% (NMI) over federated baselines while maintaining provable privacy guarantees.

PDF Details DOI

AAAI Conference 2025 System Paper

A Multi-Style Chinese Characters Writing Intelligent Tool Based on Small-scale Training Data

Zhen Zeng
Jie Wang
Xi Lyu

Chinese characters are a unique blend of language and art, featuring diverse artistic styles. Mastering these styles requires extensive practice and limits public participation. To encourage broader participation, we developed a real-time, interactive tool that supports multiple Chinese character art styles. This tool uses a diffusion model and several LoRA models to capture the diversity of Chinese character art. It generates personalized, visually striking Chinese character artworks in real-time by utilizing handwritten input, allowing users to adjust various stylistic parameters.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Accurate KV Cache Eviction via Anchor Direction Projection for Efficient LLM Inference

Zijie Geng
Jie Wang
Ziqi Liu
Feng Ju
Yiming Li
Xing Li
Mingxuan Yuan
Jianye Hao

Key-Value (KV) cache eviction---which retains the KV pairs of the most important tokens while discarding less important ones---is a critical technique for optimizing both memory usage and inference latency in large language models (LLMs). However, existing approaches often rely on simple heuristics---such as attention weights---to measure token importance, overlooking the spatial relationships between token value states in the vector space. This often leads to suboptimal token selections and thus performance degradation. To tackle this problem, we propose a novel method, namely **AnDPro** (**An**chor **D**irection **Pro**jection), which introduces a projection-based scoring function to more accurately measure token importance. Specifically, AnDPro operates in the space of value vectors and leverages the projections of these vectors onto an *``Anchor Direction''*---the direction of the pre-eviction output---to measure token importance and guide more accurate token selection. Experiments on $16$ datasets from the LongBench benchmark demonstrate that AnDPro can maintain $96. 07\\%$ of the full cache accuracy using only $3. 44\\%$ KV cache budget, reducing KV cache budget size by $46. 0\\%$ without compromising quality compared to previous state-of-the-arts.