Author name cluster

Cheng Yan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

2 author rows

EAAI Journal 2026 Journal Article

A cross-domain rotating machinery fault diagnosis based on multi-source information progressive domain adaptation network

Linlin Xue
Wanyang Zhang
Yukun Huang
Zhengkun Xue
Guangpeng Xing
Huantong Lu
Cheng Yan
Huageng Luo

In industrial equipment fault diagnosis, the significant sample distribution differences between cross-domain data severely limit the generalization capability of the model. To address this, a multi-source information progressive domain adaptation network is proposed and developed in this paper. Unlike conventional methods that rely solely on feature-level alignment, the proposed method constructs a hierarchical alignment framework. Firstly, to eliminate the root cause of time-scale distortion, a pulse-aware data segmentation mechanism based on the resampling method and Teager-Kaiser energy operator is proposed. Based on the pulse positions of local energy bursts, similar vibration segments are intercepted, which effectively enhances sample consistency among different domains. Subsequently, considering the single shared feature extraction strategy is difficult to effectively model multi-domain common features, a shared-private feature extractor is designed. The private extractor is introduced to process each domain data independently, thus enhancing the cross-domain feature alignment effect. Further, a multi-granularity alignment loss is introduced during the training process, enabling optimization of domain-invariant feature extraction. Finally, a progressive domain adaptation strategy based on graph-structured pseudo label spread optimization and uncertainty measurement is proposed. By leveraging the data manifold structure, this strategy gradually incorporates high-quality samples from the target domain into the training, and the stable convergence of the model on the target domain is facilitated. The proposed method is experimentally verified on two vibration datasets. The results show that the proposed method achieves more than 99% diagnostic accuracy in several cross-domain transfer tasks, which is significantly better than that from the other state-of-the-art methods.

Details DOI

EAAI Journal 2026 Journal Article

Application of machine learning to mechanical properties of composite materials: A ten-year review (2015–2025)

Jinjing Zhu
Ruixiang Bai
Yaoxing Xu
Jiantong Wang
Longchao He
Zhenkun Lei
Cheng Yan

Artificial intelligence (AI), particularly machine learning (ML), has advanced rapidly and is increasingly applied across scientific domains. This paper presents a review of ML applications in predicting mechanical properties and optimizing designs of composite materials, analyzing 165 studies published between 2015 and 2025. Valued in aerospace for lightweight and high-strength properties, composites present modeling challenges due to nonlinear behavior, multi-scale characteristics, and manufacturing sensitivity that limit the accuracy and efficiency of traditional methods. ML techniques have demonstrated remarkable potential in overcoming these limitations through data-driven paradigms, particularly in multi-scale modeling. This study examines the evolutionary trajectory of supervised, unsupervised, reinforcement, and deep learning (DL) within algorithmic frameworks, with a focus on applications in multi-scale analysis of composite materials. At the microscale, ML enables modulus prediction, strength evaluation, and interface analysis via representative volume elements (RVE). For macroscale mechanical analysis, it improves predictions of nonlinear responses, damage progression, and impact behavior in laminates for the rapid optimization design of composite stiffened plates based on buckling and collapse load. Special attention is given to damage evolution, health monitoring, multi-physics field and multi-scale mechanical properties of composite materials. Future research directions, including the development of hybrid data-physics models, digital twin integration, and cross-disciplinary collaboration for enabling breakthroughs in intelligent composite design, are also prospected.

Details DOI

AAAI Conference 2026 Conference Paper

Breaking Model Lock-in: Cost-Efficient Zero-Shot LLM Routing via a Universal Latent Space

Cheng Yan
Wuyang Zhang
Zhiyuan Ning
Fan Xu
Ziyang Tao
Lu Zhang
Bing Yin
Yanyong Zhang

The rapid proliferation of Large Language Models (LLMs) has led to a fragmented and inefficient ecosystem, a state of ``model lock-in'' where seamlessly integrating novel models remains a significant bottleneck. Current routing frameworks require exhaustive, costly retraining, hindering scalability and adaptability. We introduce ZeroRouter, a new paradigm for LLM routing that breaks this lock-in. Our approach is founded on a universal latent space, a model-agnostic representation of query difficulty that fundamentally decouples the characterization of a query from the profiling of a model. This allows for zero-shot onboarding of new models without full-scale retraining. ZeroRouter features a context-aware predictor that maps queries to this universal space and a dual-mode optimizer that balances accuracy, cost, and latency. Our framework consistently outperforms all baselines, delivering higher accuracy at lower cost and latency.

PDF Details DOI

JBHI Journal 2025 Journal Article

Drug Repositioning Based on Expert Knowledge Augmented Graph Neural Network

Zhenpeng Wu
Cheng Yan
Jiamin Chen
Siyang Xiao
Jianliang Gao

Drug repositioning is critical in accelerating drug discovery, which identifies new indications for existing drugs by modeling drug–disease associations. Compared to traditional methods, graph neural networks (GNNs) have recently gained widespread attention due to their ability to effectively aggregate information from neighboring nodes in drug–disease heterogeneous graphs. The GNN-based methods need effective node embeddings for information aggregation. However, they generate the node embeddings by random initialization, rather than incorporating the high-quality expert knowledge involving biological mechanisms in the databases. This limits their capacity to generate interpretable node embeddings aligned with expert knowledge. To bridge this gap, we develop a novel framework dubbed DReKGNN ( D rug Re positioning based on expert K nowledge augmented G raph N eural N etwork). To be specific, DReKGNN will first adopt large language models (LLMs) as a semantic bridge between expert knowledge and GNNs. To ensure the accuracy of expert knowledge, DReKGNN does not rely on prompt templates in LLMs to generate knowledge descriptions for drugs and diseases. Instead, it extracts expert knowledge directly from the DrugBank and OMIM databases. The effective node embeddings with interpretable semantic information will be generated from expert knowledge descriptions involving biological mechanisms by LLMs. Then, we demonstrate that there is a need to mitigate noise when LLM node embeddings serving drug repositioning prediction tasks. Considering this design need, we integrate GNNs with LLM node embeddings by a mean aggregation strategy. The experiment results of performance comparison and case study show the effectiveness of DReKGNN in predicting drug–disease associations. The code is available at https://github.com/csubigdata-Organization/DReKGNN.

Details DOI

ICML Conference 2025 Conference Paper

Information Bottleneck-guided MLPs for Robust Spatial-temporal Forecasting

Min Chen
Guansong Pang
Wenjun Wang 0002
Cheng Yan

Spatial-temporal forecasting (STF) plays a pivotal role in urban planning and computing. Spatial-Temporal Graph Neural Networks (STGNNs) excel at modeling spatial-temporal dynamics, thus being robust against noise perturbations. However, they often suffer from relatively poor computational efficiency. Simplifying the architectures can improve efficiency but also weakens robustness with respect to noise interference. In this study, we investigate the problem: can simple neural networks such as Multi-Layer Perceptrons (MLPs) achieve robust spatial-temporal forecasting while remaining efficient? To this end, we first reveal the dual noise effect in spatial-temporal data and propose a theoretically grounded principle termed Robust Spatial-Temporal Information Bottleneck (RSTIB), which holds strong potential for improving model robustness. We then design an implementation named RSTIB-MLP, together with a new training regime incorporating a knowledge distillation module, to enhance the robustness of MLPs for STF while maintaining their efficiency. Comprehensive experiments demonstrate that RSTIB-MLP achieves an excellent trade-off between robustness and efficiency, outperforming state-of-the-art STGNNs and MLP-based models. Our code is publicly available at: https: //github. com/mchen644/RSTIB.

Details

NeurIPS Conference 2025 Conference Paper

LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought

Cheng Yan
Felix Mohr
Tom Viering

Sample-wise learning curves plot performance versus training set size. They are useful for studying scaling laws and speeding up hyperparameter tuning and model selection. Learning curves are often assumed to be well-behaved: monotone (i. e. improving with more data) and convex. By constructing the Learning Curves Database 1. 1 (LCDB 1. 1), a large-scale database with high-resolution learning curves including more modern learners (CatBoost, TabNet, RealMLP, and TabPFN), we show that learning curves are less often well-behaved than previously thought. Using statistically rigorous methods, we observe significant ill-behavior in approximately 15% of the learning curves, almost twice as much as in previous estimates. We also identify which learners are to blame and show that specific learners are more ill-behaved than others. Additionally, we demonstrate that different feature scalings rarely resolve ill-behavior. We evaluate the impact of ill-behavior on downstream tasks, such as learning curve fitting and model selection, and find it poses significant challenges, underscoring the relevance and potential of LCDB 1. 1 as a challenging benchmark for future research.

PDF Details

NeurIPS Conference 2021 Conference Paper

Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection

Jingjing Li
Wei Ji
Qi Bi
Cheng Yan
Miao Zhang
Yongri Piao
Huchuan Lu
Li Cheng

Training saliency detection models with weak supervisions, e. g. , image-level tags or captions, is appealing as it removes the costly demand of per-pixel annotations. Despite the rapid progress of RGB-D saliency detection in fully-supervised setting, it however remains an unexplored territory when only weak supervision signals are available. This paper is set to tackle the problem of weakly-supervised RGB-D salient object detection. The key insight in this effort is the idea of maintaining per-pixel pseudo-labels with iterative refinements by reconciling the multimodal input signals in our joint semantic mining (JSM). Considering the large variations in the raw depth map and the lack of explicit pixel-level supervisions, we propose spatial semantic modeling (SSM) to capture saliency-specific depth cues from the raw depth and produce depth-refined pseudo-labels. Moreover, tags and captions are incorporated via a fill-in-the-blank training in our textual semantic modeling (TSM) to estimate the confidences of competing pseudo-labels. At test time, our model involves only a light-weight sub-network of the training pipeline, i. e. , it requires only an RGB image as input, thus allowing efficient inference. Extensive evaluations demonstrate the effectiveness of our approach under the weakly-supervised setting. Importantly, our method could also be adapted to work in both fully-supervised and unsupervised paradigms. In each of these scenarios, superior performance has been attained by our approach with comparing to the state-of-the-art dedicated methods. As a by-product, a CapS dataset is constructed by augmenting existing benchmark training set with additional image tags and captions.

PDF Details

YNIMG Journal 2013 Journal Article

Continuously tracing brain-wide long-distance axonal projections in mice at a one-micron voxel resolution

Hui Gong
Shaoqun Zeng
Cheng Yan
Xiaohua Lv
Zhongqin Yang
Tonghui Xu
Zhao Feng
Wenxiang Ding

Revealing neural circuit mechanisms is critical for understanding brain functions. Significant progress in dissecting neural connections has been made using optical imaging with fluorescence labels, especially in dissecting local connections. However, acquiring and tracing brain-wide, long-distance neural circuits at the neurite level remains a substantial challenge. Here, we describe a whole-brain approach to systematically obtaining continuous neuronal pathways in a fluorescent protein transgenic mouse at a one-micron voxel resolution. This goal is achieved by combining a novel resin-embedding method for maintaining fluorescence, an automated fluorescence micro-optical sectioning tomography system for long-term stable imaging, and a digital reconstruction-registration-annotation pipeline for tracing the axonal pathways in the mouse brain. With the unprecedented ability to image a whole mouse brain at a one-micron voxel resolution, the long-distance pathways were traced minutely and without interruption for the first time. With advancing labeling techniques, our method is believed to open an avenue to exploring both local and long-distance neural circuits that are related to brain functions and brain diseases down to the neurite level.

Details DOI