Author name cluster

Cheng Yan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

2 author rows

EAAI Journal 2026 Journal Article

A cross-domain rotating machinery fault diagnosis based on multi-source information progressive domain adaptation network

Linlin Xue
Wanyang Zhang
Yukun Huang
Zhengkun Xue
Guangpeng Xing
Huantong Lu
Cheng Yan
Huageng Luo

Details DOI

EAAI Journal 2026 Journal Article

Application of machine learning to mechanical properties of composite materials: A ten-year review (2015–2025)

Jinjing Zhu
Ruixiang Bai
Yaoxing Xu
Jiantong Wang
Longchao He
Zhenkun Lei
Cheng Yan

Details DOI

AAAI Conference 2026 Conference Paper

Breaking Model Lock-in: Cost-Efficient Zero-Shot LLM Routing via a Universal Latent Space

Cheng Yan
Wuyang Zhang
Zhiyuan Ning
Fan Xu
Ziyang Tao
Lu Zhang
Bing Yin
Yanyong Zhang

The rapid proliferation of Large Language Models (LLMs) has led to a fragmented and inefficient ecosystem, a state of ``model lock-in'' where seamlessly integrating novel models remains a significant bottleneck. Current routing frameworks require exhaustive, costly retraining, hindering scalability and adaptability. We introduce ZeroRouter, a new paradigm for LLM routing that breaks this lock-in. Our approach is founded on a universal latent space, a model-agnostic representation of query difficulty that fundamentally decouples the characterization of a query from the profiling of a model. This allows for zero-shot onboarding of new models without full-scale retraining. ZeroRouter features a context-aware predictor that maps queries to this universal space and a dual-mode optimizer that balances accuracy, cost, and latency. Our framework consistently outperforms all baselines, delivering higher accuracy at lower cost and latency.

PDF Details DOI

JBHI Journal 2025 Journal Article

Drug Repositioning Based on Expert Knowledge Augmented Graph Neural Network

Zhenpeng Wu
Cheng Yan
Jiamin Chen
Siyang Xiao
Jianliang Gao

Drug repositioning is critical in accelerating drug discovery, which identifies new indications for existing drugs by modeling drug–disease associations. Compared to traditional methods, graph neural networks (GNNs) have recently gained widespread attention due to their ability to effectively aggregate information from neighboring nodes in drug–disease heterogeneous graphs. The GNN-based methods need effective node embeddings for information aggregation. However, they generate the node embeddings by random initialization, rather than incorporating the high-quality expert knowledge involving biological mechanisms in the databases. This limits their capacity to generate interpretable node embeddings aligned with expert knowledge. To bridge this gap, we develop a novel framework dubbed DReKGNN ( D rug Re positioning based on expert K nowledge augmented G raph N eural N etwork). To be specific, DReKGNN will first adopt large language models (LLMs) as a semantic bridge between expert knowledge and GNNs. To ensure the accuracy of expert knowledge, DReKGNN does not rely on prompt templates in LLMs to generate knowledge descriptions for drugs and diseases. Instead, it extracts expert knowledge directly from the DrugBank and OMIM databases. The effective node embeddings with interpretable semantic information will be generated from expert knowledge descriptions involving biological mechanisms by LLMs. Then, we demonstrate that there is a need to mitigate noise when LLM node embeddings serving drug repositioning prediction tasks. Considering this design need, we integrate GNNs with LLM node embeddings by a mean aggregation strategy. The experiment results of performance comparison and case study show the effectiveness of DReKGNN in predicting drug–disease associations. The code is available at https://github.com/csubigdata-Organization/DReKGNN.

Details DOI

ICML Conference 2025 Conference Paper

Information Bottleneck-guided MLPs for Robust Spatial-temporal Forecasting

Min Chen
Guansong Pang
Wenjun Wang 0002
Cheng Yan

Spatial-temporal forecasting (STF) plays a pivotal role in urban planning and computing. Spatial-Temporal Graph Neural Networks (STGNNs) excel at modeling spatial-temporal dynamics, thus being robust against noise perturbations. However, they often suffer from relatively poor computational efficiency. Simplifying the architectures can improve efficiency but also weakens robustness with respect to noise interference. In this study, we investigate the problem: can simple neural networks such as Multi-Layer Perceptrons (MLPs) achieve robust spatial-temporal forecasting while remaining efficient? To this end, we first reveal the dual noise effect in spatial-temporal data and propose a theoretically grounded principle termed Robust Spatial-Temporal Information Bottleneck (RSTIB), which holds strong potential for improving model robustness. We then design an implementation named RSTIB-MLP, together with a new training regime incorporating a knowledge distillation module, to enhance the robustness of MLPs for STF while maintaining their efficiency. Comprehensive experiments demonstrate that RSTIB-MLP achieves an excellent trade-off between robustness and efficiency, outperforming state-of-the-art STGNNs and MLP-based models. Our code is publicly available at: https: //github. com/mchen644/RSTIB.

Details

NeurIPS Conference 2025 Conference Paper

LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought

Cheng Yan
Felix Mohr
Tom Viering

Sample-wise learning curves plot performance versus training set size. They are useful for studying scaling laws and speeding up hyperparameter tuning and model selection. Learning curves are often assumed to be well-behaved: monotone (i. e. improving with more data) and convex. By constructing the Learning Curves Database 1. 1 (LCDB 1. 1), a large-scale database with high-resolution learning curves including more modern learners (CatBoost, TabNet, RealMLP, and TabPFN), we show that learning curves are less often well-behaved than previously thought. Using statistically rigorous methods, we observe significant ill-behavior in approximately 15% of the learning curves, almost twice as much as in previous estimates. We also identify which learners are to blame and show that specific learners are more ill-behaved than others. Additionally, we demonstrate that different feature scalings rarely resolve ill-behavior. We evaluate the impact of ill-behavior on downstream tasks, such as learning curve fitting and model selection, and find it poses significant challenges, underscoring the relevance and potential of LCDB 1. 1 as a challenging benchmark for future research.

PDF Details

NeurIPS Conference 2021 Conference Paper

Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection

Jingjing Li
Wei Ji
Qi Bi
Cheng Yan
Miao Zhang
Yongri Piao
Huchuan Lu
Li Cheng

Training saliency detection models with weak supervisions, e. g. , image-level tags or captions, is appealing as it removes the costly demand of per-pixel annotations. Despite the rapid progress of RGB-D saliency detection in fully-supervised setting, it however remains an unexplored territory when only weak supervision signals are available. This paper is set to tackle the problem of weakly-supervised RGB-D salient object detection. The key insight in this effort is the idea of maintaining per-pixel pseudo-labels with iterative refinements by reconciling the multimodal input signals in our joint semantic mining (JSM). Considering the large variations in the raw depth map and the lack of explicit pixel-level supervisions, we propose spatial semantic modeling (SSM) to capture saliency-specific depth cues from the raw depth and produce depth-refined pseudo-labels. Moreover, tags and captions are incorporated via a fill-in-the-blank training in our textual semantic modeling (TSM) to estimate the confidences of competing pseudo-labels. At test time, our model involves only a light-weight sub-network of the training pipeline, i. e. , it requires only an RGB image as input, thus allowing efficient inference. Extensive evaluations demonstrate the effectiveness of our approach under the weakly-supervised setting. Importantly, our method could also be adapted to work in both fully-supervised and unsupervised paradigms. In each of these scenarios, superior performance has been attained by our approach with comparing to the state-of-the-art dedicated methods. As a by-product, a CapS dataset is constructed by augmenting existing benchmark training set with additional image tags and captions.

PDF Details

YNIMG Journal 2013 Journal Article

Continuously tracing brain-wide long-distance axonal projections in mice at a one-micron voxel resolution

Hui Gong
Shaoqun Zeng
Cheng Yan
Xiaohua Lv
Zhongqin Yang
Tonghui Xu
Zhao Feng
Wenxiang Ding

Details DOI