Author name cluster

Kuo Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

2 author rows

AAAI Conference 2025 Conference Paper

KGCRR: An Effective Metric-Driven Knowledge Graph Completion Framework by Designing a Novel Upper Bound Function with Adaptive Approximation to Reciprocal Rank

Kuan Xu
Kuo Yang
Jian Liu
Xiangkui Lu
Jun Wu
Xuezhong Zhou

Knowledge Graph Embedding (KGE) methods have achieved great success in predicting missing links in knowledge graphs, a task also known as Knowledge Graph Completion (KGC). Under this task, the Reciprocal Rank (RR) of ground-truth items serve as a key indicator for evaluating the method’s performance. However, most existing studies have overlooked the inconsistency between the ranking metric, RR, and the optimization objective functions, resulting in sub-optimal KGC performance. To address this issue, we propose a KGC framework called KGCRR by designing a novel upper bound function named CRR. By introducing the parameter-pressure ρ to shift the sigmoid function, CRR achieves a better approximation to RR compared with existing objective functions. We theoretically proved that by adjusting ρ, CRR can achieve a more effective approximation to RR. By narrowing the discrepancy with RR and alleviating the gradient vanishing issue associated with the direct optimization of RR loss, CRR demonstrates an advantage in optimizing RR. CRR serves as a plug-and-play objective, capable of seamless integration into various KGE methods. Through extensive experiments conducted on FB15k-237 and WN18RR datasets, we have obtained promising results, with an average improvement of 19.06% in MRR, indicating that CRR significantly enhances the performance of existing methods.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Less but More: Linear Adaptive Graph Learning Empowering Spatiotemporal Forecasting

Jiaming Ma
Binwu Wang
Guanjun Wang
Kuo Yang
Zhengyang Zhou
Pengkun Wang
Xu Wang
Yang Wang

The effectiveness of Spatiotemporal Graph Neural Networks (STGNNs) critically hinges on the quality of the underlying graph topology. While end-to-end adaptive graph learning methods have demonstrated promising results in capturing latent spatiotemporal dependencies, they often suffer from high computational complexity and limited expressive capacity. In this paper, we propose MAGE for efficient spatiotemporal forecasting. We first conduct a theoretical analysis demonstrating that the ReLU activation function employed in existing methods amplifies edge-level noise during graph topology learning, thereby compromising the fidelity of the learned graph structures. To enhance model expressiveness, we introduce a sparse yet balanced mixture-of-experts strategy, where each expert perceives the unique underlying graph through kernel-based functions and operates with linear complexity relative to the number of nodes. The sparsity mechanism ensures that each node interacts exclusively with compatible experts, while the balancing mechanism promotes uniform activation across all experts, enabling diverse and adaptive graph representations. Furthermore, we theoretically establish that a single graph convolution using the learned graph in MAGE is mathematically equivalent to multiple convolutional steps under conventional graphs. We evaluate MAGE against advanced baselines on multiple real-world spatiotemporal datasets. MAGE achieves competitive performance while maintaining strong computational efficiency.

PDF Details

NeurIPS Conference 2025 Conference Paper

Many Minds, One Goal: Time Series Forecasting via Sub-task Specialization and Inter-agent Cooperation

Qihe Huang
Zhengyang Zhou
Yangze Li
Kuo Yang
Binwu Wang
Yang Wang

Time series forecasting is a critical and complex task, characterized by diverse temporal patterns, varying statistical properties, and different prediction horizons across datasets and domains. Conventional approaches typically rely on a single, unified model architecture to handle all forecasting scenarios. However, such monolithic models struggle to generalize across dynamically evolving time series with shifting patterns. In reality, different types of time series may require distinct modeling strategies. Some benefit from homogeneous multi-scale forecasting awareness, while others rely on more complex and heterogeneous signal perception. Relying on a single model to capture all temporal diversity and structural variations leads to limited performance and poor interpretability. To address this challenge, we propose a Multi-Agent Forecasting System (MAFS) that abandons the one-size-fits-all paradigm. MAFS decomposes the forecasting task into multiple sub-tasks, each handled by a dedicated agent trained on specific temporal perspectives (e. g. , different forecasting resolutions or signal characteristics). Furthermore, to achieve holistic forecasting, agents share and refine information through different communication topology, enabling cooperative reasoning across different temporal views. A lightweight voting aggregator then integrates their outputs into consistent final predictions. Extensive experiments across 11 benchmarks demonstrate that MAFS significantly outperforms traditional single-model approaches, yielding more robust and adaptable forecasts.

PDF Details

IJCAI Conference 2025 Conference Paper

Revealing Concept Shift in Spatio-Temporal Graphs via State Learning

Kuo Yang
Yunhe Guo
Qihe Huang
Zhengyang Zhou
Yang Wang

Dynamic graphs are ubiquitous in the real world, presenting the temporal evolution of individuals within spatial associations. Recently, dynamic graph learning research is flourishing, striving to more effectively capture evolutionary patterns and spatial correlations. However, existing methods still fail to address the issue of concept shift in dynamic graphs. Concept shift manifests as a distribution shift in the mapping pattern between historical observations and future evolution. The reason is that some environment variables in dynamic graphs exert varying effects on evolution patterns, but these variables are not effectively captured by the models, leading to the intractable concept shift issue. To tackle this issue, we propose a State-driven environment inference framework (Samen) to achieve a dynamic graph learning framework equipped with concept generalization ability. Firstly, we propose a two-stage environment inference and compression strategy. From the perspective of state space, we introduce a prefix-suffix collaborative state learning mechanism to bidirectionally model the spatio-temporal states. A hierarchical state compressor is further designed to refine the state information resulting in concept shift. Secondly, we propose a skip-connection spatio-temporal prediction module, which effectively utilizes the inferred environments to improve the model's generalization capability. Finally, we select seven datasets from different domains to validate the effectiveness of our model. By comparing the performance of different models on samples with concept shift, we verify that our Samen gains generalization capacity that existing methods fail to capture.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

The Indra Representation Hypothesis for Multimodal Alignment

Jianglin Lu
Hailing Wang
Kuo Yang
Yitian Zhang
Simon Jenni
Yun Fu

Recent studies have uncovered an interesting phenomenon: unimodal foundation models tend to learn convergent representations, regardless of differences in architecture, training objectives, or data modalities. However, these representations are essentially internal abstractions of samples that characterize samples independently, leading to limited expressiveness. In this paper, we propose The Indra Representation Hypothesis, inspired by the philosophical metaphor of Indra’s Net. We argue that representations from unimodal foundation models are converging to implicitly reflect a shared relational structure underlying reality, akin to the relational ontology of Indra’s Net. We formalize this hypothesis using the V-enriched Yoneda embedding from category theory, defining the Indra representation as a relational profile of each sample with respect to others. This formulation is shown to be unique, complete, and structure-preserving under a given cost function. We instantiate the Indra representation using angular distance and evaluate it in cross-model and cross-modal scenarios involving vision, language, and audio. Extensive experiments demonstrate that Indra representations consistently enhance robustness and alignment across architectures and modalities, providing a theoretically grounded and practical framework for training-free alignment of unimodal foundation models. Our code is available at https: //github. com/Jianglin954/Indra.

PDF Details

ICLR Conference 2024 Conference Paper

Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis

Kai Chen 0023
Chunwei Wang
Kuo Yang
Jianhua Han
Lanqing Hong
Fei Mi
Hang Xu 0004
Zhengying Liu

The rapid development of large language models (LLMs) has not only provided numerous opportunities but also presented significant challenges. This becomes particularly evident when LLMs inadvertently generate harmful or toxic content, either unintentionally or because of intentional inducement. Existing alignment methods usually direct LLMs toward the favorable outcomes by utilizing human-annotated, flawless instruction-response pairs. Conversely, this study proposes a novel alignment technique based on mistake analysis, which deliberately exposes LLMs to erroneous content to learn the reasons for mistakes and how to avoid them. In this case, mistakes are repurposed into valuable data for alignment, effectively helping to avoid the production of erroneous responses. Without external models or human annotations, our method leverages a model's intrinsic ability to discern undesirable mistakes and improves the safety of its generated responses. Experimental results reveal that our method outperforms existing alignment approaches in enhancing model safety while maintaining the overall utility.

Details

NeurIPS Conference 2024 Conference Paper

Improving Generalization of Dynamic Graph Learning via Environment Prompt

Kuo Yang
Zhengyang Zhou
Qihe Huang
Limin Li
Yuxuan Liang
Yang Wang

Out-of-distribution (OOD) generalization issue is a well-known challenge within deep learning tasks. In dynamic graphs, the change of temporal environments is regarded as the main cause of data distribution shift. While numerous OOD studies focusing on environment factors have achieved remarkable performance, they still fail to systematically solve the two issue of environment inference and utilization. In this work, we propose a novel dynamic graph learning model named EpoD based on prompt learning and structural causal model to comprehensively enhance both environment inference and utilization. Inspired by the superior performance of prompt learning in understanding underlying semantic and causal associations, we first design a self-prompted learning mechanism to infer unseen environment factors. We then rethink the role of environment variable within spatio-temporal causal structure model, and introduce a novel causal pathway where dynamic subgraphs serve as mediating variables. The extracted dynamic subgraph can effectively capture the data distribution shift by incorporating the inferred environment variables into the node-wise dependencies. Theoretical discussions and intuitive analysis support the generalizability and interpretability of EpoD. Extensive experiments on seven real-world datasets across domains showcase the superiority of EpoD against baselines, and toy example experiments further verify the powerful interpretability and rationality of our EpoD.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

LeRet: Language-Empowered Retentive Network for Time Series Forecasting

Qihe Huang
Zhengyang Zhou
Kuo Yang
Gengyu Lin
Zhongchao Yi
Yang Wang

Time series forecasting (TSF) plays a pivotal role in many real-world applications. Recently, the utilization of Large Language Models (LLM) in TSF has demonstrated exceptional predictive performance, surpassing most task-specific forecasting models. The success of LLM-based forecasting methods underscores the importance of causal dependence modeling and pre-trained knowledge transfer. However, challenges persist in directly applying LLM to TSF, i. e. , the unacceptable parameter scales for resource-intensive model optimization, and the significant gap of feature space between structural numerical time series and natural language. To this end, we propose LeRet, a Language-empowered Retentive network for TSF. Technically, inspired by the causal extraction in LLM, we propose a causal dependence learner, enhanced by a patch-level pre-training task, to capture sequential causal evolution. To minimize the gap between numeric and language, we initialize a language description protocol for time series and design a TS-related language knowledge extractor to learn from language description, avoiding training with large-scale parameters. Finally, we dedicatedly achieve a Language-TS Modality Integrator for the fusion of two types data, and enable language-empowered sequence forecasting. Extensive evaluations demonstrate the effectiveness of our LeRet, especially reveal superiority on few-shot, and zero-shot forecasting tasks. Code is available at https: //github. com/hqh0728/LeRet.

PDF Details DOI

EAAI Journal 2024 Journal Article

Photovoltaic power forecasting: A dual-attention gated recurrent unit framework incorporating weather clustering and transfer learning strategy

Yugui Tang
Kuo Yang
Shujing Zhang
Zhen Zhang

Accurate forecasting of photovoltaic power is essential in the integration, operation, and scheduling of hybrid energy systems. However, modeling for newly built photovoltaic sites is restricted by insufficient training data and computational burden. In this study, a weather clustering-based photovoltaic power forecasting framework incorporating attention mechanism and transfer learning strategy is proposed. By clustering historical days into multiple weather types, the gated recurrent unit-based encoder-decoders with dual-attention mechanism are designed to predict the photovoltaic power generations. The input attention and temporal attention mechanism are responsible for rebuilding input variables and context vectors of the encoder-decoder structure, respectively. Furthermore, a knowledge-transferring strategy, which focuses on establishing an alignment mapping module between the pre-trained structure and the target domain data, is designed for overcoming insufficient data of newly built sites. The data from the actual photovoltaic system are acquired to validate the proposed framework. The proposed forecasting model presents superior performance than other benchmark models, and the knowledge-transferring strategy not only addresses data shortage but also significantly accelerates the training process. With the introduction of knowledge-transferring, the maximum improvement in forecasting accuracy and training efficiency reaches 67. 40% and 59. 10%.

Details DOI

ICLR Conference 2022 Conference Paper

Memory Replay with Data Compression for Continual Learning

Liyuan Wang
Xingxing Zhang 0001
Kuo Yang
Longhui Yu
Chongxuan Li
Lanqing Hong
Shifeng Zhang
Zhenguo Li

Continual learning needs to overcome catastrophic forgetting of the past. Memory replay of representative old training samples has been shown as an effective solution, and achieves the state-of-the-art (SOTA) performance. However, existing work is mainly built on a small memory buffer containing a few original data, which cannot fully characterize the old data distribution. In this work, we propose memory replay with data compression to reduce the storage cost of old training samples and thus increase their amount that can be stored in the memory buffer. Observing that the trade-off between the quality and quantity of compressed data is highly nontrivial for the efficacy of memory replay, we propose a novel method based on determinantal point processes (DPPs) to efficiently determine an appropriate compression quality for currently-arrived training samples. In this way, using a naive data compression algorithm with a properly selected quality can largely boost recent strong baselines by saving more compressed data in a limited storage space. We extensively validate this across several benchmarks of class-incremental learning and in a realistic scenario of object detection for autonomous driving.

Details

AIIM Journal 2020 Journal Article

Disease phenotype synonymous prediction through network representation learning from PubMed database

Shiwen Ma
Kuo Yang
Ning Wang
Qiang Zhu
Zhuye Gao
Runshun Zhang
Baoyan Liu
Xuezhong Zhou

Synonym mapping between phenotype concepts from different terminologies is difficult because terminology databases have been developed largely independently. Existing maps of synonymous phenotype concepts from different terminology databases are highly incomplete, and manually mapping is time consuming and laborious. Therefore, building an automatic method for predictive mapping of synonymous phenotypes is of special importance. We propose a classifier-based phenotype mapping prediction model (CPM) to predict synonymous relationships between phenotype concepts from different terminology databases. The model takes network semantic representations of phenotypes as input and predicts synonymous relationships by training binary classifiers with a voting strategy. We compared the performance of the CPM with a similarity-based phenotype mapping prediction model (SPM), which predicts mapping based on the ranked cosine similarity of candidate mapping concepts. Based on a network representation N2V-TFIDF, with a majority voting strategy method MV, the CPM achieved accuracy of 0. 943, which was 15. 4% higher than that of the SPM using the cosine similarity method (0. 789) and 23. 8% higher than that of the SSDTM method (0. 724) proposed in our previous work.

Details DOI

JBHI Journal 2019 Journal Article

HerGePred: Heterogeneous Network Embedding Representation for Disease Gene Prediction

Kuo Yang
Ruyu Wang
Guangming Liu
Zixin Shu
Ning Wang
Runshun Zhang
Jian Yu
Jianxin Chen

The discovery of disease-causing genes is a critical step towards understanding the nature of a disease and determining a possible cure for it. In recent years, many computational methods to identify disease genes have been proposed. However, making full use of disease-related (e. g. , symptoms) and gene-related (e. g. , gene ontology and protein-protein interactions) information to improve the performance of disease gene prediction is still an issue. Here, we develop a heterogeneous disease-gene-related network (HDGN) embedding representation framework for disease gene prediction (called HerGePred). Based on this framework, a low-dimensional vector representation (LVR) of the nodes in the HDGN can be obtained. Then, we propose two specific algorithms, namely, an LVR-based similarity prediction and a random walk with restart on a reconstructed heterogeneous disease-gene network (RWRDGN), to predict disease genes with high performance. First, to validate the rationality of the framework, we analyze the similarity-based overlap distribution of disease pairs and design an experiment for disease-gene association recovery, the results of which revealed that the LVR of nodes performs well at preserving the local and global network structure of the HDGN. Then, we apply tenfold cross validation and external validation to compare our methods with other well-known disease gene prediction algorithms. The experimental results show that the RW-RDGN performs better than the state-of-the-art algorithm. The prediction results of disease candidate genes are essential for molecular mechanism investigation and experimental validation. The source codes of HerGePred and experimental data are available at https://github.com/yangkuoone/HerGePred.

Details DOI