Author name cluster

Han Yu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

78 papers

2 author rows

AAAI Conference 2026 Conference Paper

Enhancing Logical Expressiveness in Graph Neural Networks via Path-Neighbor Aggregation

Han Yu
Xiaojuan Zhao
Aiping Li
Kai Chen
Ziniu Liu
Zhichao Peng

Graph neural networks (GNNs) can effectively model structural information of graphs, making them widely used in knowledge graph (KG) reasoning. However, existing studies on the expressive power of GNNs mainly focuses on simple single-relation graphs, and there is still insufficient discussion on the power of GNN to express logical rules in KGs. How to enhance the logical expressive power of GNNs is still a key issue. Motivated by this, we propose Path-Neighbor enhanced GNN (PN-GNN), a method to enhance the logical expressive power of GNN by aggregating node-neighbor embeddings on the reasoning path. First, we analyze the logical expressive power of existing GNN-based methods and point out the shortcomings of the expressive power of these methods. Then, we theoretically investigate the logical expressive power of PN-GNN, showing that it not only has strictly stronger expressive power than C-GNN but also that its (k+1)-hop logical expressiveness is strictly superior to that of k-hop. Finally, we evaluate the logical expressive power of PN-GNN on six synthetic datasets and two real-world datasets. Both theoretical analysis and extensive experiments confirm that PN-GNN enhances the expressive power of logical rules without compromising generalization, as evidenced by its competitive performance in KG reasoning tasks.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Error Slice Discovery via Manifold Compactness

Han Yu
Hao Zou
Jiashuo Liu
Renzhe Xu
Yue He
Xingxuan Zhang
Peng Cui

Despite the great performance of deep learning models in many areas, they still make mistakes and underperform on certain subsets of data, i.e. error slices. Given a trained model, it is important to identify its semantically coherent error slices that are easy to interpret, which is referred to as the error slice discovery problem. However, there is no proper metric of slice coherence without relying on extra information like predefined slice labels. Current evaluation of slice coherence requires access to predefined slices formulated by metadata like attributes or subclasses. Its validity heavily relies on the quality and abundance of metadata, where some possible patterns could be ignored. Besides, current algorithms cannot directly incorporate the constraint of coherence into their optimization objective due to absence of an explicit coherence metric, which could potentially hinder their effectiveness. In this paper, we propose manifold compactness, a coherence metric without reliance on extra information by incorporating the data geometry property into its design, and experiments on typical datasets empirically validate the rationality of the metric. Then we develop Manifold Compactness based error Slice Discovery (MCSD), a novel algorithm that directly treats risk and coherence as the optimization objective, and is flexible to be applied to models of various tasks. Extensive experiments on the benchmark and case studies on other typical datasets demonstrate the superiority of MCSD.

PDF Details DOI

AAAI Conference 2026 System Paper

Federated Learning Playground

Bryan Shan Guanrong
Alysa Ziying Tan
Han Yu

We present Federated Learning Playground, an interactive browser-based platform inspired by and extends TensorFlow Playground that teaches core Federated Learning (FL) concepts. Users can experiment with heterogeneous client data distributions, model hyperparameters, and aggregation algorithms directly in the browser without coding or system setup, and observe their effects on client and global models through real-time visualizations, gaining intuition for challenges such as non-IID data, local overfitting, and scalability. The playground serves as an easy to use educational tool, lowering the entry barrier for newcomers to distributed AI while also offering a sandbox for rapidly prototyping and comparing FL methods. By democratizing exploration of FL, it promotes broader understanding and adoption of this important paradigm.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Generating Risky Samples with Conformity Constraints via Diffusion Models

Han Yu
Hao Zou
Xingxuan Zhang
Zhengyi Wang
Yue He
Kehan Li
Peng Cui

Although neural networks achieve promising performance in many tasks, they may still fail when encountering some examples and bring about risks to applications. To discover risky samples, previous literature attempts to search for patterns of risky samples within existing datasets or inject perturbation into them. Yet in this way the diversity of risky samples is limited by the coverage of existing datasets. To overcome this limitation, recent works adopt diffusion models to produce new risky samples beyond the coverage of existing datasets. However, these methods struggle in the conformity between generated samples and expected categories, which could introduce label noise and severely limit their effectiveness in applications. To address this issue, we propose RiskyDiff that incorporates the embeddings of both texts and images as implicit constraints of category conformity. We also design a conformity score to further explicitly strengthen the category conformity, as well as introduce the mechanisms of embedding screening and risky gradient guidance to boost the risk of generated samples. Extensive experiments reveal that RiskyDiff greatly outperforms existing methods in terms of the degree of risk, generation quality, and conformity with conditioned categories. We also empirically show the generalization ability of the models can be enhanced by augmenting training data with generated samples of high conformity.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

A Reinforcement Learning-based Bidding Strategy for Data Consumers in Auction-based Federated Learning

Xiaoli Tang
Han Yu
Xiaoxiao Li

Auction-based Federated Learning (AFL) fosters collaboration among self-interested data consumers (DCs) and data owners (DOs). A major challenge in AFL pertains to how DCs select and bid for DOs. Existing methods are generally static, making them ill-suited for dynamic AFL markets. To address this issue, we propose the R}einforcement Learning-based Bidding Strategy for DCs in Auction-based Federated Learning (RLB-AFL). We incorporate historical states into a Deep Q-Network to capture sequential information critical for bidding decisions. To mitigate state space sparsity, where specific states rarely reoccur for each DC during auctions, we incorporate the Gaussian Mixture Model into RLB-AFL. This facilitates soft clustering on sequential states, reducing the state space dimensionality and easing exploration and action-value function approximation. In addition, we enhance the $\epsilon$-greedy policy to help the RLB-AFL agent balance exploitation and exploration, enabling it to be more adaptable in the AFL decision-making process. Extensive experiments under 6 widely used benchmark datasets demonstrate that RLB-AFL achieves superior performance compared to 8 state-of-the-art approaches. It outperforms the best baseline by 10. 56% and 3. 15% in terms of average total utility