Author name cluster

Xihong Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

2 author rows

ICML Conference 2025 Conference Paper

Automatically Identify and Rectify: Robust Deep Contrastive Multi-view Clustering in Noisy Scenarios

Xihong Yang
Siwei Wang 0001
Fangdi Wang
Jiaqi Jin
Suyuan Liu
Yue Liu 0008
En Zhu
Xinwang Liu 0002

Leveraging the powerful representation learning capabilities, deep multi-view clustering methods have demonstrated reliable performance by effectively integrating multi-source information from diverse views in recent years. Most existing methods rely on the assumption of clean views. However, noise is pervasive in real-world scenarios, leading to a significant degradation in performance. To tackle this problem, we propose a novel multi-view clustering framework for the automatic identification and rectification of noisy data, termed AIRMVC. Specifically, we reformulate noisy identification as an anomaly identification problem using GMM. We then design a hybrid rectification strategy to mitigate the adverse effects of noisy data based on the identification results. Furthermore, we introduce a noise-robust contrastive mechanism to generate reliable representations. Additionally, we provide a theoretical proof demonstrating that these representations can discard noisy information, thereby improving the performance of downstream tasks. Extensive experiments on six benchmark datasets demonstrate that AIRMVC outperforms state-of-the-art algorithms in terms of robustness in noisy scenarios. The code of AIRMVC are available at https: //github. com/xihongyang1999/AIRMVC on Github.

Details

ICML Conference 2025 Conference Paper

Hgformer: Hyperbolic Graph Transformer for Collaborative Filtering

Xin Yang 0041
Xingrun Li
Heng Chang
Jinze Yang
Xihong Yang
Shengyu Tao
Maiko Shigeno
Ningkang Chang

Recommender systems are increasingly spreading to different areas like e-commerce or video streaming to alleviate information overload. One of the most fundamental methods for recommendation is Collaborative Filtering (CF), which leverages historical user-item interactions to infer user preferences. In recent years, Graph Neural Networks (GNNs) have been extensively studied to capture graph structures in CF tasks. Despite this remarkable progress, local structure modeling and embedding distortion still remain two notable limitations in the majority of GNN-based CF methods. Therefore, in this paper, we propose a novel Hyperbolic Graph Transformer architecture, to tackle the long-tail problems in CF tasks. Specifically, the proposed framework is comprised of two essential modules: 1) Local Hyperbolic Graph Convolutional Network (LHGCN), which performs graph convolution entirely in the hyperbolic manifold and captures the local structure of each node; 2) Hyperbolic Transformer, which is comprised of hyperbolic cross-attention mechanisms to capture global information. Furthermore, to enable its feasibility on large-scale data, we introduce an unbiased approximation of the cross-attention for linear computational complexity, with a theoretical guarantee in approximation errors. Empirical experiments demonstrate that our proposed model outperforms the leading collaborative filtering methods and significantly mitigates the long-tail issue in CF tasks. Our implementations are available in https: //github. com/EnkiXin/Hgformer.

Details

AAAI Conference 2024 Conference Paper

Cross-Gate MLP with Protein Complex Invariant Embedding Is a One-Shot Antibody Designer

Cheng Tan
Zhangyang Gao
Lirong Wu
Jun Xia
Jiangbin Zheng
Xihong Yang
Yue Liu
Bozhen Hu

Antibodies are crucial proteins produced by the immune system in response to foreign substances or antigens. The specificity of an antibody is determined by its complementarity-determining regions (CDRs), which are located in the variable domains of the antibody chains and form the antigen-binding site. Previous studies have utilized complex techniques to generate CDRs, but they suffer from inadequate geometric modeling. Moreover, the common iterative refinement strategies lead to an inefficient inference. In this paper, we propose a simple yet effective model that can co-design 1D sequences and 3D structures of CDRs in a one-shot manner. To achieve this, we decouple the antibody CDR design problem into two stages: (i) geometric modeling of protein complex structures and (ii) sequence-structure co-learning. We develop a novel macromolecular structure invariant embedding, typically for protein complexes, that captures both intra- and inter-component interactions among the backbone atoms, including Calpha, N, C, and O atoms, to achieve comprehensive geometric modeling. Then, we introduce a simple cross-gate MLP for sequence-structure co-learning, allowing sequence and structure representations to implicitly refine each other. This enables our model to design desired sequences and structures in a one-shot manner. Extensive experiments are conducted to evaluate our results at both the sequence and structure level, which demonstrate that our model achieves superior performance compared to the state-of-the-art antibody CDR design methods.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Evaluate then Cooperate: Shapley-based View Cooperation Enhancement for Multi-view Clustering

Fangdi Wang
Jiaqi Jin
Jingtao Hu
Suyuan Liu
Xihong Yang
Siwei Wang
Xinwang Liu
En Zhu

The fundamental goal of deep multi-view clustering is to achieve preferable task performance through inter-view cooperation. Although numerous DMVC approaches have been proposed, the collaboration role of individual views have not been well investigated in existing literature. Moreover, how to further enhance view cooperation for better fusion still needs to be explored. In this paper, we firstly consider DMVC as an unsupervised cooperative game where each view can be regarded as a participant. Then, we introduce the Shapley value and propose a novel MVC framework termed Shapley-based Cooperation Enhancing Multi-view Clustering (SCE-MVC), which evaluates view cooperation with game theory. Specially, we employ the optimal transport distance between fused cluster distributions and single view component as the utility function for computing shapley values. Afterwards, we apply shapley values to assess the contribution of each view and utilize these contributions to promote view cooperation. Comprehensive experimental results well support the effectiveness of our framework adopting to existing DMVC frameworks, demonstrating the importance and necessity of enhancing the cooperation among views.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Sample-Level Cross-View Similarity Learning for Incomplete Multi-View Clustering

Suyuan Liu
Junpu Zhang
Yi Wen
Xihong Yang
Siwei Wang
Yi Zhang
En Zhu
Chang Tang

Incomplete multi-view clustering has attracted much attention due to its ability to handle partial multi-view data. Recently, similarity-based methods have been developed to explore the complete relationship among incomplete multi-view data. Although widely applied to partial scenarios, most of the existing approaches are still faced with two limitations. Firstly, fusing similarities constructed individually on each view fails to yield a complete unified similarity. Moreover, incomplete similarity generation may lead to anomalous similarity values with column sum constraints, affecting the final clustering results. To solve the above challenging issues, we propose a Sample-level Cross-view Similarity Learning (SCSL) method for Incomplete Multi-view Clustering. Specifically, we project all samples to the same dimension and simultaneously construct a complete similarity matrix across views based on the inter-view sample relationship and the intra-view sample relationship. In addition, a simultaneously learning consensus representation ensures the validity of the projection, which further enhances the quality of the similarity matrix through the graph Laplacian regularization. Experimental results on six benchmark datasets demonstrate the ability of SCSL in processing incomplete multi-view clustering tasks. Our code is publicly available at https://github.com/Tracesource/SCSL.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms. The code of CCGC is available at https://github.com/xihongyang1999/CCGC on Github.

PDF Details DOI

ICML Conference 2023 Conference Paper

Dink-Net: Neural Clustering on Large Graphs

Yue Liu 0008
Ke Liang 0006
Jun Xia 0001
Sihang Zhou 0001
Xihong Yang
Xinwang Liu 0002
Stan Z. Li

Deep graph clustering, which aims to group the nodes of a graph into disjoint clusters with deep neural networks, has achieved promising progress in recent years. However, the existing methods fail to scale to the large graph with million nodes. To solve this problem, a scalable deep graph clustering method (Dink-Net) is proposed with the idea of dilation and shrink. Firstly, by discriminating nodes, whether being corrupted by augmentations, representations are learned in a self-supervised manner. Meanwhile, the cluster centers are initialized as learnable neural parameters. Subsequently, the clustering distribution is optimized by minimizing the proposed cluster dilation loss and cluster shrink loss in an adversarial manner. By these settings, we unify the two-step clustering, i. e. , representation learning and clustering optimization, into an end-to-end framework, guiding the network to learn clustering-friendly features. Besides, Dink-Net scales well to large graphs since the designed loss functions adopt the mini-batch data to optimize the clustering distribution even without performance drops. Both experimental results and theoretical analyses demonstrate the superiority of our method. Compared to the runner-up, Dink-Net achieves $9. 62%$ NMI improvement on the ogbn-papers100M dataset with 111 million nodes and 1. 6 billion edges. The source code is released: https: //github. com/yueliu1999/Dink-Net. Besides, a collection (papers, codes, and datasets) of deep graph clustering is shared on GitHub https: //github. com/yueliu1999/Awesome-Deep-Graph-Clustering.

Details

AAAI Conference 2023 Conference Paper

Hard Sample Aware Network for Contrastive Deep Graph Clustering

Yue Liu
Xihong Yang
Sihang Zhou
Xinwang Liu
Zhen Wang
Ke Liang
Wenxuan Tu
Liang Li

Contrastive deep graph clustering, which aims to divide nodes into disjoint groups via contrastive mechanisms, is a challenging research spot. Among the recent works, hard sample mining-based algorithms have achieved great attention for their promising performance. However, we find that the existing hard sample mining methods have two problems as follows. 1) In the hardness measurement, the important structural information is overlooked for similarity calculation, degrading the representativeness of the selected hard negative samples. 2) Previous works merely focus on the hard negative sample pairs while neglecting the hard positive sample pairs. Nevertheless, samples within the same cluster but with low similarity should also be carefully learned. To solve the problems, we propose a novel contrastive deep graph clustering method dubbed Hard Sample Aware Network (HSAN) by introducing a comprehensive similarity measure criterion and a general dynamic sample weighing strategy. Concretely, in our algorithm, the similarities between samples are calculated by considering both the attribute embeddings and the structure embeddings, better revealing sample relationships and assisting hardness measurement. Moreover, under the guidance of the carefully collected high-confidence clustering information, our proposed weight modulating function will first recognize the positive and negative samples and then dynamically up-weight the hard sample pairs while down-weighting the easy ones. In this way, our method can mine not only the hard negative samples but also the hard positive sample, thus improving the discriminative capability of the samples further. Extensive experiments and analyses demonstrate the superiority and effectiveness of our proposed method. The source code of HSAN is shared at https://github.com/yueliu1999/HSAN and a collection (papers, codes and, datasets) of deep graph clustering is shared at https://github.com/yueliu1999/Awesome-Deep-Graph-Clustering on Github.

PDF Details DOI

AAAI Conference 2022 Conference Paper

Deep Graph Clustering via Dual Correlation Reduction

Yue Liu
Wenxuan Tu
Sihang Zhou
Xinwang Liu
Linxuan Song
Xihong Yang
En Zhu

Deep graph clustering, which aims to reveal the underlying graph structure and divide the nodes into different groups, has attracted intensive attention in recent years. However, we observe that, in the process of node encoding, existing methods suffer from representation collapse which tends to map all data into the same representation. Consequently, the discriminative capability of the node representation is limited, leading to unsatisfied clustering performance. To address this issue, we propose a novel self-supervised deep graph clustering method termed Dual Correlation Reduction Network (DCRN) by reducing information correlation in a dual manner. Specifically, in our method, we first design a siamese network to encode samples. Then by forcing the cross-view sample correlation matrix and cross-view feature correlation matrix to approximate two identity matrices, respectively, we reduce the information correlation in the dual-level, thus improving the discriminative capability of the resulting features. Moreover, in order to alleviate representation collapse caused by over-smoothing in GCN, we introduce a propagation regularization term to enable the network to gain long-distance information with the shallow network structure. Extensive experimental results on six benchmark datasets demonstrate the effectiveness of the proposed DCRN against the existing state-of-the-art methods. The code of DCRN is available at https: //github. com/yueliu1999/DCRN and a collection (papers, codes and, datasets) of deep graph clustering is shared at https: //github. com/yueliu1999/Awesome-Deep- Graph-Clustering on Github.

PDF Details