Arrow Research search

Author name cluster

Bingheng Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers
2 author rows

Possible papers

9

AAAI Conference 2026 Conference Paper

Learning the Latent Structure: A Feature-Centric Approach to Graph Data Augmentation

  • Yu Song
  • Zhigang Hua
  • Yan Xie
  • Bingheng Li
  • Jingzhe Liu
  • Bo Long
  • Jiliang Tang
  • Hui Liu

Graph-structured data plays a pivotal role in modeling complex relationships. However, real-world graphs are often incomplete due to data collection and observational constraints, severely limiting the effectiveness of modern graph learning pipelines. While existing Graph Data Augmentation (GDA) methods attempt to refine graph structures for improved downstream performance, they are typically label-dependent, computationally expensive, and inherently transductive, limiting their applicability in practical scenarios. In this work, we present a novel feature-centric graph data augmentation framework that bypasses explicit structure modeling by operating directly in the embedding space. Through a self-supervised inverse masking process, our method captures latent ties between observed and complete graphs, enabling recovery of unobserved structural signals through refined node representations. To enhance robustness under noisy and sparse supervision, we introduce a message regularizer and a bootstrap strategy for effective training and generalization. Evaluated on ten graph datasets spanning multiple domains, our approach, SelfAug, consistently outperforms state-of-the-art methods in both accuracy and efficiency across inductive and cold-start settings, highlighting its potential as a scalable and generalizable solution for real-world graph learning scenarios.

ICML Conference 2025 Conference Paper

Homophily Enhanced Graph Domain Adaptation

  • Ruiyi Fang
  • Bingheng Li
  • Jingyu Zhao
  • Ruizhi Pu
  • Qiuhao Zeng
  • Gezheng Xu
  • Charles X. Ling
  • Boyu Wang 0004

Graph Domain Adaptation (GDA) transfers knowledge from labeled source graphs to unlabeled target graphs, addressing the challenge of label scarcity. In this paper, we highlight the significance of graph homophily, a pivotal factor for graph domain alignment, which, however, has long been overlooked in existing approaches. Specifically, our analysis first reveals that homophily discrepancies exist in benchmarks. Moreover, we also show that homophily discrepancies degrade GDA performance from both empirical and theoretical aspects, which further underscores the importance of homophily alignment in GDA. Inspired by this finding, we propose a novel homophily alignment algorithm that employs mixed filters to smooth graph signals, thereby effectively capturing and mitigating homophily discrepancies between graphs. Experimental results on a variety of benchmarks verify the effectiveness of our method.

ICLR Conference 2025 Conference Paper

On the Benefits of Attribute-Driven Graph Domain Adaptation

  • Ruiyi Fang
  • Bingheng Li
  • Zhao Kang 0001
  • Qiuhao Zeng
  • Nima Hosseini Dashtbayaz
  • Ruizhi Pu
  • Charles X. Ling
  • Boyu Wang 0004

Graph Domain Adaptation (GDA) addresses a pressing challenge in cross-network learning, particularly pertinent due to the absence of labeled data in real-world graph datasets. Recent studies attempted to learn domain invariant representations by eliminating structural shifts between graphs. In this work, we show that existing methodologies have overlooked the significance of the graph node attribute, a pivotal factor for graph domain alignment. Specifically, we first reveal the impact of node attributes for GDA by theoretically proving that in addition to the graph structural divergence between the domains, the node attribute discrepancy also plays a critical role in GDA. Moreover, we also empirically show that the attribute shift is more substantial than the topology shift, which further underscore the importance of node attribute alignment in GDA. Inspired by this finding, a novel cross-channel module is developed to fuse and align both views between the source and target graphs for GDA. Experimental results on a variety of benchmark verify the effectiveness of our method.

AAAI Conference 2025 Conference Paper

One Node One Model: Featuring the Missing-Half for Graph Clustering

  • Xuanting Xie
  • Bingheng Li
  • Erlin Pan
  • Zhaochen Guo
  • Zhao Kang
  • Wenyu Chen

Most existing graph clustering methods primarily focus on exploiting topological structure, often neglecting the "missing-half" node feature information, especially how these features can enhance clustering performance. This issue is further compounded by the challenges associated with high-dimensional features. Feature selection in graph clustering is particularly difficult because it requires simultaneously discovering clusters and identifying the relevant features for these clusters. To address this gap, we introduce a novel paradigm called "one node one model", which builds an exclusive model for each node and defines the node label as a combination of predictions for node groups. Specifically, the proposed "Feature Personalized Graph Clustering (FPGC)" method identifies cluster-relevant features for each node using a squeeze-and-excitation block, integrating these features into each model to form the final representations. Additionally, the concept of feature cross is developed as a data augmentation technique to learn low-order feature interactions. Extensive experimental results demonstrate that FPGC outperforms state-of-the-art clustering methods. Moreover, the plug-and-play nature of our method provides a versatile solution to enhance GNN-based models from the feature perspective.

AAAI Conference 2024 Conference Paper

PC-Conv: Unifying Homophily and Heterophily with Two-Fold Filtering

  • Bingheng Li
  • Erlin Pan
  • Zhao Kang

Recently, many carefully designed graph representation learning methods have achieved impressive performance on either strong heterophilic or homophilic graphs, but not both. Therefore, they are incapable of generalizing well across real-world graphs with different levels of homophily. This is attributed to their neglect of homophily in heterophilic graphs, and vice versa. In this paper, we propose a two-fold filtering mechanism to mine homophily in heterophilic graphs, and vice versa. In particular, we extend the graph heat equation to perform heterophilic aggregation of global information from a long distance. The resultant filter can be exactly approximated by the Possion-Charlier (PC) polynomials. To further exploit information at multiple orders, we introduce a powerful graph convolution PC-Conv and its instantiation PCNet for the node classification task. Compared to the state-of-the-art GNNs, PCNet shows competitive performance on well-known homophilic and heterophilic graphs. Our implementation is available at https://github.com/uestclbh/PC-Conv.

ICML Conference 2024 Conference Paper

PDHG-Unrolled Learning-to-Optimize Method for Large-Scale Linear Programming

  • Bingheng Li
  • Linxin Yang
  • Yupeng Chen
  • Senmiao Wang
  • Haitao Mao
  • Qian Chen
  • Yao Ma 0001
  • Akang Wang

Solving large-scale linear programming (LP) problems is an important task in various areas such as communication networks, power systems, finance and logistics. Recently, two distinct approaches have emerged to expedite LP solving: (i) First-order methods (FOMs); (ii) Learning to optimize (L2O). In this work, we propose an FOM-unrolled neural network (NN) called PDHG-Net, and propose a two-stage L2O method to solve large-scale LP problems. The new architecture PDHG-Net is designed by unrolling the recently emerged PDHG method into a neural network, combined with channel-expansion techniques borrowed from graph neural networks. We prove that the proposed PDHG-Net can recover PDHG algorithm, thus can approximate optimal solutions of LP instances with a polynomial number of neurons. We propose a two-stage inference approach: first use PDHG-Net to generate an approximate solution, and then apply PDHG algorithm to further improve the solution. Experiments show that our approach can significantly accelerate LP solving, achieving up to a 3$\times$ speedup compared to FOMs for large-scale LP problems.

ICLR Conference 2024 Conference Paper

Revisiting Link Prediction: a data perspective

  • Haitao Mao
  • Juanhui Li
  • Harry Shomer
  • Bingheng Li
  • Wenqi Fan
  • Yao Ma 0001
  • Tong Zhao 0003
  • Neil Shah

Link prediction, a fundamental task on graphs, has proven indispensable in various applications, e.g., friend recommendation, protein analysis, and drug interaction prediction. However, since datasets span a multitude of domains, they could have distinct underlying mechanisms of link formation. Evidence in existing literature underscores the absence of a universally best algorithm suitable for all datasets. In this paper, we endeavor to explore principles of link prediction across diverse datasets from a data-centric perspective. We recognize three fundamental factors critical to link prediction: local structural proximity, global structural proximity, and feature proximity. We then unearth relationships among those factors where (i) global structural proximity only shows effectiveness when local structural proximity is deficient. (ii) The incompatibility can be found between feature and structural proximity. Such incompatibility leads to GNNs for Link Prediction (GNN4LP) consistently underperforming on edges where the feature proximity factor dominates. Inspired by these new insights from a data perspective, we offer practical instruction for GNN4LP model design and guidelines for selecting appropriate benchmark datasets for more comprehensive evaluations.

NeurIPS Conference 2024 Conference Paper

Text-space Graph Foundation Models: Comprehensive Benchmarks and New Insights

  • Zhikai Chen
  • Haitao Mao
  • Jingzhe Liu
  • Yu Song
  • Bingheng Li
  • Wei Jin
  • Bahare Fatemi
  • Anton Tsitsulin

Given the ubiquity of graph data and its applications in diverse domains, building a Graph Foundation Model (GFM) that can work well across different graphs and tasks with a unified backbone has recently garnered significant interests. A major obstacle to achieving this goal stems from the fact that graphs from different domains often exhibit diverse node features. Inspired by multi-modal models that align different modalities with natural language, the text has recently been adopted to provide a unified feature space for diverse graphs. Despite the great potential of these text-space GFMs, current research in this field is hampered by two problems. First, the absence of a comprehensive benchmark with unified problem settings hinders a clear understanding of the comparative effectiveness and practical value of different text-space GFMs. Second, there is a lack of sufficient datasets to thoroughly explore the methods' full potential and verify their effectiveness across diverse settings. To address these issues, we conduct a comprehensive benchmark providing novel text-space datasets and comprehensive evaluation under unified problem settings. Empirical results provide new insights and inspire future research directions. Our code and data are publicly available from https: //github. com/CurryTang/TSGFM.

AAAI Conference 2024 Conference Paper

Upper Bounding Barlow Twins: A Novel Filter for Multi-Relational Clustering

  • Xiaowei Qian
  • Bingheng Li
  • Zhao Kang

Multi-relational clustering is a challenging task due to the fact that diverse semantic information conveyed in multi-layer graphs is difficult to extract and fuse. Recent methods integrate topology structure and node attribute information through graph filtering. However, they often use a low-pass filter without fully considering the correlation among multiple graphs. To overcome this drawback, we propose to learn a graph filter motivated by the theoretical analysis of Barlow Twins. We find that input with a negative semi-definite inner product provides a lower bound for Barlow Twins loss, which prevents it from reaching a better solution. We thus learn a filter that yields an upper bound for Barlow Twins. Afterward, we design a simple clustering architecture and demonstrate its state-of-the-art performance on four benchmark datasets. The source code is available at https://github.com/XweiQ/BTGF.