Arrow Research search

Author name cluster

Yu Rong

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

23 papers
1 author row

Possible papers

23

NeurIPS Conference 2025 Conference Paper

MOF-BFN: Metal-Organic Frameworks Structure Prediction via Bayesian Flow Networks

  • Rui Jiao
  • Hanlin Wu
  • Wenbing Huang
  • Yuxuan Song
  • Yawen Ouyang
  • Yu Rong
  • Tingyang Xu
  • Pengju Wang

Metal-Organic Frameworks (MOFs) have attracted considerable attention due to their unique properties including high surface area and tunable porosity, and promising applications in catalysis, gas storage, and drug delivery. Structure prediction for MOFs is a challenging task, as these frameworks are intrinsically periodic and hierarchically organized, where the entire structure is assembled from building blocks like metal nodes and organic linkers. To address this, we introduce MOF-BFN, a novel generative model for MOF structure prediction based on Bayesian Flow Networks (BFNs). Given the local geometry of building blocks, MOF-BFN jointly predicts the lattice parameters, as well as the positions and orientations of all building blocks within the unit cell. In particular, the positions are modelled in the fractional coordinate system to naturally incorporate the periodicity. Meanwhile, the orientations are modeled as unit quaternions sampled from learned Bingham distributions via the proposed Bingham BFN, enabling effective orientation generation on the 4D unit hypersphere. Experimental results demonstrate that MOF-BFN achieves state-of-the-art performance across multiple tasks, including structure prediction, geometric property evaluation, and de novo generation, offering a promising tool for designing complex MOF materials.

NeurIPS Conference 2025 Conference Paper

Non-stationary Equivariant Graph Neural Networks for Physical Dynamics Simulation

  • Chaohao Yuan
  • Maoji Wen
  • Ercan KURUOGLU
  • Yang Liu
  • Jia Li
  • Tingyang Xu
  • Deli Zhao
  • Hong Cheng

To enhance the generalization ability of graph neural networks (GNNs) in learning and simulation physical dynamics, a series of equivariant GNNs have been developed to incorporate the symmetric inductive bias. However, the existing methods do not take into account the non-stationarity nature of physical dynamics, where the joint distribution changes over time. Moreover, previous approaches for modeling non-stationary time series typically involve normalizing the data, which disrupts the symmetric assumption inherent in physical dynamics. To model the non-stationary physical dynamics while preserving the symmetric inductive bias, we introduce a Non-Stationary Equivariant Graph Neural Network (NS-EGNN) to capture the non-stationarity in physical dynamics while preserving the symmetric property of the model. Specifically, NS-EGNN employs Fourier Transform on segments of physical dynamics to extract time-varying frequency information from the trajectories. It then uses the first and second-order differences to mitigate non-stationarity, followed by pooling for future predictions. Through capturing varying frequency characteristics and alleviate the linear and quadric trend in the raw physical dynamics, NS-EGNN better models the temporal dependencies in the physical dynamics. NS-EGNN has been applied on various types of physical dynamics, including molecular, motion and protein dynamics. In various scenario, NS-EGNN consistently surpasses the performance of existing state-of-the-art algorithms, underscoring its effectiveness. The implementation of NS-EGNN is available at https: //github. com/MaojiWEN/NS-EGNN.

NeurIPS Conference 2025 Conference Paper

Scaling Language-centric Omnimodal Representation Learning

  • Chenghao Xiao
  • Hou Pong (Ken) Chan
  • Hao Zhang
  • Weiwen Xu
  • Mahani Aljunied
  • Yu Rong

Recent multimodal embedding approaches leveraging multimodal large language models (MLLMs) fine-tuned with contrastive learning (CL) have shown promising results, yet the underlying reasons behind their superiority remain underexplored. This work argues that a crucial advantage of MLLM-based approaches stems from implicit cross-modal alignment achieved during generative pretraining, where the language decoder learns to exploit multimodal signals within a shared representation space for generating unimodal outputs. Through analysis of anisotropy and kernel similarity structure, we empirically confirm that latent alignment emerges within MLLM representations, allowing CL to serve as a lightweight refinement stage. Leveraging this insight, we propose a Language-Centric Omnimodal Embedding framework, termed LCO-Embed. Extensive experiments across diverse backbones and benchmarks demonstrate its effectiveness, achieving state-of-the-art performance across modalities. Furthermore, we identify a Generation-Representation Scaling Law (GRSL), showing that the representational capabilities gained through contrastive refinement scale positively with the MLLM's generative capabilities. This suggests that improving generative abilities evolves as an effective paradigm for enhancing representation quality. We provide a theoretical explanation of GRSL, which formally links the MLLM's generative quality to the upper bound on its representation performance, and validate it on a challenging, low-resource visual-document retrieval task, showing that continual generative pretraining before CL can further enhance the potential of a model's embedding capabilities. Codes, models, and resources are available at https: //github. com/LCO-Embedding/LCO-Embedding.

NeurIPS Conference 2025 Conference Paper

The Rise of Parameter Specialization for Knowledge Storage in Large Language Models

  • Yihuai Hong
  • Yiran Zhao
  • Wei Tang
  • Yang Deng
  • Yu Rong
  • Wenxuan Zhang

Over time, a growing wave of large language models from various series has been introduced to the community. Researchers are striving to maximize the performance of language models with constrained parameter sizes. However, from a microscopic perspective, there has been limited research on how to better store knowledge in model parameters, particularly within MLPs, to enable more effective utilization of this knowledge by the model. In this work, we analyze twenty publicly available open-source large language models to investigate the relationship between their strong performance and the way knowledge is stored in their corresponding MLP parameters. Our findings reveal that as language models become more advanced and demonstrate stronger knowledge capabilities, their parameters exhibit increased specialization. Specifically, parameters in the MLPs tend to be more focused on encoding similar types of knowledge. We experimentally validate that this specialized distribution of knowledge contributes to improving the efficiency of knowledge utilization in these models. Furthermore, by conducting causal training experiments, we confirm that this specialized knowledge distribution plays a critical role in improving the model's efficiency in leveraging stored knowledge.

NeurIPS Conference 2025 Conference Paper

Universally Invariant Learning in Equivariant GNNs

  • Jiacheng Cen
  • Anyi Li
  • Ning Lin
  • Tingyang Xu
  • Yu Rong
  • Deli Zhao
  • Zihe Wang
  • Wenbing Huang

Equivariant Graph Neural Networks (GNNs) have demonstrated significant success across various applications. To achieve completeness---that is, the universal approximation property over the space of equivariant functions---the network must effectively capture the intricate multi-body interactions among different nodes. Prior methods attain this via deeper architectures, augmented body orders, or increased degrees of steerable features, often at high computational cost and without polynomial-time solutions. In this work, we present a theoretically grounded framework for constructing complete equivariant GNNs that is both efficient and practical. We prove that a complete equivariant GNN can be achieved through two key components: 1) a complete scalar function, referred to as the canonical form of the geometric graph; and 2) a full-rank steerable basis set. Leveraging this finding, we propose an efficient algorithm for constructing complete equivariant GNNs based on two common models: EGNN and TFN. Empirical results demonstrate that our model demonstrates superior completeness and excellent performance with only a few layers, thereby significantly reducing computational overhead while maintaining strong practical efficacy.

TMLR Journal 2024 Journal Article

Equivariant Graph Learning for High-density Crowd Trajectories Modeling

  • Yang Liu
  • Zinan Zheng
  • Yu Rong
  • Jia Li

Understanding the high-density crowd dynamics of urbanization plays an important role in architectural design and urban planning, preventing the occurrence of crowd crush. Most traditional methods rely on formulas designed based on expert knowledge, which are inflexible and incomplete to model complex real-world crowd trajectories. To address the issue, recent studies propose to simulate crowds via data-driven models. However, these models fail to learn the inherent symmetry of high-density crowd trajectories, leading to insufficient generalization ability. For example, existing models can not predict left-to-right trajectories by learning right-to-left trajectories, even though they share similar patterns. In this work, we propose a novel Equivariant Graph Learning framework for high-density crowd dynamic modeling, called CrowdEGL. It utilizes an additional objective to encourage models to predict the transformed output given the input under the same transformation. We summarize three types of transformation groups, which are determined by the symmetry of environments. To explicitly incorporate these augmented data, a multi-channel GNN is employed to learn the latent graph embedding of pedestrian patterns. Finally, to model dense crowd interactions, future positions of original and transformed inputs are obtained by multiple independent graph decoders. Extensive experiments on 8 datasets from 5 different environments show that CrowdEGL outperforms existing models by a large margin.

NeurIPS Conference 2023 Conference Paper

Deep Insights into Noisy Pseudo Labeling on Graph Data

  • Botao Wang
  • Jia Li
  • Yang Liu
  • Jiashun Cheng
  • Yu Rong
  • Wenjia Wang
  • Fugee Tsung

Pseudo labeling (PL) is a wide-applied strategy to enlarge the labeled dataset by self-annotating the potential samples during the training process. Several works have shown that it can improve the graph learning model performance in general. However, we notice that the incorrect labels can be fatal to the graph training process. Inappropriate PL may result in the performance degrading, especially on graph data where the noise can propagate. Surprisingly, the corresponding error is seldom theoretically analyzed in the literature. In this paper, we aim to give deep insights of PL on graph learning models. We first present the error analysis of PL strategy by showing that the error is bounded by the confidence of PL threshold and consistency of multi-view prediction. Then, we theoretically illustrate the effect of PL on convergence property. Based on the analysis, we propose a cautious pseudo labeling methodology in which we pseudo label the samples with highest confidence and multi-view consistency. Finally, extensive experiments demonstrate that the proposed strategy improves graph learning process and outperforms other PL strategies on link prediction and node classification tasks.

AAAI Conference 2023 Conference Paper

DrugOOD: Out-of-Distribution Dataset Curator and Benchmark for AI-Aided Drug Discovery – a Focus on Affinity Prediction Problems with Noise Annotations

  • Yuanfeng Ji
  • Lu Zhang
  • Jiaxiang Wu
  • Bingzhe Wu
  • Lanqing Li
  • Long-Kai Huang
  • Tingyang Xu
  • Yu Rong

AI-aided drug discovery (AIDD) is gaining popularity due to its potential to make the search for new pharmaceuticals faster, less expensive, and more effective. Despite its extensive use in numerous fields (e.g., ADMET prediction, virtual screening), little research has been conducted on the out-of-distribution (OOD) learning problem with noise. We present DrugOOD, a systematic OOD dataset curator and benchmark for AIDD. Particularly, we focus on the drug-target binding affinity prediction problem, which involves both macromolecule (protein target) and small-molecule (drug compound). DrugOOD offers an automated dataset curator with user-friendly customization scripts, rich domain annotations aligned with biochemistry knowledge, realistic noise level annotations, and rigorous benchmarking of SOTA OOD algorithms, as opposed to only providing fixed datasets. Since the molecular data is often modeled as irregular graphs using graph neural network (GNN) backbones, DrugOOD also serves as a valuable testbed for graph OOD learning problems. Extensive empirical studies have revealed a significant performance gap between in-distribution and out-of-distribution experiments, emphasizing the need for the development of more effective schemes that permit OOD generalization under noise for AIDD.

AAAI Conference 2023 Conference Paper

Energy-Motivated Equivariant Pretraining for 3D Molecular Graphs

  • Rui Jiao
  • Jiaqi Han
  • Wenbing Huang
  • Yu Rong
  • Yang Liu

Pretraining molecular representation models without labels is fundamental to various applications. Conventional methods mainly process 2D molecular graphs and focus solely on 2D tasks, making their pretrained models incapable of characterizing 3D geometry and thus defective for downstream 3D tasks. In this work, we tackle 3D molecular pretraining in a complete and novel sense. In particular, we first propose to adopt an equivariant energy-based model as the backbone for pretraining, which enjoys the merits of fulfilling the symmetry of 3D space. Then we develop a node-level pretraining loss for force prediction, where we further exploit the Riemann-Gaussian distribution to ensure the loss to be E(3)-invariant, enabling more robustness. Moreover, a graph-level noise scale prediction task is also leveraged to further promote the eventual performance. We evaluate our model pretrained from a large-scale 3D dataset GEOM-QM9 on two challenging 3D benchmarks: MD17 and QM9. Experimental results demonstrate the efficacy of our method against current state-of-the-art pretraining approaches, and verify the validity of our design for each proposed component. Code is available at https://github.com/jiaor17/3D-EMGP.

NeurIPS Conference 2023 Conference Paper

Equivariant Spatio-Temporal Attentive Graph Networks to Simulate Physical Dynamics

  • Liming Wu
  • Zhichao Hou
  • Jirui Yuan
  • Yu Rong
  • Wenbing Huang

Learning to represent and simulate the dynamics of physical systems is a crucial yet challenging task. Existing equivariant Graph Neural Network (GNN) based methods have encapsulated the symmetry of physics, \emph{e. g. }, translations, rotations, etc, leading to better generalization ability. Nevertheless, their frame-to-frame formulation of the task overlooks the non-Markov property mainly incurred by unobserved dynamics in the environment. In this paper, we reformulate dynamics simulation as a spatio-temporal prediction task, by employing the trajectory in the past period to recover the Non-Markovian interactions. We propose Equivariant Spatio-Temporal Attentive Graph Networks (ESTAG), an equivariant version of spatio-temporal GNNs, to fulfil our purpose. At its core, we design a novel Equivariant Discrete Fourier Transform (EDFT) to extract periodic patterns from the history frames, and then construct an Equivariant Spatial Module (ESM) to accomplish spatial message passing, and an Equivariant Temporal Module (ETM) with the forward attention and equivariant pooling mechanisms to aggregate temporal message. We evaluate our model on three real datasets corresponding to the molecular-, protein- and macro-level. Experimental results verify the effectiveness of ESTAG compared to typical spatio-temporal GNNs and equivariant GNNs.

AAAI Conference 2023 Conference Paper

Human Mobility Modeling during the COVID-19 Pandemic via Deep Graph Diffusion Infomax

  • Yang Liu
  • Yu Rong
  • Zhuoning Guo
  • Nuo Chen
  • Tingyang Xu
  • Fugee Tsung
  • Jia Li

Non-Pharmaceutical Interventions (NPIs), such as social gathering restrictions, have shown effectiveness to slow the transmission of COVID-19 by reducing the contact of people. To support policy-makers, multiple studies have first modelled human mobility via macro indicators (e.g., average daily travel distance) and then study the effectiveness of NPIs. In this work, we focus on mobility modelling and, from a micro perspective, aim to predict locations that will be visited by COVID-19 cases. Since NPIs generally cause economic and societal loss, such a prediction benefits governments when they design and evaluate them. However, in real-world situations, strict privacy data protection regulations result in severe data sparsity problems (i.e., limited case and location information). To address these challenges and jointly model variables including a geometric graph, a set of diffusions and a set of locations, we propose a model named Deep Graph Diffusion Infomax (DGDI). We show the maximization of DGDI can be bounded by two tractable components: a univariate Mutual Information (MI) between geometric graph and diffusion representation, and a univariate MI between diffusion representation and location representation. To facilitate the research of COVID-19 prediction, we present two benchmarks that contain geometric graphs and location histories of COVID-19 cases. Extensive experiments on the two benchmarks show that DGDI significantly outperforms other competing methods.

TMLR Journal 2023 Journal Article

Noise-robust Graph Learning by Estimating and Leveraging Pairwise Interactions

  • Xuefeng Du
  • Tian Bian
  • Yu Rong
  • Bo Han
  • Tongliang Liu
  • Tingyang Xu
  • Wenbing Huang
  • Yixuan Li

Teaching Graph Neural Networks (GNNs) to accurately classify nodes under severely noisy labels is an important problem in real-world graph learning applications, but is currently underexplored. Although pairwise training methods have demonstrated promise in supervised metric learning and unsupervised contrastive learning, they remain less studied on noisy graphs, where the structural pairwise interactions (PI) between nodes are abundant and thus might benefit label noise learning rather than the pointwise methods. This paper bridges the gap by proposing a pairwise framework for noisy node classification on graphs, which relies on the PI as a primary learning proxy in addition to the pointwise learning from the noisy node class labels. Our proposed framework PI-GNN contributes two novel components: (1) a confidence-aware PI estimation model that adaptively estimates the PI labels, which are defined as whether the two nodes share the same node labels, and (2) a decoupled training approach that leverages the estimated PI labels to regularize a node classification model for robust node classification. Extensive experiments on different datasets and GNN architectures demonstrate the effectiveness of PI-GNN, yielding a promising improvement over the state-of-the-art methods. Code is publicly available at https://github.com/TianBian95/pi-gnn.

NeurIPS Conference 2022 Conference Paper

Equivariant Graph Hierarchy-Based Neural Networks

  • Jiaqi Han
  • Wenbing Huang
  • Tingyang Xu
  • Yu Rong

Equivariant Graph neural Networks (EGNs) are powerful in characterizing the dynamics of multi-body physical systems. Existing EGNs conduct flat message passing, which, yet, is unable to capture the spatial/dynamical hierarchy for complex systems particularly, limiting substructure discovery and global information fusion. In this paper, we propose Equivariant Hierarchy-based Graph Networks (EGHNs) which consist of the three key components: generalized Equivariant Matrix Message Passing (EMMP), E-Pool and E-UnPool. In particular, EMMP is able to improve the expressivity of conventional equivariant message passing, E-Pool assigns the quantities of the low-level nodes into high-level clusters, while E-UnPool leverages the high-level information to update the dynamics of the low-level nodes. As their names imply, both E-Pool and E-UnPool are guaranteed to be equivariant to meet physic symmetry. Considerable experimental evaluations verify the effectiveness of our EGHN on several applications including multi-object dynamics simulation, motion capture, and protein dynamics modeling.

IJCAI Conference 2022 Conference Paper

Fine-Tuning Graph Neural Networks via Graph Topology Induced Optimal Transport

  • Jiying Zhang
  • Xi Xiao
  • Long-Kai Huang
  • Yu Rong
  • Yatao Bian

Recently, the pretrain-finetuning paradigm has attracted tons of attention in graph learning community due to its power of alleviating the lack of labels problem in many real-world applications. Current studies use existing techniques, such as weight constraint, representation constraint, which are derived from images or text data, to transfer the invariant knowledge from the pre-train stage to fine-tuning stage. However, these methods failed to preserve invariances from graph structure and Graph Neural Network (GNN) style models. In this paper, we present a novel optimal transport-based fine-tuning framework called GTOT-Tuning, namely, Graph Topology induced Optimal Transport fine-Tuning, for GNN style backbones. GTOT-Tuning is required to utilize the property of graph data to enhance the preservation of representation produced by fine-tuned networks. Toward this goal, we formulate graph local knowledge transfer as an Optimal Transport (OT) problem with a structural prior and construct the GTOT regularizer to constrain the fine-tuned model behaviors. By using the adjacency relationship amongst nodes, the GTOT regularizer achieves node-level optimal transport procedures and reduces redundant transport procedures, resulting in efficient knowledge transfer from the pre-trained models. We evaluate GTOT-Tuning on eight downstream tasks with various GNN backbones and demonstrate that it achieves state-of-the-art fine-tuning performance for GNNs.

AAAI Conference 2021 Conference Paper

Hierarchical Graph Capsule Network

  • Jinyu Yang
  • Peilin Zhao
  • Yu Rong
  • Chaochao Yan
  • Chunyuan Li
  • Hehuan Ma
  • Junzhou Huang

Graph Neural Networks (GNNs) draw their strength from explicitly modeling the topological information of structured data. However, existing GNNs suffer from limited capability in capturing the hierarchical graph representation which plays an important role in graph classification. In this paper, we innovatively propose hierarchical graph capsule network (HGCN) that can jointly learn node embeddings and extract graph hierarchies. Specifically, disentangled graph capsules are established by identifying heterogeneous factors underlying each node, such that their instantiation parameters represent different properties of the same entity. To learn the hierarchical representation, HGCN characterizes the part-whole relationship between lower-level capsules (part) and higherlevel capsules (whole) by explicitly considering the structure information among the parts. Experimental studies demonstrate the effectiveness of HGCN and the contribution of each component. Code: https: //github. com/uta-smile/HGCN

NeurIPS Conference 2021 Conference Paper

Not All Low-Pass Filters are Robust in Graph Convolutional Networks

  • Heng Chang
  • Yu Rong
  • Tingyang Xu
  • Yatao Bian
  • Shiji Zhou
  • Xin Wang
  • Junzhou Huang
  • Wenwu Zhu

Graph Convolutional Networks (GCNs) are promising deep learning approaches in learning representations for graph-structured data. Despite the proliferation of such methods, it is well known that they are vulnerable to carefully crafted adversarial attacks on the graph structure. In this paper, we first conduct an adversarial vulnerability analysis based on matrix perturbation theory. We prove that the low- frequency components of the symmetric normalized Laplacian, which is usually used as the convolutional filter in GCNs, could be more robust against structural perturbations when their eigenvalues fall into a certain robust interval. Our results indicate that not all low-frequency components are robust to adversarial attacks and provide a deeper understanding of the relationship between graph spectrum and robustness of GCNs. Motivated by the theory, we present GCN-LFR, a general robust co-training paradigm for GCN-based models, that encourages transferring the robustness of low-frequency components with an auxiliary neural network. To this end, GCN-LFR could enhance the robustness of various kinds of GCN-based models against poisoning structural attacks in a plug-and-play manner. Extensive experiments across five benchmark datasets and five GCN-based models also confirm that GCN-LFR is resistant to the adversarial attacks without compromising on performance in the benign situation.

IJCAI Conference 2021 Conference Paper

On Self-Distilling Graph Neural Network

  • Yuzhao Chen
  • Yatao Bian
  • Xi Xiao
  • Yu Rong
  • Tingyang Xu
  • Junzhou Huang

Recently, the teacher-student knowledge distillation framework has demonstrated its potential in training Graph Neural Networks (GNNs). However, due to the difficulty of training over-parameterized GNN models, one may not easily obtain a satisfactory teacher model for distillation. Furthermore, the inefficient training process of teacher-student knowledge distillation also impedes its applications in GNN models. In this paper, we propose the first teacher-free knowledge distillation method for GNNs, termed GNN Self-Distillation (GNN-SD), that serves as a drop-in replacement of the standard training process. The method is built upon the proposed neighborhood discrepancy rate (NDR), which quantifies the non-smoothness of the embedded graph in an efficient way. Based on this metric, we propose the adaptive discrepancy retaining (ADR) regularizer to empower the transferability of knowledge that maintains high neighborhood discrepancy across GNN layers. We also summarize a generic GNN-SD framework that could be exploited to induce other distillation strategies. Experiments further prove the effectiveness and generalization of our approach, as it brings: 1) state-of-the-art GNN distillation performance with less training cost, 2) consistent and considerable performance enhancement for various popular backbones.

AAAI Conference 2020 Conference Paper

A Restricted Black-Box Adversarial Framework Towards Attacking Graph Embedding Models

  • Heng Chang
  • Yu Rong
  • Tingyang Xu
  • Wenbing Huang
  • Honglei Zhang
  • Peng Cui
  • Wenwu Zhu
  • Junzhou Huang

With the great success of graph embedding model on both academic and industry area, the robustness of graph embedding against adversarial attack inevitably becomes a central problem in graph learning domain. Regardless of the fruitful progress, most of the current works perform the attack in a white-box fashion: they need to access the model predictions and labels to construct their adversarial loss. However, the inaccessibility of model predictions in real systems makes the white-box attack impractical to real graph learning system. This paper promotes current frameworks in a more general and flexible sense – we demand to attack various kinds of graph embedding model with black-box driven. To this end, we begin by investigating the theoretical connections between graph signal processing and graph embedding models in a principled way and formulate the graph embedding model as a general graph signal process with corresponding graph filter. As such, a generalized adversarial attacker: GF-Attack is constructed by the graph filter and feature matrix. Instead of accessing any knowledge of the target classifiers used in graph embedding, GF-Attack performs the attack only on the graph filter in a black-box attack fashion. To validate the generalization of GF-Attack, we construct the attacker on four popular graph embedding models. Extensive experimental results validate the effectiveness of our attacker on several benchmark datasets. Particularly by using our attack, even small graph perturbations like one-edge flip is able to consistently make a strong attack in performance to different graph embedding models.

NeurIPS Conference 2020 Conference Paper

Deep Multimodal Fusion by Channel Exchanging

  • Yikai Wang
  • Wenbing Huang
  • Fuchun Sun
  • Tingyang Xu
  • Yu Rong
  • Junzhou Huang

Deep multimodal fusion by using multiple sources of data for classification or regression has exhibited a clear advantage over the unimodal counterpart on various applications. Yet, current methods including aggregation-based and alignment-based fusion are still inadequate in balancing the trade-off between inter-modal fusion and intra-modal processing, incurring a bottleneck of performance improvement. To this end, this paper proposes Channel-Exchanging-Network (CEN), a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities. Specifically, the channel exchanging process is self-guided by individual channel importance that is measured by the magnitude of Batch-Normalization (BN) scaling factor during training. The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network. Extensive experiments on semantic segmentation via RGB-D data and image translation through multi-domain input verify the effectiveness of our CEN compared to current state-of-the-art methods. Detailed ablation studies have also been carried out, which provably affirm the advantage of each component we propose. Our code is available at https: //github. com/yikaiw/CEN.

NeurIPS Conference 2020 Conference Paper

Dirichlet Graph Variational Autoencoder

  • Jia Li
  • Jianwei Yu
  • Jiajin Li
  • Honglei Zhang
  • Kangfei Zhao
  • Yu Rong
  • Hong Cheng
  • Junzhou Huang

Graph Neural Networks (GNN) and Variational Autoencoders (VAEs) have been widely used in modeling and generating graphs with latent factors. However there is no clear explanation of what these latent factors are and why they perform well. In this work, we present Dirichlet Graph Variational Autoencoder (DGVAE) with graph cluster memberships as latent factors. Our study connects VAEs based graph generation and balanced graph cut, and provides a new way to understand and improve the internal mechanism of VAEs based graph generation. Specifically, we first interpret the reconstruction term of DGVAE as balanced graph cut in a principled way. Furthermore, motivated by the low pass characteristics in balanced graph cut, we propose a new variant of GNN named Heatts to encode the input graph into cluster memberships. Heatts utilizes the Taylor series for fast computation of Heat kernels and has better low pass characteristics than Graph Convolutional Networks (GCN). Through experiments on graph generation and graph clustering, we demonstrate the effectiveness of our proposed framework.

AAAI Conference 2020 Conference Paper

Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks

  • Tian Bian
  • Xi Xiao
  • Tingyang Xu
  • Peilin Zhao
  • Wenbing Huang
  • Yu Rong
  • Junzhou Huang

Social media has been developing rapidly in public due to its nature of spreading new information, which leads to rumors being circulated. Meanwhile, detecting rumors from such massive information in social media is becoming an arduous challenge. Therefore, some deep learning methods are applied to discover rumors through the way they spread, such as Recursive Neural Network (RvNN) and so on. However, these deep learning methods only take into account the patterns of deep propagation but ignore the structures of wide dispersion in rumor detection. Actually, propagation and dispersion are two crucial characteristics of rumors. In this paper, we propose a novel bi-directional graph model, named Bi-Directional Graph Convolutional Networks (Bi-GCN), to explore both characteristics by operating on both top-down and bottom-up propagation of rumors. It leverages a GCN with a top-down directed graph of rumor spreading to learn the patterns of rumor propagation; and a GCN with an opposite directed graph of rumor diffusion to capture the structures of rumor dispersion. Moreover, the information from source post is involved in each layer of GCN to enhance the influences from the roots of rumors. Encouraging empirical results on several benchmarks confirm the superiority of the proposed method over the state-of-the-art approaches.

NeurIPS Conference 2020 Conference Paper

Self-Supervised Graph Transformer on Large-Scale Molecular Data

  • Yu Rong
  • Yatao Bian
  • Tingyang Xu
  • Weiyang Xie
  • Ying Wei
  • Wenbing Huang
  • Junzhou Huang

How to obtain informative representations of molecules is a crucial prerequisite in AI-driven drug design and discovery. Recent researches abstract molecules as graphs and employ Graph Neural Networks (GNNs) for molecular representation learning. Nevertheless, two issues impede the usage of GNNs in real scenarios: (1) insufficient labeled molecules for supervised training; (2) poor generalization capability to new-synthesized molecules. To address them both, we propose a novel framework, GROVER, which stands for Graph Representation frOm self-superVised mEssage passing tRansformer. With carefully designed self-supervised tasks in node-, edge- and graph-level, GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data. Rather, to encode such complex information, GROVER integrates Message Passing Networks into the Transformer-style architecture to deliver a class of more expressive encoders of molecules. The flexibility of GROVER allows it to be trained efficiently on large-scale molecular dataset without requiring any supervision, thus being immunized to the two issues mentioned above. We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules---the biggest GNN and the largest training dataset in molecular representation learning. We then leverage the pre-trained GROVER for molecular property prediction followed by task-specific fine-tuning, where we observe a huge improvement (more than 6% on average) from current state-of-the-art methods on 11 challenging benchmarks. The insights we gained are that well-designed self-supervision losses and largely-expressive pre-trained models enjoy the significant potential on performance boosting.

NeurIPS Conference 2018 Conference Paper

Adaptive Sampling Towards Fast Graph Representation Learning

  • Wenbing Huang
  • Tong Zhang
  • Yu Rong
  • Junzhou Huang

Graph Convolutional Networks (GCNs) have become a crucial tool on learning representations of graph vertices. The main challenge of adapting GCNs on large-scale graphs is the scalability issue that it incurs heavy cost both in computation and memory due to the uncontrollable neighborhood expansion across layers. In this paper, we accelerate the training of GCNs through developing an adaptive layer-wise sampling method. By constructing the network layer by layer in a top-down passway, we sample the lower layer conditioned on the top one, where the sampled neighborhoods are shared by different parent nodes and the over expansion is avoided owing to the fixed-size sampling. More importantly, the proposed sampler is adaptive and applicable for explicit variance reduction, which in turn enhances the training of our method. Furthermore, we propose a novel and economical approach to promote the message passing over distant nodes by applying skip connections. Intensive experiments on several benchmarks verify the effectiveness of our method regarding the classification accuracy while enjoying faster convergence speed.