Arrow Research search

Author name cluster

Zhen Cui

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

29 papers
1 author row

Possible papers

29

IJCAI Conference 2025 Conference Paper

Going Beyond Consistency: Target-oriented Multi-view Graph Neural Network

  • Sujia Huang
  • Lele Fu
  • Shuman Zhuang
  • Yide Qiu
  • Bo Huang
  • Zhen Cui
  • Tong Zhang

Multi‐view learning has emerged as a pivotal research area driven by the growing heterogeneity of real‐world data, and graph neural network-based models, modeling multi-view data as multi-view graphs, have achieved remarkable performance by revealing its deep semantics. However, by assuming cross‐view consistency, most approaches collect not only task-relevant (determinative) semantics but also symbiotic yet task-irrelevant (incidental) factors are collected to obscure model inference. Furthermore, these approaches often lack rigorous theoretical analysis that bridges training data to test data. To address these issues, we propose Target-oriented Graph Neural Network (TGNN), a novel framework that goes beyond traditional consistency by prioritizing task-relevant information, ensuring alignment with the target. Specifically, TGNN employs a class-level dual-objective loss to minimize the classification similarity between determinative and incidental factors, accentuating the former while suppressing the latter during model inference. Meanwhile, to ensure consistency between the learned semantics and predictions in representation learning, we introduce a penalty term that aims to amplify the divergence between these two types of factors. Furthermore, we derive an upper bound on the loss discrepancy between training and test data, providing formal guarantees for generalization to test domains. Extensive experiments conducted on three types of multi-view datasets validate the superiority of TGNN.

NeurIPS Conference 2025 Conference Paper

Learn and Ensemble Bridge Adapters for Multi-domain Task Incremental Learning

  • Ziqi Gu
  • Chunyan Xu
  • Wenxuan Fang
  • Xin Liu
  • Yide Qiu
  • Zhen Cui

Multi-domain task incremental learning (MTIL) demands models to master domain-specific expertise while preserving generalization capabilities. Inspired by human lifelong learning, which relies on revisiting, aligning, and integrating past experiences, we propose a Learning and Ensembling Bridge Adapters (LEBA) framework. To facilitate cohesive knowledge transfer across domains, specifically, we propose a continuous-domain bridge adaptation module, leveraging the distribution transfer capabilities of Schrödinger bridge for stable progressive learning. To strengthen memory consolidation, we further propose a progressive knowledge ensemble strategy that revisits past task representations via a diffusion model and dynamically integrates historical adapters. For efficiency, LEBA maintains a compact adapter pool through similarity-based selection and employs learnable weights to align replayed samples with current task semantics. Together, these components effectively mitigate catastrophic forgetting and enhance generalization across tasks. Extensive experiments across multiple benchmarks validate the effectiveness and superiority of LEBA over state-of-the-art methods.

AAAI Conference 2025 Conference Paper

Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection

  • Chenxu Wang
  • Chunyan Xu
  • Xiang Li
  • Yuxuan Li
  • Xu Guo
  • Ziqi Gu
  • Zhen Cui

While existing semi-supervised object detection (SSOD) methods perform well in general scenes, they encounter challenges in handling oriented objects in aerial images. We experimentally find three gaps between general and oriented object detection in semi-supervised learning: 1) Sampling inconsistency: the common center sampling is not suitable for oriented objects with larger aspect ratios when selecting positive labels from labeled data. 2) Assignment inconsistency: balancing the precision and localization quality of oriented pseudo-boxes poses greater challenges which introduces more noise when selecting positive labels from unlabeled data. 3) Confidence inconsistency: there exists more mismatch between the predicted classification and localization qualities when considering oriented objects, affecting the selection of pseudo-labels. Therefore, we propose a Multi-clue Consistency Learning (MCL) framework to bridge gaps between general and oriented objects in semi-supervised detection. Specifically, considering various shapes of rotated objects, the Gaussian Center Assignment is specially designed to select the pixel-level positive labels from labeled data. We then introduce the Scale-aware Label Assignment to select pixel-level pseudo-labels instead of unreliable pseudo-boxes, which is a divide-and-rule strategy suited for objects with various scales. The Consistent Confidence Soft Label is adopted to further boost the detector by maintaining the alignment of the predicted results. Comprehensive experiments on DOTA-v1.5 and DOTA-v1.0 benchmarks demonstrate that our proposed MCL can achieve state-of-the-art performance in the semi-supervised oriented object detection task.

JBHI Journal 2025 Journal Article

NPENN: A Noise Perturbation Ensemble Neural Network for Microbiome Disease Phenotype Prediction

  • Zhen Cui
  • Yan Wu
  • Qin-Hu Zhang
  • Si-Guo Wang
  • Zhen-Hao Guo

With advances in microbiomics, the crucial role of microbes in disease progression is increasingly recognized. However, predicting disease phenotypes using microbiome data remains challenging due to data complexity, heterogeneity, and limited model generalization. Current methods often depend on specific datasets and are vulnerable to adversarial attacks. To address these issues, this paper introduces a novel Noise Perturbation Ensemble Neural Network model (NPENN), which combines noise mechanisms with Gradient Boosting (GB) techniques for robust neural network ensemble learning. NPENN, validated on multiple microbiome datasets, shows superior accuracy and generalization compared to traditional methods, effectively handling data complexity and variability. This approach enhances model robustness and feature learning by integrating GB prior knowledge. Additionally, the study explores microbial community roles in various diseases, providing insights into disease mechanisms and potential biomarkers for personalized precision diagnosis and treatment strategies.

NeurIPS Conference 2025 Conference Paper

One for All: Universal Topological Primitive Transfer for Graph Structure Learning

  • Yide Qiu
  • Tong Zhang
  • Xing Cai
  • Hui Yan
  • Zhen Cui

The non-Euclidean geometry inherent in graph structures fundamentally impedes cross-graph knowledge transfer. Drawing inspiration from texture transfer in computer vision, we pioneer topological primitives as transferable semantic units for graph structural knowledge. To address three critical barriers - the absence of specialized benchmarks, aligned semantic representations, and systematic transfer methodologies - we present G²SN-Transfer, a unified framework comprising: (i) TopoGraph-Mapping that transforms non-Euclidean graphs into transferable sequences via topological primitive distribution dictionaries; (ii) G²SN, a dual-stream architecture learning text-topology aligned representations through contrastive alignment; and (iii) AdaCross-Transfer, a data-adaptive knowledge transfer mechanism leveraging cross-attention for both full-parameter and parameter-frozen scenarios. Particularly, G²SN is a dual-stream sequence network driven by ordinary differential equations, and our theoretical analysis establishes the convergence guarantee of G²SN. We construct STA-18, the first large-scale benchmark with aligned topological primitive-text pairs across 18 diverse graph datasets. Comprehensive evaluations demonstrate that G²SN achieves state-of-the-art performance on four structural learning tasks (average 3. 2\% F1-score improvement), while our transfer method yields consistent enhancements across 13 downstream tasks (5. 2\% average gains) including 10 large-scale graph datasets. The datasets and code are available at https: //anonymous. 4open. science/r/UGSKT-C10E/.

AAAI Conference 2025 Conference Paper

Re-Attentional Controllable Video Diffusion Editing

  • Yuanzhi Wang
  • Yong Li
  • Mengyi Liu
  • Xiaoya Zhang
  • Xin Liu
  • Zhen Cui
  • Antoni B. Chan

Editing videos with textual guidance has garnered popularity due to its streamlined process which mandates users to solely edit the text prompt corresponding to the source video. Recent studies have explored and exploited large-scale text-to-image diffusion models for text-guided video editing, resulting in remarkable video editing capabilities. However, they may still suffer from some limitations such as mislocated objects, incorrect number of objects. Therefore, the controllability of video editing remains a formidable challenge. In this paper, we aim to challenge the above limitations by proposing a Re-Attentional Controllable Video Diffusion Editing (ReAtCo) method. Specially, to align the spatial placement of the target objects with the edited text prompt in a training-free manner, we propose a Re-Attentional Diffusion (RAD) to refocus the cross-attention activation responses between the edited text prompt and the target video during the denoising stage, resulting in a spatially location-aligned and semantically high-fidelity manipulated video. In particular, to faithfully preserve the invariant region content with less border artifacts, we propose an Invariant Region-guided Joint Sampling (IRJS) strategy to mitigate the intrinsic sampling errors w.r.t the invariant regions at each denoising timestep and constrain the generated content to be harmonized with the invariant region content. Experimental results verify that ReAtCo consistently improves the controllability of video diffusion editing and achieves superior video editing performance.

AAAI Conference 2025 Conference Paper

Scene Graph-Grounded Image Generation

  • Fuyun Wang
  • Tong Zhang
  • Yuanzhi Wang
  • Xiaoya Zhang
  • Xin Liu
  • Zhen Cui

With the beneft of explicit object-oriented reasoning capabilities of scene graphs, scene graph-to-image generation has made remarkable advancements in comprehending object coherence and interactive relations. Recent state-of-the-arts typically predict the scene layouts as an intermediate representation of a scene graph before synthesizing the image. Nevertheless, transforming a scene graph into an exact layout may restrict its representation capabilities, leading to discrepancies in interactive relationships (such as standing on, wearing, or covering) between the generated image and the input scene graph. In this paper, we propose a Scene Graph-Grounded Image Generation (SGG-IG) method to mitigate the above issues. Specifcally, to enhance the scene graph representation, we design a masked auto-encoder module and a relation embedding learning module to integrate structural knowledge and contextual information of the scene graph with a mask self-supervised manner. Subsequently, to bridge the scene graph with visual content, we introduce a spatial constraint and image-scene alignment constraint to capture the fne-grained visual correlation between the scene graph symbol representation and the corresponding image representation, thereby generating semantically consistent and high-quality images. Extensive experiments demonstrate the effectiveness of the method both quantitatively and qualitatively.

NeurIPS Conference 2025 Conference Paper

UniHG: A Large-scale Universal Heterogeneous Graph Dataset and Benchmark for Representation Learning and Cross-Domain Transferring

  • Yide Qiu
  • Tong Zhang
  • Shaoxiang Ling
  • Xing Cai
  • Ziqi Gu
  • Zhen Cui

Irregular data in the real world are usually organized as heterogeneous graphs consisting of multiple types of nodes and edges. However, current heterogeneous graph research confronts three fundamental challenges: i) Benchmark Deficiency, ii) Semantic Disalignment, and iii) Propagation Degradation. In this paper, we construct a large-scale, universal, and joint multi-domain heterogeneous graph dataset named UniHG to facilitate heterogeneous graph representation learning and cross-domain knowledge mining. Overall, UniHG contains 77. 31 million nodes and 564 million directed edges with thousands of labels and attributes, which is currently the largest universal heterogeneous graph dataset available to the best of our knowledge. To perform effective learning and provide comprehensively benchmarks on UniHG, two key measures are taken, including i) the semantic alignment strategy for multi-attribute entities, which projects the feature description of multi-attribute nodes and edges into a common embedding space to facilitate information aggregation; ii) proposing the novel Heterogeneous Graph Decoupling (HGD) framework with a specifically designed Anisotropy Feature Propagation (AFP) module for learning effective multi-hop anisotropic propagation kernels. These two strategies enable efficient information propagation among a tremendous number of multi-attribute entities and meanwhile mine multi-attribute association adaptively through the multi-hop aggregation in large-scale heterogeneous graphs. Comprehensive benchmark results demonstrate that our model significantly outperforms existing methods with an accuracy improvement of 28. 93\%. And the UniHG can facilitate downstream tasks, achieving an NDCG@20 improvement rate of 11. 48\% and 11. 71\%. The UniHG dataset and benchmark codes have been released at https: //anonymous. 4open. science/r/UniHG-AA78.

NeurIPS Conference 2025 Conference Paper

Value Diffusion Reinforcement Learning

  • Xiaoliang Hu
  • Fuyun Wang
  • Tong Zhang
  • Zhen Cui

Model-free reinforcement learning (RL) combined with diffusion models has achieved significant progress in addressing complex continuous control tasks. However, a persistent challenge in RL remains the accurate estimation of Q-values, which critically governs the efficacy of policy optimization. Although recent advances employ parametric distributions to model value distributions for enhanced estimation accuracy, current methodologies predominantly rely on unimodal Gaussian assumptions or quantile representations. These constraints introduce distributional bias between the learned and true value distributions, particularly in some tasks with a nonstationary policy, ultimately degrading performance. To address these limitations, we propose value diffusion reinforcement learning (VDRL), a novel model-free online RL method that utilizes the generative capacity of diffusion models to represent multimodal value distributions. The core innovation of VDRL lies in the use of the variational loss of diffusion-based value distribution, which is theoretically proven to be a tight lower bound for the optimization objective under the KL-divergence measurement. Furthermore, we introduce double value diffusion learning with sample selection to enhance training stability and further improve value estimation accuracy. Extensive experiments conducted on the MuJoCo benchmark demonstrate that VDRL significantly outperforms some SOTA model-free online RL baselines, showcasing its effectiveness and robustness.

NeurIPS Conference 2024 Conference Paper

MMM-RS: A Multi-modal, Multi-GSD, Multi-scene Remote Sensing Dataset and Benchmark for Text-to-Image Generation

  • Jialin Luo
  • Yuanzhi Wang
  • Ziqi Gu
  • Yide Qiu
  • Shuaizhen Yao
  • Fuyun Wang
  • Chunyan Xu
  • Wenhua Zhang

Recently, the diffusion-based generative paradigm has achieved impressive general image generation capabilities with text prompts due to its accurate distribution modeling and stable training process. However, generating diverse remote sensing (RS) images that are tremendously different from general images in terms of scale and perspective remains a formidable challenge due to the lack of a comprehensive remote sensing image generation dataset with various modalities, ground sample distances (GSD), and scenes. In this paper, we propose a Multi-modal, Multi-GSD, Multi-scene Remote Sensing (MMM-RS) dataset and benchmark for text-to-image generation in diverse remote sensing scenarios. Specifically, we first collect nine publicly available RS datasets and conduct standardization for all samples. To bridge RS images to textual semantic information, we utilize a large-scale pretrained vision-language model to automatically output text prompts and perform hand-crafted rectification, resulting in information-rich text-image pairs (including multi-modal images). In particular, we design some methods to obtain the images with different GSD and various environments (e. g. , low-light, foggy) in a single sample. With extensive manual screening and refining annotations, we ultimately obtain a MMM-RS dataset that comprises approximately 2. 1 million text-image pairs. Extensive experimental results verify that our proposed MMM-RS dataset allows off-the-shelf diffusion models to generate diverse RS images across various modalities, scenes, weather conditions, and GSD. The dataset is available at https: //github. com/ljl5261/MMM-RS.

JBHI Journal 2024 Journal Article

Occluded Facial Pain Assessment in the ICU Using Action Units Guided Network

  • Xin Yuan
  • Zhen Cui
  • Dingfan Xu
  • Shuai Zhang
  • Cancan Zhao
  • Xinbao Wu
  • Tongyu Jia
  • Bo Ouyang

Untreated pain in critically ill patients can lead to immunosuppression and increased metabolic activity, with severe clinical consequences such as tachypnea and delirium. Continuous pain assessment is challenging due to nursing shortages and intensive care unit (ICU) workload. Mechanical ventilation equipment obscures the facial features of many patients in the ICU, making previous facial pain detection methods based on full-face images inapplicable. This paper proposes a facial Action Units (AUs) guided pain assessment network for faces under occlusion. The network consists of an AU-guided (AUG) module, a texture feature extraction (TFE) module, and a pain assessment (PA) module. The AUG module automatically detects AUs in the non-occluded areas of the face. In contrast, the TFE module detects the facial landmarks and crops prior knowledge patches, a random exploration patch, and a global feature patch. Then these patches are fed into two convolutional networks to extract texture features. Afterward, the designed AU guidances and texture features are fused in the PA module to assess the pain state. Extensive validation is conducted on a public dataset and two datasets created in this work. The proposed network architecture achieves superior performance in binary classification, four-class classification, and intensity regression tasks. In addition, we have successfully applied the network to actual data collected in the laboratory environment with excellent results.

NeurIPS Conference 2024 Conference Paper

Progressive Exploration-Conformal Learning for Sparsely Annotated Object Detection in Aerial Images

  • Zihan Lu
  • Chenxu Wang
  • Chunyan Xu
  • Xiangwei Zheng
  • Zhen Cui

The ability to detect aerial objects with limited annotation is pivotal to the development of real-world aerial intelligence systems. In this work, we focus on a demanding but practical sparsely annotated object detection (SAOD) in aerial images, which encompasses a wider variety of aerial scenes with the same number of annotated objects. Although most existing SAOD methods rely on fixed thresholding to filter pseudo-labels for enhancing detector performance, adapting to aerial objects proves challenging due to the imbalanced probabilities/confidences associated with predicted aerial objects. To address this problem, we propose a novel Progressive Exploration-Conformal Learning (PECL) framework to address the SAOD task, which can adaptively perform the selection of high-quality pseudo-labels in aerial images. Specifically, the pseudo-label exploration can be formulated as a decision-making paradigm by adopting a conformal pseudo-label explorer and a multi-clue selection evaluator. The conformal pseudo-label explorer learns an adaptive policy by maximizing the cumulative reward, which can decide how to select these high-quality candidates by leveraging their essential characteristics and inter-instance contextual information. The multi-clue selection evaluator is designed to evaluate the explorer-guided pseudo-label selections by providing an instructive feedback for policy optimization. Finally, the explored pseudo-labels can be adopted to guide the optimization of aerial object detector in a closed-looping progressive fashion. Comprehensive evaluations on two public datasets demonstrate the superiority of our PECL when compared with other state-of-the-art methods in the sparsely annotated aerial object detection task.

AAAI Conference 2023 Conference Paper

Deep Graph Structural Infomax

  • Wenting Zhao
  • Gongping Xu
  • Zhen Cui
  • Siqiang Luo
  • Cheng Long
  • Tong Zhang

In the scene of self-supervised graph learning, Mutual Information (MI) was recently introduced for graph encoding to generate robust node embeddings. A successful representative is Deep Graph Infomax (DGI), which essentially operates on the space of node features but ignores topological structures, and just considers global graph summary. In this paper, we present an effective model called Deep Graph Structural Infomax (DGSI) to learn node representation. We explore to derive the structural mutual information from the perspective of Information Bottleneck (IB), which defines a trade-off between the sufficiency and minimality of representation on the condition of the topological structure preservation. Intuitively, the derived constraints formally maximize the structural mutual information both edge-wise and local neighborhood-wise. Besides, we develop a general framework that incorporates the global representational mutual information, local representational mutual information, and sufficient structural information into the node representation. Essentially, our DGSI extends DGI and could capture more fine-grained semantic information as well as beneficial structural information in a self-supervised manner, thereby improving node representation and further boosting the learning performance. Extensive experiments on different types of datasets demonstrate the effectiveness and superiority of the proposed method.

JBHI Journal 2023 Journal Article

DeepTPpred: A Deep Learning Approach With Matrix Factorization for Predicting Therapeutic Peptides by Integrating Length Information

  • Zhen Cui
  • Si-Guo Wang
  • Ying He
  • Zhan-Heng Chen
  • Qin-Hu Zhang

The abuse of traditional antibiotics has led to increased resistance of bacteria and viruses. Efficient therapeutic peptide prediction is critical for peptide drug discovery. However, most of the existing methods only make effective predictions for one class of therapeutic peptides. It is worth noting that currently no predictive method considers sequence length information as a distinct feature of therapeutic peptides. In this article, a novel deep learning approach with matrix factorization for predicting therapeutic peptides (DeepTPpred) by integrating length information are proposed. The matrix factorization layer can learn the potential features of the encoded sequence through the mechanism of first compression and then restoration. And the length features of the sequence of therapeutic peptides are embedded with encoded amino acid sequences. To automatically learn therapeutic peptide predictions, these latent features are input into the neural networks with self-attention mechanism. On eight therapeutic peptide datasets, DeepTPpred achieved excellent prediction results. Based on these datasets, we first integrated eight datasets to obtain a full therapeutic peptide integration dataset. Then, we obtained two functional integration datasets based on the functional similarity of the peptides. Finally, we also conduct experiments on the latest versions of the ACP and CPP datasets. Overall, the experimental results show that our work is effective for the identification of therapeutic peptides.

AAAI Conference 2023 Conference Paper

Exploratory Inference Learning for Scribble Supervised Semantic Segmentation

  • Chuanwei Zhou
  • Zhen Cui
  • Chunyan Xu
  • Cao Han
  • Jian Yang

Scribble supervised semantic segmentation has achieved great advances in pseudo label exploitation, yet suffers insufficient label exploration for the mass of unannotated regions. In this work, we propose a novel exploratory inference learning (EIL) framework, which facilitates efficient probing on unlabeled pixels and promotes selecting confident candidates for boosting the evolved segmentation. The exploration of unannotated regions is formulated as an iterative decision-making process, where a policy searcher learns to infer in the unknown space and the reward to the exploratory policy is based on a contrastive measurement of candidates. In particular, we devise the contrastive reward with the intra-class attraction and the inter-class repulsion in the feature space w.r.t the pseudo labels. The unlabeled exploration and the labeled exploitation are jointly balanced to improve the segmentation, and framed in a close-looping end-to-end network. Comprehensive evaluations on the benchmark datasets (PASCAL VOC 2012 and PASCAL Context) demonstrate the superiority of our proposed EIL when compared with other state-of-the-art methods for the scribble-supervised semantic segmentation problem.

NeurIPS Conference 2023 Conference Paper

Incomplete Multimodality-Diffused Emotion Recognition

  • Yuanzhi Wang
  • Yong Li
  • Zhen Cui

Human multimodal emotion recognition (MER) aims to perceive and understand human emotions via various heterogeneous modalities, such as language, vision, and acoustic. Compared with unimodality, the complementary information in the multimodalities facilitates robust emotion understanding. Nevertheless, in real-world scenarios, the missing modalities hinder multimodal understanding and result in degraded MER performance. In this paper, we propose an Incomplete Multimodality-Diffused emotion recognition (IMDer) method to mitigate the challenge of MER under incomplete multimodalities. To recover the missing modalities, IMDer exploits the score-based diffusion model that maps the input Gaussian noise into the desired distribution space of the missing modalities and recovers missing data abided by their original distributions. Specially, to reduce semantic ambiguity between the missing and the recovered modalities, the available modalities are embedded as the condition to guide and refine the diffusion-based recovering process. In contrast to previous work, the diffusion-based modality recovery mechanism in IMDer allows to simultaneously reach both distribution consistency and semantic disambiguation. Feature visualization of the recovered modalities illustrates the consistent modality-specific distribution and semantic alignment. Besides, quantitative experimental results verify that IMDer obtains state-of-the-art MER accuracy under various missing modality patterns.

AAAI Conference 2023 Conference Paper

Progressive Bayesian Inference for Scribble-Supervised Semantic Segmentation

  • Chuanwei Zhou
  • Chunyan Xu
  • Zhen Cui

The scribble-supervised semantic segmentation is an important yet challenging task in the field of computer vision. To deal with the pixel-wise sparse annotation problem, we propose a Progressive Bayesian Inference (PBI) framework to boost the performance of the scribble-supervised semantic segmentation, which can effectively infer the semantic distribution of these unlabeled pixels to guide the optimization of the segmentation network. The PBI dynamically improves the model learning from two aspects: the Bayesian inference module (i.e., semantic distribution learning) and the pixel-wise segmenter (i.e., model updating). Specifically, we effectively infer the semantic probability distribution of these unlabeled pixels with our designed Bayesian inference module, where its guidance is estimated through the Bayesian expectation maximization under the situation of partially observed data. The segmenter can be progressively improved under the joint guidance of the original scribble information and the learned semantic distribution. The segmenter optimization and semantic distribution promotion are encapsulated into a unified architecture where they could improve each other with mutual evolution in a progressive fashion. Comprehensive evaluations of several benchmark datasets demonstrate the effectiveness and superiority of our proposed PBI when compared with other state-of-the-art methods applied to the scribble-supervised semantic segmentation task.

AAAI Conference 2021 Conference Paper

Deep Wasserstein Graph Discriminant Learning for Graph Classification

  • Tong Zhang
  • Yun Wang
  • Zhen Cui
  • Chuanwei Zhou
  • Baoliang Cui
  • Haikuan Huang
  • Jian Yang

Graph topological structures are crucial to distinguish different-class graphs. In this work, we propose a deep Wasserstein graph discriminant learning (WGDL) framework to learn discriminative embeddings of graphs in Wassersteinmetric (W-metric) matching space. In order to bypass the calculation of W-metric class centers in discriminant analysis, as well as better support batch process learning, we introduce a reference set of graphs (aka graph dictionary) to express those representative graph samples (aka dictionary keys). On the bridge of graph dictionary, every input graph can be projected into the latent dictionary space through our proposed Wasserstein graph transformation (WGT). In WGT, we formulate inter-graph distance in W-metric space by virtue of the optimal transport (OT) principle, which effectively expresses the correlations of cross-graph structures. To make WGDL better representation ability, we dynamically update graph dictionary during training by maximizing the Wasserstein Discriminant loss, i. e. the ratio of inter-class versus intra-class Wasserstein distance. To evaluate our WGDL method, comprehensive experiments are conducted on six graph classification datasets. Experimental results demonstrate the effectiveness of our WGDL, and state-of-the-art performance.

IJCAI Conference 2021 Conference Paper

Graph Deformer Network

  • Wenting Zhao
  • Yuan Fang
  • Zhen Cui
  • Tong Zhang
  • Jian Yang

Convolution learning on graphs draws increasing attention recently due to its potential applications to a large amount of irregular data. Most graph convolution methods leverage the plain summation/average aggregation to avoid the discrepancy of responses from isomorphic graphs. However, such an extreme collapsing way would result in a structural loss and signal entanglement of nodes, which further cause the degradation of the learning ability. In this paper, we propose a simple yet effective Graph Deformer Network (GDN) to fulfill anisotropic convolution filtering on graphs, analogous to the standard convolution operation on images. Local neighborhood subgraphs (acting like receptive fields) with different structures are deformed into a unified virtual space, coordinated by several anchor nodes. In the deformation process, we transfer components of nodes therein into affinitive anchors by learning their correlations, and build a multi-granularity feature space calibrated with anchors. Anisotropic convolutional kernels can be further performed over the anchor-coordinated space to well encode local variations of receptive fields. By parameterizing anchors and stacking coarsening layers, we build a graph deformer network in an end-to-end fashion. Theoretical analysis indicates its connection to previous work and shows the promising property of graph isomorphism testing. Extensive experiments on widely-used datasets validate the effectiveness of GDN in graph and node classifications.

AAAI Conference 2021 Conference Paper

Graph Game Embedding

  • Xiaobin Hong
  • Tong Zhang
  • Zhen Cui
  • Yuge Huang
  • Pengcheng Shen
  • Shaoxin Li
  • Jian Yang

Graph embedding aims to encode nodes/edges into lowdimensional continuous features, and has become a crucial tool for graph analysis including graph/node classification, link prediction, etc. In this paper we propose a novel graph learning framework, named graph game embedding, to learn discriminative node representation as well as encode graph structures. Inspired by the spirit of game learning, node embedding is converted to the selection/searching process of player strategies, where each node corresponds to one player and each edge corresponds to the interaction of two players. Then, a utility function, which theoretically satisfies the Nash Equilibrium, is defined to measure the benefit/loss of players during graph evolution. Furthermore, a collaboration and competition mechanism is introduced to increase the discriminant learning ability. Under this graph game embedding framework, considering different interaction manners of nodes, we propose two specific models, named paired game embedding for paired nodes and group game embedding for group interaction. Comparing with existing graph embedding methods, our algorithm possesses two advantages: (1) the designed utility function ensures the stable graph evolution with theoretical convergence and Nash Equilibrium satisfaction; (2) the introduced collaboration and competition mechanism endows the graph game embedding framework with discriminative feature leaning ability by guiding each node to learn an optimal strategy distinguished from others. We test the proposed method on three public datasets about citation networks, and the experimental results verify the effectiveness of our method.

JBHI Journal 2021 Journal Article

Multi-Label Fusion Collaborative Matrix Factorization for Predicting LncRNA-Disease Associations

  • Ming-Ming Gao
  • Zhen Cui
  • Ying-Lian Gao
  • Juan Wang
  • Jin-Xing Liu

As we all know, science and technology are developing faster and faster. Many experts and scholars have demonstrated that human diseases are related to lncRNA, but only a few associations have been confirmed, and many unknown associations need to be found. In the process of finding associations, it takes a lot of time, so finding an efficient way to predict the associations between lncRNAs and diseases is particularly important. In this paper, we propose a multi-label fusion collaborative matrix factorization (MLFCMF) approach for predicting lncRNA-disease associations (LDAs). Firstly, the lncRNA space and disease space are optimized by multi-label to enhance the intrinsic link between lncRNA and disease and to tap potential information. Multi-label learning can encode a variety of data information from the sample space. Secondly, to learn multi-label information in the data space, the fusion method is used to handle the relationship between multiple labels. More comprehensive information will be obtained by weighing the effects of different labels. The addition of Gaussian interaction profile (GIP) kernel can increase the network similarity. Finally, the lncRNA-disease associations are predicted by the method of collaborative matrix factorization. The ten-fold cross-validation method is used to evaluate the MLFCMF method, and our method finally obtains an AUC value of 0. 8612. Detailed analysis of ovarian cancer, colorectal cancer, and lung cancer in the simulation experiment results. So it can be seen that our method MLFCMF is an effective model for predicting lncRNA-disease associations.

JBHI Journal 2021 Journal Article

WGRCMF: A Weighted Graph Regularized Collaborative Matrix Factorization Method for Predicting Novel LncRNA-Disease Associations

  • Jin-Xing Liu
  • Zhen Cui
  • Ying-Lian Gao
  • Xiang-Zhen Kong

In recent years, many human diseases have been determined to be associated with certain lncRNAs. Only a small percentage of all lncRNA-disease associations (LDAs) have been discovered by researchers. Predicting novel LDAs is time-consuming and costly. It is crucial to propose a method that can effectively identify potential LDAs to solve this problem based on the available datasets. Although some current methods can effectively predict potential LDAs, the prediction accuracy needs to be improved, and there are few known associations. Moreover, there are notable errors in the method of constructing the network and the bipartite graph, which interfere with the final results. A weighted graph regularized collaborative matrix factorization (WGRCMF) method is proposed to predict novel LDAs. We introduce the graph regularization terms into the collaborative matrix factorization. Considering that manifold learning can recover low-dimensional manifold structures from high-dimensional sampled data, we can find low-dimensional manifolds in high-dimensional space. In addition, a weight matrix is also introduced into the method, the significance of which is to prevent unknown associations from contributing to the final prediction matrix. Finally, the prediction accuracy of this method is better than those of other methods. In several cancer cases, we implemented the corresponding simulation experiments. According to the experimental results, the proposed method is feasible and effective.

AAAI Conference 2020 Conference Paper

Instance-Adaptive Graph for EEG Emotion Recognition

  • Tengfei Song
  • Suyuan Liu
  • Wenming Zheng
  • Yuan Zong
  • Zhen Cui

To tackle the individual differences and characterize the dynamic relationships among different EEG regions for EEG emotion recognition, in this paper, we propose a novel instance-adaptive graph method (IAG), which employs a more flexible way to construct graphic connections so as to present different graphic representations determined by different input instances. To fit the different EEG pattern, we employ an additional branch to characterize the intrinsic dynamic relationships between different EEG channels. To give a more precise graphic representation, we design the multilevel and multi-graph convolutional operation and the graph coarsening. Furthermore, we present a type of sparse graphic representation to extract more discriminative features. Experiments on two widely-used EEG emotion recognition datasets are conducted to evaluate the proposed model and the experimental results show that our method achieves the state-ofthe-art performance.

JBHI Journal 2020 Journal Article

LncRNA-Disease Associations Prediction Using Bipartite Local Model With Nearest Profile-Based Association Inferring

  • Zhen Cui
  • Jin-Xing Liu
  • Ying-Lian Gao
  • Rong Zhu
  • Sha-Sha Yuan

There is much evidence that long non-coding RNA (lncRNA) is associated with many diseases. However, it is time-consuming and expensive to identify meaningful lncRNA-disease associations (LDAs) through medical or biological experiments. Therefore, investigating how to identify more meaningful LDAs is necessary, and at the same time it is conducive to the prevention, diagnosis and treatment of complex diseases. Considering the limitations of some current prediction models, a novel model based on bipartite local model with nearest profile-based association inferring, BLM-NPAI, is developed for predicting LDAs. This model predicts novel LDAs from the lncRNA side and the disease side, respectively. More importantly, for some lncRNAs and diseases without any association, the model can also be predicted by their nearest neighbors. Leave-one-out cross validation (LOOCV) and 5-fold cross validation are implemented for BLM-NPAI to evaluate the performance of this model. Our model is superior to current advanced methods in most cases. In addition, to verify the validity and reliability of BLM-NPAI, three disease cases and three lncRNA cases are analyzed to further evaluate BLM-NPAI. Finally, these predicted novel LDAs are confirmed by using the LncRNA-disease database.

AAAI Conference 2020 Conference Paper

Variational Pathway Reasoning for EEG Emotion Recognition

  • Tong Zhang
  • Zhen Cui
  • Chunyan Xu
  • Wenming Zheng
  • Jian Yang

Research on human emotion cognition revealed that connections and pathways exist between spatially-adjacent and functional-related areas during emotion expression (Adolphs 2002a; Bullmore and Sporns 2009). Deeply inspired by this mechanism, we propose a heuristic Variational Pathway Reasoning (VPR) method to deal with EEG-based emotion recognition. We introduce random walk to generate a large number of candidate pathways along electrodes. To encode each pathway, the dynamic sequence model is further used to learn between-electrode dependencies. The encoded pathways around each electrode are aggregated to produce a pseudo maximum-energy pathway, which consists of the most important pair-wise connections. To find those most salient connections, we propose a sparse variational scaling (SVS) module to learn scaling factors of pseudo pathways by using the Bayesian probabilistic process and sparsity constraint, where the former endows good generalization ability while the latter favors adaptive pathway selection. Finally, the salient pathways from those candidates are jointly decided by the pseudo pathways and scaling factors. Extensive experiments on EEG emotion recognition demonstrate that the proposed VPR is superior to those state-of-the-art methods, and could find some interesting pathways w. r. t. different emotions.

AAAI Conference 2019 Conference Paper

Gaussian-Induced Convolution for Graphs

  • Jiatao Jiang
  • Zhen Cui
  • Chunyan Xu
  • Jian Yang

Learning representation on graph plays a crucial role in numerous tasks of pattern recognition. Different from gridshaped images/videos, on which local convolution kernels can be lattices, however, graphs are fully coordinate-free on vertices and edges. In this work, we propose a Gaussianinduced convolution (GIC) framework to conduct local convolution filtering on irregular graphs. Specifically, an edgeinduced Gaussian mixture model is designed to encode variations of subgraph region by integrating edge information into weighted Gaussian models, each of which implicitly characterizes one component of subgraph variations. In order to coarsen a graph, we derive a vertex-induced Gaussian mixture model to cluster vertices dynamically according to the connection of edges, which is approximately equivalent to the weighted graph cut. We conduct our multi-layer graph convolution network on several public datasets of graph classification. The extensive experiments demonstrate that our GIC is effective and can achieve the state-of-the-art results.

IJCAI Conference 2018 Conference Paper

A Novel Neural Network Model based on Cerebral Hemispheric Asymmetry for EEG Emotion Recognition

  • Yang Li
  • Wenming Zheng
  • Zhen Cui
  • Tong Zhang
  • Yuan Zong

In this paper, we propose a novel neural network model, called bi-hemispheres domain adversarial neural network (BiDANN), for EEG emotion recognition. BiDANN is motivated by the neuroscience findings, i. e. , the emotional brain's asymmetries between left and right hemispheres. The basic idea of BiDANN is to map the EEG feature data of both left and right hemispheres into discriminative feature spaces separately, in which the data representations can be classified easily. For further precisely predicting the class labels of testing data, we narrow the distribution shift between training and testing data by using a global and two local domain discriminators, which work adversarially to the classifier to encourage domain-invariant data representations to emerge. After that, the learned classifier from labeled training data can be applied to unlabeled testing data naturally. We conduct two experiments to verify the performance of our BiDANN model on SEED database. The experimental results show that the proposed model achieves the state-of-the-art performance.

AAAI Conference 2018 Conference Paper

Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition

  • Chaolong Li
  • Zhen Cui
  • Wenming Zheng
  • Chunyan Xu
  • Jian Yang

Variations of human body skeletons may be considered as dynamic graphs, which are generic data representation for numerous real-world applications. In this paper, we propose a spatio-temporal graph convolution (STGC) approach for assembling the successes of local convolutional filtering and sequence learning ability of autoregressive moving average. To encode dynamic graphs, the constructed multi-scale local graph convolution filters, consisting of matrices of local receptive fields and signal mappings, are recursively performed on structured graph data of temporal and spatial domain. The proposed model is generic and principled as it can be generalized into other dynamic models. We theoretically prove the stability of STGC and provide an upper-bound of the signal transformation to be learnt. Further, the proposed recursive model can be stacked into a multi-layer architecture. To evaluate our model, we conduct extensive experiments on four benchmark skeleton-based action datasets, including the large-scale challenging NTU RGB+D. The experimental results demonstrate the effectiveness of our proposed model and the improvement over the state-of-the-art.

NeurIPS Conference 2014 Conference Paper

Generalized Unsupervised Manifold Alignment

  • Zhen Cui
  • Hong Chang
  • Shiguang Shan
  • Xilin Chen

In this paper, we propose a generalized Unsupervised Manifold Alignment (GUMA) method to build the connections between different but correlated datasets without any known correspondences. Based on the assumption that datasets of the same theme usually have similar manifold structures, GUMA is formulated into an explicit integer optimization problem considering the structure matching and preserving criteria, as well as the feature comparability of the corresponding points in the mutual embedding space. The main benefits of this model include: (1) simultaneous discovery and alignment of manifold structures; (2) fully unsupervised matching without any pre-specified correspondences; (3) efficient iterative alignment without computations in all permutation cases. Experimental results on dataset matching and real-world applications demonstrate the effectiveness and the practicability of our manifold alignment method.