Author name cluster

Mingxuan Ju

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

A Pre-training Framework for Relational Data with Information-theoretic Principles

Quang Truong
Zhikai Chen
Mingxuan Ju
Tong Zhao
Neil Shah
Jiliang Tang

Relational databases underpin critical infrastructure across a wide range of domains, yet the design of generalizable pre-training strategies for learning from relational databases remains an open challenge due to task heterogeneity. Specifically, there exist many possible downstream tasks, as tasks are defined based on relational schema graphs, temporal dependencies, and SQL-defined label logics. An effective pre-training framework is desired to take these factors into account in order to obtain task-aware representations. By incorporating knowledge of the underlying distribution that drives label generation, downstream tasks can benefit from relevant side-channel information. To bridge this gap, we introduce Task Vector Estimation (TVE), a novel pre-training framework that constructs predictive supervisory signals via set-based aggregation over schema traversal graphs, explicitly modeling next-window relational dynamics. We formalize our approach through an information-theoretic lens, demonstrating that task-informed representations retain more relevant signals than those obtained without task priors. Extensive experiments on the RelBench benchmark show that TVE consistently outperforms traditional pre-training baselines. Our findings advocate for pre-training objectives that encode task heterogeneity and temporal structure as design principles for predictive modeling on relational databases. Our code is publicly available at https: //github. com/quang-truong/task-vector-estimation.

PDF Details

ICML Conference 2025 Conference Paper

Learning Along the Arrow of Time: Hyperbolic Geometry for Backward-Compatible Representation Learning

Ngoc Bui
Menglin Yang 0001
Runjin Chen
Leonardo Neves
Mingxuan Ju
Rex Ying
Neil Shah
Tong Zhao 0003

Backward compatible representation learning enables updated models to integrate seamlessly with existing ones, avoiding to reprocess stored data. Despite recent advances, existing compatibility approaches in Euclidean space neglect the uncertainty in the old embedding models and force the new model to replicate outdated representations regardless of their quality, and thereby hindering the learning process. In this paper, we switch perspectives to hyperbolic geometry, where we treat time as a natural axis for capturing a model’s confidence and evolution. By lifting embeddings into hyperbolic space and constraining updated embeddings to lie within the entailment cone of the old ones, we maintain generational consistency across models while accounting for uncertainties in the representations. To further enhance compatibility, we introduce a robust contrastive alignment loss that dynamically adjusts alignment weights based on the uncertainty of the old embeddings. Experiments validate the superiority of the proposed method in achieving compatibility, paving the way for more resilient and adaptable machine learning systems.

Details

TMLR Journal 2025 Journal Article

Graph Neural Networks (GNNs) are prominent in graph machine learning and have shown state-of-the-art performance in Link Prediction (LP) tasks. Nonetheless, recent studies show that GNNs struggle to produce good results on low-degree nodes despite their overall strong performance. In practical applications of LP, like recommendation systems, improving performance on low-degree nodes is critical, as it amounts to tackling the cold-start problem of improving the experiences of users with few observed interactions. In this paper, we investigate improving GNNs' LP performance on low-degree nodes while preserving their performance on high-degree nodes and propose a simple yet surprisingly effective augmentation technique called NodeDup. Specifically, NodeDup duplicates low-degree nodes and creates links between nodes and their own duplicates before following the standard supervised LP training scheme. By leveraging a ``multi-view'' perspective for low-degree nodes, NodeDup shows significant LP performance improvements on low-degree nodes without compromising any performance on high-degree nodes. Additionally, as a plug-and-play augmentation module, NodeDup can be easily applied on existing GNNs with very light computational cost. Extensive experiments show that NodeDup achieves 38.49%, 13.34%, and 6.76% relative improvements on isolated, low-degree, and warm nodes, respectively, on average across all datasets compared to GNNs and the existing cold-start methods.

PDF Details

ICML Conference 2024 Conference Paper

From Coarse to Fine: Enable Comprehensive Graph Self-supervised Learning with Multi-granular Semantic Ensemble

Qianlong Wen
Mingxuan Ju
Zhongyu Ouyang
Chuxu Zhang
Yanfang Ye 0001

Self-supervised learning (SSL) has gained increasing attention in the graph learning community, owing to its capability of enabling powerful models pre-trained on large unlabeled graphs for general purposes, facilitating quick adaptation to specific domains. Though promising, existing graph SSL frameworks often struggle to capture both high-level abstract features and fine-grained features simultaneously, leading to sub-optimal generalization abilities across different downstream tasks. To bridge this gap, we present Multi-granularity Graph Semantic Ensemble via Knowledge Distillation, namely MGSE, a plug-and-play graph knowledge distillation framework that can be applied to any existing graph SSL framework to enhance its performance by incorporating the concept of multi-granularity. Specifically, MGSE captures multi-granular knowledge by employing multiple student models to learn from a single teacher model, conditioned by probability distributions with different granularities. We apply it to six state-of-the-art graph SSL frameworks and evaluate their performances over multiple graph datasets across different domains, the experimental results show that MGSE can consistently boost the performance of these existing graph SSL frameworks with up to 9. 2% improvement.

Details

ICLR Conference 2023 Conference Paper

Chasing All-Round Graph Representation Robustness: Model, Training, and Optimization

Chunhui Zhang
Yijun Tian 0001
Mingxuan Ju
Zheyuan Liu 0010
Yanfang Ye 0001
Nitesh V. Chawla
Chuxu Zhang

Graph Neural Networks (GNNs) have achieved state-of-the-art results on a variety of graph learning tasks, however, it has been demonstrated that they are vulnerable to adversarial attacks, raising serious security concerns. A lot of studies have been developed to train GNNs in a noisy environment and increase their robustness against adversarial attacks. However, existing methods have not uncovered a principled difficulty: the convoluted mixture distribution between clean and attacked data samples, which leads to sub-optimal model design and limits their frameworks’ robustness. In this work, we first begin by identifying the root cause of mixture distribution, then, for tackling it, we propose a novel method GAME - Graph Adversarial Mixture of Experts to enlarge the model capacity and enrich the representation diversity of adversarial samples, from three perspectives of model, training, and optimization. Specifically, we first propose a plug-and- play GAME layer that can be easily incorporated into any GNNs and enhance their adversarial learning capabilities. Second, we design a decoupling-based graph adversarial training in which the component of the model used to generate adversarial graphs is separated from the component used to update weights. Third, we introduce a graph diversity regularization that enables the model to learn diverse representation and further improves model performance. Extensive experiments demonstrate the effectiveness and advantages of GAME over the state-of-the-art adversarial training methods across various datasets given different attacks.

Details

ICLR Conference 2023 Conference Paper

Generate rather than Retrieve: Large Language Models are Strong Context Generators

Wenhao Yu 0002
Dan Iter
Shuohang Wang
Yichong Xu
Mingxuan Ju
Soumya Sanyal 0001
Chenguang Zhu 0001
Michael Zeng 0001

Knowledge-intensive tasks, such as open-domain question answering (QA), require access to a large amount of world or domain knowledge. A common approach for knowledge-intensive tasks is to employ a retrieve-then-read pipeline that first retrieves a handful of relevant contextual documents from an external corpus such as Wikipedia and then predicts an answer conditioned on the retrieved documents. In this paper, we present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators. We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextual documents based on a given question, and then reads the generated documents to produce the final answer. Furthermore, we propose a novel clustering-based prompting method that selects distinct prompts, in order to generate diverse documents that cover different perspectives, leading to better recall over acceptable answers. We conduct extensive experiments on three different knowledge-intensive tasks, including open-domain QA, fact checking, and dialogue system. Notably, GenRead achieves 71.6 and 54.4 exact match scores on TriviaQA and WebQ, significantly outperforming the state-of-the-art retrieve-then-read pipeline DPR-FiD by +4.0 and +3.9, without retrieving any documents from any external knowledge source. Lastly, we demonstrate the model performance can be further improved by combining retrieval and generation. Our code and generated documents can be found at https://github.com/wyu97/GenRead.

Details

NeurIPS Conference 2023 Conference Paper

GraphPatcher: Mitigating Degree Bias for Graph Neural Networks via Test-time Augmentation

Mingxuan Ju
Tong Zhao
Wenhao Yu
Neil Shah
Yanfang Ye

Recent studies have shown that graph neural networks (GNNs) exhibit strong biases towards the node degree: they usually perform satisfactorily on high-degree nodes with rich neighbor information but struggle with low-degree nodes. Existing works tackle this problem by deriving either designated GNN architectures or training strategies specifically for low-degree nodes. Though effective, these approaches unintentionally create an artificial out-of-distribution scenario, where models mainly or even only observe low-degree nodes during the training, leading to a downgraded performance for high-degree nodes that GNNs originally perform well at. In light of this, we propose a test-time augmentation framework, namely GraphPatcher, to enhance test-time generalization of any GNNs on low-degree nodes. Specifically, GraphPatcher iteratively generates virtual nodes to patch artificially created low-degree nodes via corruptions, aiming at progressively reconstructing target GNN's predictions over a sequence of increasingly corrupted nodes. Through this scheme, GraphPatcher not only learns how to enhance low-degree nodes (when the neighborhoods are heavily corrupted) but also preserves the original superior performance of GNNs on high-degree nodes (when lightly corrupted). Additionally, GraphPatcher is model-agnostic and can also mitigate the degree bias for either self-supervised or supervised GNNs. Comprehensive experiments are conducted over seven benchmark datasets and GraphPatcher consistently enhances common GNNs' overall performance by up to 3. 6% and low-degree performance by up to 6. 5%, significantly outperforming state-of-the-art baselines. The source code is publicly available at https: //github. com/jumxglhf/GraphPatcher.

PDF Details

AAAI Conference 2023 Conference Paper

Let Graph Be the Go Board: Gradient-Free Node Injection Attack for Graph Neural Networks via Reinforcement Learning

Mingxuan Ju
Yujie Fan
Chuxu Zhang
Yanfang Ye

Graph Neural Networks (GNNs) have drawn significant attentions over the years and been broadly applied to essential applications requiring solid robustness or vigorous security standards, such as product recommendation and user behavior modeling. Under these scenarios, exploiting GNN's vulnerabilities and further downgrading its performance become extremely incentive for adversaries. Previous attackers mainly focus on structural perturbations or node injections to the existing graphs, guided by gradients from the surrogate models. Although they deliver promising results, several limitations still exist. For the structural perturbation attack, to launch a proposed attack, adversaries need to manipulate the existing graph topology, which is impractical in most circumstances. Whereas for the node injection attack, though being more practical, current approaches require training surrogate models to simulate a white-box setting, which results in significant performance downgrade when the surrogate architecture diverges from the actual victim model. To bridge these gaps, in this paper, we study the problem of black-box node injection attack, without training a potentially misleading surrogate model. Specifically, we model the node injection attack as a Markov decision process and propose Gradient-free Graph Advantage Actor Critic, namely G2A2C, a reinforcement learning framework in the fashion of advantage actor critic. By directly querying the victim model, G2A2C learns to inject highly malicious nodes with extremely limited attacking budgets, while maintaining a similar node feature distribution. Through our comprehensive experiments over eight acknowledged benchmark datasets with different characteristics, we demonstrate the superior performance of our proposed G2A2C over the existing state-of-the-art attackers. Source code is publicly available at: https://github.com/jumxglhf/G2A2C.

PDF Details DOI

ICLR Conference 2023 Conference Paper

Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization

Mingxuan Ju
Tong Zhao 0003
Qianlong Wen
Wenhao Yu 0002
Neil Shah
Yanfang Ye 0001
Chuxu Zhang

Self-supervised learning (SSL) for graph neural networks (GNNs) has attracted increasing attention from the graph machine learning community in recent years, owing to its capability to learn performant node embeddings without costly label information. One weakness of conventional SSL frameworks for GNNs is that they learn through a single philosophy, such as mutual information maximization or generative reconstruction. When applied to various downstream tasks, these frameworks rarely perform equally well for every task, because one philosophy may not span the extensive knowledge required for all tasks. To enhance the task generalization across tasks, as an important first step forward in exploring fundamental graph models, we introduce PARETOGNN, a multi-task SSL framework for node representation learning over graphs. Specifically, PARETOGNN is self-supervised by manifold pretext tasks observing multiple philosophies. To reconcile different philosophies, we explore a multiple-gradient descent algorithm, such that PARETOGNN actively learns from every pretext task while minimizing potential conflicts. We conduct comprehensive experiments over four downstream tasks (i.e., node classification, node clustering, link prediction, and partition prediction), and our proposal achieves the best overall performance across tasks on 11 widely adopted benchmark datasets. Besides, we observe that learning from multiple philosophies enhances not only the task generalization but also the single task performances, demonstrating that PARETOGNN achieves better task generalization via the disjoint yet complementary knowledge learned from different philosophies. Our code is publicly available at https://github.com/jumxglhf/ParetoGNN.

Details

AAAI Conference 2022 Conference Paper

Adaptive Kernel Graph Neural Network

Mingxuan Ju
Shifu Hou
Yujie Fan
Jianan Zhao
Yanfang Ye
Liang Zhao

Graph neural networks (GNNs) have demonstrated great success in representation learning for graph-structured data. The layer-wise graph convolution in GNNs is shown to be powerful at capturing graph topology. During this process, GNNs are usually guided by pre-defined kernels such as Laplacian matrix, adjacency matrix, or their variants. However, the adoptions of pre-defined kernels may restrain the generalities to different graphs: mismatch between graph and kernel would entail sub-optimal performance. For example, GNNs that focus on low-frequency information may not achieve satisfactory performance when high-frequency information is significant for the graphs, and vice versa. To solve this problem, in this paper, we propose a novel framework - i. e. , namely Adaptive Kernel Graph Neural Network (AKGNN) which learns to adapt to the optimal graph kernel in a unified manner at the first attempt. In the proposed AKGNN, we first design a data-driven graph kernel learning mechanism, which adaptively modulates the balance between all-pass and lowpass filters by modifying the maximal eigenvalue of the graph Laplacian. Through this process, AKGNN learns the optimal threshold between high and low frequency signals to relieve the generality problem. Later, we further reduce the number of parameters by a parameterization trick and enhance the expressive power by a global readout function. Extensive experiments are conducted on acknowledged benchmark datasets and promising results demonstrate the outstanding performance of our proposed AKGNN by comparison with state-of-the-art GNNs. The source code is publicly available at: https: //github. com/jumxglhf/AKGNN.

PDF Details

AAAI Conference 2021 Conference Paper

Disentangled Representation Learning in Heterogeneous Information Network for Large-scale Android Malware Detection in the COVID-19 Era and Beyond

Shifu Hou
Yujie Fan
Mingxuan Ju
Yanfang Ye
Wenqiang Wan
Kui Wang
Yinming Mei
Qi Xiong

In the fight against the COVID-19 pandemic, many social activities have moved online; society’s overwhelming reliance on the complex cyberspace makes its security more important than ever. In this paper, we propose and develop an intelligent system named Dr. HIN to protect users against the evolving Android malware attacks in the COVID-19 era and beyond. In Dr. HIN, besides app content, we propose to consider higherlevel semantics and social relations among apps, developers and mobile devices to comprehensively depict Android apps; and then we introduce a structured heterogeneous information network (HIN) to model the complex relations and exploit meta-path guided strategy to learn node (i. e. , app) representations from HIN. As the representations of malware could be highly entangled with benign apps in the complex ecosystem of development, it poses a new challenge of learning the latent explanatory factors hidden in the HIN embeddings to detect the evolving malware. To address this challenge, we propose to integrate domain priors generated from different views (i. e. , app content, app authorship, app installation) to devise an adversarial disentangler to separate the distinct, informative factors of variations hidden in the HIN embeddings for large-scale Android malware detection. This is the first attempt of disentangled representation learning in HIN data. Promising experimental results based on real sample collections from security industry demonstrate the performance of Dr. HIN in evolving Android malware detection, by comparison with baselines and popular mobile security products.

PDF Details

JBHI Journal 2020 Journal Article

$\alpha$-Satellite: An AI-Driven System and Benchmark Datasets for Dynamic COVID-19 Risk Assessment in the United States

Yanfang Ye
Shifu Hou
Yujie Fan
Yiming Zhang
Yiyue Qian
Shiyu Sun
Qian Peng
Mingxuan Ju

The fast evolving and deadly outbreak of coronavirus disease (COVID-19) has posed grand challenges to human society. To slow the spread of virus infections and better respond for community mitigation, by advancing capabilities of artificial intelligence (AI) and leveraging the large-scale and up-to-date data generated from heterogeneous sources (e. g. , disease related data, demographic, mobility and social media data), in this work, we propose and develop an AI-driven system (named α-Satellite), as an initial offering, to provide dynamic COVID-19 risk assessment in the United States. More specifically, given a point of interest (POI), the system will automatically provide risk indices associated with it in a hierarchical manner (e. g. , state, county, POI) to enable people to select appropriate actions for protection while minimizing disruptions to daily life. To comprehensively evaluate our system for dynamic COVID-19 risk assessment, we first conduct a set of empirical studies; and then we validate it based on a real-world dataset consisting of 5, 060 annotated POIs, which achieves the area of under curve (AUC) of 0. 9202. As of June 18, 2020, α-Satellite has had 56, 980 users. Based on the feedback from its large-scale users, we perform further analysis and have three key findings: i) people from more severe regions (i. e. , with larger numbers of COVID-19 cases) have stronger interests using our system to assist with actionable information; ii) users are more concerned about their nearby areas in terms of COVID-19 risks; iii) the user feedback about their perceptions towards COVID-19 risks of their query POIs indicate the challenge of public concerns about the safety versus its negative effects on society and the economy. Our system and generated datasets have been made publicly accessible via our website.

Details DOI