Author name cluster

Huiyuan Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

2 author rows

ICLR Conference 2025 Conference Paper

A Large-scale Training Paradigm for Graph Generative Models

Yu Wang 0160
Ryan A. Rossi
Namyong Park 0001
Huiyuan Chen
Nesreen K. Ahmed
Puja Trivedi
Franck Dernoncourt
Danai Koutra

Large Generative Models (LGMs) such as GPT, Stable Diffusion, Sora, and Suno are trained on a huge amount of texts, images, videos, and audio that are extremely diverse from numerous domains. This large-scale training paradigm on diverse well-curated data enhances the creativity and diversity of the generated content. However, all previous graph-generative models (e.g., GraphRNN, MDVAE, MoFlow, GDSS, and DiGress) have been trained only on one dataset each time, which cannot replicate the revolutionary success achieved by LGMs in other fields. To remedy this crucial gap, we propose a large-scale training paradigm that uses a large corpus of graphs (over 5000 graphs) from 13 domains, leading to the development of large graph generative models (LGGMs). We empirically demonstrate that the pre-trained LGGMs have superior zero-shot generative capability to existing graph generative models. Furthermore, our pre-trained LGGMs can be easily fine-tuned with graphs from target domains and demonstrate even better performance than those directly trained from scratch, behaving as a solid starting point for real-world customization. Inspired by Stable Diffusion, we further equip LGGMs with the Text-to-Graph generation capability, such as providing the description of the network name and domain (i.e., "The power-1138-bus graph represents a network of buses in a power distribution system.") and network statistics (i.e., "The graph has a low average degree, suitable for modeling social media interactions."). This Text-to-Graph capability integrates the extensive world knowledge in the underlying language model, offering users fine-grained control of the generated graphs. We release the code, the model checkpoint, and the datasets at https://github.com/KINDLab-Fly/LGGM.

Details

ICML Conference 2025 Conference Paper

Understanding and Mitigating Memorization in Diffusion Models for Tabular Data

Zhengyu Fang
Zhimeng Jiang
Huiyuan Chen
Xiao Li
Jing Li 0002

Tabular data generation has attracted significant research interest in recent years, with the tabular diffusion models greatly improving the quality of synthetic data. However, while memorization—where models inadvertently replicate exact or near-identical training data—has been thoroughly investigated in image and text generation, its effects on tabular data remain largely unexplored. In this paper, we conduct the first comprehensive investigation of memorization phenomena in diffusion models for tabular data. Our empirical analysis reveals that memorization appears in tabular diffusion models and increases with larger training epochs. We further examine the influence of factors such as dataset sizes, feature dimensions, and different diffusion models on memorization. Additionally, we provide a theoretical explanation for why memorization occurs in tabular diffusion models. To address this issue, we propose TabCutMix, a simple yet effective data augmentation technique that exchanges randomly selected feature segments between random same-class training sample pairs. Building upon this, we introduce TabCutMixPlus, an enhanced method that clusters features based on feature correlations and ensures that features within the same cluster are exchanged together during augmentation. This clustering mechanism mitigates out-of-distribution (OOD) generation issues by maintaining feature coherence. Experimental results across various datasets and diffusion models demonstrate that TabCutMix effectively mitigates memorization while maintaining high-quality data generation. Our code is available at https: //github. com/fangzy96/TabCutMix.

Details

NeurIPS Conference 2024 Conference Paper

Discrete-state Continuous-time Diffusion for Graph Generation

Zhe Xu
Ruizhong Qiu
Yuzhong Chen
Huiyuan Chen
Xiran Fan
Menghai Pan
Zhichen Zeng
Mahashweta Das

Graph is a prevalent discrete data structure, whose generation has wide applications such as drug discovery and circuit design. Diffusion generative models, as an emerging research focus, have been applied to graph generation tasks. Overall, according to the space of states and time steps, diffusion generative models can be categorized into discrete-/continuous-state discrete-/continuous-time fashions. In this paper, we formulate the graph diffusion generation in a discrete-state continuous-time setting, which has never been studied in previous graph diffusion models. The rationale of such a formulation is to preserve the discrete nature of graph-structured data and meanwhile provide flexible sampling trade-offs between sample quality and efficiency. Analysis shows that our training objective is closely related to the generation quality and our proposed generation framework enjoys ideal invariant/equivariant properties concerning the permutation of node ordering. Our proposed model shows competitive empirical performance against other state-of-the-art graph generation solutions on various benchmarks while at the same time can flexibly trade off the generation quality and efficiency in the sampling phase.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Fairness without Demographics through Shared Latent Space-Based Debiasing

Rashidul Islam
Huiyuan Chen
Yiwei Cai

Ensuring fairness in machine learning (ML) is crucial, particularly in applications that impact diverse populations. The majority of existing works heavily rely on the availability of protected features like race and gender. However, practical challenges such as privacy concerns and regulatory restrictions often prohibit the use of this data, limiting the scope of traditional fairness research. To address this, we introduce a Shared Latent Space-based Debiasing (SLSD) method that transforms data from both the target domain, which lacks protected features, and a separate source domain, which contains these features, into correlated latent representations. This allows for joint training of a cross-domain protected group estimator on the representations. We then debias the downstream ML model with an adversarial learning technique that leverages the group estimator. We also present a relaxed variant of SLSD, the R-SLSD, that occasionally accesses a small subset of protected features from the target domain during its training phase. Our extensive experiments on benchmark datasets demonstrate that our methods consistently outperform existing state-of-the-art models in standard group fairness metrics.

PDF Details DOI

ICML Conference 2024 Conference Paper

LLM Maybe LongLM: SelfExtend LLM Context Window Without Tuning

Hongye Jin
Xiaotian Han
Jingfeng Yang 0001
Zhimeng Jiang
Zirui Liu 0001
Chia-Yuan Chang 0002
Huiyuan Chen
Xia Hu 0001

It is well known that LLMs cannot generalize well to long contexts whose lengths are larger than the training sequence length. This poses challenges when employing LLMs for processing long input sequences during inference. In this work, we argue that LLMs themselves have inherent capabilities to handles s long contexts without fine-tuning. To achieve this goal, we propose SelfExtend to extend the context window of LLMs by constructing bi-level attention information: the grouped attention and the neighbor attention. The grouped attention captures the dependencies among tokens that are far apart, while neighbor attention captures dependencies among adjacent tokens within a specified range. The two-level attentions are computed based on the original model’s self-attention mechanism during inference. With minor code modification, our SelfExtend can effortlessly extend existing LLMs’ context window without any fine-tuning. We conduct comprehensive experiments on multiple benchmarks and the results show that our SelfExtend can effectively extend existing LLMs’ context window length.

Details

ICML Conference 2024 Conference Paper

TabLog: Test-Time Adaptation for Tabular Data Using Logic Rules

Weijieying Ren
Xiaoting Li 0001
Huiyuan Chen
Vineeth Rakesh
Zhuoyi Wang
Mahashweta Das
Vasant G. Honavar

We consider the problem of test-time adaptation of predictive models trained on tabular data. Effective solution of this problem requires adaptation of predictive models trained on the source domain to a target domain, using only unlabeled target domain data, without access to source domain data. Existing test-time adaptation methods for tabular data have difficulty coping with the heterogeneous features and their complex dependencies inherent in tabular data. To overcome these limitations, we consider test-time adaptation in the setting wherein the logical structure of the rules is assumed to remain invariant despite distribution shift between source and target domains whereas the numerical parameters associated with the rules and the weights assigned to them can vary to accommodate distribution shift. TabLog discretizes numerical features, models dependencies between heterogeneous features, introduces a novel contrastive loss for coping with distribution shift, and presents an end-to-end framework for efficient training and test-time adaptation by taking advantage of a logical neural network representation of a rule ensemble. We present results of experiments using several benchmark data sets that demonstrate TabLog is competitive with or improves upon the state-of-the-art methods for test-time adaptation of predictive models trained on tabular data. Our code is available at https: //github. com/WeijieyingRen/TabLog.

Details

NeurIPS Conference 2023 Conference Paper

From Trainable Negative Depth to Edge Heterophily in Graphs

Yuchen Yan
Yuzhong Chen
Huiyuan Chen
Minghua Xu
Mahashweta Das
Hao Yang
Hanghang Tong

Finding the proper depth $d$ of a graph convolutional network (GCN) that provides strong representation ability has drawn significant attention, yet nonetheless largely remains an open problem for the graph learning community. Although noteworthy progress has been made, the depth or the number of layers of a corresponding GCN is realized by a series of graph convolution operations, which naturally makes $d$ a positive integer ($d \in \mathbb{N}+$). An interesting question is whether breaking the constraint of $\mathbb{N}+$ by making $d$ a real number ($d \in \mathbb{R}$) can bring new insights into graph learning mechanisms. In this work, by redefining GCN's depth $d$ as a trainable parameter continuously adjustable within $(-\infty, +\infty)$, we open a new door of controlling its signal processing capability to model graph homophily/heterophily (nodes with similar/dissimilar labels/attributes tend to be inter-connected). A simple and powerful GCN model TEDGCN, is proposed to retain the simplicity of GCN and meanwhile automatically search for the optimal $d$ without the prior knowledge regarding whether the input graph is homophilic or heterophilic. Negative-valued $d$ intrinsically enables high-pass frequency filtering functionality via augmented topology for graph heterophily. Extensive experiments demonstrate the superiority of TEDGCN on node classification tasks for a variety of homophilic and heterophilic graphs.

PDF Details

ICLR Conference 2023 Conference Paper

Interpretable Debiasing of Vectorized Language Representations with Iterative Orthogonalization

Prince Osei Aboagye
Yan Zheng 0001
Jack Shunn
Chin-Chia Michael Yeh
Junpeng Wang 0001
Zhongfang Zhuang
Huiyuan Chen
Liang Wang 0047

We propose a new mechanism to augment a word vector embedding representation that offers improved bias removal while retaining the key information—resulting in improved interpretability of the representation. Rather than removing the information associated with a concept that may induce bias, our proposed method identifies two concept subspaces and makes them orthogonal. The resulting representation has these two concepts uncorrelated. Moreover, because they are orthogonal, one can simply apply a rotation on the basis of the representation so that the resulting subspace corresponds with coordinates. This explicit encoding of concepts to coordinates works because they have been made fully orthogonal, which previous approaches do not achieve. Furthermore, we show that this can be extended to multiple subspaces. As a result, one can choose a subset of concepts to be represented transparently and explicitly, while the others are retained in the mixed but extremely expressive format of the representation.

Details

IJCAI Conference 2023 Conference Paper

Probabilistic Masked Attention Networks for Explainable Sequential Recommendation

Huiyuan Chen
Kaixiong Zhou
Zhimeng Jiang
Chin-Chia Michael Yeh
Xiaoting Li
Menghai Pan
Yan Zheng
Xia Hu

Transformer-based models are powerful for modeling temporal dynamics of user preference in sequential recommendation. Most of the variants adopt the Softmax transformation in the self-attention layers to generate dense attention probabilities. However, real-world item sequences are often noisy, containing a mixture of true-positive and false-positive interactions. Such dense attentions inevitably assign probability mass to noisy or irrelevant items, leading to sub-optimal performance and poor explainability. Here we propose a Probabilistic Masked Attention Network (PMAN) to identify the sparse pattern of attentions, which is more desirable for pruning noisy items in sequential recommendation. Specifically, we employ a probabilistic mask to achieve sparse attentions under a constrained optimization framework. As such, PMAN allows to select which information is critical to be retained or dropped in a data-driven fashion. Experimental studies on real-world benchmark datasets show that PMAN is able to improve the performance of Transformers significantly.

PDF Details DOI

IJCAI Conference 2020 Conference Paper

Learning Data-Driven Drug-Target-Disease Interaction via Neural Tensor Network

Huiyuan Chen
Jing Li

Precise medicine recommendations provide more effective treatments and cause fewer drug side effects. A key step is to understand the mechanistic relationships among drugs, targets, and diseases. Tensor-based models have the ability to explore relationships of drug-target-disease based on large amount of labeled data. However, existing tensor models fail to capture complex nonlinear dependencies among tensor data. In addition, rich medical knowledge are far less studied, which may lead to unsatisfied results. Here we propose a Neural Tensor Network (NeurTN) to assist personalized medicine treatments. NeurTN seamlessly combines tensor algebra and deep neural networks, which offers a more powerful way to capture the nonlinear relationships among drugs, targets, and diseases. To leverage medical knowledge, we augment NeurTN with geometric neural networks to capture the structural information of both drugs’ chemical structures and targets’ sequences. Extensive experiments on real-world datasets demonstrate the effectiveness of the NeurTN model.

PDF Details DOI

IJCAI Conference 2020 Conference Paper

Neural Tensor Model for Learning Multi-Aspect Factors in Recommender Systems

Huiyuan Chen
Jing Li

Recommender systems often involve multi-aspect factors. For example, when shopping for shoes online, consumers usually look through their images, ratings, and product's reviews before making their decisions. To learn multi-aspect factors, many context-aware models have been developed based on tensor factorizations. However, existing models assume multilinear structures in the tensor data, thus failing to capture nonlinear feature interactions. To fill this gap, we propose a novel nonlinear tensor machine, which combines deep neural networks and tensor algebra to capture nonlinear interactions among multi-aspect factors. We further consider adversarial learning to assist the training of our model. Extensive experiments demonstrate the effectiveness of the proposed model.

PDF Details DOI