Author name cluster

Tianxin Wei

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

15 papers

2 author rows

TMLR Journal 2026 Journal Article

DiffKGW: Stealthy and Robust Diffusion Model Watermarking

Tianxin Wei
Ruizhong Qiu
Yifan Chen
Yunzhe Qi
Jiacheng Lin
Wenxuan Bao
Wenju Xu
Sreyashi Nag

Diffusion models are known for their supreme capability to generate realistic images. However, ethical concerns, such as copyright protection and the generation of inappropriate content, pose significant challenges for the practical deployment of diffusion models. Recent work has proposed a flurry of watermarking techniques that inject artificial patterns into initial latent representations of diffusion models, offering a promising solution to these issues. However, enforcing a specific pattern on selected elements can disrupt the Gaussian distribution of the initial latent representation. Inspired by watermarks for large language models (LLMs), we generalize the LLM KGW watermark to image diffusion models and propose a stealthy probability adjustment approach DiffKGW that preserves the Gaussian distribution of initial latent representation. In addition, we dissect the design principles of state-of-the-art watermarking techniques and introduce a unified framework. We identify a set of dimensions that explain the manipulation enforced by watermarking methods, including the distribution of individual elements, the specification of watermark shapes within each channel, and the choice of channels for watermark embedding. Through the empirical studies on regular text-to-image applications and the first systematic attempt at watermarking image-to-image diffusion models, we thoroughly verify the effectiveness of our proposed framework through comprehensive evaluations. On all the diffusion models, including Stable Diffusion, our approach induced from the proposed framework not only preserves image quality but also outperforms existing methods in robustness against a wide range of attacks.

PDF Details

AAAI Conference 2026 Conference Paper

Panda: Test-Time Adaptation with Negative Data Augmentation

Ruxi Deng
Wenxuan Bao
Tianxin Wei
Jingrui He

Pretrained vision-language models exhibit strong zero-shot classification capabilities, but their predictions degrade significantly under common image corruptions. To improve robustness, many test-time adaptation (TTA) methods adopt positive data augmentation (PDA), which generates multiple views of each test sample to reduce prediction variance. However, these methods suffer from two key limitations. First, it introduces considerable computational overhead due to the large number of augmentations required per image. Second, it fails to mitigate prediction bias, where the model tends to predict certain classes disproportionately under corruption, as PDA operates on corrupted inputs and typically does not remove the corruption itself. To address these challenges, we propose Panda, a novel TTA method based on negative data augmentation (NDA). Unlike positive augmentations that preserve object semantics, Panda generates negative augmentations by disrupting semantic content. It divides images into patches and randomly assembles them from a shared patch pool. These negatively augmented images retain corruption-specific features while discarding object-relevant signals. We then subtract the mean feature of these negative samples from the original image feature, effectively suppressing corruption-related components while preserving class-relevant information. This mitigates prediction bias under distribution shifts. Importantly, Panda allows augmentation to be shared across samples within a batch, resulting in minimal computational overhead. Panda can be seamlessly integrated into existing test-time adaptation frameworks and substantially improve their robustness. Our experiments indicate that Panda delivers superior performance compared to PDA methods, and a wide range of TTA methods exhibit significantly enhanced performance when integrated with Panda.

PDF Details DOI

TMLR Journal 2026 Journal Article

Pave Your Own Path: Graph Gradual Domain Adaptation on Fused Gromov-Wasserstein Geodesics

Zhichen Zeng
Ruizhong Qiu
Wenxuan Bao
Tianxin Wei
Xiao Lin
Yuchen Yan
Tarek F. Abdelzaher
Jiawei Han

Graph neural networks, despite their impressive performance, are highly vulnerable to distribution shifts on graphs. Existing graph domain adaptation (graph DA) methods often implicitly assume a mild shift between source and target graphs, limiting their applicability to real-world scenarios with large shifts. Gradual domain adaptation (GDA) has emerged as a promising approach for addressing large shifts by gradually adapting the source model to the target domain via a path of unlabeled intermediate domains. Existing GDA methods exclusively focus on independent and identically distributed (IID) data with a predefined path, leaving their extension to non-IID graphs without a given path an open challenge. To bridge this gap, we present Gadget, the first GDA framework for non-IID graph data. First (theoretical foundation), the Fused Gromov-Wasserstein (FGW) distance is adopted as the domain discrepancy for non-IID graphs, based on which, we derive an error bound on node, edge and graph-level tasks, showing that the target domain error is proportional to the length of the path. Second (optimal path), guided by the error bound, we identify the FGW geodesic as the optimal path, which can be efficiently generated by our proposed algorithm. The generated path can be seamlessly integrated with existing graph DA methods to handle large shifts on graphs, improving state-of-the-art graph DA methods by up to 6.8% in accuracy on real-world datasets.

PDF Details

ICML Conference 2025 Conference Paper

Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting

Zhining Liu 0002
Ze Yang
Xiao Lin 0016
Ruizhong Qiu
Tianxin Wei
Yada Zhu
Hendrik F. Hamann
Jingrui He

Time-series forecasting plays a critical role in many real-world applications. Although increasingly powerful models have been developed and achieved superior results on benchmark datasets, through a fine-grained sample-level inspection, we find that (i) no single model consistently outperforms others across different test samples, but instead (ii) each model excels in specific cases. These findings prompt us to explore how to adaptively leverage the distinct strengths of various forecasting models for different samples. We introduce TimeFuse, a framework for collective time-series forecasting with sample-level adaptive fusion of heterogeneous models. TimeFuse utilizes meta-features to characterize input time series and trains a learnable fusor to predict optimal model fusion weights for any given input. The fusor can leverage samples from diverse datasets for joint training, allowing it to adapt to a wide variety of temporal patterns and thus generalize to new inputs, even from unseen datasets. Extensive experiments demonstrate the effectiveness of TimeFuse in various long-/short-term forecasting tasks, achieving near-universal improvement over the state-of-the-art individual models. Code is available at https: //github. com/ZhiningLiu1998/TimeFuse.

Details

NeurIPS Conference 2025 Conference Paper

CLIMB: Class-imbalanced Learning Benchmark on Tabular Data

Zhining Liu
Zihao Li
Ze Yang
Tianxin Wei
Jian Kang
Yada Zhu
Hendrik Hamann
Jingrui He

Class-imbalanced learning (CIL) on tabular data is important in many real-world applications where the minority class holds the critical but rare outcomes. In this paper, we present CLIMB, a comprehensive benchmark for class-imbalanced learning on tabular data. CLIMB includes 73 real-world datasets across diverse domains and imbalance levels, along with unified implementations of 29 representative CIL algorithms. Built on a high-quality open-source Python package with unified API designs, detailed documentation, and rigorous code quality controls, CLIMB supports easy implementation and comparison between different CIL algorithms. Through extensive experiments, we provide practical insights on method accuracy and efficiency, highlighting the limitations of naive rebalancing, the effectiveness of ensembles, and the importance of data quality. Our code, documentation, and examples are available at https: //github. com/ZhiningLiu1998/imbalanced-ensemble.

PDF Details

ICML Conference 2025 Conference Paper

Graph4MM: Weaving Multimodal Learning with Structural Information

Xuying Ning
Dongqi Fu
Tianxin Wei
Wujiang Xu
Jingrui He

Real-world multimodal data usually exhibit complex structural relationships beyond traditional one-to-one mappings like image-caption pairs. Entities across modalities interact in intricate ways, with images and text forming diverse interconnections through contextual dependencies and co-references. Graphs provide powerful structural information for modeling intra-modal and inter-modal relationships. However, previous works fail to distinguish multi-hop neighbors and treat the graph as a standalone modality, which fragments the overall understanding. This limitation presents two key challenges in multimodal learning: (1) integrating structural information from multi-hop neighbors into foundational models, and (2) fusing modality-specific information in a principled manner. To address these challenges, we revisit the role of graphs in multimodal learning within the era of foundation models and propose Graph4MM, a graph-based multimodal learning framework. To be specific, we introduce Hop-Diffused Attention, which integrates multi-hop structural information into self-attention through causal masking and hop diffusion. Furthermore, we design MM-QFormer, a multi-mapping querying transformer for cross-modal fusion. Through theoretical and empirical analysis, we show that leveraging structures to integrate both intra- and inter-modal interactions improves multimodal understanding beyond treating them as a standalone modality. Experiments on both generative and discriminative tasks show that Graph4MM outperforms larger VLMs, LLMs, and multimodal graph baselines, achieving a 6. 93% average improvement.

Details

UAI Conference 2025 Conference Paper

i 2 VAE: Interest Information Augmentation with Variational Regularizers for Cross-Domain Sequential Recommendation

Xuying Ning
Wujiang Xu
Tianxin Wei
Xiaolei Liu

Cross-Domain Sequential Recommendation (CDSR) leverages user behaviors across multiple domains to mitigate data sparsity and cold-start challenges in Single-Domain Sequential Recommendation. Existing methods primarily rely on shared users (overlapping users) to learn transferable interest representations. However, these approaches have limited information propagation, benefiting mainly overlapping users and those with rich interaction histories while neglecting non-overlapping (cold-start) and long-tailed users, who constitute the majority in real-world scenarios. To address this issue, we propose i$^2$VAE, a novel variational autoencoder (VAE)-based framework that enhances user interest learning with mutual information-based regularizers. i$^2$VAE improves recommendations for cold-start and long-tailed users while maintaining strong performance across all user groups. Specifically, cross-domain and disentangling regularizers extract transferable features for cold-start users, while a pseudo-sequence generator synthesizes interactions for long-tailed users, refined by a denoising regularizer to filter noise and preserve meaningful interest signals. Extensive experiments demonstrate that i$^2$VAE outperforms state-of-the-art methods, underscoring its effectiveness in real-world CDSR applications. Code and datasets are available at https: //github. com/WujiangXu/IM-VAE.

Details

ICML Conference 2024 Conference Paper

Graph Mixup on Approximate Gromov-Wasserstein Geodesics

Zhichen Zeng 0001
Ruizhong Qiu
Zhe Xu 0007
Zhining Liu 0002
Yuchen Yan
Tianxin Wei
Lei Ying 0001
Jingrui He

Mixup, which generates synthetic training samples on the data manifold, has been shown to be highly effective in augmenting Euclidean data. However, finding a proper data manifold for graph data is non-trivial, as graphs are non-Euclidean data in disparate spaces. Though efforts have been made, most of the existing graph mixup methods neglect the intrinsic geodesic guarantee, thereby generating inconsistent sample-label pairs. To address this issue, we propose GeoMix to mixup graphs on the Gromov-Wasserstein (GW) geodesics. A joint space over input graphs is first defined based on the GW distance, and graphs are then transformed into the GW space through equivalence-preserving transformations. We further show that the linear interpolation of the transformed graph pairs defines a geodesic connecting the original pairs on the GW manifold, hence ensuring the consistency between generated samples and labels. An accelerated mixup algorithm on the approximate low-dimensional GW manifold is further proposed. Extensive experiments show that the proposed GeoMix promotes the generalization and robustness of GNN models.

Details

ICML Conference 2024 Conference Paper

Language Models as Semantic Indexers

Bowen Jin
Hansi Zeng
Guoyin Wang 0001
Xiusi Chen
Tianxin Wei
Ruirui Li 0002
Zhengyang Wang
Zheng Li 0018

Semantic identifier (ID) is an important concept in information retrieval that aims to preserve the semantics of objects such as documents and items inside their IDs. Previous studies typically adopt a two-stage pipeline to learn semantic IDs by first procuring embeddings using off-the-shelf text encoders and then deriving IDs based on the embeddings. However, each step introduces potential information loss, and there is usually an inherent mismatch between the distribution of embeddings within the latent space produced by text encoders and the anticipated distribution required for semantic indexing. It is non-trivial to design a method that can learn the document’s semantic representations and its hierarchical structure simultaneously, given that semantic IDs are discrete and sequentially structured, and the semantic supervision is deficient. In this paper, we introduce LMIndexer, a self-supervised framework to learn semantic IDs with a generative language model. We tackle the challenge of sequential discrete ID by introducing a semantic indexer capable of generating neural sequential discrete representations with progressive training and contrastive learning. In response to the semantic supervision deficiency, we propose to train the model with a self-supervised document reconstruction objective. We show the high quality of the learned IDs and demonstrate their effectiveness on three tasks including recommendation, product search, and document retrieval on five datasets from various domains. Code is available at https: //github. com/PeterGriffinJin/LMIndexer.

Details

AAAI Conference 2024 Conference Paper

TAU: Trajectory Data Augmentation with Uncertainty for Next POI Recommendation

Zhuang Zhuang
Tianxin Wei
Lingbo Liu
Heng Qi
Yanming Shen
Baocai Yin

Next Point-of-Interest (POI) recommendation has been proven effective at utilizing sparse, intricate spatial-temporal trajectory data to recommend subsequent POIs to users. While existing methods commonly alleviate the problem of data sparsity by integrating spatial-temporal context information, POI category features, and social relationships, they largely overlook the fact that the trajectory sequences collected in the datasets are often incomplete. This oversight limits the model’s potential to fully leverage historical context. In light of this background, we propose Trajectory Data Augmentation with Uncertainty (TAU) for Next POI Recommendation. TAU is a general graph-based trajectory data augmentation method designed to complete user mobility patterns by marrying uncertainty estimation into the next POI recommendation task. More precisely, TAU taps into the global transition pattern graph to identify sets of intermediate nodes located between every pair of locations, effectively leveraging edge weights as transition probabilities. During trajectory sequence construction, TAU selectively prompts intermediate nodes, chosen based on their likelihood of occurrence as pseudo-labels, to establish comprehensive trajectory sequences. Furthermore, to gauge the certainty and impact of pseudo-labels on the target location, we introduce a novel confidence-aware calibration strategy using evidence deep learning (EDL) for improved performance and reliability. The experimental results clearly indicate that our TAU method achieves consistent performance improvements over existing techniques across two real-world datasets, verifying its effectiveness as the state-of-the-art approach to the task.

PDF Details DOI

ICLR Conference 2024 Conference Paper

Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond

Tianxin Wei
Bowen Jin
Ruirui Li 0002
Hansi Zeng
Zhengyang Wang
Jianhui Sun
Qingyu Yin
Hanqing Lu

Developing a universal model that can effectively harness heterogeneous resources and respond to a wide range of personalized needs has been a longstanding community aspiration. Our daily choices, especially in domains like fashion and retail, are substantially shaped by multi-modal data, such as pictures and textual descriptions. These modalities not only offer intuitive guidance but also cater to personalized user preferences. However, the predominant personalization approaches mainly focus on ID or text-based recommendation problems, failing to comprehend the information spanning various tasks or modalities. In this paper, our goal is to establish a Unified paradigm for Multi-modal Personalization systems (UniMP), which effectively leverages multi-modal data while eliminating the complexities associated with task- and modality-specific customization. We argue that the advancements in foundational generative modeling have provided the flexibility and effectiveness necessary to achieve the objective. In light of this, we develop a generic and extensible personalization generative framework, that can handle a wide range of personalized needs including item recommendation, product search, preference prediction, explanation generation, and further user-guided image generation. Our methodology enhances the capabilities of foundational language models for personalized tasks by seamlessly ingesting interleaved cross-modal user history information, ensuring a more precise and customized experience for users. To train and evaluate the proposed multi-modal personalized tasks, we also introduce a novel and comprehensive benchmark covering a variety of user requirements. Our experiments on the real-world benchmark showcase the model's potential, outperforming competitive methods specialized for each task.

Details

NeurIPS Conference 2023 Conference Paper

Adaptive Test-Time Personalization for Federated Learning

Wenxuan Bao
Tianxin Wei
Haohan Wang
Jingrui He

Personalized federated learning algorithms have shown promising results in adapting models to various distribution shifts. However, most of these methods require labeled data on testing clients for personalization, which is usually unavailable in real-world scenarios. In this paper, we introduce a novel setting called test-time personalized federated learning (TTPFL), where clients locally adapt a global model in an unsupervised way without relying on any labeled data during test-time. While traditional test-time adaptation (TTA) can be used in this scenario, most of them inherently assume training data come from a single domain, while they come from multiple clients (source domains) with different distributions. Overlooking these domain interrelationships can result in suboptimal generalization. Moreover, most TTA algorithms are designed for a specific kind of distribution shift and lack the flexibility to handle multiple kinds of distribution shifts in FL. In this paper, we find that this lack of flexibility partially results from their pre-defining which modules to adapt in the model. To tackle this challenge, we propose a novel algorithm called ATP to adaptively learns the adaptation rates for each module in the model from distribution shifts among source domains. Theoretical analysis proves the strong generalization of ATP. Extensive experiments demonstrate its superiority in handling various distribution shifts including label shift, image corruptions, and domain shift, outperforming existing TTA methods across multiple datasets and model architectures. Our code is available at https: //github. com/baowenxuan/ATP.

PDF Details

NeurIPS Conference 2023 Conference Paper

Meta-Learning with Neural Bandit Scheduler

Yunzhe Qi
Yikun Ban
Tianxin Wei
Jiaru Zou
Huaxiu Yao
Jingrui He

Meta-learning has been proven an effective learning paradigm for training machine learning models with good generalization ability. Apart from the common practice of uniformly sampling the meta-training tasks, existing methods working on task scheduling strategies are mainly based on pre-defined sampling protocols or the assumed task-model correlations, and greedily make scheduling decisions, which can lead to sub-optimal performance bottlenecks of the meta-model. In this paper, we propose a novel task scheduling framework under Contextual Bandits settings, named BASS, which directly optimizes the task scheduling strategy based on the status of the meta-model. By balancing the exploitation and exploration in meta-learning task scheduling, BASS can help tackle the challenge of limited knowledge about the task distribution during the early stage of meta-training, while simultaneously exploring potential benefits for forthcoming meta-training iterations through an adaptive exploration strategy. Theoretical analysis and extensive experiments are presented to show the effectiveness of our proposed framework.

PDF Details

ICML Conference 2023 Conference Paper

NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning

Tianxin Wei
Zeming Guo
Yifan Chen 0004
Jingrui He

Fine-tuning a pre-trained language model (PLM) emerges as the predominant strategy in many natural language processing applications. However, even fine-tuning the PLMs and doing inference are expensive, especially on edge devices with low computing power. Some general approaches (e. g. quantization and distillation) have been widely studied to reduce the compute/memory of PLM fine-tuning, while very few one-shot compression techniques are explored. In this paper, we investigate the neural tangent kernel (NTK)–which reveals the gradient descent dynamics of neural networks–of the multilayer perceptrons (MLP) modules in a PLM and propose to coin a lightweight PLM through NTK-approximating MLP fusion. To achieve this, we reconsider the MLP as a bundle of sub-MLPs, and cluster them into a given number of centroids, which can then be restored as a compressed MLP and surprisingly shown to well approximate the NTK of the original PLM. Extensive experiments of PLM fine-tuning on both natural language understanding (NLU) and generation (NLG) tasks are provided to verify the effectiveness of the proposed method MLP fusion. Our code is available at https: //github. com/weitianxin/MLP_Fusion.

Details

NeurIPS Conference 2022 Conference Paper

Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative

Tianxin Wei
Yuning You
Tianlong Chen
Yang Shen
Jingrui He
Zhangyang Wang

This paper targets at improving the generalizability of hypergraph neural networks in the low-label regime, through applying the contrastive learning approach from images/graphs (we refer to it as HyperGCL). We focus on the following question: How to construct contrastive views for hypergraphs via augmentations? We provide the solutions in two folds. First, guided by domain knowledge, we fabricate two schemes to augment hyperedges with higher-order relations encoded, and adopt three vertex augmentation strategies from graph-structured data. Second, in search of more effective views in a data-driven manner, we for the first time propose a hypergraph generative model to generate augmented views, and then an end-to-end differentiable pipeline to jointly learn hypergraph augmentations and model parameters. Our technical innovations are reflected in designing both fabricated and generative augmentations of hypergraphs. The experimental findings include: (i) Among fabricated augmentations in HyperGCL, augmenting hyperedges provides the most numerical gains, implying that higher-order information in structures is usually more downstream-relevant; (ii) Generative augmentations do better in preserving higher-order information to further benefit generalizability; (iii) HyperGCL also boosts robustness and fairness in hypergraph representation learning. Codes are released at https: //github. com/weitianxin/HyperGCL.

PDF Details