Author name cluster

Xiao Fu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

2 author rows

AAAI Conference 2026 Conference Paper

CHASE: Contextual History for Adaptive and Simple Exploitation in Large Language Model Jailbreaking

Zhiqiang Hao
Chuanyi Li
Ye Fan
Jun Cai
Xiao Fu
Shangqi Wang
Hao Shen
Jiao Yin

We propose Contextual History for Adaptive and Simple Exploitation (CHASE), a novel multi-turn method for Large Language Model (LLM) jailbreaking. Rather than directly attack an LLM that may be difficult to jailbreak, CHASE first collects jailbroken histories from an easy-to-jailbreak LLM and then transfers them to the target LLM. Through this history transfer process, CHASE misleads the target LLM into thinking that it is responsible for producing the jailbroken histories and increases the chances of successful jailbreaking by prompting it to continue the conversation. Extensive evaluations on mainstream LLMs show that CHASE consistently achieves higher attack success rates and demands fewer computational resources compared to existing methods.

PDF Details DOI

ICLR Conference 2025 Conference Paper

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

Xiao Fu
Xian Liu
Xintao Wang 0002
Sida Peng
Menghan Xia
Xiaoyu Shi 0002
Ziyang Yuan
Pengfei Wan 0001

This paper aims to manipulate multi-entity 3D motions in video generation. Previous methods on controllable video generation primarily leverage 2D control signals to manipulate object motions and have achieved remarkable synthesis results. However, 2D control signals are inherently limited in expressing the 3D nature of object motions. To overcome this problem, we introduce 3DTrajMaster, a robust controller that regulates multi-entity dynamics in 3D space, given user-desired 6DoF pose (location and rotation) sequences of entities. At the core of our approach is a plug-and-play 3D-motion grounded object injector that fuses multiple input entities with their respective 3D trajectories through a gated self-attention mechanism. In addition, we exploit an injector architecture to preserve the video diffusion prior, which is crucial for generalization ability. To mitigate video quality degradation, we introduce a domain adaptor during training and employ an annealed sampling strategy during inference. To address the lack of suitable training data, we construct a 360-Motion Dataset, which first correlates collected 3D human and animal assets with GPT-generated trajectory and then captures their motion with 12 evenly-surround cameras on diverse 3D UE platforms. Extensive experiments show that 3DTrajMaster sets a new state-of-the-art in both accuracy and generalization for controlling multi-entity 3D motions. Project page: http://fuxiao0719.github.io/projects/3dtrajmaster

Details

NeurIPS Conference 2025 Conference Paper

Diverse Influence Component Analysis: A Geometric Approach to Nonlinear Mixture Identifiability

Hoang Son Nguyen
Xiao Fu

Latent component identification from unknown nonlinear mixtures is a foundational challenge in machine learning, with applications in tasks such as self-supervised learning and causal representation learning. Prior work in nonlinear independent component analysis (nICA) has shown that auxiliary signals---such as weak supervision---can support identifiability of conditionally independent latent components. More recent approaches explore structural assumptions, like sparsity in the Jacobian of the mixing function, to relax such requirements. In this work, we introduce Diverse Influence Component Analysis (DICA), a framework that exploits the convex geometry of the mixing function’s Jacobian. We propose a Jacobian Volume Maximization (J-VolMax) criterion, which enables latent component identification by encouraging diversity in their influence on the observed variables. Under suitable conditions, this approach achieves identifiability without relying on auxiliary information, latent component independence, or Jacobian sparsity assumptions. These results extend the scope of identifiability analysis and offer a complementary perspective to existing methods.

PDF Details

IJCAI Conference 2025 Conference Paper

Enhancing Multimodal Model Robustness Under Missing Modalities via Memory-Driven Prompt Learning

Yihan Zhao
Wei Xi
Xiao Fu
Jizhong Zhao

Existing multimodal models typically assume the availability of all modalities, leading to significant performance degradation when certain modalities are missing. Recent methods have introduced prompt learning to adapt pretrained models to incomplete data, achieving remarkable performance when the missing cases are consistent during training and inference. However, these methods rely heavily on distribution consistency and fail to compensate for missing modalities, limiting their ability to generalize to unseen missing cases. To address this issue, we propose Memory-Driven Prompt Learning, a framework that adaptively compensates for missing modalities through prompt learning. The compensation strategies are achieved by two types of prompts: generative prompts and shared prompts. Generative prompts retrieve semantically similar samples from a predefined prompt memory that stores modality-specific semantic information, while shared prompts leverage available modalities to provide cross-modal compensation. Extensive experiments demonstrate the effectiveness of the proposed model, achieving significant improvements across diverse missing-modality scenarios, with average performance increasing from 34. 76% to 40. 40% on MM-IMDb, 62. 71% to 77. 06% on Food101, and 60. 40% to 62. 77% on Hateful Memes. The code is available at https: //github. com/zhao-yh20/MemPrompt.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Identifiable Shared Component Analysis of Unpaired Multimodal Mixtures

Subash Timilsina
Sagar Shrestha
Xiao Fu

A core task in multi-modal learning is to integrate information from multiple feature spaces (e. g. , text and audio), offering modality-invariant essential representations of data. Recent research showed that, classical tools such as canonical correlation analysis (CCA) provably identify the shared components up to minor ambiguities, when samples in each modality are generated from a linear mixture of shared and private components. Such identifiability results were obtained under the condition that the cross-modality samples are aligned/paired according to their shared information. This work takes a step further, investigating shared component identifiability from multi-modal linear mixtures where cross-modality samples are unaligned. A distribution divergence minimization-based loss is proposed, under which a suite of sufficient conditions ensuring identifiability of the shared components are derived. Our conditions are based on cross-modality distribution discrepancy characterization and density-preserving transform removal, which are much milder than existing studies relying on independent component analysis. More relaxed conditions are also provided via adding reasonable structural constraints, motivated by available side information in various applications. The identifiability claims are thoroughly validated using synthetic and real-world data.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Noisy Label Learning with Instance-Dependent Outliers: Identifiability via Crowd Wisdom

Tri Nguyen
Shahana Ibrahim
Xiao Fu

The generation of label noise is often modeled as a process involving a probability transition matrix (also interpreted as the annotator confusion matrix ) imposed onto the label distribution. Under this model, learning the ``ground-truth classifier''---i. e. , the classifier that can be learned if no noise was present---and the confusion matrix boils down to a model identification problem. Prior works along this line demonstrated appealing empirical performance, yet identifiability of the model was mostly established by assuming an instance-invariant confusion matrix. Having an (occasionally) instance-dependent confusion matrix across data samples is apparently more realistic, but inevitably introduces outliers to the model. Our interest lies in confusion matrix-based noisy label learning with such outliers taken into consideration. We begin with pointing out that under the model of interest, using labels produced by only one annotator is fundamentally insufficient to detect the outliers or identify the ground-truth classifier. Then, we prove that by employing a crowdsourcing strategy involving multiple annotators, a carefully designed loss function can establish the desired model identifiability under reasonable conditions. Our development builds upon a link between the noisy label model and a column-corrupted matrix factorization mode---based on which we show that crowdsourced annotations distinguish nominal data and instance-dependent outliers using a low-dimensional subspace. Experiments show that our learning scheme substantially improves outlier detection and the classifier's testing accuracy.

PDF Details DOI

NeurIPS Conference 2022 Conference Paper

Provable Subspace Identification Under Post-Nonlinear Mixtures

Qi Lyu
Xiao Fu

Unsupervised mixture learning (UML) aims at identifying linearly or nonlinearly mixed latent components in a blind manner. UML is known to be challenging: Even learning linear mixtures requires highly nontrivial analytical tools, e. g. , independent component analysis or nonnegative matrix factorization. In this work, the post-nonlinear (PNL) mixture model---where {\it unknown} element-wise nonlinear functions are imposed onto a linear mixture---is revisited. The PNL model is widely employed in different fields ranging from brain signal classification, speech separation, remote sensing, to causal discovery. To identify and remove the unknown nonlinear functions, existing works often assume different properties on the latent components (e. g. , statistical independence or probability-simplex structures). This work shows that under a carefully designed UML criterion, the existence of a nontrivial {\it null space} associated with the underlying mixing system suffices to guarantee identification/removal of the unknown nonlinearity. Compared to prior works, our finding largely relaxes the conditions of attaining PNL identifiability, and thus may benefit applications where no strong structural information on the latent components is known. A finite-sample analysis is offered to characterize the performance of the proposed approach under realistic settings. To implement the proposed learning criterion, a block coordinate descent algorithm is proposed. A series of numerical experiments corroborate our theoretical claims.

PDF Details

AAAI Conference 2021 Conference Paper

StatEcoNet: Statistical Ecology Neural Networks for Species Distribution Modeling

Eugene Seo
Rebecca A. Hutchinson
Xiao Fu
Chelsea Li
Tyler A. Hallman
John Kilbride
W. Douglas Robinson

This paper focuses on a core task in computational sustainability and statistical ecology: species distribution modeling (SDM). In SDM, the occurrence pattern of a species on a landscape is predicted by environmental features based on observations at a set of locations. At first, SDM may appear to be a binary classification problem, and one might be inclined to employ classic tools (e. g. , logistic regression, support vector machines, neural networks) to tackle it. However, wildlife surveys introduce structured noise (especially under-counting) in the species observations. If unaccounted for, these observation errors systematically bias SDMs. To address the unique challenges of SDM, this paper proposes a framework called StatEcoNet. Specifically, this work employs a graphical generative model in statistical ecology to serve as the skeleton of the proposed computational framework and carefully integrates neural networks under the framework. The advantages of StatEcoNet over related approaches are demonstrated on simulated datasets as well as bird species data. Since SDMs are critical tools for ecological science and natural resource management, StatEcoNet may offer boosted computational and analytical powers to a wide range of applications that have significant social impacts, e. g. , the study and conservation of threatened species.

PDF Details

NeurIPS Conference 2019 Conference Paper

Crowdsourcing via Pairwise Co-occurrences: Identifiability and Algorithms

Shahana Ibrahim
Xiao Fu
Nikolaos Kargas
Kejun Huang

The data deluge comes with high demands for data labeling. Crowdsourcing (or, more generally, ensemble learning) techniques aim to produce accurate labels via integrating noisy, non-expert labeling from annotators. The classic Dawid-Skene estimator and its accompanying expectation maximization (EM) algorithm have been widely used, but the theoretical properties are not fully understood. Tensor methods were proposed to guarantee identification of the Dawid-Skene model, but the sample complexity is a hurdle for applying such approaches---since the tensor methods hinge on the availability of third-order statistics that are hard to reliably estimate given limited data. In this paper, we propose a framework using pairwise co-occurrences of the annotator responses, which naturally admits lower sample complexity. We show that the approach can identify the Dawid-Skene model under realistic conditions. We propose an algebraic algorithm reminiscent of convex geometry-based structured matrix factorization to solve the model identification problem efficiently, and an identifiability-enhanced algorithm for handling more challenging and critical scenarios. Experiments show that the proposed algorithms outperform the state-of-art algorithms under a variety of scenarios.

PDF Details

AAAI Conference 2018 Conference Paper

On Convergence of Epanechnikov Mean Shift

Kejun Huang
Xiao Fu
Nicholas Sidiropoulos

Epanechnikov Mean Shift is a simple yet empirically very effective algorithm for clustering. It localizes the centroids of data clusters via estimating modes of the probability distribution that generates the data points, using the ‘optimal’ Epanechnikov kernel density estimator. However, since the procedure involves non-smooth kernel density functions, the convergence behavior of Epanechnikov mean shift lacks theoretical support as of this writing—most of the existing analyses are based on smooth functions and thus cannot be applied to Epanechnikov Mean Shift. In this work, we ﬁrst show that the original Epanechnikov Mean Shift may indeed terminate at a non-critical point, due to the non-smoothness nature. Based on our analysis, we propose a simple remedy to ﬁx it. The modiﬁed Epanechnikov Mean Shift is guaranteed to terminate at a local maximum of the estimated density, which corresponds to a cluster centroid, within a ﬁnite number of iterations. We also propose a way to avoid running the Mean Shift iterates from every data point, while maintaining good clustering accuracies under non-overlapping spherical Gaussian mixture models. This further pushes Epanechnikov Mean Shift to handle very large and high-dimensional data sets. Experiments show surprisingly good performance compared to the Lloyd’s K-means algorithm and the EM algorithm.

PDF Details

NeurIPS Conference 2016 Conference Paper

Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm

Kejun Huang
Xiao Fu
Nikolaos Sidiropoulos

In topic modeling, many algorithms that guarantee identifiability of the topics have been developed under the premise that there exist anchor words -- i. e. , words that only appear (with positive probability) in one topic. Follow-up work has resorted to three or higher-order statistics of the data corpus to relax the anchor word assumption. Reliable estimates of higher-order statistics are hard to obtain, however, and the identification of topics under those models hinges on uncorrelatedness of the topics, which can be unrealistic. This paper revisits topic modeling based on second-order moments, and proposes an anchor-free topic mining framework. The proposed approach guarantees the identification of the topics under a much milder condition compared to the anchor-word assumption, thereby exhibiting much better robustness in practice. The associated algorithm only involves one eigen-decomposition and a few small linear programs. This makes it easy to implement and scale up to very large problem instances. Experiments using the TDT2 and Reuters-21578 corpus demonstrate that the proposed anchor-free approach exhibits very favorable performance (measured using coherence, similarity count, and clustering accuracy metrics) compared to the prior art.

PDF Details