Author name cluster

Qiang Sun

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

2 author rows

TMLR Journal 2026 Journal Article

Order from Chaos: Physical World Understanding from Glitchy Gameplay Videos

Meng Cao
Haoran Tang
Haoze Zhao
Mingfei Han
Ruyang Liu
Qiang Sun
Xiaojun Chang
Ian Reid

Understanding the physical world, including object dynamics, material properties, and causal interactions, remains a core challenge in artificial intelligence. Although recent multi-modal large language models (MLLMs) have demonstrated impressive general reasoning capabilities, they still fall short of achieving human-level understanding of physical principles. Existing datasets for physical reasoning either rely on real-world videos, which incur high annotation costs, or on synthetic simulations, which suffer from limited realism and diversity. In this paper, we propose a novel paradigm that leverages glitches in gameplay videos, referring to visual anomalies that violate predefined physical laws, as a rich and scalable supervision source for physical world understanding. We introduce PhysGame, an instruction-tuning dataset containing 140,057 glitch-centric question–answer pairs across five physical domains and sixteen fine-grained categories. To ensure data accuracy, we design a meta-information–guided prompting strategy that utilizes gameplay metadata such as titles and descriptions to guide high-quality QA generation. Complementing PhysGame, we construct GameBench, an expert-annotated benchmark with 880 glitch-identified gameplay videos designed to evaluate physical reasoning capabilities. Extensive experiments show that PhysGame significantly enhances both Game2Real transferability, improving the real-world physical reasoning performance of Qwen2.5-VL by 2.5% on PhysBench, and Game2General transferability, yielding a 1.9% gain on the MVBench benchmark. Moreover, PhysGame-tuned models achieve a 3.7% absolute improvement on GameBench, demonstrating enhanced robustness in detecting physical implausibilities. These results indicate that learning from gameplay anomalies offers a scalable and effective pathway toward advancing physical world understanding in multimodal intelligence.

PDF Details

UAI Conference 2025 Conference Paper

Corruption-Robust Variance-aware Algorithms for Generalized Linear Bandits under Heavy-tailed Rewards

Qingyuan Yu
Euijin Baek
Xiang Li
Qiang Sun

Stochastic linear bandits have recently received significant attention in sequential decision-making. However, real-world challenges such as heavy-tailed noise, reward corruption, and nonlinear reward functions remain difficult to address. To tackle these difficulties, we propose GAdaOFUL, a novel algorithm that leverages adaptive Huber regression to achieve robustness in generalized linear models (GLMs), where rewards can be nonlinear functions of features. GAdaOFUL achieves a state-of-the-art variance-aware regret bound, scaling with the square root of the cumulative reward variance over time, plus an additional term proportional to the level of corruption. The algorithm adapts to problem complexity, yielding improved regret when the cumulative variance is small. Simulation results demonstrate the robustness and effectiveness of GAdaOFUL in practice. The code is available at \url{https: //github. com/NeXAIS/GAdaOFUL}.

Details

ICLR Conference 2025 Conference Paper

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

Feilong Tang
Zile Huang
Chengzhi Liu
Qiang Sun
Harry Yang
Ser-Nam Lim

Multimodal large language models (MLLMs) offer a powerful mechanism for interpreting visual information. However, they often suffer from hallucinations, which impede the real-world usage of these models. Existing methods attempt to alleviate this issue by designing special decoding strategies that penalize the summary tokens. However, these methods lack analysis of the relationship between hallucination and summarization mechanism of LLMs. Interestingly, we find that penalizing summary tokens is not necessary: merely intervening the query-key parameters variance, without costing extra inference time, still alleviates hallucinations. Specifically, we explore the causes of hallucinations by analyzing localized self-attention patterns called ``anchor" tokens and define the attention localization degree of the model as token propagation probabilities. Our analysis reveals that over-propagation of anchor tokens occurs when the distribution of eigenvalues of the query and key matrices has a non-zero mean and a polarized variance, leading to excessive dependence on anchor tokens while neglecting vision information and describes the image content with hallucination. Based on the observation, we propose a versatile plug-and-play decoding strategy, Dynamic Token Propagation Mechanism (TAME), to alleviate excessive propagation by dynamically intervening the eigenspectrum variance of the attention weight, thereby alleviating hallucinations without relying on complex decoding strategies. Extensive experiments reveal a correlation between the eigenspectrum and hallucinations across various MLLMs, and show that TAME reduces the percentage of hallucinated objects.

Details

NeurIPS Conference 2025 Conference Paper

PCA++: How Uniformity Induces Robustness to Background Noise in Contrastive Learning

Mingqi Wu
Qiang Sun
Archer Yang

High-dimensional data often conceal low-dimensional signals beneath structured background noise, limiting standard PCA. Motivated by contrastive learning, we address the problem of recovering shared signal subspaces from positive pairs--paired observations sharing the same signal but differing in background. Our baseline, PCA+, uses alignment-only contrastive learning and succeeds when background variation is mild, but fails under strong noise or high-dimensional regimes. To address this, we introduce PCA++, a hard uniformity-constrained contrastive PCA that enforces identity covariance on projected features. PCA++ has a closed-form solution via a generalized eigenproblem, remains stable in high dimensions, and provably regularizes against background interference. We provide exact high-dimensional asymptotics in both fixed-aspect-ratio and growing-spike regimes, showing uniformity’s role in robust signal recovery. Empirically, PCA++ outperforms standard PCA and alignment-only PCA+ on simulations, corrupted-MNIST, and single-cell transcriptomics, reliably recovering condition-invariant structure. More broadly, we clarify uniformity’s role in contrastive learning—showing that explicit feature dispersion defends against structured noise and enhances robustness.

PDF Details

UAI Conference 2024 Conference Paper

Gradient descent in matrix factorization: Understanding large initialization

Hengchao Chen
Xin Chen
Mohamad Elmasri
Qiang Sun

Gradient Descent (GD) has been proven effective in solving various matrix factorization problems. However, its optimization behavior with large initial values remains less understood. To address this gap, this paper presents a novel theoretical framework for examining the convergence trajectory of GD with a large initialization. The framework is grounded in signal-to-noise ratio concepts and inductive arguments. The results uncover an implicit incremental learning phenomenon in GD and offer a deeper understanding of its performance in large initialization scenarios.

Details

ICLR Conference 2024 Conference Paper

Rethinking the Uniformity Metric in Self-Supervised Learning

Xianghong Fang
Jian Li 0054
Qiang Sun
Benyou Wang

Uniformity plays an important role in evaluating learned representations, providing insights into self-supervised learning. In our quest for effective uniformity metrics, we pinpoint four principled properties that such metrics should possess. Namely, an effective uniformity metric should remain invariant to instance permutations and sample replications while accurately capturing feature redundancy and dimensional collapse. Surprisingly, we find that the uniformity metric proposed by \citet{Wang2020UnderstandingCR} fails to satisfy the majority of these properties. Specifically, their metric is sensitive to sample replications, and can not account for feature redundancy and dimensional collapse correctly. To overcome these limitations, we introduce a new uniformity metric based on the Wasserstein distance, which satisfies all the aforementioned properties. Integrating this new metric in existing self-supervised learning methods effectively mitigates dimensional collapse and consistently improves their performance on downstream tasks involving CIFAR-10 and CIFAR-100 datasets. Code is available at \url{https://github.com/statsle/WassersteinSSL}.

Details

TMLR Journal 2024 Journal Article

Variance-aware decision making with linear function approximation under heavy-tailed rewards

Xiang Li
Qiang Sun

This paper studies how to achieve variance-aware regrets for online decision-making in the presence of heavy-tailed rewards with only finite variances. For linear stochastic bandits, we address the issue of heavy-tailed rewards by modifying the adaptive Huber regression and proposing AdaOFUL. AdaOFUL achieves a state-of-the-art regret bound of $\widetilde{\mathcal{O}}\big(d\big(\sum_{t=1}^T \nu_{t}^2\big)^{1/2}+d\big)$ as if the rewards were uniformly bounded, where $\nu_{t}^2$ is the conditional variance of the reward at round $t$, $d$ is the feature dimension, {and $T$ is number of online rounds}. Building upon AdaOFUL, we propose VARA for linear MDPs, which achieves a variance-aware regret bound of $\widetilde{\mathcal{O}}(d\sqrt{H\mathcal{G}^*K})$. Here, $H$ is the length of episodes, $K$ is the number of episodes, and $\mathcal{G}^*$ is a smaller instance-dependent quantity that can be bounded by other instance-dependent quantities when additional structural conditions on the MDP are satisfied. Overall, our modified adaptive Huber regression algorithm may serve as a useful building block in the design of algorithms for online problems with heavy-tailed rewards.

PDF Details

NeurIPS Conference 2023 Conference Paper

Directional diffusion models for graph representation learning

Run Yang
Yuling Yang
Fan Zhou
Qiang Sun

Diffusion models have achieved remarkable success in diverse domains such as image synthesis, super-resolution, and 3D molecule generation. Surprisingly, the application of diffusion models in graph learning has garnered little attention. In this paper, we aim to bridge this gap by exploring the use of diffusion models for unsupervised graph representation learning. Our investigation commences with the identification of anisotropic structures within graphs and the recognition of a crucial limitation in the vanilla forward diffusion process when dealing with these anisotropic structures. The original forward diffusion process continually adds isotropic Gaussian noise to the data, which may excessively dilute anisotropic signals, leading to rapid signal-to-noise conversion. This rapid conversion poses challenges for training denoising neural networks and obstructs the acquisition of semantically meaningful representations during the reverse process. To overcome this challenge, we introduce a novel class of models termed {\it directional diffusion models}. These models adopt data-dependent, anisotropic, and directional noises in the forward diffusion process. In order to assess the effectiveness of our proposed models, we conduct extensive experiments on 12 publicly available datasets, with a particular focus on two distinct graph representation learning tasks. The experimental results unequivocally establish the superiority of our models over state-of-the-art baselines, underscoring their effectiveness in capturing meaningful graph representations. Our research not only sheds light on the intricacies of the forward process in diffusion models but also underscores the vast potential of these models in addressing a wide spectrum of graph-related tasks. Our code is available at \url{https: //github. com/statsle/DDM}.

PDF Details

AIIM Journal 2021 Journal Article

Fenchel duality of Cox partial likelihood with an application in survival kernel learning

Christopher M. Wilson
Kaiqiao Li
Qiang Sun
Pei Fen Kuan
Xuefeng Wang

The Cox proportional hazard model is one of the most widely used methods in modeling time-to-event data in the health sciences. Due to the simplicity of the Cox partial likelihood function, many machine learning algorithms use it for survival data. However, due to the nature of censored data, the optimization problem becomes intractable when more complicated regularization is employed, which is necessary when dealing with high dimensional omic data. In this paper, we show that a convex conjugate function of the Cox loss function based on Fenchel duality exists, and provide an alternative framework to optimization based on the primal form. Furthermore, the dual form suggests an efficient algorithm for solving the kernel learning problem with censored survival outcomes. We illustrate performance and properties of the derived duality form of Cox partial likelihood loss in multiple kernel learning problems with simulated and the Skin Cutaneous Melanoma TCGA datasets.

Details DOI

JMLR Journal 2021 Journal Article

Hoeffding's Inequality for General Markov Chains and Its Applications to Statistical Learning

Jianqing Fan
Bai Jiang
Qiang Sun

This paper establishes Hoeffding's lemma and inequality for bounded functions of general-state-space and not necessarily reversible Markov chains. The sharpness of these results is characterized by the optimality of the ratio between variance proxies in the Markov-dependent and independent settings. The boundedness of functions is shown necessary for such results to hold in general. To showcase the usefulness of the new results, we apply them for non-asymptotic analyses of MCMC estimation, respondent-driven sampling and high-dimensional covariance matrix estimation on time series data with a Markovian nature. In addition to statistical problems, we also apply them to study the time-discounted rewards in econometric models and the multi-armed bandit problem with Markovian rewards arising from the field of machine learning. [abs] [ pdf ][ bib ] &copy JMLR 2021. ( edit, beta )

PDF Details

ICML Conference 2018 Conference Paper

Graphical Nonconvex Optimization via an Adaptive Convex Relaxation

Qiang Sun
Kean Ming Tan
Han Liu 0001
Tong Zhang 0001

We consider the problem of learning high-dimensional Gaussian graphical models. The graphical lasso is one of the most popular methods for estimating Gaussian graphical models. However, it does not achieve the oracle rate of convergence. In this paper, we propose the graphical nonconvex optimization for optimal estimation in Gaussian graphical models, which is then approximated by a sequence of convex programs. Our proposal is computationally tractable and produces an estimator that achieves the oracle rate of convergence. The statistical error introduced by the sequential approximation using a sequence of convex programs is clearly demonstrated via a contraction property. The proposed methodology is then extended to modeling semiparametric graphical models. We show via numerical studies that the proposed estimator outperforms other popular methods for estimating Gaussian graphical models.

Details

AAAI Conference 2006 Conference Paper

Explanation-Based Learning for Image Understanding

Qiang Sun

Existing prior domain knowledge represents a valuable source of information for image interpretation problems such as classifying handwritten characters. Such domain knowledge must be translated into a form understandable by the learner. Translation can be realized with Explanation-Based Learning (EBL) which provides a kind of dynamic inductive bias, combining domain knowledge and training examples. The dynamic bias formed by the interaction of domain knowledge with training examples can yield solution knowledge of potential higher quality than can be anticipated by the static bias designer without seeing training examples. We detail how EBL can be used to dynamically integrate domain knowledge, training examples, and the learning mechanism, and describe the two EBL approaches in (Sun & DeJong 2005a) and (Sun & DeJong 2005b).

PDF Details

ICML Conference 2005 Conference Paper

Explanation-Augmented SVM: an approach to incorporating domain knowledge into SVM learning

Qiang Sun
Gerald DeJong

We introduce a novel approach to incorporating domain knowledge into Support Vector Machines to improve their example efficiency. Domain knowledge is used in an Explanation Based Learning fashion to build justifications or explanations for why the training examples are assigned their given class labels. Explanations bias the large margin classifier through the interaction of training examples and domain knowledge. We develop a new learning algorithm for this Explanation-Augmented SVM (EA-SVM). It naturally extends to imperfect knowledge, a stumbling block to conventional EBL. Experimental results confirm desirable properties predicted by the analysis and demonstrate the approach on three domains.

Details