Author name cluster

Minnan Luo

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

16 papers

2 author rows

AAAI Conference 2026 Conference Paper

Bot Meets Shortcut: How Can LLMs Aid in Handling Unknown Invariance OOD Scenarios?

Shiyan Zheng
Herun Wan
Minnan Luo
Junhang Huang

While existing social bot detectors perform well on benchmarks, their robustness across diverse real-world scenarios remains limited due to unclear ground truth and varied misleading cues. In particular, the impact of shortcut learning, where models rely on spurious correlations instead of capturing causal task-relevant features, has received limited attention. To address this gap, we conduct an in-depth study to assess how detectors are influenced by potential shortcuts based on textual features, which are most susceptible to manipulation by social bots. We design a series of shortcut scenarios by constructing spurious associations between user labels and superficial textual cues to evaluate model robustness. Results show that shifts in irrelevant feature distributions significantly degrade social bot detector performance, with an average relative accuracy drop of 32 % in the baseline models. To tackle this challenge, we propose mitigation strategies based on large language models, leveraging counterfactual data augmentation. These methods mitigate the problem from data and model perspectives across three levels, including data distribution at both the individual user text and overall dataset levels, as well as model’s ability to extract causal information. Our strategies achieve an average relative performance improvement of 56 % under shortcut scenarios.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Correspondence Coverage Matters for Multi-Modal Dataset Distillation

Zhuohang Dang
Minnan Luo
Chengyou Jia
Hangwei Qian
Xinyu Zhang
Xiaojun Chang
Ivor Tsang

Multi-modal dataset distillation (DD) condenses large datasets into compact ones that retain task efficacy by capturing correspondence patterns, i.e., shared semantics between paired modalities. However, such patterns rely on cross-modal similarity and cannot be faithfully captured by intra-modal similarity of current unimodal strategies. As a result, current multi-modal DD methods tend to over-concentrate, redundantly encoding similar correspondence patterns and thus limiting generalizability. To this end, we propose a novel multi-modal DD framework to systematically Promote Correspondence coverage, i.e., ProCo. Initially, we develop a correspondence consistency metric based on cross-modal retrieval distributions to cluster correspondence patterns. These clusters capture the underlying correspondence distribution, enabling ProCo to initialize distilled data with representative patterns while regularizing optimization to promote correspondence representativeness and diversity. Moreover, we employ conditional neural fields for efficient distilled data parameterization, enhancing fine-grained pattern capture while allowing more distilled data under a fixed budget to boost correspondence coverage. Extensive experiments verify that our ProCo achieves superior and elastic budget-efficacy trade-offs, surpassing prior methods by over 15% with 10x distillation budget reduction, highlighting its real-world practicality.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Each Fake News Is Fake in Its Own Way: An Attribution Multi-Granularity Benchmark for Multimodal Fake News Detection

Hao Guo
Zihan Ma
Zhi Zeng
Minnan Luo
Weixin Zeng
Jiuyang Tang
Xiang Zhao

Social platforms, while facilitating access to information, have also become saturated with a plethora of fake news, resulting in negative consequences. Automatic multimodal fake news detection is a worthwhile pursuit. Existing multimodal fake news datasets only provide binary labels of real or fake. However, real news is alike, while each fake news is fake in its own way. These datasets fail to reflect the mixed nature of various types of multimodal fake news. To bridge the gap, we construct an attributing multi-granularity multimodal fake news detection dataset AMG, revealing the inherent fake pattern. Furthermore, we propose a multi-granularity clue alignment model MGCA to achieve multimodal fake news detection and attribution. Experimental results demonstrate that AMG is a challenging dataset, and its attribution setting opens up new avenues for future research.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Rethinking Verification for LLM Code Generation: From Generation to Testing

Zihan Ma
Taolin Zhang
Junnan Liu
Wenwei Zhang
Minnan Luo
Songyang Zhang
Kai Chen

Large language models (LLMs) have recently achieved notable success in code‑generation benchmarks such as HumanEval and LiveCodeBench. However, a detailed examination reveals that these evaluation suites often comprise only a limited number of homogeneous test cases, resulting in subtle faults going undetected. This not only artificially inflates measured performance but also compromises accurate reward estimation in reinforcement learning frameworks utilizing verifiable rewards (RLVR). To address these critical shortcomings, we systematically investigate the test-case generation (TCG) task by proposing multi-dimensional metrics designed to rigorously quantify test-suite thoroughness. Furthermore, we introduce a human-LLM collaborative method (SAGA), leveraging human programming expertise with LLM reasoning capability, aimed at significantly enhancing both the coverage and the quality of generated test cases. In addition, we develop a TCGBench to facilitate the study of the TCG task. Experiments show that SAGA achieves a detection rate of 90. 62\% and a verifier accuracy of 32. 58\% on TCGBench. The Verifier Accuracy (Verifier Acc) of the code generation evaluation benchmark synthesized by SAGA is 10. 78\% higher than that of LiveCodeBench-v6. These results demonstrate the effectiveness of our proposed method. We hope this work contributes to building a scalable foundation for reliable LLM code evaluation, further advancing RLVR in code generation, and paving the way for automated adversarial test synthesis and adaptive benchmark integration.

PDF Details

NeurIPS Conference 2025 Conference Paper

Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection

Herun Wan
Jiaying Wu
Minnan Luo
Zhi Zeng
Zhixiong Su

Misinformation detectors often rely on superficial cues (i. e. , shortcuts) that correlate with misinformation in training data but fail to generalize to the diverse and evolving nature of real-world misinformation. This issue is exacerbated by large language models (LLMs), which can easily generate convincing misinformation using simple prompts. We introduce TruthOverTricks, a unified evaluation paradigm for measuring shortcut learning in misinformation detection. TruthOverTricks categorizes shortcut behaviors into intrinsic shortcut induction and extrinsic shortcut injection, and evaluates seven representative detectors across 14 popular benchmarks, along with two new factual misinformation datasets, NQ-Misinfo and Streaming-Misinfo. Empirical results reveal that existing detectors suffer severe performance degradation when exposed to both naturally occurring and adversarially crafted shortcuts. To address this, we propose the Shortcut Mitigation Framework (SMF), an LLM-augmented data augmentation framework that mitigates shortcut reliance through paraphrasing, factual summarization, and sentiment normalization. SMF consistently enhances robustness across 16 benchmarks, forcing models to rely on deeper semantic understanding rather than shortcut cues.

PDF Details

ICLR Conference 2024 Conference Paper

Adversarial Attacks on Fairness of Graph Neural Networks

Binchi Zhang
Yushun Dong
Chen Chen 0022
Yada Zhu
Minnan Luo
Jundong Li

Fairness-aware graph neural networks (GNNs) have gained a surge of attention as they can reduce the bias of predictions on any demographic group (e.g., female) in graph-based applications. Although these methods greatly improve the algorithmic fairness of GNNs, the fairness can be easily corrupted by carefully designed adversarial attacks. In this paper, we investigate the problem of adversarial attacks on fairness of GNNs and propose G-FairAttack, a general framework for attacking various types of fairness-aware GNNs in terms of fairness with an unnoticeable effect on prediction utility. In addition, we propose a fast computation technique to reduce the time complexity of G-FairAttack. The experimental study demonstrates that G-FairAttack successfully corrupts the fairness of different types of GNNs while keeping the attack unnoticeable. Our study on fairness attacks sheds light on potential vulnerabilities in fairness-aware GNNs and guides further research on the robustness of GNNs in terms of fairness.

Details

ICLR Conference 2024 Conference Paper

Masked Distillation Advances Self-Supervised Transformer Architecture Search

Caixia Yan
Xiaojun Chang
Zhihui Li 0001
Lina Yao 0001
Minnan Luo
Qinghua Zheng

Transformer architecture search (TAS) has achieved remarkable progress in automating the neural architecture design process of vision transformers. Recent TAS advancements have discovered outstanding transformer architectures while saving tremendous labor from human experts. However, it is still cumbersome to deploy these methods in real-world applications due to the expensive costs of data labeling under the supervised learning paradigm. To this end, this paper proposes a masked image modelling (MIM) based self-supervised neural architecture search method specifically designed for vision transformers, termed as MaskTAS, which completely avoids the expensive costs of data labeling inherited from supervised learning. Based on the one-shot NAS framework, MaskTAS requires to train various weight-sharing subnets, which can easily diverged without strong supervision in MIM-based self-supervised learning. For this issue, we design the search space of MaskTAS as a siamesed teacher-student architecture to distill knowledge from pre-trained networks, allowing for efficient training of the transformer supernet. To achieve self-supervised transformer architecture search, we further design a novel unsupervised evaluation metric for the evolutionary search algorithm, where each candidate of the student branch is rated by measuring its consistency with the larger teacher network. Extensive experiments demonstrate that the searched architectures can achieve state-of-the-art accuracy on CIFAR-10, CIFAR-100, and ImageNet datasets even without using manual labels. Moreover, the proposed MaskTAS can generalize well to various data domains and tasks by searching specialized transformer architectures in self-supervised manner.

Details

AAAI Conference 2024 Conference Paper

Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation

Zhuohang Dang
Minnan Luo
Chengyou Jia
Guang Dai
Xiaojun Chang
Jingdong Wang

Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. Recently, to alleviate expensive data collection, co-occurring pairs from the Internet are automatically harvested for training. However, it inevitably includes mismatched pairs, i.e., noisy correspondences, undermining supervision reliability and degrading performance. Current methods leverage deep neural networks' memorization effect to address noisy correspondences, which overconfidently focus on similarity-guided training with hard negatives and suffer from self-reinforcing errors. In light of above, we introduce a novel noisy correspondence learning framework, namely Self-Reinforcing Errors Mitigation (SREM). Specifically, by viewing sample matching as classification tasks within the batch, we generate classification logits for the given sample. Instead of a single similarity score, we refine sample filtration through energy uncertainty and estimate model's sensitivity of selected clean samples using swapped classification entropy, in view of the overall prediction distribution. Additionally, we propose cross-modal biased complementary learning to leverage negative matches overlooked in hard-negative training, further improving model optimization stability and curbing self-reinforcing errors. Extensive experiments on challenging benchmarks affirm the efficacy and efficiency of SREM.

PDF Details DOI

AAAI Conference 2024 Conference Paper

SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation

Chengyou Jia
Minnan Luo
Zhuohang Dang
Guang Dai
Xiaojun Chang
Mengmeng Wang
Jingdong Wang

Despite significant progress in Text-to-Image (T2I) generative models, even lengthy and complex text descriptions still struggle to convey detailed controls. In contrast, Layout-to-Image (L2I) generation, aiming to generate realistic and complex scene images from user-specified layouts, has risen to prominence. However, existing methods transform layout information into tokens or RGB images for conditional control in the generative process, leading to insufficient spatial and semantic controllability of individual instances. To address these limitations, we propose a novel Spatial-Semantic Map Guided (SSMG) diffusion model that adopts the feature map, derived from the layout, as guidance. Owing to rich spatial and semantic information encapsulated in well-designed feature maps, SSMG achieves superior generation quality with sufficient spatial and semantic controllability compared to previous works. Additionally, we propose the Relation-Sensitive Attention (RSA) and Location-Sensitive Attention (LSA) mechanisms. The former aims to model the relationships among multiple objects within scenes while the latter is designed to heighten the model's sensitivity to the spatial information embedded in the guidance. Extensive experiments demonstrate that SSMG achieves highly promising results, setting a new state-of-the-art across a range of metrics encompassing fidelity, diversity, and controllability.

PDF Details DOI

AAAI Conference 2022 Conference Paper

Heterogeneity-Aware Twitter Bot Detection with Relational Graph Transformers

Shangbin Feng
Zhaoxuan Tan
Rui Li
Minnan Luo

Twitter bot detection has become an important and challenging task to combat misinformation and protect the integrity of the online discourse. State-of-the-art approaches generally leverage the topological structure of the Twittersphere, while they neglect the heterogeneity of relations and influence among users. In this paper, we propose a novel bot detection framework to alleviate this problem, which leverages the topological structure of user-formed heterogeneous graphs and models varying influence intensity between users. Specifically, we construct a heterogeneous information network with users as nodes and diversified relations as edges. We then propose relational graph transformers to model heterogeneous influence between users and learn node representations. Finally, we use semantic attention networks to aggregate messages across users and relations and conduct heterogeneity-aware Twitter bot detection. Extensive experiments demonstrate that our proposal outperforms state-ofthe-art methods on a comprehensive Twitter bot detection benchmark. Additional studies also bear out the effectiveness of our proposed relational graph transformers, semantic attention networks and the graph-based approach in general.

PDF Details

NeurIPS Conference 2022 Conference Paper

TwiBot-22: Towards Graph-Based Twitter Bot Detection

Shangbin Feng
Zhaoxuan Tan
Herun Wan
Ningnan Wang
Zilong Chen
Binchi Zhang
Qinghua Zheng
Wenqian Zhang

Twitter bot detection has become an increasingly important task to combat misinformation, facilitate social media moderation, and preserve the integrity of the online discourse. State-of-the-art bot detection methods generally leverage the graph structure of the Twitter network, and they exhibit promising performance when confronting novel Twitter bots that traditional methods fail to detect. However, very few of the existing Twitter bot detection datasets are graph-based, and even these few graph-based datasets suffer from limited dataset scale, incomplete graph structure, as well as low annotation quality. In fact, the lack of a large-scale graph-based Twitter bot detection benchmark that addresses these issues has seriously hindered the development and evaluation of novel graph-based bot detection approaches. In this paper, we propose TwiBot-22, a comprehensive graph-based Twitter bot detection benchmark that presents the largest dataset to date, provides diversified entities and relations on the Twitter network, and has considerably better annotation quality than existing datasets. In addition, we re-implement 35 representative Twitter bot detection baselines and evaluate them on 9 datasets, including TwiBot-22, to promote a fair comparison of model performance and a holistic understanding of research progress. To facilitate further research, we consolidate all implemented codes and datasets into the TwiBot-22 evaluation framework, where researchers could consistently evaluate new models and datasets. The TwiBot-22 Twitter bot detection benchmark and evaluation framework are publicly available at \url{https: //twibot22. github. io/}.

PDF Details

TIST Journal 2020 Journal Article

Self-weighted Robust LDA for Multiclass Classification with Edge Classes

Caixia Yan
Xiaojun Chang
Minnan Luo
Qinghua Zheng
Xiaoqin Zhang
Zhihui Li
Feiping Nie

Linear discriminant analysis (LDA) is a popular technique to learn the most discriminative features for multi-class classification. A vast majority of existing LDA algorithms are prone to be dominated by the class with very large deviation from the others, i.e., edge class, which occurs frequently in multi-class classification. First, the existence of edge classes often makes the total mean biased in the calculation of between-class scatter matrix. Second, the exploitation of ℓ 2 -norm based between-class distance criterion magnifies the extremely large distance corresponding to edge class. In this regard, a novel self-weighted robust LDA with ℓ 2,1 -norm based pairwise between-class distance criterion, called SWRLDA, is proposed for multi-class classification especially with edge classes. SWRLDA can automatically avoid the optimal mean calculation and simultaneously learn adaptive weights for each class pair without setting any additional parameter. An efficient re-weighted algorithm is exploited to derive the global optimum of the challenging ℓ 2,1 -norm maximization problem. The proposed SWRLDA is easy to implement and converges fast in practice. Extensive experiments demonstrate that SWRLDA performs favorably against other compared methods on both synthetic and real-world datasets while presenting superior computational efficiency in comparison with other techniques.

Details DOI

IJCAI Conference 2018 Conference Paper

ANOMALOUS: A Joint Modeling Approach for Anomaly Detection on Attributed Networks

Zhen Peng
Minnan Luo
Jundong Li
Huan Liu
Qinghua Zheng

The key point of anomaly detection on attributed networks lies in the seamless integration of network structure information and attribute information. A vast majority of existing works are mainly based on the Homophily assumption that implies the nodal attribute similarity of connected nodes. Nonetheless, this assumption is untenable in practice as the existence of noisy and structurally irrelevant attributes may adversely affect the anomaly detection performance. Despite the fact that recent attempts perform subspace selection to address this issue, these algorithms treat subspace selection and anomaly detection as two separate steps which often leads to suboptimal solutions. In this paper, we investigate how to fuse attribute and network structure information more synergistically to avoid the adverse effects brought by noisy and structurally irrelevant attributes. Methodologically, we propose a novel joint framework to conduct attribute selection and anomaly detection as a whole based on CUR decomposition and residual analysis. By filtering out noisy and irrelevant node attributes, we perform anomaly detection with the remaining representative attributes. Experimental results on both synthetic and real-world datasets corroborate the effectiveness of the proposed framework.

PDF Details

IJCAI Conference 2017 Conference Paper

Adaptive Semi-Supervised Learning with Discriminative Least Squares Regression

Minnan Luo
Lingling Zhang
Feiping Nie
Xiaojun Chang
Buyue Qian
Qinghua Zheng

Semi-supervised learning plays a significant role in multi-class classification, where a small number of labeled data are more deterministic while substantial unlabeled data might cause large uncertainties and potential threats. In this paper, we distinguish the label fitting of labeled and unlabeled training data through a probabilistic vector with an adaptive parameter, which always ensures the significant importance of labeled data and characterizes the contribution of unlabeled instance according to its uncertainty. Instead of using traditional least squares regression (LSR) for classification, we develop a new discriminative LSR by equipping each label with an adjustment vector. This strategy avoids incorrect penalization on samples that are far away from the boundary and simultaneously facilitates multi-class classification by enlarging the geometrical distance of instances belonging to different classes. An efficient alternative algorithm is exploited to solve the proposed model with closed form solution for each updating rule. We also analyze the convergence and complexity of the proposed algorithm theoretically. Experimental results on several benchmark datasets demonstrate the effectiveness and superiority of the proposed model for multi-class classification tasks.

PDF Details

IJCAI Conference 2017 Conference Paper

How Unlabeled Web Videos Help Complex Event Detection?

Huan Liu
Qinghua Zheng
Minnan Luo
Dingwen Zhang
Xiaojun Chang
Cheng Deng

The lack of labeled exemplars is an important factor that makes the task of multimedia event detection (MED) complicated and challenging. Utilizing artificially picked and labeled external sources is an effective way to enhance the performance of MED. However, building these data usually requires professional human annotators, and the procedure is too time-consuming and costly to scale. In this paper, we propose a new robust dictionary learning framework for complex event detection, which is able to handle both labeled and easy-to-get unlabeled web videos by sharing the same dictionary. By employing the lq-norm based loss jointly with the structured sparsity based regularization, our model shows strong robustness against the substantial noisy and outlier videos from open source. We exploit an effective optimization algorithm to solve the proposed highly non-smooth and non-convex problem. Extensive experiment results over standard datasets of TRECVID MEDTest 2013 and TRECVID MEDTest 2014 demonstrate the effectiveness and superiority of the proposed framework on complex event detection.

PDF Details

AAAI Conference 2017 Conference Paper

Probabilistic Non-Negative Matrix Factorization and Its Robust Extensions for Topic Modeling

Minnan Luo
Feiping Nie
Xiaojun Chang
Yi Yang
Alexander Hauptmann
Qinghua Zheng

Traditional topic model with maximum likelihood estimate inevitably suffers from the conditional independence of words given the documents topic distribution. In this paper, we follow the generative procedure of topic model and learn the topic-word distribution and topics distribution via directly approximating the word-document co-occurrence matrix with matrix decomposition technique. These methods include: (1) Approximating the normalized document-word conditional distribution with the documents probability matrix and words probability matrix based on probabilistic nonnegative matrix factorization (NMF); (2) Since the standard NMF is well known to be non-robust to noises and outliers, we extended the probabilistic NMF of the topic model to its robust versions using 2, 1-norm and capped 2, 1-norm based loss functions, respectively. The proposed framework inherits the explicit probabilistic meaning of factors in topic models and simultaneously makes the conditional independence assumption on words unnecessary. Straightforward and efﬁcient algorithms are exploited to solve the corresponding nonsmooth and non-convex problems. Experimental results over several benchmark datasets illustrate the effectiveness and superiority of the proposed methods.

PDF Details