Author name cluster

Li Su

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

1 author row

YNIMG Journal 2026 Journal Article

Converse or reverse? Machine-learning modeling for disease progression: A study based on Alzheimer’s disease continuum cohort

Yujing Huang
Hao Zhang
Buqing Ma
Zhe Yu
Shenyi Dai
Lu Cheng
Li Su
Gaoyi Yang

Details DOI

AAAI Conference 2026 Conference Paper

STaR: Sensitive Trajectory Regulation for Unlearning in Large Reasoning Models

Jingjing Zhou
Gaoxiang Cong
Li Su
Liang Li

Large Reasoning Models (LRMs) have advanced automated multi-step reasoning, but their ability to generate complex Chain-of-Thought (CoT) trajectories introduces severe privacy risks, as sensitive information may be deeply embedded throughout the reasoning process. Existing Large Language Models (LLMs) unlearning approaches that typically focus on modifying only final answers are insufficient for LRMs, as they fail to remove sensitive content from intermediate steps, leading to persistent privacy leakage and degraded security. To address these challenges, we propose Sensitive Trajectory Regulation (STaR), a parameter-free, inference-time unlearning framework that achieves robust privacy protection throughout the reasoning process. Specifically, we first identify sensitive content via semantic-aware detection. Then, we inject global safety constraints through secure prompt encoder. Next, we perform trajectory-aware suppression to dynamically block sensitive content across the entire reasoning chain. Finally, we apply token-level adaptive filtering to prevent both exact and paraphrased sensitive tokens during generation. Furthermore, to overcome the inadequacies of existing evaluation protocols, we introduce two metrics: Multi-Decoding Consistency Assessment (MCS), which measures the consistency of unlearning across diverse decoding strategies, and Multi-Granularity Membership Inference Attack (MIA) Evaluation, which quantifies privacy protection at both answer and reasoning-chain levels. Experiments on the R-TOFU benchmark demonstrate that STaR achieves comprehensive and stable unlearning with minimal utility loss, setting a new standard for privacy-preserving reasoning in LRMs.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Generalizing Single-Frame Supervision to Event-Level Understanding for Video Anomaly Detection

Junxi Chen
Liang Li
Yunbin Tu
Li Su
Zhe Xue
Qingming Huang

Video Anomaly Detection (VAD) aims to identify abnormal frames from discrete events within video sequences. Existing VAD methods suffer from heavy annotation burdens in fully-supervised paradigm, insensitivity to subtle anomalies in semi-supervised paradigm, and vulnerability to noise in weakly-supervised paradigm. To address these limitations, we propose a novel paradigm: Single-Frame supervised VAD (SF-VAD), which uses a single annotated abnormal frame per abnormal video. SF-VAD ensures annotation efficiency while offering precise anomaly reference, facilitating robust anomaly modeling, and enhancing the detection of subtle anomalies in complex visual contexts. To validate its effectiveness, we construct three SF-VAD benchmarks by manually re-annotating the ShanghaiTech, UCF-Crime, and XD-Violence datasets in a practical procedure. Further, we devise Frame-guided Progressive Learning (FPL), to generalize sparse frame supervision to event-level anomaly understanding. FPL first leverages evidential learning to estimate anomaly relevance guided by annotated frames. Then it extends anomaly supervision by mining discrete abnormal events based on anomaly relevance and feature similarity. Meanwhile, FPL decouples normal patterns by isolating distinct normal frames outside abnormal events, reducing false alarms. Extensive experiments show SF-VAD achieves state-of-the-art detection results while offering a favorable trade-off between performance and annotation cost.

PDF Details

AAAI Conference 2025 Conference Paper

Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning

Yunbin Tu
Liang Li
Li Su
Qingming Huang

Video has emerged as a favored multimedia format on the internet. To better gain video contents, a new topic HIREST is presented, including video retrieval, moment retrieval, moment segmentation, and step-captioning. The pioneering work chooses the pre-trained CLIP-based model for video retrieval, and leverages it as a feature extractor for other three challenging tasks solved in a multi-task learning paradigm. Nevertheless, this work struggles to learn the comprehensive cognition of user-preferred content, due to disregarding the hierarchies and association relations across modalities. In this paper, guided by the shallow-to-deep principle, we propose a query-centric audio-visual cognition (QUAG) network to construct a reliable multi-modal representation for moment retrieval, segmentation and step-captioning. Specifically, we first design the modality-synergistic perception to obtain rich audio-visual content, by modeling global contrastive alignment and local fine-grained interaction between visual and audio modalities. Then, we devise the query-centric cognition that uses the deep-level query to perform the temporal-channel filtration on the shallow-level audio-visual representation. This can cognize user-preferred content and thus attain a query-centric audio-visual representation for three tasks. Extensive experiments show QUAG achieves the SOTA results on HIREST. Further, we test QUAG on the query-based video summarization task and verify its good generalization.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Leveraging Catastrophic Forgetting to Develop Safe Diffusion Models against Malicious Finetuning

Jiadong Pan
Hongcheng Gao
Zongyu Wu
Taihang Hu
Li Su
Qingming Huang
Liang Li

Diffusion models (DMs) have demonstrated remarkable proficiency in producing images based on textual prompts. Numerous methods have been proposed to ensure these models generate safe images. Early methods attempt to incorporate safety filters into models to mitigate the risk of generating harmful images but such external filters do not inherently detoxify the model and can be easily bypassed. Hence, model unlearning and data cleaning are the most essential methods for maintaining the safety of models, given their impact on model parameters. However, malicious fine-tuning can still make models prone to generating harmful or undesirable images even with these methods. Inspired by the phenomenon of catastrophic forgetting, we propose a training policy using contrastive learning to increase the latent space distance between clean and harmful data distribution, thereby protecting models from being fine-tuned to generate harmful images due to forgetting. The experimental results demonstrate that our methods not only maintain clean image generation capabilities before malicious fine-tuning but also effectively prevent DMs from producing harmful images after malicious fine-tuning. Our method can also be combined with other safety methods to maintain their safety against malicious fine-tuning further.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

A Sparse-Motif Ensemble Graph Convolutional Network against Over-smoothing

Xuan Jiang
Zhiyong Yang
Peisong Wen
Li Su
Qingming Huang

The over-smoothing issue is a well-known challenge for Graph Convolutional Networks (GCN). Specifically, it is often observed that increasing the depth of GCN ends up in a trivial embedding subspace where the difference among node embeddings belonging to the same cluster tends to vanish. This paper believes that the main cause lies in the limited diversity along the message passing pipeline. Inspired by this, we propose a Sparse-Motif Ensemble Graph Convolutional Network (SMEGCN). We argue that merely employing the original graph Laplacian as the spectrum of the graph cannot capture the diversified local structure of complex graphs. Hence, to improve the diversity of the graph spectrum, we introduce local topological structures of complex graphs into GCN by employing the so-called graph motifs or the small network subgraphs. Moreover, we find that the motif connections are much denser than the edge connections, which might converge to an all-one matrix within a few times of message-passing. To fix this, we first propose the notion of sparse motif to avoid spurious motif connections. Subsequently, we propose a hierarchical motif aggregation mechanism to integrate the graph spectral information from a series of different sparse-motif message passing paths. Finally, we conduct a series of theoretical and experimental analyses to demonstrate the superiority of the proposed method.

PDF Details DOI

YNIMG Journal 2021 Journal Article

Proximity to dementia onset and multi-modal neuroimaging changes: The prevent-dementia study

Elijah Mak
Maria-Eleni Dounavi
Audrey Low
Stephen F. Carter
Elizabeth McKiernan
Guy B Williams
P Simon Jones
Isabelle Carriere

Details DOI

YNICL Journal 2020 Journal Article

Correlation of microglial activation with white matter changes in dementia with Lewy bodies

Nicolas Nicastro
Elijah Mak
Guy B. Williams
Ajenthan Surendranathan
W Richard Bevan-Jones
Luca Passamonti
Patricia Vàzquez Rodrìguez
Li Su

Details DOI

AAAI Conference 2019 Conference Paper

Learning Attribute-Specific Representations for Visual Tracking

Yuankai Qi
Shengping Zhang
Weigang Zhang
Li Su
Qingming Huang
Ming-Hsuan Yang

In recent years, convolutional neural networks (CNNs) have achieved great success in visual tracking. Most of existing methods train or fine-tune a binary classifier to distinguish the target from its background. However, they may suffer from the performance degradation due to insufficient training data. In this paper, we show that attribute information (e. g. , illumination changes, occlusion and motion) in the context facilitates training an effective classifier for visual tracking. In particular, we design an attribute-based CNN with multiple branches, where each branch is responsible for classifying the target under a specific attribute. Such a design reduces the appearance diversity of the target under each attribute and thus requires less data to train the model. We combine all attributespecific features via ensemble layers to obtain more discriminative representations for the final target/background classification. The proposed method achieves favorable performance on the OTB100 dataset compared to state-of-the-art tracking methods. After being trained on the VOT datasets, the proposed network also shows a good generalization ability on the UAV-Traffic dataset, which has significantly different attributes and target appearances with the VOT datasets.

PDF Details

YNIMG Journal 2019 Journal Article

Normative pathways in the functional connectome

Matthew Leming
Li Su
Shayanti Chattopadhyay
John Suckling

Details DOI

AAAI Conference 2019 Conference Paper

Play as You Like: Timbre-Enhanced Multi-Modal Music Style Transfer

Chien-Yu Lu
Min-Xin Xue
Chia-Che Chang
Che-Rung Lee
Li Su

Style transfer of polyphonic music recordings is a challenging task when considering the modeling of diverse, imaginative, and reasonable music pieces in the style different from their original one. To achieve this, learning stable multi-modal representations for both domain-variant (i. e. , style) and domaininvariant (i. e. , content) information of music in an unsupervised manner is critical. In this paper, we propose an unsupervised music style transfer method without the need for parallel data. Besides, to characterize the multi-modal distribution of music pieces, we employ the Multi-modal Unsupervised Image-to-Image Translation (MUNIT) framework in the proposed system. This allows one to generate diverse outputs from the learned latent distributions representing contents and styles. Moreover, to better capture the granularity of sound, such as the perceptual dimensions of timbre and the nuance in instrument-specific performance, cognitively plausible features including mel-frequency cepstral coefficients (MFCC), spectral difference, and spectral envelope, are combined with the widely-used mel-spectrogram into a timbreenhanced multi-channel input representation. The Relativistic average Generative Adversarial Networks (RaGAN) is also utilized to achieve fast convergence and high stability. We conduct experiments on bilateral style transfer tasks among three different genres, namely piano solo, guitar solo, and string quartet. Results demonstrate the advantages of the proposed method in music style transfer with improved sound quality and in allowing users to manipulate the output.

PDF Details

YNICL Journal 2015 Journal Article

Longitudinal assessment of global and regional atrophy rates in Alzheimer's disease and dementia with Lewy bodies

Elijah Mak
Li Su
Guy B. Williams
Rosie Watson
Michael Firbank
Andrew M. Blamire
John T. O'Brien

Details DOI

YNIMG Journal 2014 Journal Article

Self-regulation of the anterior insula: Reinforcement learning using real-time fMRI neurofeedback

Emma J. Lawrence
Li Su
Gareth J. Barker
Nick Medford
Jeffrey Dalton
Steve C.R. Williams
Niels Birbaumer
Ralf Veit

Details DOI