Author name cluster

Sen Su

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

2 author rows

AAAI Conference 2026 Conference Paper

PrivSV: Differentially Private Steering Vector for Large Language Models

Haocheng Yang
Xiang Cheng
Chenhao Sun
Pengfei Zhang
Sen Su

Steering Vector (SV) is a powerful technique for controlling Large Language Models (LLMs) by manipulating their activations without altering model weights. However, when constructed from sensitive data, SV poses significant privacy risks, as it may leak private information. Existing differential privacy (DP) techniques for constructing SV cannot be directly applied to training-based SV construction paradigms, which offer higher task performance. In this work, we present **PrivSV**, a general privacy-preserving approach for constructing SV with DP guarantees, compatible with arbitrary SV construction paradigms while maintaining high utility. In PrivSV, we propose three novel methods: a Layer-wise Noise-Resilient Reduction (LNR²) method to reduce the injected noise in high-dimensional SV; a Directional Prior Compensation (DPC) method to recover utility degraded by noise perturbation; and a Privacy-Aware Optimal Parameter Determination (POPD) method to adaptively maximize the performance of the final compensated SV. Extensive experiments on open-source LLMs of different families (i.e., LlaMa, Qwen, Mistral and Gemma) demonstrate that PrivSV outperforms several existing techniques across various privacy budgets.

PDF Details DOI

ICML Conference 2025 Conference Paper

Going Deeper into Locally Differentially Private Graph Neural Networks

Longzhu He
Chaozhuo Li
Peng Tang 0002
Sen Su

Graph Neural Networks (GNNs) have demonstrated superior performance in a variety of graph mining and learning tasks. However, when node representations involve sensitive personal information or variables related to individuals, learning from graph data can raise significant privacy concerns. Although recent studies have explored local differential privacy (LDP) to address these concerns, they often introduce significant distortions to graph data, severely degrading private learning utility (e. g. , node classification accuracy). In this paper, we present UPGNET, an LDP-based privacy-preserving graph learning framework that enhances utility while protecting user data privacy. Specifically, we propose a three-stage pipeline that generalizes the LDP protocols for node features, targeting privacy-sensitive scenarios. Our analysis identifies two key factors that affect the utility of privacy-preserving graph learning: feature dimension and neighborhood size. Based on the above analysis, UPGNET enhances utility by introducing two core layers: High-Order Aggregator (HOA) layer and the Node Feature Regularization (NFR) layer. Extensive experiments on real-world datasets indicate that UPGNET significantly outperforms existing methods in terms of both privacy protection and learning utility.

Details

NeurIPS Conference 2025 Conference Paper

LIFEBENCH: Evaluating Length Instruction Following in Large Language Models

Wei Zhang
Zhenhong Zhou
Kun Wang
Junfeng Fang
Rongwu Xu
Yuanhe Zhang
Rui Wang
Ge Zhang

While large language models (LLMs) can solve PhD-level reasoning problems over long context inputs, they still struggle with a seemingly simpler task: following explicit length instructions —e. g. , write a 10, 000-word novel. Additionally, models often generate far too short outputs, terminate prematurely, or even refuse the request. Existing benchmarks focus primarily on evaluating generations quality, but often overlook whether the generations meet length constraints. To this end, we introduce Length Instruction Following Evaluation Benchmark (LIFEBench) to comprehensively evaluate LLMs' ability to follow length instructions across diverse tasks and a wide range of specified lengths. LIFEBench consists of 10, 800 instances across 4 task categories in both English and Chinese, covering length constraints ranging from 16 to 8192 words. We evaluate 26 widely-used LLMs and find that most models reasonably follow short-length instructions but deteriorate sharply beyond a certain threshold. Surprisingly, almost all models fail to reach the vendor-claimed maximum output lengths in practice, as further confirmed by our evaluations extending up to 32K words. Even long-context LLMs, despite their extended input-output windows, counterintuitively fail to improve length-instructions following. Notably, Reasoning LLMs outperform even specialized long-text generation models, achieving state-of-the-art length following. Overall, LIFEBench uncovers fundamental limitations in current LLMs' length instructions following ability, offering critical insights for future progress.

PDF Details

NeurIPS Conference 2025 Conference Paper

Residual Stream Analysis of Overfitting And Structural Disruptions

Quan Liu
Han Zhou
Wenquan Wu
Hua Wu
Sen Su

Ensuring that large language models (LLMs) remain both helpful and harmless poses a significant challenge: fine-tuning on repetitive safety datasets—where unsafe prompts are paired with standard refusal templates—often leads to \emph{false refusals}, in which benign queries are declined. We first quantify this effect, showing that safety data exhibits substantially lower token entropy ($H_{1}\approx9. 18$) and 2-gram diversity ($\approx$ 0. 048) compared to general instruction data ($H_{1}\approx12. 05$, 2-gram$\approx$0. 205). To uncover the root cause, we introduce \emph{FlowLens}, a stable PCA-based tool for residual-stream geometry analysis, and reveal that higher proportions of safety examples concentrate variance along a few components, reducing representational smoothness and driving false refusals (false refusal rate rises from 63\% to 84\% as safety data increases from 0\% to 40\%). Guided by these insights, we propose \emph{Variance Concentration Loss} (VCL), an auxiliary regularizer that penalizes excessive variance concentration in mid-layer residuals. Empirical results demonstrate that VCL reduces false refusals by over 35 percentage points while maintaining or improving performance on general benchmarks such as MMLU and GSM8K.

PDF Details

AAAI Conference 2024 Conference Paper

Quantifying and Analyzing Entity-Level Memorization in Large Language Models

Zhenhong Zhou
Jiuyang Xiang
Chaomeng Chen
Sen Su

Large language models (LLMs) have been proven capable of memorizing their training data, which can be extracted through specifically designed prompts. As the scale of datasets continues to grow, privacy risks arising from memorization have attracted increasing attention. Quantifying language model memorization helps evaluate potential privacy risks. However, prior works on quantifying memorization require access to the precise original data or incur substantial computational overhead, making it difficult for applications in real-world language models. To this end, we propose a fine-grained, entity-level definition to quantify memorization with conditions and metrics closer to real-world scenarios. In addition, we also present an approach for efficiently extracting sensitive entities from autoregressive language models. We conduct extensive experiments based on the proposed, probing language models' ability to reconstruct sensitive entities under different settings. We find that language models have strong memorization at the entity level and are able to reproduce the training data even with partial leakages. The results demonstrate that LLMs not only memorize their training data but also understand associations between entities. These findings necessitate that trainers of LLMs exercise greater prudence regarding model memorization, adopting memorization mitigation techniques to preclude privacy violations.

PDF Details DOI

TIST Journal 2023 Journal Article

MC 2: Unsupervised Multiple Social Network Alignment

Li Sun
Zhongbao Zhang
Gen Li
Pengxin Ji
Sen Su
Philip S. Yu

Social network alignment, identifying social accounts of the same individual across different social networks, shows fundamental importance in a wide spectrum of applications, such as link prediction and information diffusion. Individuals more often than not join in multiple social networks, and it is in fact much too expensive or even impossible to acquiring supervision for guiding the alignment. To the best of our knowledge, few method in the literature can align multiple social networks without supervision. In this article, we propose to study the problem of unsupervised multiple social network alignment. To address this problem, we propose a novel unsupervised model of joint Matrix factorization with a diagonal Cone under orthogonal Constraint, referred to as MC 2. Its core idea is to embed and align multiple social networks in the common subspace via an unsupervised approach. Specifically, in MC 2 model, we first design a matrix optimization to infer the common subspace from different social networks. To address the nonconvex optimization, we then design an efficient alternating algorithm by leveraging its inherent functional property. Through extensive experiments on real-world datasets, we demonstrate that the proposed MC 2 model significantly outperforms the state-of-the-art methods.

Details DOI

AAAI Conference 2022 Conference Paper

A Self-Supervised Mixed-Curvature Graph Neural Network

Li Sun
Zhongbao Zhang
Junda Ye
Hao Peng
Jiawei Zhang
Sen Su
Philip S Yu

Graph representation learning received increasing attentions in recent years. Most of the existing methods ignore the complexity of the graph structures and restrict graphs in a single constant-curvature representation space, which is only suitable to particular kinds of graph structure indeed. Additionally, these methods follow the supervised or semi-supervised learning paradigm, and thereby notably limit their deployment on the unlabeled graphs in real applications. To address these aforementioned limitations, we take the first attempt to study the self-supervised graph representation learning in the mixed-curvature spaces. In this paper, we present a novel Self-Supervised Mixed-Curvature Graph Neural Network (SELFMGNN). To capture the complex graph structures, we construct a mixed-curvature space via the Cartesian product of multiple Riemannian component spaces, and design hierarchical attention mechanisms for learning and fusing graph representations across these component spaces. To enable the self-supervised learning, we propose a novel dual contrastive approach. The constructed mixed-curvature space actually provides multiple Riemannian views for the contrastive learning. We introduce a Riemannian projector to reveal these views, and utilize a well-designed Riemannian discriminator for the single-view and cross-view contrastive learning within and across the Riemannian views. Finally, extensive experiments show that SELFMGNN captures the complex graph structures and outperforms state-of-the-art baselines.

PDF Details

AAAI Conference 2021 Conference Paper

Hyperbolic Variational Graph Neural Network for Modeling Dynamic Graphs

Li Sun
Zhongbao Zhang
Jiawei Zhang
Feiyang Wang
Hao Peng
Sen Su
Philip S. Yu

Learning representations for graphs plays a critical role in a wide spectrum of downstream applications. In this paper, we summarize the limitations of the prior works in three folds: representation space, modeling dynamics and modeling uncertainty. To bridge this gap, we propose to learn dynamic graph representation in hyperbolic space, for the first time, which aims to infer stochastic node representations. Working with hyperbolic space, we present a novel Hyperbolic Variational Graph Neural Network, referred to as HVGNN. In particular, to model the dynamics, we introduce a Temporal GNN (TGNN) based on a theoretically grounded time encoding approach. To model the uncertainty, we devise a hyperbolic graph variational autoencoder built upon the proposed TGNN to generate stochastic node representations of hyperbolic normal distributions. Furthermore, we introduce a reparameterisable sampling algorithm for the hyperbolic normal distribution to enable the gradient-based learning of HVGNN. Extensive experiments show that HVGNN outperforms stateof-the-art baselines on real-world datasets.

PDF Details

IJCAI Conference 2020 Conference Paper

BANANA: when Behavior ANAlysis meets social Network Alignment

Fuxin Ren
Zhongbao Zhang
Jiawei Zhang
Sen Su
Li Sun
Guozhen Zhu
Congying Guo

Recently, aligning users among different social networks has received significant attention. However, most of the existing studies do not consider users’ behavior information during the aligning procedure and thus still suffer from the poor learning performance. In fact, we observe that social network alignment and behavior analysis can benefit from each other. Motivated by such an observation, we propose to jointly study the social network alignment problem and user behavior analysis problem. We design a novel end-to-end framework named BANANA. In this framework, to leverage behavior analysis for social network alignment at the distribution level, we design an earth mover’s distance based alignment model to fuse users’ behavior information for more comprehensive user representations. To further leverage social network alignment for behavior analysis, in turn, we design a temporal graph neural network model to fuse behavior information in different social networks based on the alignment result. Two models above can work together in an end-to-end manner. Through extensive experiments on real-world datasets, we demonstrate that our proposed approach outperforms the state-of-the-art methods in the social network alignment task and the user behavior analysis task, respectively.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Multi-Task Learning with Generative Adversarial Training for Multi-Passage Machine Reading Comprehension

Qiyu Ren
Xiang Cheng
Sen Su

Multi-passage machine reading comprehension (MRC) aims to answer a question by multiple passages. Existing multipassage MRC approaches have shown that employing passages with and without golden answers (i. e. labeled and unlabeled passages) for model training can improve prediction accuracy. In this paper, we present MG-MRC, a novel approach for multi-passage MRC via multi-task learning with generative adversarial training. MG-MRC adopts the extract-thenselect framework, where an extractor is ﬁrst used to predict answer candidates, then a selector is used to choose the ﬁnal answer. In MG-MRC, we adopt multi-task learning to train the extractor by using both labeled and unlabeled passages. In particular, we use labeled passages to train the extractor by supervised learning, while using unlabeled passages to train the extractor by generative adversarial training, where the extractor is regarded as the generator and a discriminator is introduced to evaluate the generated answer candidates. Moreover, to train the extractor by backpropagation in the generative adversarial training process, we propose a hybrid method which combines boundary-based and content-based extracting methods to produce the answer candidate set and its representation. The experimental results on three open-domain QA datasets conﬁrm the effectiveness of our approach.

PDF Details

IJCAI Conference 2018 Conference Paper

Exploring Encoder-Decoder Model for Distant Supervised Relation Extraction

Sen Su
Ningning Jia
Xiang Cheng
Shuguang Zhu
Ruiping Li

In this paper, we present an encoder-decoder model for distant supervised relation extraction. Given an entity pair and its sentence bag as input, in the encoder component, we employ the convolutional neural network to extract the features of the sentences in the sentence bag and merge them into a bag representation. In the decoder component, we utilize the long short-term memory network to model relation dependencies and predict the target relations in a sequential manner. In particular, to enable the sequential prediction of relations, we introduce a measure to quantify the amounts of information the relations take in their sentence bag, and use such information to determine the order of the relations of a sentence bag during model training. Moreover, we incorporate the attention mechanism into our model to dynamically adjust the bag representation to reduce the impact of sentences whose corresponding relations have been predicted. Extensive experiments on a popular dataset show that our model achieves significant improvement over state-of-the-art methods.

PDF Details

IJCAI Conference 2018 Conference Paper

MASTER: across Multiple social networks, integrate Attribute and STructure Embedding for Reconciliation

Sen Su
Li Sun
Zhongbao Zhang
Gen Li
Jielun Qu

Recently, reconciling social networks receives significant attention. Most of the existing studies have limitations in the following three aspects: multiplicity, comprehensiveness and robustness. To address these three limitations, we rethink this problem and propose the MASTER framework, i. e. , across Multiple social networks, integrate Attribute and STructure Embedding for Reconciliation. In this framework, we first design a novel Constrained Dual Embedding model by simultaneously embedding and reconciling multiple social networks to formulate our problem into a unified optimization. To address this optimization, we then design an effective algorithm called NS-Alternating. We also prove that this algorithm converges to KKT points. Through extensive experiments on real-world datasets, we demonstrate that MASTER outperforms the state-of-the-art approaches.

PDF Details

IJCAI Conference 2017 Conference Paper

Deep Supervised Hashing with Nonlinear Projections

Sen Su
Gang Chen
Xiang Cheng
Rong Bi

Hashing has attracted broad research interests in large scale image retrieval due to its high search speed and efficient storage. Recently, many deep hashing methods have been proposed to perform simultaneous nonlinear feature learning and hash projection learning, which have shown superior performance compared to hand-crafted feature based hashing methods. Nonlinear projection functions have shown their advantages over the linear ones due to their powerful generalization capabilities. To improve the performance of deep hashing methods by generalizing projection functions, we propose the idea of implementing a pure nonlinear deep hashing network architecture. By consolidating the above idea, this paper presents a Deep Supervised Hashing architecture with Nonlinear Projections (DSHNP). In particular, soft decision trees are adopted as the nonlinear projection functions, since they can generate differentiable nonlinear outputs and can be trained with deep neural networks in an end-to-end way. Moreover, to make the hash codes as independent as possible, we design two regularizers imposed on the parameter matrices of the leaves in the soft decision trees. Extensive evaluations on two benchmark image datasets show that the proposed DSHNP outperforms several state-of-the-art hashing methods.

PDF Details

ECAI Conference 2016 Conference Paper

Topic-Level Influencers Identification in the Microblog Sphere

Yakun Wang
Zhongbao Zhang
Sen Su
Cheng Chang
Muhammad Azam Zia

This paper studies the problem of identifying influencers on specific topics in the microblog sphere. Prior works usually use the cumulative number of social links to measure users' topic-level influence, which ignores the dynamics of influence. As a result, they usually find faded influencers. To address the limitations of prior methods, we propose a novel probabilistic generative model to capture the variation of influence over time. Then a influence decay method is proposed to measure users' current topic-level influence.

Details