Arrow Research search

Author name cluster

Fei Sun

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers
1 author row

Possible papers

10

AAAI Conference 2026 System Paper

AuditAgent: LLM Agent for Risks Auditing in Recommender Systems

  • Du Su
  • Zhenxing Chen
  • Shilong Zhao
  • Yuanhao Liu
  • Fei Sun
  • Qi Cao
  • Huawei Shen

Auditing recommendation systems has attracted growing attention due to increasing concerns over filter bubbles, unfairness, and data misuse. A common approach is sock-puppet auditing, where autonomous agents interact with platforms to reveal risks. However, existing approaches rely on hard-coded agents, lacking adaptability to dynamic GUI layouts and generating behaviors far from those of real users, limiting the comprehensiveness and representativeness of assessment. To address these issues, we introduce AuditAgent, an LLM-powered GUI-agent framework for risk auditing. AuditAgent simulates realistic user preferences and performs adaptive, human-like interactions on recommendation platforms. This design enables more thorough and faithful auditing, providing comprehensive assessments across multiple risk dimensions, including filter bubbles, unfairness, and data misuse.

AAAI Conference 2026 Conference Paper

LiR3AG: A Lightweight Rerank Reasoning Strategy Framework for Retrieval-Augmented Generation

  • Guo Chen
  • Junjie Huang
  • Huaijin Xie
  • Fei Sun
  • Tao Jia

Retrieval-Augmented Generation (RAG) effectively enhances Large Language Models (LLMs) by incorporating retrieved external knowledge into the generation process. Reasoning models improve LLM performance in multi-hop QA tasks, which require integrating and reasoning over multiple pieces of evidence across different documents to answer a complex question. However, they often introduce substantial computational costs, including increased token consumption and inference latency. To better understand and mitigate this trade-off, we conduct a comprehensive study of reasoning strategies for reasoning models in RAG multi-hop QA tasks. Our findings reveal that reasoning models adopt structured strategies to integrate retrieved and internal knowledge, primarily following two modes: Context-Grounded Reasoning, which relies directly on retrieved content, and Knowledge-Reconciled Reasoning, which resolves conflicts or gaps using internal knowledge. To this end, we propose a novel Lightweight Rerank Reasoning Strategy Framework for RAG (LiR³AG) to enable non-reasoning models to transfer reasoning strategies by restructuring retrieved evidence into coherent reasoning chains. LiR³AG significantly reduce the average 98% output tokens overhead and 58.6% inferencing time while improving 8B non-reasoning model's F1 performance ranging from 6.2% to 22.5% to surpass the performance of 32B reasoning model in RAG, offering a practical and efficient path forward for RAG systems.

AAAI Conference 2024 Conference Paper

PDE+: Enhancing Generalization via PDE with Adaptive Distributional Diffusion

  • Yige Yuan
  • Bingbing Xu
  • Bo Lin
  • Liang Hou
  • Fei Sun
  • Huawei Shen
  • Xueqi Cheng

The generalization of neural networks is a central challenge in machine learning, especially concerning the performance under distributions that differ from training ones. Current methods, mainly based on the data-driven paradigm such as data augmentation, adversarial training, and noise injection, may encounter limited generalization due to model non-smoothness. In this paper, we propose to investigate generalization from a Partial Differential Equation (PDE) perspective, aiming to enhance it directly through the underlying function of neural networks, rather than focusing on adjusting input data. Specifically, we first establish the connection between neural network generalization and the smoothness of the solution to a specific PDE, namely transport equation. Building upon this, we propose a general framework that introduces adaptive distributional diffusion into transport equation to enhance the smoothness of its solution, thereby improving generalization. In the context of neural networks, we put this theoretical framework into practice as PDE+ (PDE with Adaptive Distributional Diffusion) which diffuses each sample into a distribution covering semantically similar inputs. This enables better coverage of potentially unobserved distributions in training, thus improving generalization beyond merely data-driven methods. The effectiveness of PDE+ is validated through extensive experimental settings, demonstrating its superior performance compared to state-of-the-art methods. Our code is available at https://github.com/yuanyige/pde-add.

NeurIPS Conference 2024 Conference Paper

Understanding and Improving Adversarial Collaborative Filtering for Robust Recommendation

  • Kaike Zhang
  • Qi Cao
  • Yunfan Wu
  • Fei Sun
  • Huawei Shen
  • Xueqi Cheng

Adversarial Collaborative Filtering (ACF), which typically applies adversarial perturbations at user and item embeddings through adversarial training, is widely recognized as an effective strategy for enhancing the robustness of Collaborative Filtering (CF) recommender systems against poisoning attacks. Besides, numerous studies have empirically shown that ACF can also improve recommendation performance compared to traditional CF. Despite these empirical successes, the theoretical understanding of ACF's effectiveness in terms of both performance and robustness remains unclear. To bridge this gap, in this paper, we first theoretically show that ACF can achieve a lower recommendation error compared to traditional CF with the same training epochs in both clean and poisoned data contexts. Furthermore, by establishing bounds for reductions in recommendation error during ACF's optimization process, we find that applying personalized magnitudes of perturbation for different users based on their embedding scales can further improve ACF's effectiveness. Building on these theoretical understandings, we propose Personalized Magnitude Adversarial Collaborative Filtering (PamaCF). Extensive experiments demonstrate that PamaCF effectively defends against various types of poisoning attacks while significantly enhancing recommendation performance.

IJCAI Conference 2022 Conference Paper

Neural Re-ranking in Multi-stage Recommender Systems: A Review

  • Weiwen Liu
  • Yunjia Xi
  • Jiarui Qin
  • Fei Sun
  • Bo Chen
  • Weinan Zhang
  • Rui Zhang
  • Ruiming Tang

As the final stage of the multi-stage recommender system (MRS), re-ranking directly affects users’ experience and satisfaction by rearranging the input ranking lists, and thereby plays a critical role in MRS. With the advances in deep learning, neural re-ranking has become a trending topic and been widely adopted in industrial applications. This review aims at integrating re-ranking algorithms into a broader picture, and paving ways for more comprehensive solutions for future research. For this purpose, we first present a taxonomy of current methods on neural re-ranking. Then we give a description of these methods along with the historic development according to their objectives. The network structure, personalization, and complexity are also discussed and compared. Next, we provide a benchmark for the major neural re-ranking models and quantitatively analyze their re-ranking performance. Finally, the review concludes with a discussion on future prospects of this field. A list of papers discussed in this review, the benchmark datasets, our re-ranking library LibRerank, and detailed parameter settings are publicly available at https: //github. com/LibRerank-Community/LibRerank.

AAAI Conference 2020 Conference Paper

Be Relevant, Non-Redundant, and Timely: Deep Reinforcement Learning for Real-Time Event Summarization

  • Min Yang
  • Chengming Li
  • Fei Sun
  • Zhou Zhao
  • Ying Shen
  • Chenglin Wu

Real-time event summarization is an essential task in natural language processing and information retrieval areas. Despite the progress of previous work, generating relevant, nonredundant, and timely event summaries remains challenging in practice. In this paper, we propose a Deep Reinforcement learning framework for real-time Event Summarization (DRES), which shows promising performance for resolving all three challenges (i. e. , relevance, non-redundancy, timeliness) in a unified framework. Specifically, we (i) devise a hierarchical cross-attention network with intra- and interdocument attentions to integrate important semantic features within and between the query and input document for better text matching. In addition, relevance prediction is leveraged as an auxiliary task to strengthen the document modeling and help to extract relevant documents; (ii) propose a multi-topic dynamic memory network to capture the sequential patterns of different topics belonging to the event of interest and temporally memorize the input facts from the evolving document stream, avoiding extracting redundant information at each time step; (iii) consider both historical dependencies and future uncertainty of the document stream for generating relevant and timely summaries by exploiting the reinforcement learning technique. Experimental results on two realworld datasets have demonstrated the advantages of DRES model with significant improvement in generating relevant, non-redundant, and timely event summaries against the stateof-the-arts.

IJCAI Conference 2020 Conference Paper

Intent Preference Decoupling for User Representation on Online Recommender System

  • Zhaoyang Liu
  • Haokun Chen
  • Fei Sun
  • Xu Xie
  • Jinyang Gao
  • Bolin Ding
  • Yanyan Shen

Accurately characterizing the user's current interest is the core of recommender systems. However, users' interests are dynamic and affected by intent factors and preference factors. The intent factors imply users' current needs and change among different visits. The preference factors are relatively stable and learned continuously over time. Existing works either resort to the sequential recommendation to model the current browsing intent and historical preference separately or just mix up these two factors during online learning. In this paper, we propose a novel learning strategy named FLIP to decouple the learning of intent and preference under the online settings. The learning of the intent is considered as a meta-learning task and fast adaptive to the current browsing; the learning of the preference is based on the calibrated user intent and constantly updated over time. We conducted experiments on two public datasets and a real-world recommender system. When equipping it with modern recommendation methods, significant improvements are demonstrated over strong baselines.

IJCAI Conference 2019 Conference Paper

Deep Session Interest Network for Click-Through Rate Prediction

  • Yufei Feng
  • Fuyu Lv
  • Weichen Shen
  • Menghan Wang
  • Fei Sun
  • Yu Zhu
  • Keping Yang

Click-Through Rate (CTR) prediction plays an important role in many industrial applications, such as online advertising and recommender systems. How to capture users' dynamic and evolving interests from their behavior sequences remains a continuous research topic in the CTR prediction. However, most existing studies overlook the intrinsic structure of the sequences: the sequences are composed of sessions, where sessions are user behaviors separated by their occurring time. We observe that user behaviors are highly homogeneous in each session, and heterogeneous cross sessions. Based on this observation, we propose a novel CTR model named Deep Session Interest Network (DSIN) that leverages users' multiple historical sessions in their behavior sequences. We first use self-attention mechanism with bias encoding to extract users' interests in each session. Then we apply Bi-LSTM to model how users' interests evolve and interact among sessions. Finally, we employ the local activation unit to adaptively learn the influences of various session interests on the target item. Experiments are conducted on both advertising and production recommender datasets and DSIN outperforms other state-of-the-art models on both datasets.

IJCAI Conference 2019 Conference Paper

Tag2Gauss: Learning Tag Representations via Gaussian Distribution in Tagged Networks

  • Yun Wang
  • Lun Du
  • Guojie Song
  • Xiaojun Ma
  • Lichen Jin
  • Wei Lin
  • Fei Sun

Keyword-based tags (referred to as tags) are used to represent additional attributes of nodes in addition to what is explicitly stated in their contents, like the hashtags in YouTube. Aside of being auxiliary information for node representation, tags can also be used for retrieval, recommendation, content organization, and event analysis. Therefore, tag representation learning is of great importance. However, to learn satisfactory tag representations is challenging because 1) traditional representation methods generally fail when it comes to representing tags, 2) bidirectional interactions between nodes and tags should be modeled, which are generally not dealt within existing research works. In this paper, we propose a tag representation learning model which takes tag-related node interaction into consideration, named Tag2Gauss. Specifically, since tags represent node communities with intricate overlapping relationships, we propose that Gaussian distributions would be appropriate in modeling tags. Considering the bidirectional interactions between nodes and tags, we propose a tag representation learning model mapping tags to distributions consisting of two embedding tasks, namely Tag-view embedding and Node-view embedding. Extensive evidence demonstrates the effectiveness of representing tag as a distribution, and the advantages of the proposed architecture in many applications, such as the node classification and the network visualization.

AAAI Conference 2016 Conference Paper

Inside Out: Two Jointly Predictive Models for Word Representations and Phrase Representations

  • Fei Sun
  • Jiafeng Guo
  • Yanyan Lan
  • Jun Xu
  • Xueqi Cheng

Distributional hypothesis lies in the root of most existing word representation models by inferring word meaning from its external contexts. However, distributional models cannot handle rare and morphologically complex words very well and fail to identify some fine-grained linguistic regularity as they are ignoring the word forms. On the contrary, morphology points out that words are built from some basic units, i. e. , morphemes. Therefore, the meaning and function of such rare words can be inferred from the words sharing the same morphemes, and many syntactic relations can be directly identified based on the word forms. However, the limitation of morphology is that it cannot infer the relationship between two words that do not share any morphemes. Considering the advantages and limitations of both approaches, we propose two novel models to build better word representations by modeling both external contexts and internal morphemes in a jointly predictive way, called BEING and SEING. These two models can also be extended to learn phrase representations according to the distributed morphology theory. We evaluate the proposed models on similarity tasks and analogy tasks. The results demonstrate that the proposed models can outperform state-of-the-art models significantly on both word and phrase representation learning.