Arrow Research search

Author name cluster

Lin Gui

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers
1 author row

Possible papers

11

AAAI Conference 2026 Conference Paper

Beyond Perplexity: Let the Reader Select Retrieval Summaries via Spectrum Projection Score

  • Zhanghao Hu
  • Qinglin Zhu
  • Siya Qi
  • Yulan He
  • Hanqi Yan
  • Lin Gui

Large Language Models (LLMs) have shown improved generation performance through retrieval-augmented generation (RAG) following the retriever-reader paradigm, which supplements model inputs with externally retrieved knowledge. However, prior work often evaluates RAG holistically, assessing the retriever and reader jointly, making it difficult to isolate the true contribution of retrieval, particularly given the prompt sensitivity of LLMs used as readers. We move beyond perplexity and introduce Spectrum Projection Score (SPS), a lightweight and supervision-free metric that allows the reader to gauge the semantic alignment of a retrieved summary with its hidden representation by comparing the area formed by generated tokens from the summary, and the principal directions of subspace in the reader and to measure the relevance. Building on SPS we present xCompress, an inference‑time controller framework that dynamically samples, ranks, and compresses retrieval summary candidates. Extensive experiments on five QA benchmarks with four open-sourced LLMs show that SPS not only enhances performance across a range of tasks but also provides a principled perspective on the interaction between retrieval and generation.

AAAI Conference 2026 Conference Paper

MemGuide: Intent-Driven Memory Selection for Goal-Oriented Multi-Session LLM Agents

  • Yiming Du
  • Bingbing Wang
  • Yang He
  • Bin Liang
  • Baojun Wang
  • Zhongyang Li
  • Lin Gui
  • Jeff Z. Pan

Modern task-oriented dialogue (TOD) systems increasingly rely on large language model (LLM) agents, leveraging Retrieval-Augmented Generation (RAG) and long-context capabilities for long-term memory utilization. However, these methods prioritise semantic similarity over task intent, degrading multi-session coherence. We propose MemGuide, a two-stage intent-driven memory selection framework: (1) Intent‑Aligned Retrieval retrieves goal-consistent QA‑formatted memory units; (2) Missing‑Slot Guided Filtering reranks units by slot-completion gain via a chain‑of‑thought reasoner and fine‑tuned LLaMA‑8B filter. We also introduce the MS-TOD, the first multi-session TOD benchmark with 132 diverse personas, 956 task goals, and annotated intent-aligned memory targets. Evaluations on MS-TOD show that MemGuide boosts task success rate by 11% (88%→99%) and reduces dialogue length by 2.84 turns, and matches single‑session performance.

AAAI Conference 2025 Conference Paper

Correcting Large Language Model Behavior via Influence Function

  • Han Zhang
  • Zhuo Zhang
  • Yi Zhang
  • Yuanzhao Zhai
  • Hanyang Peng
  • Yu Lei
  • Yue Yu
  • Hui Wang

Recent advancements in AI alignment techniques have significantly improved the alignment of large language models (LLMs) with static human preferences. However, the dynamic nature of human preferences can render some prior training data outdated or even erroneous, ultimately causing LLMs to deviate from contemporary human preferences and societal norms. Existing methodologies, either curation of new data for continual alignment or manual correction of outdated data for re-alignment, demand costly human resources. To address this, we propose a novel approach, LLM BehAvior Correction with INfluence FunCtion REcall and Post-Training (LANCET), which needs no human involvement. LANCET consists of two phases: (1) using a new method LinFAC to efficiently identify the training data that significantly impact undesirable model outputs, and (2) applying an novel Influence-driven Bregman Optimization (IBO) technique to adjust the model’s outputs based on these influence distributions. Our experiments show that LANCET effectively and efficiently corrects inappropriate behaviors of LLMs while preserving model utility. Further more, LANCET exhibits stronger generalization ability than all baselines under out-of-distribution harmful prompts, offering better interpretability and compatibility with real-world applications of LLMs.

NeurIPS Conference 2024 Conference Paper

BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling

  • Lin Gui
  • Cristina Gârbacea
  • Victor Veitch

This paper concerns the problem of aligning samples from large language models to human preferences using *best-of-$n$* sampling, where we draw $n$ samples, rank them, and return the best one. We consider two fundamental problems. First: what is the relationship between best-of-$n$ and other (RLHF-type) approaches to aligning LLMs? In particular, when should one be preferred to the other? We show that the best-of-$n$ sampling distribution is essentially equivalent to the policy learned by RLHF if we apply a particular monotone transformation to the reward function. Moreover, we show that this transformation yields the best possible trade-off between win-rate against the base model vs KL distance from the base model. Then, best-of-$n$ is a Pareto-optimal win-rate vs KL solution. The second problem we consider is how to fine-tune a model to mimic the best-of-$n$ sampling distribution, to avoid drawing $n$ samples for each inference. We derive *BonBon Alignment* as a method for achieving this. Experiments show that BonBon alignment yields a model that achieves high win rates while minimally affecting off-target aspects of the generations.

AAAI Conference 2024 System Paper

NarrativePlay: An Automated System for Crafting Visual Worlds in Novels for Role-Playing

  • Runcong Zhao
  • Wenjia Zhang
  • Jiazheng Li
  • Lixing Zhu
  • Yanran Li
  • Yulan He
  • Lin Gui

In this demo, we present NarrativePlay -- an innovative system enabling users to role-play a fictional character and interact with dynamically generated narrative environments. Unlike existing predefined sandbox approaches, NarrativePlay centres around the main storyline events extracted from the narrative, allowing users to experience the story from the perspective of a character they chose. To design versatile AI agents for diverse scenarios, we employ a framework built on a Large Language Models (LLMs) to extract detailed character traits from text. We also incorporate automatically generated visual displays of narrative settings, character portraits, and character speech, greatly enhancing the overall user experience.

NeurIPS Conference 2023 Conference Paper

Concept Algebra for (Score-Based) Text-Controlled Generative Models

  • Zihao Wang
  • Lin Gui
  • Jeffrey Negrea
  • Victor Veitch

This paper concerns the structure of learned representations in text-guided generative models, focusing on score-based models. A key property of such models is that they can compose disparate concepts in a 'disentangled' manner. This suggests these models have internal representations that encode concepts in a 'disentangled' manner. Here, we focus on the idea that concepts are encoded as subspaces of some representation space. We formalize what this means, show there's a natural choice for the representation, and develop a simple method for identifying the part of the representation corresponding to a given concept. In particular, this allows us to manipulate the concepts expressed by the model through algebraic manipulation of the representation. We demonstrate the idea with examples using Stable Diffusion.

NeurIPS Conference 2023 Conference Paper

Counterfactual Generation with Identifiability Guarantees

  • Hanqi Yan
  • Lingjing Kong
  • Lin Gui
  • Yuejie Chi
  • Eric Xing
  • Yulan He
  • Kun Zhang

Counterfactual generation lies at the core of various machine learning tasks, including image translation and controllable text generation. This generation process usually requires the identification of the disentangled latent representations, such as content and style, that underlie the observed data. However, it becomes more challenging when faced with a scarcity of paired data and labelling information. Existing disentangled methods crucially rely on oversimplified assumptions, such as assuming independent content and style variables, to identify the latent variables, even though such assumptions may not hold for complex data distributions. For instance, food reviews tend to involve words like “tasty”, whereas movie reviews commonly contain words such as “thrilling” for the same positive sentiment. This problem is exacerbated when data are sampled from multiple domains since the dependence between content and style may vary significantly over domains. In this work, we tackle the domain-varying dependence between the content and the style variables inherent in the counterfactual generation task. We provide identification guarantees for such latent-variable models by leveraging the relative sparsity of the influences from different latent variables. Our theoretical insights enable the development of a doMain AdapTive counTerfactual gEneration model, called (MATTE). Our theoretically grounded framework achieves state-of-the-art performance in unsupervised style transfer tasks, where neither paired data nor style labels are utilized, across four large-scale datasets.

IS Journal 2020 Journal Article

Commonsense Knowledge Enhanced Memory Network for Stance Classification

  • Jiachen Du
  • Lin Gui
  • Ruifeng Xu
  • Yunqing Xia
  • Xuan Wang

Stance classification aims at identifying, in the text, the attitude toward the given targets as favorable, negative, or unrelated. In existing models for stance classification, only textual representation is leveraged, while commonsense knowledge is ignored. In order to better incorporate commonsense knowledge into stance classification, we propose a novel model named commonsense knowledge enhanced memory network, which jointly represents textual and commonsense knowledge representation of given target and text. The textual memory module in our model treats the textual representation as memory vectors, and uses attention mechanism to embody the important parts. For commonsense knowledge memory module, we jointly leverage the entity and relation embeddings learned by TransE model to take full advantage of constraints of the knowledge graph. Experimental results on the SemEval dataset show that the combination of the commonsense knowledge memory and textual memory can improve stance classification.

IJCAI Conference 2017 Conference Paper

Stance Classification with Target-specific Neural Attention

  • Jiachen Du
  • Ruifeng Xu
  • Yulan He
  • Lin Gui

Stance classification, which aims at detecting the stance expressed in text towards a specific target, is an emerging problem in sentiment analysis. A major difference between stance classification and traditional aspect-level sentiment classification is that the identification of stance is dependent on target which might not be explicitly mentioned in text. This indicates that apart from text content, the target information is important to stance detection. To this end, we propose a neural network-based model, which incorporates target-specific information into stance classification by following a novel attention mechanism. In specific, the attention mechanism is expected to locate the critical parts of text which are related to target. Our evaluations on both the English and Chinese Stance Detection datasets show that the proposed model achieves the state-of-the-art performance.

IJCAI Conference 2016 Conference Paper

Intersubjectivity and Sentiment: From Language to Knowledge

  • Lin Gui
  • Ruifeng Xu
  • Yulan He
  • Qin Lu
  • Zhongyu Wei

Intersubjectivity is an important concept in psychology and sociology. It refers to sharing conceptualizations through social interactions in a community and using such shared conceptualization as a resource to interpret things that happen in everyday life. In this work, we make use of intersubjectivity as the basis to model shared stance and subjectivity for sentiment analysis. We construct an intersubjectivity network which links review writers, terms they used, as well as the polarities of the terms. Based on this network model, we propose a method to learn writer embeddings which are subsequently incorporated into a convolutional neural network for sentiment analysis. Evaluations on the IMDB, Yelp 2013 and Yelp 2014 datasets show that the proposed approach has achieved the state-of-the-art performance.