Author name cluster

Yuqi Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

AAAI Conference 2026 Conference Paper

ChartEditor: A Reinforcement Learning Framework for Robust Chart Editing

Liangyu Chen
Yichen Xu
Jianzhe Ma
Yuqi Liu
Donglu Yang
Liang Zhang
Zihao Yue
Wenxuan Wang

Chart editing reduces manual effort in visualization design. Typical benchmarks assume access to complete chart code, which is unrealistic for real-world applications. In this paper, we present ChartEditVista, a comprehensive benchmark consisting of 7,964 samples spanning 31 chart categories. It encompasses diverse editing instruction types and covers nearly all editable chart elements. The inputs in ChartEditVista include only the original chart image and natural language editing instructions, without the original chart codes. ChartEditVista is generated through a fully automated pipeline that produces, edits, and verifies charts, ensuring high-quality data. Besides, we introduce two novel fine-grained, rule-based evaluation metrics: the layout metric, which evaluates the position, size; and color of graphical components, and the text metric, which jointly assesses textual content and font styling. Building on top of ChartEditVista, we present ChartEditor, a model trained using a reinforcement learning framework that incorporates a novel rendering reward to simultaneously enforce code executability and visual fidelity. Through extensive experiments and human evaluations, we demonstrate that ChartEditVista provides a robust evaluation, while ChartEditor consistently outperforms models with similar-scale and larger-scale on chart editing tasks.

PDF Details DOI

ICML Conference 2025 Conference Paper

Emoji Attack: Enhancing Jailbreak Attacks Against Judge LLM Detection

Zhipeng Wei 0001
Yuqi Liu
N. Benjamin Erichson

Jailbreaking techniques trick Large Language Models (LLMs) into producing restricted output, posing a potential threat. One line of defense is to use another LLM as a Judge to evaluate the harmfulness of generated text. However, we reveal that these Judge LLMs are vulnerable to token segmentation bias, an issue that arises when delimiters alter the tokenization process, splitting words into smaller sub-tokens. This alters the embeddings of the entire sequence, reducing detection accuracy and allowing harmful content to be misclassified as safe. In this paper, we introduce Emoji Attack, a novel strategy that amplifies existing jailbreak prompts by exploiting token segmentation bias. Our method leverages in-context learning to systematically insert emojis into text before it is evaluated by a Judge LLM, inducing embedding distortions that significantly lower the likelihood of detecting unsafe content. Unlike traditional delimiters, emojis also introduce semantic ambiguity, making them particularly effective in this attack. Through experiments on state-of-the-art Judge LLMs, we demonstrate that Emoji Attack substantially reduces the unsafe prediction rate, bypassing existing safeguards.

Details

AAAI Conference 2025 Conference Paper

Enhancing LLMs via High-Knowledge Data Selection

Feiyu Duan
Xuemiao Zhang
Sirui Wang
Haoran Que
Yuqi Liu
Wenge Rong
Xunliang Cai

The performance of Large Language Models (LLMs) is intrinsically linked to the quality of its training data. Although several studies have proposed methods for high-quality data selection, they do not consider the importance of knowledge richness in text corpora. In this paper, we propose a novel and gradient-free High-Knowledge Scorer (HKS) to select high-quality data from the dimension of knowledge, to alleviate the problem of knowledge scarcity in the pre-trained corpus. We propose a comprehensive multi-domain knowledge element pool and introduce knowledge density and coverage as metrics to assess the knowledge content of the text. Based on this, we propose a comprehensive knowledge scorer to select data with intensive knowledge, which can also be utilized for domain-specific high-knowledge data selection by restricting knowledge elements to the specific domain. We train models on a high-knowledge bilingual dataset, and experimental results demonstrate that our scorer improves the model's performance in knowledge-intensive and general comprehension tasks, and is effective in enhancing both the generic and domain-specific capabilities of the model.

PDF Details DOI

YNIMG Journal 2025 Journal Article

Reweighting of visuomotor areas during motor processing subsequent to somatosensory cortical damage

Yuqi Liu
Elizabeth J. Halfen
Jeffrey M. Yau
Simon Fischer-Baum
Peter J. Kohler
Olufunsho Faseyitan
H. Branch Coslett
Jared Medina

Somatosensory inputs are critical to motor control. Animal studies have shown that primary somatosensory lesions cause sensorimotor deficits along with disrupted organization in primary motor cortex (M1). How does damage in primary somatosensory cortex (S1) influence motor networks in humans? Using fMRI, we examined two individuals, LS and RF, who had extensive damage to left somatosensory cortex, but primarily intact motor cortex and preserved motor abilities. Given left S1 damage, tactile detection and localization were impaired for the contralesional hand in both individuals. When moving the contralesional hand, LS, with near complete damage to S1 hand area, showed increased activation in ipsilesional putamen and deactivation in contralesional cerebellum relative to age-matched controls. These findings demonstrate influences of S1 damage to subcortical sensorimotor areas that are distant from the lesion site, and a potential reweighting of the motor network with increased action selection in putamen and inhibition of sensory prediction in cerebellum in the face of sensory loss. In contrast, RF, who had a small island of spared S1 in the hand area, showed greater activation in contralesional S1 for movement versus rest. This same region was also activated by pure somatosensory stimulation in a second experiment, suggesting that the spared S1 area in RF still subserves sensorimotor processing. Finally, the right middle occipital gyrus was more strongly activated in both individuals compared with controls, suggesting the potential reliance on visual imagery in the face of degraded sensory feedback.

Details DOI

AAAI Conference 2025 Conference Paper

SCOPE: Sign Language Contextual Processing with Embedding from LLMs

Yuqi Liu
Wenqian Zhang
Sihan Ren
Chengyu Huang
Jingyi Yu
Lan Xu

Sign languages, used by around 70 million Deaf individuals globally, are visual languages that convey visual and contextual information. Current methods in vision-based sign language recognition (SLR) and translation (SLT) struggle with dialogue scenes due to limited dataset diversity and the neglect of contextually relevant information. To address these challenges, we introduce SCOPE (Sign language COntextual Processing with Embedding from LLMs), a novel context-aware vision-based SLR and SLT framework. For SLR, we utilize dialogue contexts through a multi-modal encoder to enhance gloss-level recognition. For subsequent SLT, we further fine-tune a Large Language Model (LLM) by incorporating prior conversational context. We also contribute a new sign language dataset that contains 72 hours of Chinese sign language videos in contextual dialogues across various scenarios. Experimental results demonstrate that our SCOPE framework achieves state-of-the-art performance on multiple datasets, including Phoenix-2014T, CSL-Daily, and our SCOPE dataset. Moreover, surveys conducted with participants from the Deaf community further validate the robustness and effectiveness of our approach in real-world applications. Both our dataset and code will be open-sourced to facilitate further research.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Toward Open-Set Human Object Interaction Detection

Mingrui Wu
Yuqi Liu
Jiayi Ji
Xiaoshuai Sun
Rongrong Ji

This work is oriented toward the task of open-set Human Object Interaction (HOI) detection. The challenge lies in identifying completely new, out-of-domain relationships, as opposed to in-domain ones which have seen improvements in zero-shot HOI detection. To address this challenge, we introduce a simple Disentangled HOI Detection (DHD) model for detecting novel relationships by integrating an open-set object detector with a Visual Language Model (VLM). We utilize a disentangled image-text contrastive learning metric for training and connect the bottom-up visual features to text embeddings through lightweight unary and pair-wise adapters. Our model can benefit from the open-set object detector and the VLM to detect novel action categories and combine actions with novel object categories. We further present the VG-HOI dataset, a comprehensive benchmark with over 17k HOI relationships for open-set scenarios. Experimental results show that our model can detect unknown action classes and combine unknown object classes. Furthermore, it can generalize to over 17k HOI classes while being trained on just 600 HOI classes.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Token Mixing: Parameter-Efficient Transfer Learning from Image-Language to Video-Language

Yuqi Liu
Luhui Xu
Pengfei Xiong
Qin Jin

Applying large scale pre-trained image-language model to video-language tasks has recently become a trend, which brings two challenges. One is how to effectively transfer knowledge from static images to dynamic videos, and the other is how to deal with the prohibitive cost of fully fine-tuning due to growing model size. Existing works that attempt to realize parameter-efficient image-language to video-language transfer learning can be categorized into two types: 1) appending a sequence of temporal transformer blocks after the 2D Vision Transformer (ViT), and 2) inserting a temporal block into the ViT architecture. While these two types of methods only require fine-tuning the newly added components, there are still many parameters to update, and they are only validated on a single video-language task. In this work, based on our analysis of the core ideas of different temporal modeling components in existing approaches, we propose a token mixing strategy to enable cross-frame interactions, which enables transferring from the pre-trained image-language model to video-language tasks through selecting and mixing a key set and a value set from the input video samples. As token mixing does not require the addition of any components or modules, we can directly partially fine-tune the pre-trained image-language model to achieve parameter-efficiency. We carry out extensive experiments to compare our proposed token mixing method with other parameter-efficient transfer learning methods. Our token mixing method outperforms other methods on both understanding tasks and generation tasks. Besides, our method achieves new records on multiple video-language tasks. The code is available at https://github.com/yuqi657/video_language_model.

PDF Details DOI