Author name cluster

Qizhou Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

2 author rows

AAAI Conference 2025 Conference Paper

Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit

Qizhou Chen
Taolin Zhang
Chengyu Wang
Xiaofeng He
Dakan Wang
Tingting Liu

Model editing aims to correct outdated or erroneous knowledge in large models without costly retraining. Recent research discovered that the mid-layer representation of the subject's final token in a prompt has a strong influence on factual predictions, and developed Large Language Model (LLM) editing techniques based on this observation. However, for Vision-LLMs (VLLMs), how visual representations impact the predictions from a decoder-only language model remains largely unexplored. To the best of our knowledge, model editing for VLLMs has not been extensively studied in the literature. In this work, we employ the contribution allocation and noise perturbation methods to measure the contributions of visual representations for token predictions. Our attribution analysis shows that visual representations in mid-to-later layers that are highly relevant to the prompt contribute significantly to predictions. Based on these insights, we propose *VisEdit*, a novel model editor for VLLMs that effectively corrects knowledge by editing intermediate visual representations in regions important to the edit prompt. We evaluated *VisEdit* using multiple VLLM backbones and public VLLM editing benchmark datasets. The results show the superiority of *VisEdit* over the strong baselines adapted from existing state-of-the-art editors for LLMs.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Surface-Aware Feed-Forward Quadratic Gaussian for Frame Interpolation with Large Motion

Zaoming Yan
Yaomin Huang
Pengcheng Lei
Qizhou Chen
Guixu Zhang
Faming Fang

Motion in the real world takes place in 3D space. Existing Frame Interpolation methods often estimate global receptive fields in 2D frame space. Due to the limitations of 2D space, these global receptive fields are limited, which makes it difficult to match object correspondences between frames, resulting in sub-optimal performance when handling large-motion scenarios. In this paper, we introduce a novel pipeline for exploring object correspondences based on differential surface theory. The differential surface coordinate system provides a better representation of the real world, enabling effective exploration of object correspondences. Specifically, the pipeline first transforms an input pair of video frames from the image coordinate system to the differential surface coordinate system. Subsequently, within this coordinate system, object correspondences are explored based on surface geometric properties and the surface uniqueness theorem. Experimental findings showcase that our method attains state-of-the-art performance across large motion benchmarks. Our method demonstrates the state-of-the-art performance on these VFI subsets with large motion.

PDF Details

NeurIPS Conference 2025 Conference Paper

UniEdit: A Unified Knowledge Editing Benchmark for Large Language Models

Qizhou Chen
Dakan Wang
Taolin Zhang
Zaoming Yan
Chengsong You
Chengyu Wang
Xiaofeng He

Model editing aims to efficiently revise incorrect or outdated knowledge within LLMs without incurring the high cost of full retraining and risking catastrophic forgetting. Currently, most LLM editing datasets are confined to narrow knowledge domains and cover a limited range of editing evaluation. They often overlook the broad scope of editing demands and the diversity of ripple effects resulting from edits. In this context, we introduce \uniedit, a unified benchmark for LLM editing grounded in open-domain knowledge. First, we construct editing samples by selecting entities from 25 common domains across five major categories, utilizing the extensive triple knowledge available in open-domain knowledge graphs to ensure comprehensive coverage of the knowledge domains. To address the issues of generality and locality in editing, we design an Neighborhood Multi-hop Chain Sampling (NMCS) algorithm to sample subgraphs based on a given knowledge piece to entail comprehensive ripple effects to evaluate. Finally, we employ proprietary LLMs to convert the sampled knowledge subgraphs into natural language text, guaranteeing grammatical accuracy and syntactical diversity. Extensive statistical analysis confirms the scale, comprehensiveness, and diversity of our \uniedit benchmark. We conduct comprehensive experiments across multiple LLMs and editors, analyzing their performance to highlight strengths and weaknesses in editing across open knowledge domains and various evaluation criteria, thereby offering valuable insights for future research endeavors.

PDF Details

ECAI Conference 2024 Conference Paper

R 4: Reinforced Retriever-Reorder-Responder for Retrieval-Augmented Large Language Models

Taolin Zhang 0001
Dongyang Li
Qizhou Chen
Chengyu Wang 0001
Longtao Huang
Hui Xue 0001
Xiaofeng He
Jun Huang 0007

Retrieval-augmented large language models (LLMs) leverage relevant content retrieved by information retrieval systems to generate correct responses, aiming to alleviate the hallucination problem. However, existing retriever-responder methods typically append relevant documents to the prompt of LLMs to perform text generation tasks without considering the interaction of fine-grained structural semantics between the retrieved documents and the LLMs. This issue is particularly important for accurate response generation as LLMs tend to “lose in the middle” when dealing with input prompts augmented with lengthy documents. In this work, we propose a new pipeline named “Reinforced Retriever-Reorder-Responder” (R4) to learn document orderings for retrieval-augmented LLMs, thereby further enhancing their generation abilities while the large numbers of parameters of LLMs remain frozen. The reordering learning process is divided into two steps according to the quality of the generated responses: document order adjustment and document representation enhancement. Specifically, document order adjustment aims to organize retrieved document orderings into beginning, middle, and end positions based on graph attention learning, which maximizes the reinforced reward of response quality. Document representation enhancement further refines the representations of retrieved documents for responses of poor quality via document-level gradient adversarial learning. Extensive experiments demonstrate that our proposed pipeline achieves better factual question-answering performance on knowledge-intensive tasks compared to strong baselines across various public datasets. The source codes and trained models will be released upon paper acceptance.

Details