Arrow Research search

Author name cluster

Zhijing Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers
1 author row

Possible papers

8

AAAI Conference 2026 Conference Paper

WikiREVIEW: A Multi-Perspective Review Framework for Automatic Wiki-Style Article Generation

  • Guo-Biao Zhang
  • Zhijing Wu
  • Tian Lan
  • Ding-Yuan Liu
  • Yu-Shi Zhu
  • Xian-Ling Mao

As a knowledge-intensive and challenging task, automatic generation of long-form wiki-style articles has garnered increasing attention from researchers due to its ability to efficiently integrate, organize and present vast amounts of both structured and unstructured knowledge. To the best of our knowledge, most of the existing mainstream state-of-the-art methods for automatic wiki-style article generation typically follow a "one-shot generation" paradigm: given a topic, (1) first generating a structured outline, (2) then independently and in parallel generating the content of each outline chapter in a one-shot using the chapter title and references. However, the core limitation of the paradigm lies in its disregards inter-chapter correlation and lacks post-generation revision and refinement, resulting in content redundancy, weak relevance and logical inconsistency. To address these issues, we propose WikiREVIEW, a novel multi-perspective review framework for automatic wiki-style article generation. Specifically, our proposed method introduces multi-perspective experts to review the content of each outline chapter at both chapter and paragraph levels following the initial generation, offering evaluation feedback and continuously refining the numerous deficiencies in the initial long-form article, ultimately achieving high-quality wiki-style article generation. Extensive experimental results on the public English dataset FreshWiki and our own constructed high-quality Chinese dataset ChineseWiki, demonstrate that our proposed WikiREVIEW significantly outperforms existing state-of-the-art automatic wiki-style article generation methods across all automatic evaluation metrics and human evaluation.

AAAI Conference 2025 Conference Paper

BLADE: Enhancing Black-Box Large Language Models with Small Domain-Specific Models

  • Haitao Li
  • Qingyao Ai
  • Jia Chen
  • Qian Dong
  • Zhijing Wu
  • Yiqun Liu

Large Language Models (LLMs) like ChatGPT and GPT-4 are versatile and capable of addressing open-domain question-answering(QA) tasks effectively. However, general LLMs, which are developed on open-domain data, may lack the domain-specific knowledge essential for tasks in vertical domains, such as legal, medical, etc. To address this issue, previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs in handling QA tasks. Unfortunately, these strategies are either cost-intensive or unreliable in practical applications. To this end, we present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models. BLADE consists of a black-box LLM and a small domain-specific LM. The small LM preserves domain-specific knowledge and offers specialized insights, while the general LLM contributes robust language comprehension and reasoning capabilities. Specifically, our method involves three steps: 1) pre-training the small LM with domain-specific data, 2) fine-tuning this model using knowledge instruction data, and 3) joint Bayesian optimization of the general LLM and the small LM. In our experiments, we verify the effectiveness of BLADE on diverse LLMs and datasets across different domains. This shows the potential of BLADE as an effective and cost-efficient solution in adapting general LLMs for vertical domains.

IJCAI Conference 2025 Conference Paper

Detecting Hallucination in Large Language Models Through Deep Internal Representation Analysis

  • Luan Zhang
  • Dandan Song
  • Zhijing Wu
  • Yuhang Tian
  • Changzhi Zhou
  • Jing Xu
  • Ziyi Yang
  • Shuhao Zhang

Large language models (LLMs) have shown exceptional performance across various domains. However, LLMs are prone to hallucinate facts and generate non-factual responses, which can undermine their reliability in real-world applications. Current hallucination detection methods suffer from external resource demands, substantial time overhead, difficulty overcoming LLMs' intrinsic limitation, and insufficient modeling. In this paper, we propose MHAD, a novel internal-representation-based hallucination detection method. MHAD utilizes linear probing to select neurons and layers within LLMs. The selected neurons and layers are demonstrated with significant awareness of hallucinations at the initial and final generation steps. By concatenating the outputs from these selected neurons of selected layers at the initial and final generation steps, a hallucination awareness vector is formed, enabling precise hallucination detection via an MLP. Additionally, we introduce SOQHD, a novel benchmark for evaluating hallucination detection in Open-Domain QA (ODQA). Extensive experiments show that MHAD outperforms existing hallucination detection methods across multiple LLMs, demonstrating superior effectiveness.

NeurIPS Conference 2025 Conference Paper

Storyboard-guided Alignment for Fine-grained Video Action Recognition

  • Enqi Liu
  • Liyuan Pan
  • Yan Yang
  • Yiran Zhong
  • Zhijing Wu
  • Xinxiao Wu
  • Liu Liu

Fine-grained video action recognition can be formulated as a video–text matching problem. Previous approaches primarily rely on global video semantics to consolidate video embeddings, often leading to misaligned video–text pairs due to inaccurate atomic-level action understanding. This inaccuracy arises due to i) videos with distinct global semantics may share similar atomic actions or visual appearances, and ii) atomic actions can be momentary, gradual, or not directly aligned with overarching video semantics. Inspired by storyboarding, where a script is segmented into individual shots, we propose a multi-granularity framework, SFAR. SFAR generates fine-grained descriptions of common atomic actions for each global semantic using a large language model. Unlike existing works that refine global semantics with auxiliary video frames, SFAR introduces a filtering metric to ensure correspondence between the descriptions and the global semantics, eliminating the need for direct video involvement and thereby enabling more nuanced recognition of subtle actions. By leveraging both global semantics and fine-grained descriptions, our SFAR effectively identifies prominent frames within videos, thereby improving the accuracy of embedding aggregation. Extensive experiments on various video action recognition datasets demonstrate the competitive performance of our SFAR in supervised, few-shot, and zero-shot settings.

NeurIPS Conference 2025 Conference Paper

Zero-Shot Detection of LLM-Generated Text via Implicit Reward Model

  • Runheng Liu
  • Heyan Huang
  • Xingchen Xiao
  • Zhijing Wu

Large language models (LLMs) have demonstrated remarkable capabilities across various tasks. However, their ability to generate human-like text has raised concerns about potential misuse. This underscores the need for reliable and effective methods to detect LLM-generated text. In this paper, we propose IRM, a novel zero-shot approach that leverages Implicit Reward Models for LLM-generated text detection. Such implicit reward models can be derived from publicly available instruction-tuned and base models. Previous reward-based method relies on preference construction and task-specific fine-tuning. In comparison, IRM requires neither preference collection nor additional training. We evaluate IRM on the DetectRL benchmark and demonstrate that IRM can achieve superior detection performance, outperforms existing zero-shot and supervised methods in LLM-generated text detection.

TIST Journal 2023 Journal Article

A Semantically Driven Hybrid Network for Unsupervised Entity Alignment

  • Jia Li
  • Dandan Song
  • Zhijing Wu

The major challenge in the task of entity alignment (EA) lies in the heterogeneity of the knowledge graph. The traditional solution to EA is to first map entities to the same space via knowledge embedding and then calculate the similarity between entities from different knowledge graphs. However, these methods mainly rely on manually labeled seeds of EA, which limits their applicability. Some researchers have begun using pseudo-labels rather than seeds for unsupervised EA. However, directly using pseudo-labels causes new problems, such as noise in the pseudo-labels. In this article, we propose a model called the Semantically Driven Hybrid Network (SDHN) to reduce the impact of noise in the pseudo-labels on the performance of EA models. The SDHN consists of two modules: a Teacher–Student Network (TSN) and a Rotation and Penalty (RAP) module. The TSN module reduces the impact of noise in two ways: (1) The TSN’s teacher network guides its student network to construct pseudo-labels based on semantic information instead of directly creating pseudo-labels. (2) It adaptively fuses semantic information into student networks to improve the final representation of entity embedding. Finally, the TSN enhances the performance of models of entity alignment via the RAP module. The results of experiments on multiple benchmark datasets showed that the SDHN outperforms state-of-the-art models.

AAAI Conference 2020 Short Paper

A Multi-Task Learning Machine Reading Comprehension Model for Noisy Document (Student Abstract)

  • Zhijing Wu
  • Hua Xu

Current neural models for Machine Reading Comprehension (MRC) have achieved successful performance in recent years. However, the model is too fragile and lack robustness to tackle the imperceptible adversarial perturbations to the input. In this work, we propose a multi-task learning MRC model with a hierarchical knowledge enrichment to further improve the robustness for noisy document. Our model follows a typical encode-align-decode framework. Additionally, we apply a hierarchical method of adding background knowledge into the model from coarse-to-fine to enhance the language representations. Besides, we optimize our model by jointly training the answer span and unanswerability prediction, aiming to improve the robustness to noise. Experiment results on benchmark datasets confirm the superiority of our method, and our method can achieve competitive performance compared with other strong baselines.