Author name cluster

Yuzhong Qu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

1 author row

AAAI Conference 2026 Conference Paper

Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute

Jianhao Chen
Zishuo Xun
Bocheng Zhou
Han Qi
Hangfan Zhang
Qiaosheng Zhang
Yang Chen
Wei Hu

This paper presents a simple, effective, and cost-efficient strategy, named ModelSwitch, to improve LLM performance by scaling test-time compute. ModelSwitch builds upon the repeated-sampling-then-voting framework, with a novel twist: incorporating multiple models, even weaker ones, to leverage their complementary strengths that potentially arise from diverse training data and paradigms. By using sample consistency as a signal, our strategy dynamically switches between models. Theoretical analysis highlights the efficiency and performance advantages of our strategy. Extensive experiments on seven datasets demonstrate that our strategy not only outperforms self-consistency and state-of-the-art multi-agent debate approaches, but also significantly reduces inference costs. Additionally, our strategy requires only a few comparable LLMs to achieve optimal performance and can be extended with verification methods, demonstrating the potential of leveraging multiple LLMs in the generation-verification paradigm.

PDF Details DOI

AAAI Conference 2023 Conference Paper

DyRRen: A Dynamic Retriever-Reranker-Generator Model for Numerical Reasoning over Tabular and Textual Data

Xiao Li
Yin Zhu
Sichen Liu
Jiangzhou Ju
Yuzhong Qu
Gong Cheng

Numerical reasoning over hybrid data containing tables and long texts has recently received research attention from the AI community. To generate an executable reasoning program consisting of math and table operations to answer a question, state-of-the-art methods use a retriever-generator pipeline. However, their retrieval results are static, while different generation steps may rely on different sentences. To attend to the retrieved information that is relevant to each generation step, in this paper, we propose DyRRen, an extended retriever-reranker-generator framework where each generation step is enhanced by a dynamic reranking of retrieved sentences. It outperforms existing baselines on the FinQA dataset.

PDF Details DOI

AAAI Conference 2023 Conference Paper

PaTeCon: A Pattern-Based Temporal Constraint Mining Method for Conflict Detection on Knowledge Graphs

Jianhao Chen
Junyang Ren
Wentao Ding
Yuzhong Qu

Temporal facts, the facts for characterizing events that hold in specific time periods, are attracting rising attention in the knowledge graph (KG) research communities. In terms of quality management, the introduction of time restrictions brings new challenges to maintaining the temporal consistency of KGs and detecting potential temporal conflicts. Previous studies rely on manually enumerated temporal constraints to detect conflicts, which are labor-intensive and may have granularity issues. We start from the common pattern of temporal facts and constraints and propose a pattern-based temporal constraint mining method, PaTeCon. PaTeCon uses automatically determined graph patterns and their relevant statistical information over the given KG instead of human experts to generate time constraints. Specifically, PaTeCon dynamically attaches type restriction to candidate constraints according to their measuring scores. We evaluate PaTeCon on two large-scale datasets based on Wikidata and Freebase respectively, the experimental results show that pattern-based automatic constraint mining is powerful in generating valuable temporal constraints.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Question Decomposition Tree for Answering Complex Questions over Knowledge Bases

Xiang Huang
Sitao Cheng
Yiheng Shu
Yuheng Bao
Yuzhong Qu

Knowledge base question answering (KBQA) has attracted a lot of interest in recent years, especially for complex questions which require multiple facts to answer. Question decomposition is a promising way to answer complex questions. Existing decomposition methods split the question into sub-questions according to a single compositionality type, which is not sufficient for questions involving multiple compositionality types. In this paper, we propose Question Decomposition Tree (QDT) to represent the structure of complex questions. Inspired by recent advances in natural language generation (NLG), we present a two-staged method called Clue-Decipher to generate QDT. It can leverage the strong ability of NLG model and simultaneously preserve the original questions. To verify that QDT can enhance KBQA task, we design a decomposition-based KBQA system called QDTQA. Extensive experiments show that QDTQA outperforms previous state-of-the-art methods on ComplexWebQuestions dataset. Besides, our decomposition method improves an existing KBQA system by 12% and sets a new state-of-the-art on LC-QuAD 1.0.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Knowledge Graph Alignment Network with Gated Multi-Hop Neighborhood Aggregation

Zequn Sun
Chengming Wang
Wei Hu
Muhao Chen
Jian Dai
Wei Zhang
Yuzhong Qu

Graph neural networks (GNNs) have emerged as a powerful paradigm for embedding-based entity alignment due to their capability of identifying isomorphic subgraphs. However, in real knowledge graphs (KGs), the counterpart entities usually have non-isomorphic neighborhood structures, which easily causes GNNs to yield different representations for them. To tackle this problem, we propose a new KG alignment network, namely AliNet, aiming at mitigating the non-isomorphism of neighborhood structures in an end-to-end manner. As the direct neighbors of counterpart entities are usually dissimilar due to the schema heterogeneity, AliNet introduces distant neighbors to expand the overlap between their neighborhood structures. It employs an attention mechanism to highlight helpful distant neighbors and reduce noises. Then, it controls the aggregation of both direct and distant neighborhood information using a gating mechanism. We further propose a relation loss to reﬁne entity representations. We perform thorough experiments with detailed ablation studies and analyses on ﬁve entity alignment datasets, demonstrating the effectiveness of AliNet.

PDF Details

AAAI Conference 2020 Conference Paper

SPARQA: Skeleton-Based Semantic Parsing for Complex Questions over Knowledge Bases

Yawei Sun
Lingling Zhang
Gong Cheng
Yuzhong Qu

Semantic parsing transforms a natural language question into a formal query over a knowledge base. Many existing methods rely on syntactic parsing like dependencies. However, the accuracy of producing such expressive formalisms is not satisfying on long complex questions. In this paper, we propose a novel skeleton grammar to represent the high-level structure of a complex question. This dedicated coarse-grained formalism with a BERT-based parsing algorithm helps to improve the accuracy of the downstream ﬁne-grained semantic parsing. Besides, to align the structure of a question with the structure of a knowledge base, our multi-strategy method combines sentence-level and word-level semantics. Our approach shows promising performance on several datasets.

PDF Details

AAAI Conference 2019 Conference Paper

A Pattern-Based Approach to Recognizing Time Expressions

Wentao Ding
Guanji Gao
Linfeng Shi
Yuzhong Qu

Recognizing time expressions is a fundamental and important task in many applications of natural language understanding, such as reading comprehension and question answering. Several newest state-of-the-art approaches have achieved good performance on recognizing time expressions. These approaches are black-boxed or based on heuristic rules, which leads to the difficulty in understanding the temporal information. On the contrary, classic rule-based or semantic parsing approaches can capture rich structural information, but their performances on recognition are not so good. In this paper, we propose a pattern-based approach, called PTime, which automatically generates and selects patterns for recognizing time expressions. In this approach, time expressions in training text are abstracted into type sequences by using fine-grained token types, thus the problem is transformed to select an appropriate subset of the sequential patterns. We use the Extended Budgeted Maximum Coverage (EBMC) model to optimize the pattern selection. The main idea is to maximize the correct token sequences matched by the selected patterns while the number of the mistakes should be limited by an adjustable budget. The interpretability of patterns and the adjustability of permitted number of mistakes make PTime a very promising approach for many applications. Experimental results show that PTime achieves a very competitive performance as compared with existing state-of-the-art approaches.

PDF Details

IJCAI Conference 2019 Conference Paper

Multi-view Knowledge Graph Embedding for Entity Alignment

Qingheng Zhang
Zequn Sun
Wei Hu
Muhao Chen
Lingbing Guo
Yuzhong Qu

We study the problem of embedding-based entity alignment between knowledge graphs (KGs). Previous works mainly focus on the relational structure of entities. Some further incorporate another type of features, such as attributes, for refinement. However, a vast of entity features are still unexplored or not equally treated together, which impairs the accuracy and robustness of embedding-based entity alignment. In this paper, we propose a novel framework that unifies multiple views of entities to learn embeddings for entity alignment. Specifically, we embed entities based on the views of entity names, relations and attributes, with several combination strategies. Furthermore, we design some cross-KG inference methods to enhance the alignment between two KGs. Our experiments on real-world datasets show that the proposed framework significantly outperforms the state-of-the-art embedding-based entity alignment methods. The selected views, cross-KG inference and combination strategies all contribute to the performance improvement.

PDF Details

IJCAI Conference 2018 Conference Paper

Bootstrapping Entity Alignment with Knowledge Graph Embedding

Zequn Sun
Wei Hu
Qingheng Zhang
Yuzhong Qu

Embedding-based entity alignment represents different knowledge graphs (KGs) as low-dimensional embeddings and finds entity alignment by measuring the similarities between entity embeddings. Existing approaches have achieved promising results, however, they are still challenged by the lack of enough prior alignment as labeled training data. In this paper, we propose a bootstrapping approach to embedding-based entity alignment. It iteratively labels likely entity alignment as training data for learning alignment-oriented KG embeddings. Furthermore, it employs an alignment editing method to reduce error accumulation during iterations. Our experiments on real-world datasets showed that the proposed approach significantly outperformed the state-of-the-art embedding-based ones for entity alignment. The proposed alignment-oriented KG embedding, bootstrapping process and alignment editing method all contributed to the performance improvement.

PDF Details

IJCAI Conference 2016 Conference Paper

HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization

Gong Cheng
Cheng Jin
Yuzhong Qu

The rapid growth of open data on the Web promotes the development of data portals that facilitate finding useful datasets. To help users quickly inspect a dataset found in a portal, we propose to summarize its contents and generate a hierarchical grouping of entities connected by relations. Our generic approach, called HIEDS, considers coverage of dataset, height of hierarchy, cohesion within groups, overlap between groups, and homogeneity of groups, and integrates these configurable factors into a combinatorial optimization problem to solve. We present an efficient solution, to serve users with dynamically configured summaries with acceptable latency. We systematically experiment with our approach on real-world RDF datasets.

PDF Details

IJCAI Conference 2016 Conference Paper

Taking Up the Gaokao Challenge: An Information Retrieval Approach

Gong Cheng
Weixi Zhu
Ziwei Wang
Jianghui Chen
Yuzhong Qu

Answering questions in a university's entrance examination like Gaokao in China challenges AI technology. As a preliminary attempt to take up this challenge, we focus on multiple-choice questions in Gaokao, and propose a three-stage approach that exploits and extends information retrieval techniques. Taking Wikipedia as the source of knowledge, our approach obtains knowledge relevant to a question by retrieving pages from Wikipedia via string matching and context-based disambiguation, and then ranks and filters pages using multiple strategies to draw critical evidence, based on which the truth of each option is assessed via relevance-based entailment. It achieves encouraging results on real-life questions in recent history tests, significantly outperforming baseline approaches.

PDF Details

AAAI Conference 2015 Conference Paper

An EBMC-Based Approach to Selecting Types for Entity Filtering

Jiwei Ding
Wentao Ding
Wei Hu
Yuzhong Qu

The quantity of entities in the Linked Data is increasing rapidly. For entity search and browsing systems, filtering is very useful for users to find entities that they are interested in. Type is a kind of widely-used facet and can be easily obtained from knowledge bases, which enables to create filters by selecting at most K types of an entity collection. However, existing approaches often fail to select high-quality type filters due to complex overlap between types. In this paper, we propose a novel type selection approach based upon Budgeted Maximum Coverage (BMC), which can achieve integral optimization for the coverage quality of type filters. Furthermore, we define a new optimization problem called Extended Budgeted Maximum Coverage (EBMC) and propose an EBMC-based approach, which enhances the BMC-based approach by incorporating the relevance between entities and types, so as to create sensible type filters. Our experimental results show that the EBMCbased approach performs best comparing with several representative approaches.

PDF Details