Author name cluster

Wayne Xin Zhao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

1 author row

AAAI Conference 2025 Conference Paper

Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers

Xinyu Tang
Xiaolei Wang
Wayne Xin Zhao
Siyuan Lu
Yaliang Li
Ji-Rong Wen

Automatic prompt optimization is an important approach to improving the performance of large language models (LLMs). Recent research demonstrates the potential of using LLMs as prompt optimizers, which can generate improved task prompts via iterative refinement. In this paper, we propose a novel perspective to investigate the design of LLM-based prompt optimizers, by drawing an analogy with gradient-based model optimizers. To connect these two approaches, we identify two pivotal factors in model parameter learning: update direction and update method. By systematically analyzing a rich set of improvement strategies on the two aspects, we further develop a capable Gradient-inspired LLM-based Prompt Optimizer called GPO. At each step, it first retrieves relevant prompts from the optimization trajectory as the update direction. Then, it utilizes the generation-based refinement strategy to perform the update, while controlling the edit distance through a cosine-based decay strategy. Extensive experiments demonstrate the effectiveness and efficiency of GPO. In particular, GPO brings an additional improvement of up to 56.8% on Big-Bench Hard and 62.6% on MMLU compared to baseline methods.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Continuous Trajectory Generation Based on Two-Stage GAN

Wenjun Jiang
Wayne Xin Zhao
Jingyuan Wang
Jiawei Jiang

Simulating the human mobility and generating large-scale trajectories are of great use in many real-world applications, such as urban planning, epidemic spreading analysis, and geographic privacy protect. Although many previous works have studied the problem of trajectory generation, the continuity of the generated trajectories has been neglected, which makes these methods useless for practical urban simulation scenarios. To solve this problem, we propose a novel two-stage generative adversarial framework to generate the continuous trajectory on the road network, namely TS-TrajGen, which efficiently integrates prior domain knowledge of human mobility with model-free learning paradigm. Specifically, we build the generator under the human mobility hypothesis of the A* algorithm to learn the human mobility behavior. For the discriminator, we combine the sequential reward with the mobility yaw reward to enhance the effectiveness of the generator. Finally, we propose a novel two-stage generation process to overcome the weak point of the existing stochastic generation process. Extensive experiments on two real-world datasets and two case studies demonstrate that our framework yields significant improvements over the state-of-the-art methods.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Diffusion Models for Non-autoregressive Text Generation: A Survey

Yifan Li
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen

Non-autoregressive (NAR) text generation has attracted much attention in the field of natural language processing, which greatly reduces the inference latency but has to sacrifice the generation accuracy. Recently, diffusion models, a class of latent variable generative models, have been introduced into NAR text generation, showing an improved text generation quality. In this survey, we review the recent progress in diffusion models for NAR text generation. As the background, we first present the general definition of diffusion models and the text diffusion models, and then discuss their merits for NAR generation. As the core content, we further introduce two mainstream diffusion models in existing work of text diffusion, and review the key designs of the diffusion process. Moreover, we discuss the utilization of pre-trained language models (PLMs) for text diffusion models and introduce optimization techniques for text data. Finally, we discuss several promising directions and conclude this paper. Our survey aims to provide researchers with a systematic reference of related research on text diffusion models for NAR generation. We also demonstrate our collection of text diffusion models at https: //github. com/RUCAIBox/Awesome-Text-Diffusion-Models.

PDF Details DOI

AAAI Conference 2023 Conference Paper

PDFormer: Propagation Delay-Aware Dynamic Long-Range Transformer for Traffic Flow Prediction

Jiawei Jiang
Chengkai Han
Wayne Xin Zhao
Jingyuan Wang

As a core technology of Intelligent Transportation System, traffic flow prediction has a wide range of applications. The fundamental challenge in traffic flow prediction is to effectively model the complex spatial-temporal dependencies in traffic data. Spatial-temporal Graph Neural Network (GNN) models have emerged as one of the most promising methods to solve this problem. However, GNN-based models have three major limitations for traffic prediction: i) Most methods model spatial dependencies in a static manner, which limits the ability to learn dynamic urban traffic patterns; ii) Most methods only consider short-range spatial information and are unable to capture long-range spatial dependencies; iii) These methods ignore the fact that the propagation of traffic conditions between locations has a time delay in traffic systems. To this end, we propose a novel Propagation Delay-aware dynamic long-range transFormer, namely PDFormer, for accurate traffic flow prediction. Specifically, we design a spatial self-attention module to capture the dynamic spatial dependencies. Then, two graph masking matrices are introduced to highlight spatial dependencies from short- and long-range views. Moreover, a traffic delay-aware feature transformation module is proposed to empower PDFormer with the capability of explicitly modeling the time delay of spatial information propagation. Extensive experimental results on six real-world public traffic datasets show that our method can not only achieve state-of-the-art performance but also exhibit competitive computational efficiency. Moreover, we visualize the learned spatial-temporal attention map to make our model highly interpretable.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

A Survey of Vision-Language Pre-Trained Models

Yifan Du
Zikang Liu
Junyi Li
Wayne Xin Zhao

As transformer evolves, pre-trained models have advanced at a breakneck pace in recent years. They have dominated the mainstream techniques in natural language processing (NLP) and computer vision (CV). How to adapt pre-training to the field of Vision-and-Language (V-L) learning and improve downstream task performance becomes a focus of multimodal learning. In this paper, we review the recent progress in Vision-Language Pre-Trained Models (VL-PTMs). As the core content, we first briefly introduce several ways to encode raw images and texts to single-modal embeddings before pre-training. Then, we dive into the mainstream architectures of VL-PTMs in modeling the interaction between text and image representations. We further present widely-used pre-training tasks, and then we introduce some common downstream tasks. We finally conclude this paper and present some promising research directions. Our survey aims to provide researchers with synthesis and pointer to related research.

PDF Details DOI

IJCAI Conference 2021 Conference Paper

A Survey on Complex Knowledge Base Question Answering: Methods, Challenges and Solutions

Yunshi Lan
Gaole He
Jinhao Jiang
Jing Jiang
Wayne Xin Zhao
Ji-Rong Wen

Knowledge base question answering (KBQA) aims to answer a question over a knowledge base (KB). Recently, a large number of studies focus on semantically or syntactically complicated questions. In this paper, we elaborately summarize the typical challenges and solutions for complex KBQA. We begin with introducing the background about the KBQA task. Next, we present the two mainstream categories of methods for complex KBQA, namely semantic parsing-based (SP-based) methods and information retrieval-based (IR-based) methods. We then review the advanced methods comprehensively from the perspective of the two categories. Specifically, we explicate their solutions to the typical challenges. Finally, we conclude and discuss some promising directions for future research.

PDF Details DOI

AAAI Conference 2021 Conference Paper

Dual Sparse Attention Network For Session-based Recommendation

Jiahao Yuan
Zihan Song
Mingyou Sun
Xiaoling Wang
Wayne Xin Zhao

Session-based Recommendations recommend the next possible item for the user with anonymous sessions, whose challenge is that the user’s behavioral preference can only be analyzed in a limited sequence to meet their need. Recent advances evaluate the effectiveness of the attention mechanism in the session-based recommendation. However, two simplifying assumptions are made by most of these attentionbased models. One is to regard the last-click as the query vector to denote the user’s current preference, and the other is to consider that all items within the session are favorable for the final result, including the effect of unrelated items (i. e. , spurious user behaviors). In this paper, we propose a novel Dual Sparse Attention Network for the sessionbased recommendation called DSAN to address these shortcomings. In this proposed method, we explore a learned target item embedding to model the user’s current preference and apply an adaptively sparse transformation function to eliminate the effect of the unrelated items. Experimental results on two real public datasets show that the proposed method is superior to the state-of-the-art sessionbased recommendation algorithm in all tests and also demonstrate that not all actions within the session are useful. To make our results reproducible, we have published our code on https: //github. com/SamHaoYuan/DSANForAAAI2021.

PDF Details

IJCAI Conference 2021 Conference Paper

Pretrained Language Model for Text Generation: A Survey

Junyi Li
Tianyi Tang
Wayne Xin Zhao
Ji-Rong Wen

Text generation has become one of the most important yet challenging tasks in natural language processing (NLP). The resurgence of deep learning has greatly advanced this field by neural generation models, especially the paradigm of pretrained language models (PLMs). In this paper, we present an overview of the major advances achieved in the topic of PLMs for text generation. As the preliminaries, we present the general task definition and briefly describe the mainstream architectures of PLMs for text generation. As the core content, we discuss how to adapt existing PLMs to model different input data and satisfy special properties in the generated text. We further summarize several important fine-tuning strategies for text generation. Finally, we present several future directions and conclude this paper. Our survey aims to provide text generation researchers a synthesis and pointer to related research.

PDF Details DOI

IJCAI Conference 2017 Conference Paper

A Correlated Topic Model Using Word Embeddings

Guangxu Xun
Yaliang Li
Wayne Xin Zhao
Jing Gao
Aidong Zhang

Conventional correlated topic models are able to capture correlation structure among latent topics by replacing the Dirichlet prior with the logistic normal distribution. Word embeddings have been proven to be able to capture semantic regularities in language. Therefore, the semantic relatedness and correlations between words can be directly calculated in the word embedding space, for example, via cosine values. In this paper, we propose a novel correlated topic model using word embeddings. The proposed model enables us to exploit the additional word-level correlation information in word embeddings and directly model topic correlation in the continuous word embedding space. In the model, words in documents are replaced with meaningful word embeddings, topics are modeled as multivariate Gaussian distributions over the word embeddings and topic correlations are learned among the continuous Gaussian topics. A Gibbs sampling solution with data augmentation is given to perform inference. We evaluate our model on the 20 Newsgroups dataset and the Reuters-21578 dataset qualitatively and quantitatively. The experimental results show the effectiveness of our proposed model.

PDF Details

IJCAI Conference 2016 Conference Paper

Bayesian Probabilistic Multi-Topic Matrix Factorization for Rating Prediction

Keqiang Wang
Wayne Xin Zhao
Hongwei Peng
Xiaoling Wang

Recently, Local Matrix Factorization (LMF) has been shown to be more effective than traditional matrix factorization for rating prediction. The core idea for LMF is to first partition the original matrix into several smaller submatrices, further exploit local structures of submatrices for better low-rank approximation. Various clustering-based methods with heuristic extensions have been proposed for LMF in the literature. To develop a more principled solution for LMF, this paper presents a Bayesian Probabilistic Multi-Topic Matrix Factorization model. We treat the set of the rated items by a useras a document, and employ latent topic models to cluster items as topics. Subsequently, a user has a distribution over the set of topics. We further set topic-specific latent vectors for both users and items. The final prediction is obtained by an ensemble of the results from the corresponding topic-specific latent vectorsin each topic. Using a multi-topic latent representation, our model is more powerful to reflect the complex characteristics for users and items in rating prediction, and enhance the model interpretability. Extensive experiments on large real-world datasets demonstrate the effectiveness of the proposed model.

PDF Details

AAAI Conference 2016 Conference Paper

Tweet Timeline Generation with Determinantal Point Processes

Jin-ge Yao
Feifan Fan
Wayne Xin Zhao
Xiaojun Wan
Edward Chang
Jianguo Xiao

The task of tweet timeline generation (TTG) aims at selecting a small set of representative tweets to generate a meaningful timeline and providing enough coverage for a given topical query. This paper presents an approach based on determinantal point processes (DPPs) by jointly modeling the topical relevance of each selected tweet and overall selectional diversity. Aiming at better treatment for balancing relevance and diversity, we introduce two novel strategies, namely spectral rescaling and topical prior. Extensive experiments on the public TREC 2014 dataset demonstrate that our proposed DPP model along with the two strategies can achieve fairly competitive results against the state-of-the-art TTG systems.

PDF Details

TIST Journal 2014 Journal Article

Infer User Interests via Link Structure Regularization

Jinpeng Wang
Wayne Xin Zhao
Yulan He
Xiaoming Li

Learning user interests from online social networks helps to better understand user behaviors and provides useful guidance to design user-centric applications. Apart from analyzing users' online content, it is also important to consider users' social connections in the social Web. Graph regularization methods have been widely used in various text mining tasks, which can leverage the graph structure information extracted from data. Previously, graph regularization methods operate under the cluster assumption that nearby nodes are more similar and nodes on the same structure (typically referred to as a cluster or a manifold) are likely to be similar. We argue that learning user interests from complex, sparse, and dynamic social networks should be based on the link structure assumption under which node similarities are evaluated based on the local link structures instead of explicit links between two nodes. We propose a regularization framework based on the relation bipartite graph, which can be constructed from any type of relations. Using Twitter as our case study, we evaluate our proposed framework from social networks built from retweet relations. Both quantitative and qualitative experiments show that our proposed method outperforms a few competitive baselines in learning user interests over a set of predefined topics. It also gives superior results compared to the baselines on retweet prediction and topical authority identification.

Details DOI