Author name cluster

Leilei Sun

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

16 papers

1 author row

AAAI Conference 2026 Conference Paper

Behavior Tokens Speak Louder: Disentangled Explainable Recommendation with Behavior Vocabulary

Xinshun Feng
Mingzhe Liu
Yi Qiao
Tongyu Zhu
Leilei Sun
Shuai Wang

Recent advances in explainable recommendation have explored the integration of language models to analyze natural language rationales for user–item interactions. Despite their potential, existing methods often rely on ID-based representations that obscure semantic meaning and impose structural constraints on language models, thereby limiting their applicability in open-ended scenarios. These challenges are intensified by the complex nature of real-world interactions, where diverse user intents are entangled and collaborative signals rarely align with linguistic semantics. To overcome these limitations, we propose BEAT, a unified and transferable framework that tokenizes user and item behaviors into discrete, interpretable sequences. We construct a behavior vocabulary via a vector-quantized autoencoding process that disentangles macro-level interests and micro-level intentions from graph-based representations. We then introduce multi-level semantic supervision to bridge the gap between behavioral signals and language space. A semantic alignment regularization mechanism is designed to embed behavior tokens directly into the input space of frozen language models. Experiments on three public datasets show that BEAT improves zero-shot recommendation performance while generating coherent and informative explanations. Further analysis demonstrates that our behavior tokens capture fine-grained semantics and offer a plug-and-play interface for integrating complex behavior patterns into large language models.

PDF Details DOI

TIST Journal 2025 Journal Article

JiuTian·Chuanliu: A Large Spatiotemporal Model for General-purpose Dynamic Urban Sensing

Liangzhe Han
Leilei Sun
Tongyu Zhu
Tao Tao
Jibin Wang
Weifeng Lv

As a window for urban sensing, human mobility contains rich spatiotemporal information that reflects both residents’ behavior preferences and the functions of urban areas. The analysis of human mobility has attracted the attention of many researchers. However, existing methods often address specific tasks from a particular perspective, leading to insufficient modeling of human mobility and limited applicability of the learned knowledge in various downstream applications. To address these challenges, this paper proposes to push massive amounts of human mobility data into a spatiotemporal model, discover latent semantics behind mobility behavior and support various urban sensing tasks. Specifically, a large-scale and widely covering human mobility data is collected through the ubiquitous base station system and a framework named General-purpose and Dynamic Human Mobility Embedding (GDHME) for urban sensing is introduced. The framework follows the self-supervised learning idea and contains two major stages. In stage 1, GDHME treats people and regions as nodes within a dynamic graph, unifying human mobility data as people-region-time interactions. An encoder operating in continuous-time dynamically computes evolving node representations, capturing dynamic states for both people and regions. Moreover, an autoregressive self-supervised task is specially designed to guide the learning of the general-purpose node embeddings. In stage 2, these representations are utilized to support various tasks. To evaluate the effectiveness of our GDHME framework, we further construct a multi-task urban sensing benchmark. Offline experiments demonstrate GDHME's ability to automatically learn valuable node features from vast amounts of data. Furthermore, our framework is used to deploy the JiuTian ChuanLiu Big Model, a system that has been presented at the 2023 China Mobile Worldwide Partner Conference and reported to use for smart tourism, urban planning and so on.

Details DOI

IJCAI Conference 2025 Conference Paper

RePST: Language Model Empowered Spatio-Temporal Forecasting via Semantic-Oriented Reprogramming

Hao Wang
Jindong Han
Wei Fan
Leilei Sun
Hao Liu

Spatio-temporal forecasting is pivotal in numerous real-world applications, including transportation planning, energy management, and climate monitoring. In this work, we aim to harness the reasoning and generalization abilities of Pre-trained Language Models (PLMs) for more effective spatio-temporal forecasting, particularly in data-scarce scenarios. However, recent studies uncover that PLMs, which are primarily trained on textual data, often falter when tasked with modeling the intricate correlations in numerical time series, thereby limiting their effectiveness in comprehending spatio-temporal data. To bridge the gap, we propose RePST, a semantic-oriented PLM reprogramming framework tailored for spatio-temporal forecasting. Specifically, we first propose a semantic-oriented decomposer that adaptively disentangles spatially correlated time series into interpretable sub-components, which facilitates PLM to understand sophisticated spatio-temporal dynamics via a divide-and-conquer strategy. Moreover, we propose a selective discrete reprogramming scheme, which introduces an expanded spatio-temporal vocabulary space to project spatio-temporal series into discrete representations. This scheme minimizes the information loss during reprogramming and enriches the representations derived by PLMs. Extensive experiments on real-world datasets show that the proposed RePST outperforms twelve state-of-the-art baseline methods, particularly in data-scarce scenarios, highlighting the effectiveness and superior generalization capabilities of PLMs for spatio-temporal forecasting. Codes and Appendix can be found at https: //github. com/usail-hkust/REPST.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

An NCDE-based Framework for Universal Representation Learning of Time Series

Zihan Liu
Bowen Du
Junchen Ye
Xianqing Wen
Leilei Sun

Exploiting self-supervised learning (SSL) to extract the universal representations of time series could not only capture the natural properties of time series but also offer huge help to the downstream tasks. Nevertheless, existing time series representation learning (TSRL) methods face challenges in attaining universality. Indeed, existing methods relying solely on one SSL strategy (either contrastive learning (CL) or generative) often fall short in capturing rich semantic information for various downstream tasks. Moreover, time series exhibit diverse distributions and inherent characteristics, particularly with the common occurrence of missing values, posing a notable challenge for existing backbones in effectively handling such diverse time series data. To bridge these gaps, we propose CTRL, a framework for universal TSRL. For the first time, we employ Neural Controlled Differential Equation (NCDE) as the backbone for TSRL, which captures the continuous processes and exhibits robustness to missing data. Additionally, a dual-task SSL strategy, integrating both reconstruction and contrasting tasks, is proposed to enrich the semantic information of the learned representations. Furthermore, novel hard negative construction and false negative elimination mechanisms are proposed to improve sampling efficiency and reduce sampling bias in CL. Finally, extensive experiments demonstrate the superiority of CTRL in forecasting, classification, and imputation tasks, particularly its outstanding robustness to missing data.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Improving Temporal Link Prediction via Temporal Walk Matrix Projection

Xiaodong Lu
Leilei Sun
Tongyu Zhu
Weifeng Lv

Temporal link prediction, aiming at predicting future interactions among entities based on historical interactions, is crucial for a series of real-world applications. Although previous methods have demonstrated the importance of relative encodings for effective temporal link prediction, computational efficiency remains a major concern in constructing these encodings. Moreover, existing relative encodings are usually constructed based on structural connectivity, where temporal information is seldom considered. To address the aforementioned issues, we first analyze existing relative encodings and unify them as a function of temporal walk matrices. This unification establishes a connection between relative encodings and temporal walk matrices, providing a more principled way for analyzing and designing relative encodings. Based on this analysis, we propose a new temporal graph neural network called TPNet, which introduces a temporal walk matrix that incorporates the time decay effect to simultaneously consider both temporal and structural information. Moreover, TPNet designs a random feature propagation mechanism with theoretical guarantees to implicitly maintain the temporal walk matrices, which improves the computation and storage efficiency. Experimental results on 13 benchmark datasets verify the effectiveness and efficiency of TPNet, where TPNet outperforms other baselines on most datasets and achieves a maximum speedup of $33. 3 \times$ compared to the SOTA baseline.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation

Han Huang
Leilei Sun
Bowen Du
Weifeng Lv

Learning the underlying distribution of molecular graphs and generating high-fidelity samples is a fundamental research problem in drug discovery and material science. However, accurately modeling distribution and rapidly generating novel molecular graphs remain crucial and challenging goals. To accomplish these goals, we propose a novel Conditional Diffusion model based on discrete Graph Structures (CDGS) for molecular graph generation. Specifically, we construct a forward graph diffusion process on both graph structures and inherent features through stochastic differential equations (SDE) and derive discrete graph structures as the condition for reverse generative processes. We present a specialized hybrid graph noise prediction model that extracts the global context and the local node-edge dependency from intermediate graph states. We further utilize ordinary differential equation (ODE) solvers for efficient graph sampling, based on the semi-linear structure of the probability flow ODE. We also combine the solvers with gradient guidance from the molecule property predictor for similarity-constrained molecule optimization. Experiments on diverse datasets validate the effectiveness of our framework. Particularly, the proposed method still generates high-quality molecular graphs in a limited number of steps.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Continuous-Time Graph Learning for Cascade Popularity Prediction

Xiaodong Lu
Shuo Ji
Le Yu
Leilei Sun
Bowen Du
Tongyu Zhu

Information propagation on social networks could be modeled as cascades, and many efforts have been made to predict the future popularity of cascades. However, most of the existing research treats a cascade as an individual sequence. Actually, the cascades might be correlated with each other due to the shared users or similar topics. Moreover, the preferences of users and semantics of a cascade are usually continuously evolving over time. In this paper, we propose a continuous-time graph learning method for cascade popularity prediction, which first connects different cascades via a universal sequence of user-cascade and user-user interactions and then chronologically learns on the sequence by maintaining the dynamic states of users and cascades. Specifically, for each interaction, we present an evolution learning module to continuously update the dynamic states of the related users and cascade based on their currently encoded messages and previous dynamic states. We also devise a cascade representation learning component to embed the temporal information and structural information carried by the cascade. Experiments on real-world datasets demonstrate the superiority and rationality of our approach.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Generic and Dynamic Graph Representation Learning for Crowd Flow Modeling

Liangzhe Han
Ruixing Zhang
Leilei Sun
Bowen Du
Yanjie Fu
Tongyu Zhu

Many deep spatio-temporal learning methods have been proposed for crowd flow modeling in recent years. However, most of them focus on designing a spatial and temporal convolution mechanism to aggregate information from nearby nodes and historical observations for a pre-defined prediction task. Different from the existing research, this paper aims to provide a generic and dynamic representation learning method for crowd flow modeling. The main idea of our method is to maintain a continuous-time representation for each node, and update the representations of all nodes continuously according to the streaming observed data. Along this line, a particular encoder-decoder architecture is proposed, where the encoder converts the newly happened transactions into a timestamped message, and then the representations of related nodes are updated according to the generated message. The role of the decoder is to guide the representation learning process by reconstructing the observed transactions based on the most recent node representations. Moreover, a number of virtual nodes are added to discover macro-level spatial patterns and also share the representations among spatially-interacted stations. Experiments have been conducted on two real-world datasets for four popular prediction tasks in crowd flow modeling. The result demonstrates that our method could achieve better prediction performance for all the tasks than baseline methods.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Human-Instructed Deep Hierarchical Generative Learning for Automated Urban Planning

Dongjie Wang
Lingfei Wu
Denghui Zhang
Jingbo Zhou
Leilei Sun
Yanjie Fu

The essential task of urban planning is to generate the optimal land-use configuration of a target area. However, traditional urban planning is time-consuming and labor-intensive. Deep generative learning gives us hope that we can automate this planning process and come up with the ideal urban plans. While remarkable achievements have been obtained, they have exhibited limitations in lacking awareness of: 1) the hierarchical dependencies between functional zones and spatial grids; 2) the peer dependencies among functional zones; and 3) human regulations to ensure the usability of generated configurations. To address these limitations, we develop a novel human-instructed deep hierarchical generative model. We rethink the urban planning generative task from a unique functionality perspective, where we summarize planning requirements into different functionality projections for better urban plan generation. To this end, we develop a three-stage generation process from a target area to zones to grids. The first stage is to label the grids of a target area with latent functionalities to discover functional zones. The second stage is to perceive the planning requirements to form urban functionality projections. We propose a novel module: functionalizer to project the embedding of human instructions and geospatial contexts to the zone-level plan to obtain such projections. Each projection includes the information of land-use portfolios and the structural dependencies across spatial grids in terms of a specific urban function. The third stage is to leverage multi-attentions to model the zone-zone peer dependencies of the functionality projections to generate grid-level land-use configurations. Finally, we present extensive experiments to demonstrate the effectiveness of our framework.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Predicting Temporal Sets with Simplified Fully Connected Networks

Le Yu
Zihang Liu
Tongyu Zhu
Leilei Sun
Bowen Du
Weifeng Lv

Given a sequence of sets, where each set contains an arbitrary number of elements, temporal sets prediction aims to predict which elements will appear in the subsequent set. Existing methods for temporal sets prediction are developed on sophisticated components (e.g., recurrent neural networks, attention or gating mechanisms, and graph neural networks), which inevitably increase the model complexity due to more trainable parameters and higher computational costs. Moreover, the involved nonlinear activation may contribute little or even degrade the performance. In this paper, we present a succinct architecture that is solely built on the Simplified Fully Connected Networks (SFCNs) for temporal sets prediction to bring both effectiveness and efficiency together. In particular, given a user's sequence of sets, we employ SFCNs to derive representations of the user by learning inter-set temporal dependencies, intra-set element relationships, and intra-embedding channel correlations. Two families of general functions are introduced to preserve the permutation-invariant property of each set and the permutation-equivariant property of elements in each set. Moreover, we design a user representations adaptive fusing module to aggregate user representations according to each element for improving the prediction performance. Experiments on four benchmarks show the superiority of our approach over the state-of-the-art under both transductive and inductive settings. We also theoretically and empirically demonstrate that our model has lower space and time complexity than baselines. Codes and datasets are available at https://github.com/yule-BUAA/SFCNTSP.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

Towards Better Dynamic Graph Learning: New Architecture and Unified Library

Le Yu
Leilei Sun
Bowen Du
Weifeng Lv

We propose DyGFormer, a new Transformer-based architecture for dynamic graph learning. DyGFormer is conceptually simple and only needs to learn from nodes' historical first-hop interactions by: (1) a neighbor co-occurrence encoding scheme that explores the correlations of the source node and destination node based on their historical sequences; (2) a patching technique that divides each sequence into multiple patches and feeds them to Transformer, allowing the model to effectively and efficiently benefit from longer histories. We also introduce DyGLib, a unified library with standard training pipelines, extensible coding interfaces, and comprehensive evaluating protocols to promote reproducible, scalable, and credible dynamic graph learning research. By performing exhaustive experiments on thirteen datasets for dynamic link prediction and dynamic node classification tasks, we find that DyGFormer achieves state-of-the-art performance on most of the datasets, demonstrating its effectiveness in capturing nodes' correlations and long-term temporal dependencies. Moreover, some results of baselines are inconsistent with previous reports, which may be caused by their diverse but less rigorous implementations, showing the importance of DyGLib. All the used resources are publicly available at https: //github. com/yule-BUAA/DyGLib.

PDF Details

IJCAI Conference 2022 Conference Paper

Dynamic Graph Learning Based on Hierarchical Memory for Origin-Destination Demand Prediction

Ruixing Zhang
Liangzhe Han
Boyi Liu
Jiayuan Zeng
Leilei Sun

Recent years have witnessed a rapid growth of applying deep spatiotemporal methods in traffic forecasting. However, the prediction of origin-destination (OD) demands is still a challenging problem since the number of OD pairs is usually quadratic to the number of stations. In this case, most of the existing spatiotemporal methods fail to handle spatial relations on such a large scale. To address this problem, this paper provides a dynamic graph representation learning framework for OD demands prediction. In particular, a hierarchical memory updater is first proposed to maintain a time-aware representation for each node, and the representations are updated according to the most recently observed OD trips in continuous-time and multiple discrete-time ways. Second, a spatiotemporal propagation mechanism is provided to aggregate representations of neighbor nodes along a random spatiotemporal route which treats origin and destination as two different semantic entities. Last, an objective function is designed to derive the future OD demands according to the most recent node representations, and also to tackle the data sparsity problem in OD prediction. Extensive experiments have been conducted on two real-world datasets, and the experimental results demonstrate the superiority of the proposed method. The code and data are available at https: //github. com/Rising0321/HMOD.

PDF Details DOI

TIST Journal 2022 Journal Article

Graph Sequence Neural Network with an Attention Mechanism for Traffic Speed Prediction

Zhilong Lu
Weifeng Lv
Zhipu Xie
Bowen Du
Guixi Xiong
Leilei Sun
Haiquan Wang

Recent years have witnessed the emerging success of Graph Neural Networks (GNNs) for modeling graphical data. A GNN can model the spatial dependencies of nodes in a graph based on message passing through node aggregation. However, in many application scenarios, these spatial dependencies can change over time, and a basic GNN model cannot capture these changes. In this article, we propose a G raph S eq uence neural network with an A tt ention mechanism (GSeqAtt) for processing graph sequences. More specifically, two attention mechanisms are combined: a horizontal mechanism and a vertical mechanism. GTransformer, which is a horizontal attention mechanism for handling time series, is used to capture the correlations between graphs in the input time sequence. The vertical attention mechanism, a Graph Network (GN) block structure with an attention mechanism (GNAtt), acts within the graph structure in each frame of the time series. Experiments show that our proposed model is able to handle information propagation for graph sequences accurately and efficiently. Moreover, results on real-world data from three road intersections show that our GSeqAtt outperforms state-of-the-art baselines on the traffic speed prediction task.

Details DOI

AAAI Conference 2021 Conference Paper

Coupled Layer-wise Graph Convolution for Transportation Demand Prediction

Junchen Ye
Leilei Sun
Bowen Du
Yanjie Fu
Hui Xiong

Graph Convolutional Network (GCN) has been widely applied in transportation demand prediction due to its excellent ability to capture non-Euclidean spatial dependence among station-level or regional transportation demands. However, in most of the existing research, the graph convolution was implemented on a heuristically generated adjacency matrix, which could neither reflect the real spatial relationships of stations accurately, nor capture the multi-level spatial dependence of demands adaptively. To cope with the above problems, this paper provides a novel graph convolutional network for transportation demand prediction. Firstly, a novel graph convolution architecture is proposed, which has different adjacency matrices in different layers and all the adjacency matrices are self-learned during the training process. Secondly, a layer-wise coupling mechanism is provided, which associates the upper-level adjacency matrix with the lower-level one. It also reduces the scale of parameters in our model. Lastly, a unitary network is constructed to give the final prediction result by integrating the hidden spatial states with gated recurrent unit, which could capture the multi-level spatial dependence and temporal dynamics simultaneously. Experiments have been conducted on two real-world datasets, NYC Citi Bike and NYC Taxi, and the results demonstrate the superiority of our model over the state-of-the-art ones.

PDF Details

AAAI Conference 2021 Conference Paper

LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification

Ting Jiang
Deqing Wang
Leilei Sun
Huayi Yang
Zhengyang Zhao
Fuzhen Zhuang

Extreme Multi-label text Classification (XMC) is a task of finding the most relevant labels from a large label set. Nowadays deep learning-based methods have shown significant success in XMC. However, the existing methods (e. g. , AttentionXML and X-Transformer etc) still suffer from 1) combining several models to train and predict for one dataset, and 2) sampling negative labels statically during the process of training label ranking model, which reduces both the efficiency and accuracy of the model. To address the above problems, we proposed LightXML, which adopts endto-end training and dynamic negative labels sampling. In LightXML, we use generative cooperative networks to recall and rank labels, in which label recalling part generates negative and positive labels, and label ranking part distinguishes positive labels from these labels. Through these networks, negative labels are sampled dynamically during label ranking part training by feeding with the same text representation. Extensive experiments show that LightXML outperforms state-of-the-art methods in five extreme multi-label datasets with much smaller model size and lower computational complexity. In particular, on the Amazon dataset with 670K labels, LightXML can reduce the model size up to 72% compared to AttentionXML. Our code is available at http: //github. com/kongds/LightXML.

PDF Details

AIIM Journal 2020 Journal Article

A fusion framework to extract typical treatment patterns from electronic medical records

Jingfeng Chen
Leilei Sun
Chonghui Guo
Yanming Xie

Details DOI