Author name cluster

Huaiyu Wan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers

2 author rows

TIST Journal 2026 Journal Article

DRL4AOI: A DRL Framework for Semantic-aware AOI Segmentation in Location-based Services

Youfang Lin
Jinji Fu
Haomin Wen
Jiyuan Wang
Zhenjie Wei
Yuting Qiang
Xiaowei Mao
Lixia Wu

In Location-Based Services (LBS), such as food delivery, a fundamental task is segmenting Areas of Interest (AOIs), aiming at partitioning the urban geographical spaces into non-overlapping regions. Traditional AOI segmentation algorithms primarily rely on road networks to partition urban areas. While promising in modeling the geo-semantics, road network-based models overlooked the service-semantic goals (e.g., workload equality) in LBS service. In this article, we point out that the AOI segmentation problem can be naturally formulated as a Markov Decision Process (MDP), which gradually chooses a nearby AOI for each grid in the current AOI’s border. Based on the MDP, we present the first attempt to generalize Deep Reinforcement Learning (DRL) for AOI segmentation, leading to a novel DRL-based framework called DRL4AOI. The DRL4AOI framework introduces different service-semantic goals in a flexible way by treating them as rewards that guide the AOI generation. To evaluate the effectiveness of DRL4AOI, we develop and release an AOI segmentation system. We also present a representative implementation of DRL4AOI—TrajRL4AOI—for AOI segmentation in the logistics service. It introduces a Double Deep Q-learning Network (DDQN) to gradually optimize the AOI generation for two specific semantic goals: (i) trajectory modularity, i.e., maximize tightness of the trajectory connections within an AOI and the sparsity of connections between AOIs, (ii) matchness with the road network, i.e., maximizing the matchness between AOIs and the road network. Quantitative and qualitative experiments conducted on synthetic and real-world data demonstrate the effectiveness and superiority of our method. The code and system is publicly available at https://github.com/Kogler7/AoiOpt.

Details DOI

AAAI Conference 2026 Conference Paper

Spatial-Temporal Feedback Diffusion Guidance for Controlled Traffic Imputation

Xiaowei Mao
Huihu Ding
Yan Lin
Tingrui Wu
Shengnan Guo
Dazhuo Qiu
Feiling Fang
Jilin Hu

Imputing missing values in spatial-temporal traffic data is essential for intelligent transportation systems. Among advanced imputation methods, score-based diffusion models have demonstrated competitive performance. These models generate data by reversing a noising process, using observed values as conditional guidance. However, existing diffusion models typically apply a uniform guidance scale across both spatial and temporal dimensions, which is inadequate for nodes with high missing data rates. Sparse observations provide insufficient conditional guidance, causing the generative process to drift toward the learned prior distribution rather than closely following the conditional observations, resulting in suboptimal imputation performance. To address this, we propose FENCE, a spatial-temporal feedback diffusion guidance method designed to adaptively control guidance scales during imputation. First, FENCE introduces a dynamic feedback mechanism that adjusts the guidance scale based on the posterior likelihood approximations. The guidance scale is increased when generated values diverge from observations and reduced when alignment improves, preventing overcorrection. Second, because alignment to observations varies across nodes and denoising steps, a global guidance scale for all nodes is suboptimal. FENCE computes guidance scales at the cluster level by grouping nodes based on their attention scores, leveraging spatial-temporal correlations to provide more accurate guidance. Experimental results on real-world traffic datasets show that FENCE significantly enhances imputation accuracy.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Think How Your Teammates Think: Active Inference Can Benefit Decentralized Execution

Hao Wu
Shoucheng Song
Chang Yao
Sheng Han
Huaiyu Wan
Youfang Lin
Kai Lv

In multi-agent systems, explicit cognition of teammates' decision logic serves as a critical factor in facilitating coordination. Communication (i.e., "Tell") can assist in the cognitive development process by information dissemination, yet it is inevitably subject to real-world constraints such as noise, latency, and attacks. Therefore, building the understanding of teammates' decisions without communication remains challenging. To address this, we propose a novel non-communication MARL framework that realizes the construction of cognition through local observation-based modeling (i.e., "Think"). Our framework enables agents to model teammates' active inference process. At first, the proposed method produces three teammate portraits: perception-belief-action. Specifically, we model the teammate's decision process as follows: 1) Perception: observing environments; 2) Belief: forming beliefs; 3) Action: making decisions. Then, we selectively integrate the belief portrait into the decision process based on the accuracy and relevance of the perception portrait. This enables the selection of cooperative teammates and facilitates effective collaboration. Extensive experiments on the SMAC, SMACv2, MPE, and GRF benchmarks demonstrate the superior performance of our method.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Balancing Imbalance: Data-Scarce Urban Flow Prediction via Spatio-Temporal Balanced Transfer Learning

Xinyan Hao
Huaiyu Wan
Shengnan Guo
Youfang Lin

Advanced deep spatio-temporal networks have become the mainstream for traffic prediction, but the widespread adoption of these models is impeded by the prevalent scarcity of available data. Despite cross-city transfer learning emerging as a common strategy to address this issue, it overlooks the inherent distribution imbalances within each city, which could potentially hinder the generalization capabilities of pre-trained models. To overcome this limitation, we propose a Spatio-Temporal Balanced Transfer Learning (STBaT) framework to enhance existing spatio-temporal prediction networks, ensuring both universality and precision in predictions for new urban environments. A Regional Imbalance Acquisition Module is designed to model the regional imbalances of source cities. Besides, to promote generalizable knowledge acquisition, a Spatio-Temporal Balanced Learning Module is devised to balance the predictive learning process. Extensive experiments on real-world datasets validate the efficacy of our proposed approach compared with several state-of-the-art methods.

PDF Details DOI

AAAI Conference 2025 Conference Paper

CognTKE: A Cognitive Temporal Knowledge Extrapolation Framework

Wei Chen
Yuting Wu
Shuhan Wu
Zhiyu Zhang
Mengqi Liao
Youfang Lin
Huaiyu Wan

Reasoning future unknowable facts on temporal knowledge graphs (TKGs) is a challenging task, holding significant academic and practical values for various fields. Existing studies exploring explainable reasoning concentrate on modeling comprehensible temporal paths relevant to the query. Yet, these path-based methods primarily focus on local temporal paths appearing in recent times, failing to capture the complex temporal paths in TKG and resulting in the loss of longer historical relations related to the query. Motivated by the Dual Process Theory in cognitive science, we propose a Cognitive Temporal Knowledge Extrapolation framework (CognTKE), which introduces a novel temporal cognitive relation directed graph (TCR-Digraph) and performs interpretable global shallow reasoning and local deep reasoning over the TCR-Digraph. Specifically, the proposed TCR-Digraph is constituted by retrieving significant local and global historical temporal relation paths associated with the query. In addition, CognTKE presents the global shallow reasoner and the local deep reasoner to perform global one-hop temporal relation reasoning (System 1) and local complex multi-hop path reasoning (System 2) over the TCR-Digraph, respectively. The experimental results on four benchmark datasets demonstrate that CognTKE achieves significant improvement in accuracy compared to the state-of-the-art baselines and delivers excellent zero-shot reasoning ability.

PDF Details DOI

AAAI Conference 2025 Conference Paper

DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time Estimation

Xiaowei Mao
Yan Lin
Shengnan Guo
Yubin Chen
Xingyu Xian
Haomin Wen
Qisen Xu
Youfang Lin

Uncertainty quantification in travel time estimation (TTE) aims to estimate the confidence interval for travel time, given the origin (O), destination (D), and departure time (T). Accurately quantifying this uncertainty requires generating the most likely path and assessing travel time uncertainty along the path. This involves two main challenges: 1) Predicting a path that aligns with the ground truth, and 2) modeling the impact of travel time in each segment on overall uncertainty under varying conditions. We propose DutyTTE to address these challenges. For the first challenge, we introduce a deep reinforcement learning method to improve alignment between the predicted path and the ground truth, providing more accurate travel time information from road segments to improve TTE. For the second challenge, we propose a mixture of experts guided uncertainty quantification mechanism to better capture travel time uncertainty for each segment under varying contexts. Extensive experiments on two real-world datasets demonstrate the superiority of our proposed method.

PDF Details DOI

ICLR Conference 2025 Conference Paper

HMoRA: Making LLMs More Effective with Hierarchical Mixture of LoRA Experts

Mengqi Liao
Wei Chen 0015
Junfeng Shen
Shengnan Guo 0001
Huaiyu Wan

Recent studies have combined Mixture of Experts (MoE) and Parameter-Efficient Fine-tuning (PEFT) to fine-tune large language models (LLMs), holding excellent performance in multi-task scenarios while remaining resource-efficient. However, existing MoE approaches still exhibit the following limitations: (1) Current methods fail to consider that different LLM layers capture features at varying levels of granularity, leading to suboptimal performance. (2) Task-level routing methods lack generalizability to unseen tasks. (3) The uncertainty introduced by load imbalance loss undermines the effective specialization of the experts. To address these challenges, we propose HMoRA, a Hierarchical fine-tuning method that combines MoE and LoRA, employing hybrid routing that integrates token-level and task-level routing in a hierarchical manner. This hierarchical hybrid routing allows the model to more efficiently capture both fine-grained token information and broader task contexts. To improve the certainty of expert selection, a novel routing auxiliary loss is introduced. This auxiliary function also enhances the task router's ability to differentiate tasks and its generalization to unseen tasks. Additionally, several optional lightweight designs have been proposed to significantly reduce both the number of trainable parameters and computational costs. Experimental results demonstrate that HMoRA outperforms full fine-tuning across multiple NLP benchmarks, while fine-tuning only 3.9\% of the parameters. The code is available on: https://github.com/LiaoMengqi/HMoRA.

Details

NeurIPS Conference 2025 Conference Paper

PLMTrajRec: A Scalable and Generalizable Trajectory Recovery Method with Pre-trained Language Models

Tonglong Wei
Yan Lin
Youfang Lin
Shengnan Guo
Jilin Hu
Haitao Yuan
Gao Cong
Huaiyu Wan

Spatiotemporal trajectory data is crucial for various traffic-related applications. However, issues such as device malfunctions and network instability often result in sparse trajectories that lose detailed movement information compared to their dense counterparts. Recovering missing points in sparse trajectories is thus essential. Despite recent progress, three challenges remain. First, the lack of large-scale dense trajectory datasets hinders the training of a trajectory recovery model. Second, the varying spatiotemporal correlations in sparse trajectories make it hard to generalize across different sampling intervals. Third, extracting road conditions for missing points is non-trivial. To address these challenges, we propose $\textit{PLMTrajRec}$, a novel trajectory recovery model. It leverages the scalability of a pre-trained language model (PLM) and can effectively recover trajectories by fine-tuning with small-scale dense trajectory datasets. To handle different sampling intervals in sparse trajectories, we first convert sampling intervals and movement features into prompts for the PLM to understand. We then introduce a trajectory encoder to unify trajectories of varying intervals into a single interval. To extract road conditions for missing points, we propose an area flow-guided implicit trajectory prompt that represents traffic conditions in each region, and a road condition passing mechanism that infers the road conditions of missing points using the observed ones. Experiments on four public trajectory datasets with three sampling intervals demonstrate the effectiveness, scalability, and generalization ability of PLMTrajRec. Code is available at https: //github. com/wtl52656/PLMTrajRec.

PDF Details

AAAI Conference 2025 Conference Paper

Spatial-Temporal Knowledge Distillation for Takeaway Recommendation

Shuyuan Zhao
Wei Chen
Boyan Shi
Liyong Zhou
Shuohao Lin
Huaiyu Wan

The takeaway recommendation system aims to recommend users' future takeaway purchases based on their historical purchase behaviors, thereby improving user satisfaction and boosting merchant sales. Existing methods focus on incorporating auxiliary information or leveraging knowledge graphs to alleviate the sparsity issue of user purchase sequences. However, two main challenges limit the performance of these approaches: (1) capturing dynamic user preferences on complex geospatial information and (2) efficiently integrating spatial-temporal knowledge from both graphs and sequence data with low computational costs. In this paper, we propose a novel spatial-temporal knowledge distillation model for takeaway recommendation (STKDRec) based on the two-stage training process. Specifically, during the first pre-training stage, a spatial-temporal knowledge graph (STKG) encoder is trained to extract high-order spatial-temporal dependencies and collaborative associations from the STKG. During the second spatial-temporal knowledge distillation (STKD) stage, a spatial-temporal Transformer (ST-Transformer) is employed to comprehensively model dynamic user preferences on various types of fine-grained geospatial information from a sequential perspective. Furthermore, the STKD strategy is introduced to transfer graph-based spatial-temporal knowledge to the ST-Transformer, facilitating the adaptive fusion of rich knowledge derived from both the STKG and sequence data while reducing computational overhead. Extensive experiments on three real-world datasets show that STKDRec significantly outperforms the state-of-the-art baselines.

PDF Details DOI

AAAI Conference 2025 Conference Paper

STD-PLM: Understanding Both Spatial and Temporal Properties of Spatial-Temporal Data with PLM

Yiheng Huang
Xiaowei Mao
Shengnan Guo
Yubin Chen
Junfeng Shen
Tiankuo Li
Youfang Lin
Huaiyu Wan

Spatial-temporal forecasting and imputation are important for real-world intelligent systems. Most existing methods are tailored for individual forecasting or imputation tasks but are not designed for both. Additionally, they are less effective for zero-shot and few-shot learning. While pre-trained language model (PLM) have exhibited strong pattern recognition and reasoning abilities across various tasks, including few-shot and zero-shot learning, their applications in spatial-temporal data understanding has been constrained by insufficient modeling of complex correlations such as the temporal correlations, spatial connectivity, non-pairwise and high-order spatial-temporal correlations within data. In this paper, we propose STD-PLM for understanding both spatial and temporal properties of Spatial-Temporal Data with PLM, which is capable of implementing both spatial-temporal forecasting and imputation tasks. STD-PLM understands spatial-temporal correlations via explicitly designed spatial and temporal tokenizers. Topology-aware node embeddings are designed for PLM to comprehend and exploit the topology structure of data in inductive manner. Furthermore, to mitigate the efficiency issues introduced by the PLM, we design a sandglass attention module(SGA) combined with a specific constrained loss function, which significantly improves the model's efficiency while ensuring performance. Extensive experiments demonstrate that STD-PLM exhibits competitive performance and generalization capabilities across the forecasting and imputation tasks on various datasets. Moreover, STD-PLM achieves promising results on both few-shot and zero-shot tasks.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

TrajCogn: Leveraging LLMs for Cognizing Movement Patterns and Travel Purposes from Trajectories

Zeyu Zhou
Yan Lin
Haomin Wen
Shengnan Guo
Jilin Hu
Youfang Lin
Huaiyu Wan

Spatio-temporal trajectories are crucial for data mining tasks, requiring versatile learning methods that can accurately extract movement patterns and travel purposes. While large language models (LLMs) have shown remarkable versatility through training on extensive datasets, and trajectories share similarities with natural language, standard LLMs cannot directly handle spatio-temporal features or extract trajectory-specific information. We propose TrajCogn, a model that effectively adapts LLMs for trajectory learning. TrajCogn incorporates a novel trajectory semantic embedder to process spatio-temporal features and extract movement patterns and travel purposes, along with a trajectory prompt that integrates this information into LLMs for various downstream tasks. Experiments on three real-world datasets and four representative tasks demonstrate TrajCogn's effectiveness.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

TrajMamba: An Efficient and Semantic-rich Vehicle Trajectory Pre-training Model

Yichen Liu
Yan Lin
Shengnan Guo
Zeyu Zhou
Youfang Lin
Huaiyu Wan

Vehicle GPS trajectories record how vehicles move over time, storing valuable travel semantics, including movement patterns and travel purposes. Learning travel semantics effectively and efficiently is crucial for real-world applications of trajectory data, which is hindered by two major challenges. First, travel purposes are tied to the functions of the roads and points-of-interest (POIs) involved in a trip. Such information is encoded in textual addresses and descriptions and introduces heavy computational burden to modeling. Second, real-world trajectories often contain redundant points, which harm both computational efficiency and trajectory embedding quality. To address these challenges, we propose TrajMamba, a novel approach for efficient and semantically rich vehicle trajectory learning. TrajMamba introduces a Traj-Mamba Encoder that captures movement patterns by jointly modeling both GPS and road perspectives of trajectories, enabling robust representations of continuous travel behaviors. It also incorporates a Travel Purpose-aware Pre-training procedure to integrate travel purposes into the learned embeddings without introducing extra overhead to embedding calculation. To reduce redundancy in trajectories, TrajMamba features a Knowledge Distillation Pre-training scheme to identify key trajectory points through a learnable mask generator and obtain effective compressed trajectory embeddings. Extensive experiments on two real-world datasets and three downstream tasks show that TrajMamba outperforms state-of-the-art baselines in both efficiency and accuracy.

PDF Details

NeurIPS Conference 2025 Conference Paper

TransferTraj: A Vehicle Trajectory Learning Model for Region and Task Transferability

Tonglong Wei
Yan Lin
Zeyu Zhou
Haomin Wen
Jilin Hu
Shengnan Guo
Youfang Lin
Gao Cong

Vehicle GPS trajectories provide valuable movement information that supports various downstream tasks and applications. A desirable trajectory learning model should be able to transfer across regions and tasks without retraining, avoiding the need to maintain multiple specialized models and subpar performance with limited training data. However, each region has its unique spatial features and contexts, which are reflected in vehicle movement patterns and are difficult to generalize. Additionally, transferring across different tasks faces technical challenges due to the varying input-output structures required for each task. Existing efforts towards transferability primarily involve learning embedding vectors for trajectories, which perform poorly in region transfer and require retraining of prediction modules for task transfer. To address these challenges, we propose $\textit{TransferTraj}$, a vehicle GPS trajectory learning model that excels in both region and task transferability. For region transferability, we introduce RTTE as the main learnable module within TransferTraj. It integrates spatial, temporal, POI, and road network modalities of trajectories to effectively manage variations in spatial context distribution across regions. It also introduces a TRIE module for incorporating relative information of spatial features and a spatial context MoE module for handling movement patterns in diverse contexts. For task transferability, we propose a task-transferable input-output scheme that unifies the input-output structure of different tasks into the masking and recovery of modalities and trajectory points. This approach allows TransferTraj to be pre-trained once and transferred to different tasks without retraining. We conduct extensive experiments on three real-world vehicle trajectory datasets under various transfer settings, including task transfer, zero-shot region transfer, and few-shot region transfer. Experimental results demonstrate that TransferTraj significantly outperforms state-of-the-art baselines in different scenarios, validating its effectiveness in region and task transfer. Code is available at https: //github. com/wtl52656/TransferTraj.

PDF Details

NeurIPS Conference 2024 Conference Paper

DiffLight: A Partial Rewards Conditioned Diffusion Model for Traffic Signal Control with Missing Data

Hanyang Chen
Yang Jiang
Shengnan Guo
Xiaowei Mao
Youfang Lin
Huaiyu Wan

The application of reinforcement learning in traffic signal control (TSC) has been extensively researched and yielded notable achievements. However, most existing works for TSC assume that traffic data from all surrounding intersections is fully and continuously available through sensors. In real-world applications, this assumption often fails due to sensor malfunctions or data loss, making TSC with missing data a critical challenge. To meet the needs of practical applications, we introduce DiffLight, a novel conditional diffusion model for TSC under data-missing scenarios in the offline setting. Specifically, we integrate two essential sub-tasks, i. e. , traffic data imputation and decision-making, by leveraging a Partial Rewards Conditioned Diffusion (PRCD) model to prevent missing rewards from interfering with the learning process. Meanwhile, to effectively capture the spatial-temporal dependencies among intersections, we design a Spatial-Temporal transFormer (STFormer) architecture. In addition, we propose a Diffusion Communication Mechanism (DCM) to promote better communication and control performance under data-missing scenarios. Extensive experiments on five datasets with various data-missing scenarios demonstrate that DiffLight is an effective controller to address TSC with missing data. The code of DiffLight is released at https: //github. com/lokol5579/DiffLight-release.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Mobility-LLM: Learning Visiting Intentions and Travel Preference from Human Mobility Data with Large Language Models

Letian Gong
Yan Lin
Xinyue Zhang
Yiwen Lu
Xuedi Han
Yichen Liu
Shengnan Guo
Youfang Lin

Location-based services (LBS) have accumulated extensive human mobility data on diverse behaviors through check-in sequences. These sequences offer valuable insights into users’ intentions and preferences. Yet, existing models analyzing check-in sequences fail to consider the semantics contained in these sequences, which closely reflect human visiting intentions and travel preferences, leading to an incomplete comprehension. Drawing inspiration from the exceptional semantic understanding and contextual information processing capabilities of large language models (LLMs) across various domains, we present Mobility-LLM, a novel framework that leverages LLMs to analyze check-in sequences for multiple tasks. Since LLMs cannot directly interpret check-ins, we reprogram these sequences to help LLMs comprehensively understand the semantics of human visiting intentions and travel preferences. Specifically, we introduce a visiting intention memory network (VIMN) to capture the visiting intentions at each record, along with a shared pool of human travel preference prompts (HTPP) to guide the LLM in understanding users’ travel preferences. These components enhance the model’s ability to extract and leverage semantic information from human mobility data effectively. Extensive experiments on four benchmark datasets and three downstream tasks demonstrate that our approach significantly outperforms existing models, underscoring the effectiveness of Mobility-LLM in advancing our understanding of human mobility data within LBS contexts.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

PGN: The RNN's New Successor is Effective for Long-Range Time Series Forecasting

Yuxin Jia
Youfang Lin
Jing Yu
Shuo Wang
Tianhao Liu
Huaiyu Wan

Due to the recurrent structure of RNN, the long information propagation path poses limitations in capturing long-term dependencies, gradient explosion/vanishing issues, and inefficient sequential execution. Based on this, we propose a novel paradigm called Parallel Gated Network (PGN) as the new successor to RNN. PGN directly captures information from previous time steps through the designed Historical Information Extraction (HIE) layer and leverages gated mechanisms to select and fuse it with the current time step information. This reduces the information propagation path to $\mathcal{O}(1)$, effectively addressing the limitations of RNN. To enhance PGN's performance in long-range time series forecasting tasks, we propose a novel temporal modeling framework called Temporal PGN (TPGN). TPGN incorporates two branches to comprehensively capture the semantic information of time series. One branch utilizes PGN to capture long-term periodic patterns while preserving their local characteristics. The other branch employs patches to capture short-term information and aggregate the global representation of the series. TPGN achieves a theoretical complexity of $\mathcal{O}(\sqrt{L})$, ensuring efficiency in its operations. Experimental results on five benchmark datasets demonstrate the state-of-the-art (SOTA) performance and high efficiency of TPGN, further confirming the effectiveness of PGN as the new successor to RNN in long-range time series forecasting. The code is available in this repository: https: //github. com/Water2sea/TPGN.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Contrastive Pre-training with Adversarial Perturbations for Check-In Sequence Representation Learning

Letian Gong
Youfang Lin
Shengnan Guo
Yan Lin
Tianyi Wang
Erwen Zheng
Zeyu Zhou
Huaiyu Wan

A core step of mining human mobility data is to learn accurate representations for user-generated check-in sequences. The learned representations should be able to fully describe the spatial-temporal mobility patterns of users and the high-level semantics of traveling. However, existing check-in sequence representation learning is usually implicitly achieved by end-to-end models designed for specific downstream tasks, resulting in unsatisfactory generalizable abilities and poor performance. Besides, although the sequence representation learning models that follow the contrastive learning pre-training paradigm have achieved breakthroughs in many fields like NLP, they fail to simultaneously consider the unique spatial-temporal characteristics of check-in sequences and need manual adjustments on the data augmentation strategies. So, directly applying them to check-in sequences cannot yield a meaningful pretext task. To this end, in this paper we propose a contrastive pre-training model with adversarial perturbations for check-in sequence representation learning (CACSR). Firstly, we design a novel spatial-temporal augmentation block for disturbing the spatial-temporal features of check-in sequences in the latent space to relieve the stress of designing manual data augmentation strategies. Secondly, to construct an effective contrastive pretext task, we generate “hard” positive and negative pairs for the check-in sequence by adversarial training. These two designs encourage the model to capture the high-level spatial-temporal patterns and semantics of check-in sequences while ignoring the noisy and unimportant details. We demonstrate the effectiveness and versatility of CACSR on two kinds of downstream tasks using three real-world datasets. The results show that our model outperforms both the state-of-the-art pre-training methods and the end-to-end models.

PDF Details DOI

TIST Journal 2023 Journal Article

Enough Waiting for the Couriers: Learning to Estimate Package Pick-up Arrival Time from Couriers’ Spatial-Temporal Behaviors

Haomin Wen
Youfang Lin
Fan Wu
Huaiyu Wan
Zhongxiang Sun
Tianyue Cai
Hongyu Liu
Shengnan Guo

In intelligent logistics systems, predicting the Estimated Time of Pick-up Arrival (ETPA) of packages is a crucial task, which aims to predict the courier’s arrival time to all the unpicked-up packages at any time. Accurate prediction of ETPA can help systems alleviate customers’ waiting anxiety and improve their experience. We identify three main challenges of this problem. First, unlike the travel time estimation problem in other fields like ride-hailing, the ETPA task is distinctively a multi-destination and path-free prediction problem. Second, an intuitive idea for solving ETPA is to predict the pick-up route and then the time in two stages. However, it is difficult to accurately and efficiently predict couriers’ future routes in the route prediction step since their behaviors are affected by multiple complex factors. Third, furthermore, in the time prediction step, the requirement for providing a courier’s all unpicked-up packages’ ETPA at once in real time makes the problem even more challenging. To tackle the preceding challenges, we propose RankETPA, which integrates the route inference into the ETPA prediction. First, a learning-based pick-up route predictor is designed to learn the route-ranking strategies of couriers from their massive spatial-temporal behaviors. Then, a spatial-temporal attention-based arrival time predictor is designed for real-time ETPA inference via capturing the spatial-temporal correlations between the unpicked-up packages. Extensive experiments on two real-world datasets and a synthetic dataset demonstrate that RankETPA achieves significant performance improvement against the baseline models.

Details DOI

AAAI Conference 2023 Conference Paper

GMDNet: A Graph-Based Mixture Density Network for Estimating Packages’ Multimodal Travel Time Distribution

Xiaowei Mao
Huaiyu Wan
Haomin Wen
Fan Wu
Jianbin Zheng
Yuting Qiang
Shengnan Guo
Lixia Wu

In the logistics network, accurately estimating packages' Travel Time Distribution (TTD) given the routes greatly benefits both consumers and platforms. Although recent works perform well in predicting an expected time or a time distribution in a road network, they could not be well applied to estimate TTD in logistics networks. Because TTD prediction in the logistics network requires modeling packages' multimodal TTD (MTTD, i.e., there can be more than one likely output with a given input) while leveraging the complex correlations in the logistics network. To this end, this work opens appealing research opportunities in studying MTTD learning conditioned on graph-structure data by investigating packages' travel time distribution in the logistics network. We propose a Graph-based Mixture Density Network, named GMDNet, which takes the benefits of both graph neural network and mixture density network for estimating MTTD conditioned on graph-structure data (i.e., the logistics network). Furthermore, we adopt the Expectation-Maximization (EM) framework in the training process to guarantee local convergence and thus obtain more stable results than gradient descent. Extensive experiments on two real-world datasets demonstrate the superiority of our proposed model. Corrigendum Notice In the initial publication of this article, the authors (Mao et al. 2023) acknowledged that although it referred to an earlier paper already presented and published in ICML-21 (Errica et al. 2021), it insufficiently acknowledged the extent to which it incorporated and made extensive use of techniques therein. We are providing a Corrigendum Note, "PDF (2024-09-25)," alongside the original published version. The Corrigendum Note summarizes the main novel contributions of this paper. Errica, F.; Bacciu, D.; and Micheli, A. 2021. Graph Mixture Density Networks. In Proceedings of the 38th International Conference on Machine Learning (PMLR-28), 3025–3035. PMLR. Mao, X.; Wan, H.; Wen, H.; Wu, F.; Zheng, J.; Qiang, Y.; Guo, S.; Wu, L.; Hu, H.; and Lin, Y. 2023. GMDNet: A Graph-Based Mixture Density Network for Estimating Packages’ Multimodal Travel Time Distribution. In Proceedings of the 37th AAAI Conference on Artificial Intelligence.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

WITRAN: Water-wave Information Transmission and Recurrent Acceleration Network for Long-range Time Series Forecasting

Yuxin Jia
Youfang Lin
Xinyan Hao
Yan Lin
Shengnan Guo
Huaiyu Wan

Capturing semantic information is crucial for accurate long-range time series forecasting, which involves modeling global and local correlations, as well as discovering long- and short-term repetitive patterns. Previous works have partially addressed these issues separately, but have not been able to address all of them simultaneously. Meanwhile, their time and memory complexities are still not sufficiently low for long-range forecasting. To address the challenge of capturing different types of semantic information, we propose a novel Water-wave Information Transmission (WIT) framework. This framework captures both long- and short-term repetitive patterns through bi-granular information transmission. It also models global and local correlations by recursively fusing and selecting information using Horizontal Vertical Gated Selective Unit (HVGSU). In addition, to improve the computing efficiency, we propose a generic Recurrent Acceleration Network (RAN) which reduces the time complexity to $\mathcal{O}(\sqrt{L})$ while maintaining the memory complexity at $\mathcal{O}(L)$. Our proposed method, called Water-wave Information Transmission and Recurrent Acceleration Network (WITRAN), outperforms the state-of-the-art methods by 5. 80% and 14. 28% on long-range and ultra-long-range time series forecasting tasks respectively, as demonstrated by experiments on four benchmark datasets. The code is available at: https: //github. com/Water2sea/WITRAN.

PDF Details

TIST Journal 2022 Journal Article

DeepRoute+: Modeling Couriers’ Spatial-temporal Behaviors and Decision Preferences for Package Pick-up Route Prediction

Haomin Wen
Youfang Lin
Huaiyu Wan
Shengnan Guo
Fan Wu
Lixia Wu
Chao Song
Yinghui Xu

Over 10 billion packages are picked up every day in China. A fundamental task raised in the emerging intelligent logistics systems is the couriers’ package pick-up route prediction, which is beneficial for package dispatching, arrival-time estimation and overdue-risk evaluation, by leveraging the predicted routes to improve those downstream tasks. In the package pick-up scene, the decision-making of a courier is affected by strict spatial-temporal constraints (e.g., package location, promised pick-up time, current time, and courier’s current location). Furthermore, couriers have different decision preferences on various factors (e.g., time factor, distance factor, and balance of both), based on their own perception of the environments and work experience. In this article, we propose a novel model, named DeepRoute+, to predict couriers’ future package pick-up routes according to the couriers’ decision experience and preference learned from the historical behaviors. Specifically, DeepRoute+ consists of three layers: (1) The representation layer produces experience- and preference-aware representations for the unpicked-up packages, in which a decision preference module can dynamically adjust the importance of factors that affects the courier’s decision under the current situation. (2) The transformer encoder layer encodes the representations of packages while considering the spatial-temporal correlations among them. (3) The attention-based decoder layer uses the attention mechanism to generate the whole pick-up route recurrently. Experiments on a real-world logistics dataset demonstrate the state-of-the-art performance of our model.

Details DOI

AAAI Conference 2021 Conference Paper

GSNet: Learning Spatial-Temporal Correlations from Geographical and Semantic Aspects for Traffic Accident Risk Forecasting

Beibei Wang
Youfang Lin
Shengnan Guo
Huaiyu Wan

Traffic accident forecasting is of great importance to urban public safety, emergency treatment, and construction planning. However, it is very challenging since traffic accidents are affected by multiple factors, and have multi-scale dependencies on both spatial and temporal dimensional features. Meanwhile, traffic accidents are rare events, which leads to the zero-inflated issue. Existing traffic accident forecasting methods cannot deal with all above problems simultaneously. In this paper, we propose a novel model, named GSNet, to learn the spatial-temporal correlations from geographical and semantic aspects for traffic accident risk forecasting. In the model, a Spatial-Temporal Geographical Module is designed to capture the geographical spatial-temporal correlations among regions, while a Spatial-Temporal Semantic Module is proposed to model the semantic spatial-temporal correlations among regions. In addition, a weighted loss function is designed to solve the zero-inflated issue. Extensive experiments on two real-world datasets demonstrate the superiority of GSNet against the state-of-the-art baseline methods.

PDF Details

AAAI Conference 2021 Conference Paper

Pre-training Context and Time Aware Location Embeddings from Spatial-Temporal Trajectories for User Next Location Prediction

Yan Lin
Huaiyu Wan
Shengnan Guo
Youfang Lin

Pre-training location embeddings from spatial-temporal trajectories is a fundamental procedure and very beneficial for user next location prediction. In the real world, a location usually has variable functionalities under different contextual environments. If the exact functions of a location in the trajectory can be reflected in its embedding, the accuracy of user next location prediction should be improved. Yet, existing location embeddings pre-trained on trajectories are mostly based on distributed word representations, which mix a location’s various functionalities into one latent representation vector. To address this problem, we propose a Context and Time aware Location Embedding (CTLE) model, which calculates a location’s representation vector with consideration of its specific contextual neighbors in trajectories. In this way, the multi-functional properties of locations can be properly tackled. Furthermore, in order to incorporate temporal information in trajectories into location embeddings, we propose a subtle temporal encoding module and a novel pretraining objective, which further improve the quality of location embeddings. We evaluate our proposed model on two real-world mobile user trajectory datasets. The experimental results demonstrate that, compared with the existing embedding methods, our CTLE model can pre-train higher quality location embeddings and significantly improve the performance of downstream user location prediction models.

PDF Details

AAAI Conference 2020 Conference Paper

Spatial-Temporal Synchronous Graph Convolutional Networks: A New Framework for Spatial-Temporal Network Data Forecasting

Chao Song
Youfang Lin
Shengnan Guo
Huaiyu Wan

Spatial-temporal network data forecasting is of great importance in a huge amount of applications for trafﬁc management and urban planning. However, the underlying complex spatial-temporal correlations and heterogeneities make this problem challenging. Existing methods usually use separate components to capture spatial and temporal correlations and ignore the heterogeneities in spatial-temporal data. In this paper, we propose a novel model, named Spatial-Temporal Synchronous Graph Convolutional Networks (STSGCN), for spatial-temporal network data forecasting. The model is able to effectively capture the complex localized spatial-temporal correlations through an elaborately designed spatial-temporal synchronous modeling mechanism. Meanwhile, multiple modules for different time periods are designed in the model to effectively capture the heterogeneities in localized spatialtemporal graphs. Extensive experiments are conducted on four real-world datasets, which demonstrates that our method achieves the state-of-the-art performance and consistently outperforms other baselines.

PDF Details

AAAI Conference 2019 Conference Paper

Attention Based Spatial-Temporal Graph Convolutional Networks for Traffic Flow Forecasting

Shengnan Guo
Youfang Lin
Ning Feng
Chao Song
Huaiyu Wan

Forecasting the traffic flows is a critical issue for researchers and practitioners in the field of transportation. However, it is very challenging since the traffic flows usually show high nonlinearities and complex patterns. Most existing traffic flow prediction methods, lacking abilities of modeling the dynamic spatial-temporal correlations of traffic data, thus cannot yield satisfactory prediction results. In this paper, we propose a novel attention based spatial-temporal graph convolutional network (ASTGCN) model to solve traffic flow forecasting problem. ASTGCN mainly consists of three independent components to respectively model three temporal properties of traffic flows, i. e. , recent, daily-periodic and weekly-periodic dependencies. More specifically, each component contains two major parts: 1) the spatial-temporal attention mechanism to effectively capture the dynamic spatialtemporal correlations in traffic data; 2) the spatial-temporal convolution which simultaneously employs graph convolutions to capture the spatial patterns and common standard convolutions to describe the temporal features. The output of the three components are weighted fused to generate the final prediction results. Experiments on two real-world datasets from the Caltrans Performance Measurement System (PeMS) demonstrate that the proposed ASTGCN model outperforms the state-of-the-art baselines.

PDF Details