Arrow Research search

Author name cluster

Hao Miao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
1 author row

Possible papers

12

AAAI Conference 2026 Conference Paper

Adaptive Frequency Pathways for Spatiotemporal Forecasting

  • Yanjun Qin
  • Yuchen Fang
  • Xinke Jiang
  • Hao Miao
  • Xiaoming Tao

Spatiotemporal forecasting is a fundamental task in areas such as traffic flow prediction, environmental sensing, and urban planning. Recent advances have shown that decomposing temporal signals into multiple frequencies and modeling them jointly with spatial structures can significantly enhance forecasting performance. However, existing multifrequency forecasting models still face two critical limitations. First, the importance of different temporal frequencies evolves over time, yet most models assume fixed or static frequency contributions. Second, spatial dependencies are inherently frequency-sensitive. For instance, low-frequency components often align with global spatial patterns, while highfrequency components tend to correspond to localized interactions. However, current approaches typically use a shared spatial information across all frequencies, introducing spatiotemporal inconsistency. To address these challenges, we propose a novel Adaptive Frequency Pathways (AdaFre) for spatiotemporal forecasting, which adaptively captures both dynamic frequency relevance and frequency-aligned spatial structures. AdaFre employs a multi-frequency routing mechanism to dynamically select and aggregate the most informative temporal frequency components, while associating each with its corresponding spatial representation derived from frequency-aware embeddings. Spatiotemporal backbones are then used to model each path independently before final aggregation. Extensive experiments on several real-world datasets demonstrate that AdaFre significantly outperforms state-of-the-art baselines.

AAAI Conference 2026 Conference Paper

Task-Aware Retrieval Augmentation for Dynamic Recommendation

  • Zhen Tao
  • Xinke Jiang
  • Qingshuai Feng
  • Haoyu Zhang
  • Lun Du
  • Yuchen Fang
  • Hao Miao
  • Bangquan Xie

Dynamic recommendation systems aim to provide personalized suggestions by modeling temporal user-item interactions across time-series behavioral data. Recent studies have leveraged pre-trained dynamic graph neural networks (GNNs) to learn user-item representations over temporal snapshot graphs. However, fine-tuning GNNs on these graphs often results in generalization issues due to temporal discrepancies between pre-training and fine-tuning stages, limiting the model’s ability to capture evolving user preferences. To address this, we propose TarDGR, a task-aware retrieval-augmented framework designed to enhance generalization capability by incorporating task-aware model and retrieval-augmentation. Specifically, TarDGR introduces a Task-Aware Evaluation Mechanism to identify semantically relevant historical subgraphs, enabling the construction of task-specific datasets without manual labeling. It also presents a Graph Transformer-based Task-Aware Model that integrates semantic and structural encodings to assess subgraph relevance. During inference, TarDGR retrieves and fuses task-aware subgraphs with the query subgraph, enriching its representation and mitigating temporal generalization issues. Experiments on multiple large-scale dynamic graph datasets demonstrate that TarDGR consistently outperforms state-of-the-art methods, with extensive empirical evidence underscoring its superior accuracy and generalization capabilities.

AAAI Conference 2025 Conference Paper

C2F-TP: A Coarse-to-Fine Denoising Framework for Uncertainty-Aware Trajectory Prediction

  • Zichen Wang
  • Hao Miao
  • Senzhang Wang
  • Renzhi Wang
  • Jianxin Wang
  • Jian Zhang

Accurately predicting the trajectory of vehicles is critically important for ensuring safety and reliability in autonomous driving. Although considerable research efforts have been made recently, the inherent trajectory uncertainty caused by various factors including the dynamic driving intends and the diverse driving scenarios still poses significant challenges to accurate trajectory prediction. To address this issue, we propose C2F-TP, a coarse-to-fine denoising framework for uncertainty-aware vehicle trajectory prediction. C2F-TP features an innovative two-stage coarse-to-fine prediction process. Specifically, in the spatial-temporal interaction stage, we propose a spatial-temporal interaction module to capture the inter-vehicle interactions and learn a multimodal trajectory distribution, from which a certain number of noisy trajectories are sampled. Next, in the trajectory refinement stage, we design a conditional denoising model to reduce the uncertainty of the sampled trajectories through a step-wise denoising operation. Extensive experiments are conducted on two real datasets NGSIM and highD that are widely adopted in trajectory prediction. The result demonstrates the effectiveness of our proposal.

NeurIPS Conference 2025 Conference Paper

SPOT-Trip: Dual-Preference Driven Out-of-Town Trip Recommendation

  • Yinghui Liu
  • Hao Miao
  • Guojiang Shen
  • Yan Zhao
  • Xiangjie Kong
  • Ivan Lee

Out-of-town trip recommendation aims to generate a sequence of Points of Interest (POIs) for users traveling from their hometowns to previously unvisited regions based on personalized itineraries, e. g. , origin, destination, and trip duration. Modeling the complex user preferences--which often exhibit a two-fold nature of static and dynamic interests--is critical for effective recommendations. However, the sparsity of out-of-town check-in data presents significant challenges in capturing such user preferences. Meanwhile, existing methods often conflate the static and dynamic preferences, resulting in suboptimal performance. In this paper, we for the first time systematically study the problem of out-of-town trip recommendation. A novel framework SPOT-Trip is proposed to explicitly learns the dual static-dynamic user preferences. Specifically, to handle scarce data, we construct a POI attribute knowledge graph to enrich the semantic modeling of users’ hometown and out-of-town check-ins, enabling the static preference modeling through attribute relation-aware aggregation. Then, we employ neural ordinary differential equations (ODEs) to capture the continuous evolution of latent dynamic user preferences and innovatively combine a temporal point process to describe the instantaneous probability of each preference behavior. Further, a static-dynamic fusion module is proposed to merge the learned static and dynamic user preferences. Extensive experiments on real data offer insight into the effectiveness of the proposed solutions, showing that SPOT-Trip achieves performance improvement by up to 17. 01%.

NeurIPS Conference 2025 Conference Paper

STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization

  • Haoyu Zhang
  • Hao Miao
  • Xinke Jiang
  • Yuchen Fang
  • Yifan Zhang

Spatio-Temporal Graph Neural Networks (STGNNs) have emerged as a powerful tool for modeling dynamic graph-structured data across diverse domains. However, they often fail to generalize in Spatio-Temporal Out-of-Distribution (STOOD) scenarios, where both temporal dynamics and spatial structures evolve beyond the training distribution. To address this problem, we propose an innovative Spatio-Temporal Retrieval-Augmented Pattern Learning framework, STRAP, which enhances model generalization by integrating retrieval-augmented learning into the STGNN continue learning pipeline. The core of STRAP is a compact and expressive pattern library that stores representative spatio-temporal patterns enriched with historical, structural, and semantic information, which is obtained and optimized during the training phase. During inference, STRAP retrieves relevant patterns from this library based on similarity to the current input and injects them into the model via a plug-and-play prompting mechanism. This not only strengthens spatio-temporal representations but also mitigates catastrophic forgetting. Moreover, STRAP introduces a knowledge-balancing objective to harmonize new information with retrieved knowledge. Extensive experiments across multiple real-world streaming graph datasets show that STRAP consistently outperforms state-of-the-art STGNN baselines on STOOD tasks, demonstrating its robustness, adaptability, and strong generalization capability without task-specific fine-tuning.

AAAI Conference 2025 Conference Paper

TimeCMA: Towards LLM-Empowered Multivariate Time Series Forecasting via Cross-Modality Alignment

  • Chenxi Liu
  • Qianxiong Xu
  • Hao Miao
  • Sun Yang
  • Lingzheng Zhang
  • Cheng Long
  • Ziyue Li
  • Rui Zhao

Multivariate time series forecasting (MTSF) aims to learn temporal dynamics among variables to forecast future time series. Existing statistical and deep learning-based methods suffer from limited learnable parameters and small-scale training data. Recently, large language models (LLMs) combining time series with textual prompts have achieved promising performance in MTSF. However, we discovered that current LLM-based solutions fall short in learning disentangled embeddings. We introduce TimeCMA, an intuitive yet effective framework for MTSF via cross-modality alignment. Specifically, we present a dual-modality encoding with two branches: the time series encoding branch extracts disentangled yet weak time series embeddings, and the LLM-empowered encoding branch wraps the same time series with text as prompts to obtain entangled yet robust prompt embeddings. As a result, such a cross-modality alignment retrieves both disentangled and robust time series embeddings, ``the best of two worlds'', from the prompt embeddings based on time series and prompt modality similarities. As another key design, to reduce the computational costs from time series with their length textual prompts, we design an effective prompt to encourage the most essential temporal information to be encapsulated in the last token: only the last token is passed to downstream prediction. We further store the last token embeddings to accelerate inference speed. Extensive experiments on eight real datasets demonstrate that TimeCMA outperforms state-of-the-arts.

NeurIPS Conference 2025 Conference Paper

TimeEmb: A Lightweight Static-Dynamic Disentanglement Framework for Time Series Forecasting

  • Mingyuan Xia
  • Chunxu Zhang
  • Zijian Zhang
  • Hao Miao
  • Qidong Liu
  • Yuanshao Zhu
  • Bo Yang

Temporal non-stationarity, the phenomenon that time series distributions change over time, poses fundamental challenges to reliable time series forecasting. Intuitively, the complex time series can be decomposed into two factors, i. e. , time-invariant and time-varying components, which indicate static and dynamic patterns, respectively. Nonetheless, existing methods often conflate the time-varying and time-invariant components, and jointly learn the combined long-term patterns and short-term fluctuations, leading to suboptimal performance facing distribution shifts. To address this issue, we initiatively propose a lightweight static-dynamic decomposition framework, TimeEmb, for time series forecasting. TimeEmb innovatively separates time series into two complementary components: (1) time-invariant component, captured by a novel global embedding module that learns persistent representations across time series, and (2) time-varying component, processed by an efficient frequency-domain filtering mechanism inspired by full-spectrum analysis in signal processing. Experiments on real-world datasets demonstrate that TimeEmb outperforms state-of-the-art baselines and requires fewer computational resources. We conduct comprehensive quantitative and qualitative analyses to verify the efficacy of static-dynamic disentanglement. This lightweight framework can also improve existing time-series forecasting methods with simple integration. To ease reproducibility, our code is available at https: //github. com/showmeon/TimeEmb.

IJCAI Conference 2025 Conference Paper

Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM Era

  • Chenxi Liu
  • Shaowen Zhou
  • Qianxiong Xu
  • Hao Miao
  • Cheng Long
  • Ziyue Li
  • Rui Zhao

The proliferation of edge devices has generated an unprecedented volume of time series data across different domains, motivating a variety of well-customized methods. Recently, Large Language Models (LLMs) have emerged as a new paradigm for time series analytics by leveraging the shared sequential nature of textual data and time series. However, a fundamental cross-modality gap between time series and LLMs exists, as LLMs are pre-trained on textual corpora and are not inherently optimized for time series. Many recent proposals are designed to address this issue. In this survey, we provide an up-to-date overview of LLMs-based cross-modality modeling for time series analytics. We first introduce a taxonomy that classifies existing approaches into four groups based on the type of textual data employed for time series modeling. We then summarize key cross-modality strategies, e. g. , alignment and fusion, and discuss their applications across a range of downstream tasks. Furthermore, we conduct experiments on multimodal datasets from different application domains to investigate effective combinations of textual data and cross-modality strategies for enhancing time series analytics. Finally, we suggest several promising directions for future research. This survey is designed for a range of professionals, researchers, and practitioners interested in LLM-based time series modeling.

IJCAI Conference 2024 Conference Paper

Score-CDM: Score-Weighted Convolutional Diffusion Model for Multivariate Time Series Imputation

  • Shunyang Zhang
  • Senzhang Wang
  • Hao Miao
  • Hao Chen
  • Changjun Fan
  • Jian Zhang

Multivariant time series (MTS) data are usually incomplete in real scenarios, and imputing the incomplete MTS is practically important to facilitate various time series mining tasks. Recently, diffusion model-based MTS imputation methods have achieved promising results by utilizing CNN or attention mechanisms for temporal features learning. However, it is hard to adaptively trade off the diverse effects of local and global temporal features by simply combining CNN and attention. To address this issue, we propose a Score-weighted Convolutional Diffusion Model (Score-CDM for short), whose backbone consists of a Score-weighted Convolution Module (SCM) and an Adaptive Reception Module (ARM). SCM adopts a score map to capture the global temporal features in the time domain, while ARM uses a Spectral2Time Window Block (S2TWB) to convolve the local time series data in the spectral domain. Benefiting from the time convolution properties of Fast Fourier Transformation, ARM can adaptively change the receptive field of the score map, and thus effectively balance the local and global temporal features. We conduct extensive evaluations on three real MTS datasets of different domains, and the result verifies the effectiveness of the proposed Score-CDM.

AAAI Conference 2023 Conference Paper

AutoSTL: Automated Spatio-Temporal Multi-Task Learning

  • Zijian Zhang
  • Xiangyu Zhao
  • Hao Miao
  • Chunxu Zhang
  • Hongwei Zhao
  • Junbo Zhang

Spatio-temporal prediction plays a critical role in smart city construction. Jointly modeling multiple spatio-temporal tasks can further promote an intelligent city life by integrating their inseparable relationship. However, existing studies fail to address this joint learning problem well, which generally solve tasks individually or a fixed task combination. The challenges lie in the tangled relation between different properties, the demand for supporting flexible combinations of tasks and the complex spatio-temporal dependency. To cope with the problems above, we propose an Automated Spatio-Temporal multi-task Learning (AutoSTL) method to handle multiple spatio-temporal tasks jointly. Firstly, we propose a scalable architecture consisting of advanced spatio-temporal operations to exploit the complicated dependency. Shared modules and feature fusion mechanism are incorporated to further capture the intrinsic relationship between tasks. Furthermore, our model automatically allocates the operations and fusion weight. Extensive experiments on benchmark datasets verified that our model achieves state-of-the-art performance. As we can know, AutoSTL is the first automated spatio-temporal multi-task learning method.

TIST Journal 2022 Journal Article

Multivariate Correlation-aware Spatio-temporal Graph Convolutional Networks for Multi-scale Traffic Prediction

  • Senzhang Wang
  • Meiyue Zhang
  • Hao Miao
  • Zhaohui Peng
  • Philip S. Yu

Traffic flow prediction based on vehicle trajectories collected from the installed GPS devices is critically important to Intelligent Transportation Systems (ITS). One limitation of existing traffic prediction models is that they mostly focus on predicting road-segment level traffic conditions, which can be considered as a fine-grained prediction. In many scenarios, however, a coarse-grained prediction, such as predicting the traffic flows among different urban areas covering multiple road links, is also required to help government have a better understanding on traffic conditions from the macroscopic point of view. This is especially useful in the applications of urban planning and public transportation planning. Another limitation is that the correlations among different types of traffic-related features are largely ignored. For example, the traffic flow and traffic speed are usually negatively correlated. Existing works regard these traffic-related features as independent features without considering their correlations. In this article, we for the first time study the novel problem of multivariate correlation-aware multi-scale traffic flow predicting, and we propose a feature correlation-aware spatio-temporal graph convolutional networks named MC-STGCN to effectively address it. Specifically, given a road graph, we first construct a coarse-grained road graph based on both the topology closeness and the traffic flow similarity among the nodes (road links). Then a cross-scale spatial-temporal feature learning and fusion technique is proposed for dealing with both the fine- and coarse-grained traffic data. In the spatial domain, a cross-scale GCN is proposed to learn the multi-scale spatial features jointly and fuse them together. In the temporal domain, a cross-scale temporal network that is composed of a hierarchical attention is designed for effectively capturing intra- and inter-scale temporal correlations. To effectively capture the feature correlations, a feature correlation learning component is also designed. Finally, a structural constraint is introduced to make the predictions on the two scale traffic data consistent. We conduct extensive evaluations over two real traffic datasets, and the results demonstrate the superior performance of the proposal on both fine- and coarse-grained traffic predictions.

IJCAI Conference 2021 Conference Paper

Multi-Channel Pooling Graph Neural Networks

  • Jinlong Du
  • Senzhang Wang
  • Hao Miao
  • Jiaqiang Zhang

Graph pooling is a critical operation to downsample a graph in graph neural networks. Existing coarsening pooling methods (e. g. DiffPool) mostly focus on capturing the global topology structure by assigning the nodes into several coarse clusters, while dropping pooling methods (e. g. SAGPool) try to preserve the local topology structure by selecting the top-k representative nodes. However, there lacks an effective method to integrate the two types of methods so that both the local and the global topology structure of a graph can be well captured. To address this issue, we propose a Multi-channel Graph Pooling method named MuchPool, which captures the local structure, the global structure, and node feature simultaneously in graph pooling. Specifically, we use two channels to conduct dropping pooling based on the local topology and node features respectively, and one channel to conduct coarsening pooling. Then a cross-channel convolution operation is designed to refine the graph representations of different channels. Finally, the pooling results are aggregated as the final pooled graph. Extensive experiments on six benchmark datasets present the superior performance of MuchPool. The code of this work is publicly available at Github.