Author name cluster

Ziyu Zhou

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

1 author row

AAAI Conference 2026 Conference Paper

A Retrieval Augmented Spatio-Temporal Framework for Traffic Prediction

Weilin Ruan
Xilin Dang
Ziyu Zhou
Sisuo Lyu
Yuxuan Liang

Traffic prediction serves as a cornerstone of modern intelligent transportation systems and the critical task of spatio-temporal forecasting. Although advanced Spatio-temporal Graph Neural Networks (STGNNs) and pre-trained models have made significant progress in traffic prediction, two critical challenges persist: (i) limited contextual capacity when handling complex spatio-temporal dependencies, and (ii) low predictability at fine-grained spatio-temporal points caused by heterogeneous patterns. Inspired by Retrieval-Augmented Generation (RAG), we propose RAST, a universal framework that integrates retrieval-augmented mechanisms with spatio-temporal modeling to address these challenges. Our framework consists of three key designs: 1) Decoupled Encoder and Query Generator to capture decoupled spatial and temporal features and construct a fusion query via residual fusion; 2) Spatio-temporal Retrieval Store and Retrievers to maintain and retrieve vectorized fine-grained patterns; and 3) Universal Backbone Predictor that flexibly accommodates pre-trained STGNNs or simple MLP predictors. Extensive experiments on 6 real-world traffic networks, including large-scale datasets, demonstrate that RAST achieves superior performance while maintaining computational efficiency.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Revitalizing Canonical Pre-Alignment for Irregular Multivariate Time Series Forecasting

Ziyu Zhou
Yiming Huang
Yanyun Wang
Yuankai Wu
James Kwok
Yuxuan Liang

Irregular multivariate time series (IMTS), characterized by uneven sampling and inter-variate asynchrony, fuel many forecasting applications yet remain challenging to model efficiently. Canonical Pre-Alignment (CPA) has been widely adopted in IMTS modeling by padding zeros at every global timestamp, thereby alleviating inter-variate asynchrony and unifying the series length, but its dense zero-padding inflates the pre-aligned series length, especially when numerous variates are present, causing prohibitive compute overhead. Recent graph-based models with patching strategies sidestep CPA, but their local message passing struggles to capture global inter-variate correlations. Therefore, we posit that CPA should be retained, with the pre-aligned series properly handled by the model, enabling it to outperform state-of-the-art graph-based baselines that sidestep CPA. Technically, we propose KAFNet, a compact architecture grounded in CPA for IMTS forecasting that couples (1) a Pre-Convolution module for sequence smoothing and sparsity mitigation, (2) a Temporal Kernel Aggregation module for learnable compression and modeling of intra-series irregularity, and (3) Frequency Linear Attention blocks for low-cost inter-series correlation modeling in the frequency domain. Experiments on multiple IMTS datasets show that KAFNet achieves state-of-the-art forecasting performance, with a 7.2× parameter reduction and an 8.4× training–inference acceleration.

PDF Details DOI

AAAI Conference 2025 Conference Paper

MSV-PCT: Multi-Sparse-View Enhanced Transformer Framework for Salient Object Detection in Point Clouds

Zihao Wang
Yiming Huang
Gengyu Lyu
Yucheng Zhao
Ziyu Zhou
Bochen Xie
Zhen Yang
Yongjian Deng

Salient object detection (SOD) methods for 2D images have great significance in the field of human-computer interaction (HCI). However, as a common data format in HCI, the SOD research in the form of 3D point cloud data remains limited. Previous works commonly treat this task as point cloud segmentation, which perceives all points in the scene for prediction. However, these methods neglect that SOD is designed to simulate human visual perception where human can only see the surfaces rather than occluded point clouds. Thereby, these methods may fail when meet such situations. This paper aims to solve this problem by approximately simulating the perception paradigm of humans towards 3D scenes. Thus, we propose a framework based on the 3D visual point cloud backbone and its multi-view projection named MSV-PCT. Specifically, instead of relying solely on general point cloud learning frameworks, we additionally introduce multi-sparse-view learning branches to supplement the SOD perception. Furthermore, we propose a novel point cloud edge detection loss function to effectively address artifacts, enabling the accurate segmentation of the edges of salient objects from the background. Finally, to evaluate the generalization of point cloud SOD methods, we introduce a new approach to generate simulated PC-SOD datasets from RGBD-SOD data. Experiments on the simulated datasets show that MSV-PCT achieves better accuracy and robustness.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

SDformer: Transformer with Spectral Filter and Dynamic Attention for Multivariate Time Series Long-term Forecasting

Ziyu Zhou
Gengyu Lyu
Yiming Huang
Zihao Wang
Ziyu Jia
Zhen Yang

Transformer has gained widespread adoption in modeling time series due to the exceptional ability of its self-attention mechanism in capturing long-range dependencies. However, when processing time series data with numerous variates, the vanilla self-attention mechanism tends to distribute attention weights evenly and smoothly, causing row-homogenization in attention maps and further hampering time series forecasting. To tackle this issue, we propose an advanced Transformer architecture entitled SDformer, which designs two novel modules, Spectral-Filter-Transform (SFT) and Dynamic-Directional-Attention (DDA), and integrates them into the encoder of Transformer to achieve more intensive attention allocation. Specifically, the SFT module utilizes the Fast Fourier Transform to select the most prominent frequencies, along with a Hamming Window to smooth and denoise the filtered series data; The DDA module applies a specialized kernel function to the query and key vectors projected from the denoised data, concentrating this innovative attention mechanism more effectively on the most informative variates to obtain a sharper attention distribution. These two modules jointly enable attention weights to be more salient among numerous variates, which in turn enhances the attention's ability to capture multivariate correlations, improving the performance in forecasting. Extensive experiments on public datasets demonstrate its superior performance over other state-of-the-art models. Code is available at https: //github. com/zhouziyu02/SDformer.

PDF Details DOI

NeurIPS Conference 2021 Conference Paper

The Tufts fNIRS Mental Workload Dataset & Benchmark for Brain-Computer Interfaces that Generalize

Zhe Huang
Liang Wang
Giles Blaney
Christopher Slaughter
Devon McKeon
Ziyu Zhou
Robert Jacob
Michael Hughes

Functional near-infrared spectroscopy (fNIRS) promises a non-intrusive way to measure real-time brain activity and build responsive brain-computer interfaces. A primary barrier to realizing this technology's potential has been that observed fNIRS signals vary significantly across human users. Building models that generalize well to never-before-seen users has been difficult; a large amount of subject-specific data has been needed to train effective models. To help overcome this barrier, we introduce the largest open-access dataset of its kind, containing multivariate fNIRS recordings from 68 participants, each with labeled segments indicating four possible mental workload intensity levels. Labels were collected via a controlled setting in which subjects performed standard n-back tasks to induce desired working memory levels. We propose a benchmark analysis of this dataset with a standardized training and evaluation protocol, which allows future researchers to report comparable numbers and fairly assess generalization potential while avoiding any overlap or leakage between train and test data. Using this dataset and benchmark, we show how models trained using abundant fNIRS data from many other participants can effectively classify a new target subject's data, thus reducing calibration and setup time for new subjects. We further show how performance improves as the size of the available dataset grows, while also analyzing error rates across key subpopulations to audit equity concerns. We share our open-access Tufts fNIRS to Mental Workload (fNIRS2MW) dataset and open-source code as a step toward advancing brain computer interfaces.

PDF Details