Arrow Research search

Author name cluster

Xue Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

31 papers
2 author rows

Possible papers

31

JBHI Journal 2026 Journal Article

Mining Global and Local Semantics From Unlabeled Spectra for Spectral Classification

  • Wei Luo
  • Haiming Yao
  • Ang Gao
  • Tao Zhou
  • Xue Wang

Non-destructive detection methods based on molecular vibrational spectroscopy are pivotal in fields such as analytical chemistry and medical diagnostics. Recent advances have integrated deep learning with vibrational spectroscopy, significantly enhancing spectral recognition accuracy. However, these methods often rely on large annotated spectral datasets, limiting their general applicability. To address this limitation, we propose a novel approach, G lobal and L ocal S emantics M ining (GLSM), which leverages self-supervised learning to capture the global and local semantic information of unlabeled spectra, obviating the need for extensive annotated data. We devise two proxy tasks: global semantic mining and local semantic mining. The global semantic mining task is based on the premise that different views of the same spectrum can be mutually transformed, enabling the model to capture domain-invariant features across various perspectives and thereby develop a global understanding of the spectral data. This, in turn, enhances the model’s robustness to variations in peak positions. Meanwhile, the local semantic mining task posits that noisy spectra can be reconstructed into noise-free spectra, thereby facilitating the extraction of local patterns and fine-grained details, such as subtle variations in peak intensities. By combining both self-supervised tasks, our model effectively captures the global and local semantic information of the spectrum. The pre-trained model can be fine-tuned with a limited amount of labeled homologous or heterologous spectral data for semi-supervised or transfer learning-based spectral classification. Extensive experiments on three datasets in semi-supervised and transfer learning-based spectral recognition tasks comprehensively validate the effectiveness of our GLSM method, demonstrating its significant potential for real-world spectral analysis applications.

AAAI Conference 2026 Conference Paper

SimDiff: Simpler Yet Better Diffusion Model for Time Series Point Forecasting

  • Hang Ding
  • Xue Wang
  • Tian Zhou
  • Tao Yao

Diffusion models have recently shown promise in time series forecasting, particularly for probabilistic predictions. However, they often fail to achieve state-of-the-art point estimation performance compared to regression-based methods. This limitation stems from difficulties in providing sufficient contextual bias to track distribution shifts and in balancing output diversity with the stability and precision required for point forecasts. Existing diffusion-based approaches mainly focus on full-distribution modeling under probabilistic frameworks, often with likelihood maximization objectives, while paying little attention to dedicated strategies for high-accuracy point estimation. Moreover, other existing point prediction diffusion methods frequently rely on pre-trained or jointly trained mature models for contextual bias, sacrificing the generative flexibility of diffusion models. To address these challenges, we propose SimDiff, a single-stage, end-to-end framework. SimDiff employs a single unified Transformer network carefully tailored to serve as both denoiser and predictor, eliminating the need for external pre-trained or jointly trained regressors. It achieves state-of-the-art point estimation performance by leveraging intrinsic output diversity and improving mean squared error accuracy through multiple inference ensembling. Key innovations, including normalization independence and the median-of-means estimator, further enhance adaptability and stability. Extensive experiments demonstrate that SimDiff significantly outperforms existing methods in time series point forecasting.

NeurIPS Conference 2025 Conference Paper

DecompNet: Enhancing Time Series Forecasting Models with Implicit Decomposition

  • Donghao Luo
  • Xue Wang

In this paper, we pioneer the idea of implicit decomposition. And based on this idea, we propose a powerful decomposition-based enhancement framework, namely DecompNet. Our method converts the time series decomposition into an implicit process, where it can give a time series model the decomposition-related knowledge during inference, even though this model does not actually decompose the input time series. Thus, our DecompNet can enable a model to inherit the performance promotion brought by time series decomposition but will not introduce any additional inference costs, successfully enhancing the model performance while enjoying better efficiency. Experimentally, our DecompNet exhibits promising enhancement capability and compelling framework generality. Especially, it can also enhance the performance of the latest and state-of-the-art models, greatly pushing the performance limit of time series forecasting. Through comprehensive comparisons, DecompNet also shows excellent performance and efficiency superiority, making the decomposition-based enhancement framework surpass the well-recognized normalization-based frameworks for the first time. Code is available at this repository: https: //github. com/luodhhh/DecompNet.

IROS Conference 2025 Conference Paper

Dual-Arm Hierarchical Planning for Laboratory Automation: Vibratory Sieve Shaker Operations

  • Haoran Xiao
  • Xue Wang
  • Huimin Lu 0002
  • Zhiwen Zeng
  • Zirui Guo
  • Ziqi Ni
  • Yicong Ye
  • Wei Dai 0014

This paper addresses the challenges of automating vibratory sieve shaker operations in a materials laboratory, focusing on three critical tasks: 1) dual-arm lid manipulation in 3 cm clearance spaces, 2) bimanual handover in overlapping workspaces, and 3) obstructed powder sample container delivery with orientation constraints. These tasks present significant challenges, including inefficient sampling in narrow passages, the need for smooth trajectories to prevent spillage, and suboptimal paths generated by conventional methods. To overcome these challenges, we propose a hierarchical planning framework combining Prior-Guided Path Planning and Multi-Step Trajectory Optimization. The former uses a finite Gaussian mixture model to improve sampling efficiency in narrow passages, while the latter refines paths by shortening, simplifying, imposing joint constraints, and B-spline smoothing. Experimental results demonstrate the framework’s effectiveness: planning time is reduced by up to 80. 4%, and waypoints are decreased by 89. 4%. Furthermore, the system completes the full vibratory sieve shaker operation workflow in a physical experiment, validating its practical applicability for complex laboratory automation.

NeurIPS Conference 2025 Conference Paper

Less is More: Unlocking Specialization of Time Series Foundation Models via Structured Pruning

  • Lifan Zhao
  • Yanyan Shen
  • Zhaoyang Liu
  • Xue Wang
  • Jiaji Deng

Scaling laws motivate the development of Time Series Foundation Models (TSFMs) that pre-train vast parameters and achieve remarkable zero-shot forecasting performance. Surprisingly, even after fine-tuning, TSFMs cannot consistently outperform smaller, specialized models trained on full-shot downstream data. A key question is how to realize effective adaptation of TSFMs for a target forecasting task. Through empirical studies on various TSFMs, the pre-trained models often exhibit inherent sparsity and redundancy in computation, suggesting that TSFMs have learned to activate task-relevant network substructures to accommodate diverse forecasting tasks. To preserve this valuable prior knowledge, we propose a structured pruning method to regularize the subsequent fine-tuning process by focusing it on a more relevant and compact parameter space. Extensive experiments on seven TSFMs and six benchmarks demonstrate that fine-tuning a smaller, pruned TSFM significantly improves forecasting performance compared to fine-tuning original models. This ``prune-then-finetune'' paradigm often enables TSFMs to achieve state-of-the-art performance and surpass strong specialized baselines. Source code is made publicly available at \url{https: //github. com/SJTU-DMTai/Prune-then-Finetune}.

NeurIPS Conference 2025 Conference Paper

MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling

  • Yuxi Liu
  • Renjia Deng
  • Yutong He
  • Xue Wang
  • Tao Yao
  • Kun Yuan

The substantial memory demands of pre-training and fine-tuning large language models (LLMs) require memory-efficient optimization algorithms. One promising approach is layer-wise optimization, which treats each transformer block as a single layer and optimizes it sequentially, while freezing the other layers to save optimizer states and activations. Although effective, these methods ignore the varying importance of the modules within each layer, leading to suboptimal performance. Moreover, layer-wise sampling provides only limited memory savings, as at least one full layer must remain active during optimization. To overcome these limitations, we propose **M**odule-wise **I**mportance **SA**mpling (**MISA**), a novel method that divides each layer into smaller modules and assigns importance scores to each module. MISA uses a weighted random sampling mechanism to activate modules, provably reducing gradient variance compared to layer-wise sampling. Additionally, we establish an $\mathcal{O}(1/\sqrt{K})$ convergence rate under non-convex and stochastic conditions, where $K$ is the total number of training steps, and provide a detailed memory analysis showcasing MISA's superiority over existing baseline methods. Experiments on diverse learning tasks validate the effectiveness of MISA.

IJCAI Conference 2025 Conference Paper

Pre-defined Keypoints Promote Category-level Articulation Pose Estimation via Multi-Modal Alignment

  • Wenbo Xu
  • Li Zhang
  • Liu Liu
  • Yan Zhong
  • Haonan Jiang
  • Xue Wang
  • Rujing Wang

Articulations are essential in everyday interactions, yet traditional RGB-based pose estimation methods often struggle with issues such as lighting variations and shadows. To overcome these challenges, we propose a novel Pre-defined keypoint based framework for category-level articulation pose estimation via multi-modal Alignment, coined PAGE. Specifically, we first propose a customized keypoint estimation method, aiming to avoid the divergent distance pattern between heuristically generated keypoints and visible points. In addition, to reduce the mutual information redundancy between point clouds and RGB images, we design the geometry-color alignment, which fuses the features after aligning two modalities. This is followed by decoding the radius for each visible point, and applying our proposal integration scoring strategy to predict keypoints. Ultimately, the framework outputs the per-part 6D pose of the articulation. We conduct extensive experiments to evaluate PAGE across a variety of datasets, from synthetic to real-world scenarios, demonstrating its robustness and superior performance.

AAAI Conference 2025 Conference Paper

R^2-Art: Category-Level Articulation Pose Estimation from Single RGB Image via Cascade Render Strategy

  • Li Zhang
  • Haonan Jiang
  • Yukang Huo
  • Yan Zhong
  • Jianan Wang
  • Xue Wang
  • Rujing Wang
  • Liu Liu

Human life is filled with articulated objects. Previous works for estimating the pose of category-level articulated objects rely on costly 3D point clouds or RGB-D images. In this paper, our goal is to estimate category-level articulation poses from a single RGB image, where we propose R2-Art, a novel category-level Articulation pose estimation framework from a single RGB image and a cascade Render strategy. Given an RGB image as input, R2-Art estimates per-part 6D pose for the articulation. Specifically, we design parallel regression branches tailored to generate camera-to-root translation and rotation. Using the predicted joint states, we perform PC prior transformation and deformation with a joint-centric modeling approach. For further refinement, a cascade render strategy is proposed for projecting the 3D deformed prior onto the 2D mask. Extensive experiments are provided to validate our R2-Art on various datasets ranging from synthetic datasets to real-world scenarios, demonstrating the superior performance and robustness of the R2-Art. We believe that this work has the potential to be applied in many fields including robotics, embodied intelligence, and augmented reality.

NeurIPS Conference 2025 Conference Paper

RePO: Understanding Preference Learning Through ReLU-Based Optimization

  • Junkang Wu
  • Kexin Huang
  • Xue Wang
  • Jinyang Gao
  • Bolin Ding
  • Jiancan Wu
  • Xiangnan He
  • Xiang Wang

Preference learning has become a common approach in various recent methods for aligning large language models with human values. These methods optimize the preference margin between chosen and rejected responses, subject to certain constraints for avoiding over-optimization. In this paper, we report surprising empirical findings that simple ReLU activation can learn meaningful alignments even using \emph{none} of the following: (i) sigmoid-based gradient constraints, (ii) explicit regularization terms. Our experiments show that over-optimization does exist, but a threshold parameter $\gamma$ plays an essential role in preventing it by dynamically filtering training examples. We further provide theoretical analysis demonstrating that ReLU-based Preference Optimization (RePO) corresponds to the convex envelope of the 0-1 loss, establishing its fundamental soundness. Our ``RePO'' method achieves competitive or superior results compared to established preference optimization approaches. We hope this simple baseline will motivate researchers to rethink the fundamental mechanisms behind preference optimization for language model alignment.

YNICL Journal 2025 Journal Article

Structural and functional changes of Post-Stroke Depression: A multimodal magnetic resonance imaging study

  • Qiuhong Lu
  • Shunzu Lu
  • Xue Wang
  • Yanlan Huang
  • Jie Liu
  • Zhijian Liang

This study investigated changes in gray matter volume (GMV), white matter microstructure, and spontaneous brain activity in post-stroke depression (PSD) using multiple MRI techniques, including neurite orientation dispersion and density imaging (NODDI). Changes in GMV, neurite density index (NDI), orientation dispersion index (ODI), fraction of isotropic water (ISO), diffusion tensor imaging (DTI) parameters, and the amplitude of frequency fluctuations (ALFF) were assessed between PSD (n = 20), post-stroke without depression (n = 20), and normal control (n = 20) groups. Receiver operating characteristic (ROC) curve analysis was performed to test the classification performance of the variant parameters of each MRI modality, each single MRI modality and multiple MRI modality. Compared to patients with post-stroke without depression (non-PSD), those with PSD showed increased ODI and ISO in the widespread white matter, as well as increased ALFF in the left pallidum. No significant differences in the GMV or DTI parameters were observed between the two groups. Furthermore, the ODI of the right superior longitudinal fasciculus and NODDI showed the best classification performance for PSD at their respective comparison level (the areas under the ROC curves (AUC) = 0.917(0.000), 0.933(0.000)). The model of NODDI-derived parameters combined with non-diffusion MRI modality parameters (i.e., GMV and ALFF) showed better diagnostic performance than that of DTI-derived parameters. These findings suggest that PSD is associated with structural and functional abnormalities that may contribute to depressive symptoms. Additionally, NODDI showed its advantages in the description of structural alterations in emotion-related white matter pathways and classification performance in PSD.

NeurIPS Conference 2024 Conference Paper

DeformableTST: Transformer for Time Series Forecasting without Over-reliance on Patching

  • Donghao Luo
  • Xue Wang

With the proposal of patching technique in time series forecasting, Transformerbased models have achieved compelling performance and gained great interest fromthe time series community. But at the same time, we observe a new problem thatthe recent Transformer-based models are overly reliant on patching to achieve idealperformance, which limits their applicability to some forecasting tasks unsuitablefor patching. In this paper, we intent to handle this emerging issue. Through divinginto the relationship between patching and full attention (the core mechanismin Transformer-based models), we further find out the reason behind this issueis that full attention relies overly on the guidance of patching to focus on theimportant time points and learn non-trivial temporal representation. Based on thisfinding, we propose DeformableTST as an effective solution to this emergingissue. Specifically, we propose deformable attention, a sparse attention mechanismthat can better focus on the important time points by itself, to get rid of the need ofpatching. And we also adopt a hierarchical structure to alleviate the efficiency issuecaused by the removal of patching. Experimentally, our DeformableTST achievesthe consistent state-of-the-art performance in a broader range of time series tasks, especially achieving promising performance in forecasting tasks unsuitable forpatching, therefore successfully reducing the reliance on patching and broadeningthe applicability of Transformer-based models. Code is available at this repository: https: //github. com/luodhhh/DeformableTST.

AAAI Conference 2024 Conference Paper

MASTER: Market-Guided Stock Transformer for Stock Price Forecasting

  • Tong Li
  • Zhaoyang Liu
  • Yanyan Shen
  • Xue Wang
  • Haokun Chen
  • Sen Huang

Stock price forecasting has remained an extremely challenging problem for many decades due to the high volatility of the stock market. Recent efforts have been devoted to modeling complex stock correlations toward joint stock price forecasting. Existing works share a common neural architecture that learns temporal patterns from individual stock series and then mixes up temporal representations to establish stock correlations. However, they only consider time-aligned stock correlations stemming from all the input stock features, which suffer from two limitations. First, stock correlations often occur momentarily and in a cross-time manner. Second, the feature effectiveness is dynamic with market variation, which affects both the stock sequential patterns and their correlations. To address the limitations, this paper introduces MASTER, a MArkert-guided Stock TransformER, which models the momentary and cross-time stock correlation and leverages market information for automatic feature selection. MASTER elegantly tackles the complex stock correlation by alternatively engaging in intra-stock and inter-stock information aggregation. Experiments show the superiority of MASTER compared with previous works and visualize the captured realistic stock correlation to provide valuable insights.

ICLR Conference 2024 Conference Paper

ModernTCN: A Modern Pure Convolution Structure for General Time Series Analysis

  • Donghao Luo 0002
  • Xue Wang

Recently, Transformer-based and MLP-based models have emerged rapidly and won dominance in time series analysis. In contrast, convolution is losing steam in time series tasks nowadays for inferior performance. This paper studies the open question of how to better use convolution in time series analysis and makes efforts to bring convolution back to the arena of time series analysis. To this end, we modernize the traditional TCN and conduct time series related modifications to make it more suitable for time series tasks. As the outcome, we propose ModernTCN and successfully solve this open question through a seldom-explored way in time series community. As a pure convolution structure, ModernTCN still achieves the consistent state-of-the-art performance on five mainstream time series analysis tasks while maintaining the efficiency advantage of convolution-based models, therefore providing a better balance of efficiency and performance than state-of-the-art Transformer-based and MLP-based models. Our study further reveals that, compared with previous convolution-based models, our ModernTCN has much larger effective receptive fields (ERFs), therefore can better unleash the potential of convolution in time series analysis. Code is available at this repository: https://github.com/luodhhh/ModernTCN.

NeurIPS Conference 2023 Conference Paper

One Fits All: Power General Time Series Analysis by Pretrained LM

  • Tian Zhou
  • Peisong Niu
  • Xue Wang
  • Liang Sun
  • Rong Jin

Although we have witnessed great success of pre-trained models in natural language processing (NLP) and computer vision (CV), limited progress has been made for general time series analysis. Unlike NLP and CV where a unified model can be used to perform different tasks, specially designed approach still dominates in each time series analysis task such as classification, anomaly detection, forecasting, and few-shot learning. The main challenge that blocks the development of pre-trained model for time series analysis is the lack of a large amount of data for training. In this work, we address this challenge by leveraging language or CV models, pre-trained from billions of tokens, for time series analysis. Specifically, we refrain from altering the self-attention and feedforward layers of the residual blocks in the pre-trained language or image model. This model, known as the Frozen Pretrained Transformer (FPT), is evaluated through fine-tuning on all major types of tasks involving time series. Our results demonstrate that pre-trained models on natural language or images can lead to a comparable or state-of-the-art performance in all main time series analysis tasks, as illustrated in Figure1. We also found both theoretically and empirically that the self-attention module behaviors similarly to principle component analysis (PCA), an observation that helps explains how transformer bridges the domain gap and a crucial step towards understanding the universality of a pre-trained transformer. The code is publicly available at https: //anonymous. 4open. science/r/Pretrained-LM-for-TSForcasting-C561.

NeurIPS Conference 2023 Conference Paper

OneNet: Enhancing Time Series Forecasting Models under Concept Drift by Online Ensembling

  • Yifan Zhang
  • Qingsong Wen
  • Xue Wang
  • Weiqi Chen
  • Liang Sun
  • Zhang Zhang
  • Liang Wang
  • Rong Jin

Online updating of time series forecasting models aims to address the concept drifting problem by efficiently updating forecasting models based on streaming data. Many algorithms are designed for online time series forecasting, with some exploiting cross-variable dependency while others assume independence among variables. Given every data assumption has its own pros and cons in online time series modeling, we propose **On**line **e**nsembling **Net**work (**OneNet**). It dynamically updates and combines two models, with one focusing on modeling the dependency across the time dimension and the other on cross-variate dependency. Our method incorporates a reinforcement learning-based approach into the traditional online convex programming framework, allowing for the linear combination of the two models with dynamically adjusted weights. OneNet addresses the main shortcoming of classical online learning methods that tend to be slow in adapting to the concept drift. Empirical results show that OneNet reduces online forecasting error by more than $\mathbf{50}\\%$ compared to the State-Of-The-Art (SOTA) method.

NeurIPS Conference 2022 Conference Paper

FiLM: Frequency improved Legendre Memory Model for Long-term Time Series Forecasting

  • Tian Zhou
  • Ziqing Ma
  • Xue Wang
  • Qingsong Wen
  • Liang Sun
  • Tao Yao
  • Wotao Yin
  • Rong Jin

Recent studies have shown that deep learning models such as RNNs and Transformers have brought significant performance gains for long-term forecasting of time series because they effectively utilize historical information. We found, however, that there is still great room for improvement in how to preserve historical information in neural networks while avoiding overfitting to noise present in the history. Addressing this allows better utilization of the capabilities of deep learning models. To this end, we design a Frequency improved Legendre Memory model, or FiLM: it applies Legendre polynomial projections to approximate historical information, uses Fourier projection to remove noise, and adds a low-rank approximation to speed up computation. Our empirical studies show that the proposed FiLM significantly improves the accuracy of state-of-the-art models in multivariate and univariate long-term forecasting by (19. 2%, 22. 6%), respectively. We also demonstrate that the representation module developed in this work can be used as a general plugin to improve the long-term prediction performance of other deep learning modules. Code is available at https: //github. com/tianzhou2011/FiLM/.

AAAI Conference 2022 Conference Paper

Scaled ReLU Matters for Training Vision Transformers

  • Pichao Wang
  • Xue Wang
  • Hao Luo
  • Jingkai Zhou
  • Zhipeng Zhou
  • Fan Wang
  • Hao Li
  • Rong Jin

Vision transformers (ViTs) have been an alternative design paradigm to convolutional neural networks (CNNs). However, the training of ViTs is much harder than CNNs, as it is sensitive to the training parameters, such as learning rate, optimizer and warmup epoch. The reasons for training difficulty are empirically analysed in the paper Early Convolutions Help Transformers See Better, and the authors conjecture that the issue lies with the patchify-stem of ViT models. In this paper, we further investigate this problem and extend the above conclusion: only early convolutions do not help for stable training, but the scaled ReLU operation in the convolutional stem (conv-stem) matters. We verify, both theoretically and empirically, that scaled ReLU in conv-stem not only improves training stabilization, but also increases the diversity of patch tokens, thus boosting peak performance with a large margin via adding few parameters and flops. In addition, extensive experiments are conducted to demonstrate that previous ViTs are far from being well trained, further showing that ViTs have great potential to be a better substitute of CNNs.

IJCAI Conference 2021 Conference Paper

Time Series Data Augmentation for Deep Learning: A Survey

  • Qingsong Wen
  • Liang Sun
  • Fan Yang
  • Xiaomin Song
  • Jingkun Gao
  • Xue Wang
  • Huan Xu

Deep learning performs remarkably well on many time series analysis tasks recently. The superior performance of deep neural networks relies heavily on a large number of training data to avoid overfitting. However, the labeled data of many real-world time series applications may be limited such as classification in medical time series and anomaly detection in AIOps. As an effective way to enhance the size and quality of the training data, data augmentation is crucial to the successful application of deep learning models on time series data. In this paper, we systematically review different data augmentation methods for time series. We propose a taxonomy for the reviewed methods, and then provide a structured review for these methods by highlighting their strengths and limitations. We also empirically compare different data augmentation methods for different tasks including time series classification, anomaly detection, and forecasting. Finally, we discuss and highlight five future directions to provide useful research guidance.

ICML Conference 2018 Conference Paper

Minimax Concave Penalized Multi-Armed Bandit Model with High-Dimensional Convariates

  • Xue Wang
  • Mike Mingcheng Wei
  • Tao Yao

In this paper, we propose a Minimax Concave Penalized Multi-Armed Bandit (MCP-Bandit) algorithm for a decision-maker facing high-dimensional data with latent sparse structure in an online learning and decision-making process. We demonstrate that the MCP-Bandit algorithm asymptotically achieves the optimal cumulative regret in sample size T, O(log T), and further attains a tighter bound in both covariates dimension d and the number of significant covariates s, O(s^2 (s + log d). In addition, we develop a linear approximation method, the 2-step Weighted Lasso procedure, to identify the MCP estimator for the MCP-Bandit algorithm under non-i. i. d. samples. Using this procedure, the MCP estimator matches the oracle estimator with high probability. Finally, we present two experiments to benchmark our proposed the MCP-Bandit algorithm to other bandit algorithms. Both experiments demonstrate that the MCP-Bandit algorithm performs favorably over other benchmark algorithms, especially when there is a high level of data sparsity or when the sample size is not too small.