Author name cluster

Bowei Xing

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

2 author rows

IJCAI Conference 2025 Conference Paper

FreEformer: Frequency Enhanced Transformer for Multivariate Time Series Forecasting

Wenzhen Yue
Yong Liu
Xianghua Ying
Bowei Xing
Ruohao Guo
Ji Shi

This paper presents FreEformer, a simple yet effective model that leverages a Frequency Enhanced Transformer for multivariate time series forecasting. Our work is based on the assumption that the frequency spectrum provides a global perspective on the composition of series across various frequencies and is highly suitable for robust representation learning. Specifically, we first convert time series into the complex frequency domain using the Discrete Fourier Transform (DFT). The Transformer architecture is then applied to the frequency spectra to capture cross-variate dependencies, with the real and imaginary parts processed independently. However, we observe that the vanilla attention matrix exhibits a low-rank characteristic, thus limiting representation diversity. To address this, we enhance the vanilla attention mechanism by introducing an additional learnable matrix to the original attention matrix, followed by row-wise L1 normalization. Theoretical analysis demonstrates that this enhanced attention mechanism improves both feature diversity and gradient flow. Extensive experiments demonstrate that FreEformer consistently outperforms state-of-the-art models on eighteen real-world benchmarks covering electricity, traffic, weather, healthcare and finance. Notably, the enhanced attention mechanism also consistently improves the performance of state-of-the-art Transformer-based forecasters. Code is available at https: //anonymous. 4open. science/r/FreEformer.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Normal-NeRF: Ambiguity-Robust Normal Estimation for Highly Reflective Scenes

Ji Shi
Xianghua Ying
Ruohao Guo
Bowei Xing
Wenzhen Yue

Neural Radiance Fields (NeRF) often struggle with reconstructing and rendering highly reflective scenes. Recent advancements have developed various reflection-aware appearance models to enhance NeRF's capability to render specular reflections. However, the robust reconstruction of highly reflective scenes is still hindered by the inherent shape ambiguity on specular surfaces. Existing methods typically rely on additional geometry priors to regularize the shape prediction, but this can lead to oversmoothed geometry in complex scenes. Observing the critical role of surface normals in parameterizing reflections, we introduce a transmittance-gradient-based normal estimation technique that remains robust even under ambiguous shape conditions. Furthermore, we propose a dual activated densities module that effectively bridges the gap between smooth surface normals and sharp object boundaries. Combined with a reflection-aware appearance model, our proposed method achieves robust reconstruction and high-fidelity rendering of scenes featuring both highly specular reflections and intricate geometric structures. Extensive experiments demonstrate that our method outperforms existing state-of-the-art methods on various datasets.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain

Wenzhen Yue
Yong Liu
Hao Wang
Haoxuan Li
Xianghua Ying
Ruohao Guo
Bowei Xing
Ji Shi

This paper presents $\mathbf{OLinear}$, a $\mathbf{linear}$-based multivariate time series forecasting model that operates in an $\mathbf{o}$rthogonally transformed domain. Recent forecasting models typically adopt the temporal forecast (TF) paradigm, which directly encode and decode time series in the time domain. However, the entangled step-wise dependencies in series data can hinder the performance of TF. To address this, some forecasters conduct encoding and decoding in the transformed domain using fixed, dataset-independent bases (e. g. , sine and cosine signals in the Fourier transform). In contrast, we propose $\mathbf{OrthoTrans}$, a data-adaptive transformation based on an orthogonal matrix that diagonalizes the series' temporal Pearson correlation matrix. This approach enables more effective encoding and decoding in the decorrelated feature domain and can serve as a plug-in module to enhance existing forecasters. To enhance the representation learning for multivariate time series, we introduce a customized linear layer, $\mathbf{NormLin}$, which employs a normalized weight matrix to capture multivariate dependencies. Empirically, the NormLin module shows a surprising performance advantage over multi-head self-attention, while requiring nearly half the FLOPs. Extensive experiments on 24 benchmarks and 140 forecasting tasks demonstrate that OLinear consistently achieves state-of-the-art performance with high efficiency. Notably, as a plug-in replacement for self-attention, the NormLin module consistently enhances Transformer-based forecasters. The code and datasets are available at https: //github. com/jackyue1994/OLinear.

PDF Details

ICRA Conference 2024 Conference Paper

Masked Local-Global Representation Learning for 3D Point Cloud Domain Adaptation

Bowei Xing
Xianghua Ying
Ruibin Wang

Point cloud is a popular and widely used geometric representation, which has attracted significant attention in 3D vision. However, the geometric variability of point cloud representations across different datasets can cause domain discrepancies, which hinder knowledge transfer and model generalization, resulting in degraded performance in target domain. In this paper, we present a novel approach to improve point cloud domain adaptation by employing masked representation learning in a self-supervised manner. Specifically, our method combines masked feature prediction and masked sample consistency to encode both local structure and global semantic information for learning invariant point cloud representation across domains. Moreover, to learn domain-specific representation and transfer knowledge from source to target, we propose prototype-calibrated self-training. By exploiting class-wise prototypes in the shared feature space, the soft pseudo labels can be adaptively denoised, which benefits the decision boundary learning in target domain. We conduct experiments on PointDA-10 and PointSegDA for 3D point cloud shape classification and semantic segmentation, respectively. The results demonstrate the effectiveness of our method and show that we can achieve the new state-of-the-art performance on point cloud domain adaptation.

Details

IJCAI Conference 2024 Conference Paper

Sub-Adjacent Transformer: Improving Time Series Anomaly Detection with Reconstruction Error from Sub-Adjacent Neighborhoods

Wenzhen Yue
Xianghua Ying
Ruohao Guo
DongDong Chen
Ji Shi
Bowei Xing
Yuqing Zhu
Taiyan Chen

In this paper, we present the Sub-Adjacent Transformer with a novel attention mechanism for unsupervised time series anomaly detection. Unlike previous approaches that rely on all the points within some neighborhood for time point reconstruction, our method restricts the attention to regions not immediately adjacent to the target points, termed sub-adjacent neighborhoods. Our key observation is that owing to the rarity of anomalies, they typically exhibit more pronounced differences from their sub-adjacent neighborhoods than from their immediate vicinities. By focusing the attention on the sub-adjacent areas, we make the reconstruction of anomalies more challenging, thereby enhancing their detectability. Technically, our approach concentrates attention on the non-diagonal areas of the attention matrix by enlarging the corresponding elements in the training stage. To facilitate the implementation of the desired attention matrix pattern, we adopt linear attention because of its flexibility and adaptability. Moreover, a learnable mapping function is proposed to improve the performance of linear attention. Empirically, the Sub-Adjacent Transformer achieves state-of-the-art performance across six real-world anomaly detection benchmarks, covering diverse fields such as server monitoring, space exploration, and water treatment.

PDF Details DOI

AAAI Conference 2024 Conference Paper

VPDETR: End-to-End Vanishing Point DEtection TRansformers

Taiyan Chen
Xianghua Ying
Jinfa Yang
Ruibin Wang
Ruohao Guo
Bowei Xing
Ji Shi

In the field of vanishing point detection, previous works commonly relied on extracting and clustering straight lines or classifying candidate points as vanishing points. This paper proposes a novel end-to-end framework, called VPDETR (Vanishing Point DEtection TRansformer), that views vanishing point detection as a set prediction problem, applicable to both Manhattan and non-Manhattan world datasets. By using the positional embedding of anchor points as queries in Transformer decoders and dynamically updating them layer by layer, our method is able to directly input images and output their vanishing points without the need for explicit straight line extraction and candidate points sampling. Additionally, we introduce an orthogonal loss and a cross-prediction loss to improve accuracy on the Manhattan world datasets. Experimental results demonstrate that VPDETR achieves competitive performance compared to state-of-the-art methods, without requiring post-processing.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Cross-Modal Contrastive Learning for Domain Adaptation in 3D Semantic Segmentation

Bowei Xing
Xianghua Ying
Ruibin Wang
Jinfa Yang
Taiyan Chen

Domain adaptation for 3D point cloud has attracted a lot of interest since it can avoid the time-consuming labeling process of 3D data to some extent. A recent work named xMUDA leveraged multi-modal data to domain adaptation task of 3D semantic segmentation by mimicking the predictions between 2D and 3D modalities, and outperformed the previous single modality methods only using point clouds. Based on it, in this paper, we propose a novel cross-modal contrastive learning scheme to further improve the adaptation effects. By employing constraints from the correspondences between 2D pixel features and 3D point features, our method not only facilitates interaction between the two different modalities, but also boosts feature representations in both labeled source domain and unlabeled target domain. Meanwhile, to sufficiently utilize 2D context information for domain adaptation through cross-modal learning, we introduce a neighborhood feature aggregation module to enhance pixel features. The module employs neighborhood attention to aggregate nearby pixels in the 2D image, which relieves the mismatching between the two different modalities, arising from projecting relative sparse point cloud to dense image pixels. We evaluate our method on three unsupervised domain adaptation scenarios, including country-to-country, day-to-night, and dataset-to-dataset. Experimental results show that our approach outperforms existing methods, which demonstrates the effectiveness of the proposed method.

PDF Details DOI

AAAI Conference 2023 Conference Paper

ECO-3D: Equivariant Contrastive Learning for Pre-training on Perturbed 3D Point Cloud

Ruibin Wang
Xianghua Ying
Bowei Xing
Jinfa Yang

In this work, we investigate contrastive learning on perturbed point clouds and find that the contrasting process may widen the domain gap caused by random perturbations, making the pre-trained network fail to generalize on testing data. To this end, we propose the Equivariant COntrastive framework which closes the domain gap before contrasting, further introduces the equivariance property, and enables pre-training networks under more perturbation types to obtain meaningful features. Specifically, to close the domain gap, a pre-trained VAE is adopted to convert perturbed point clouds into less perturbed point embedding of similar domains and separated perturbation embedding. The contrastive pairs can then be generated by mixing the point embedding with different perturbation embedding. Moreover, to pursue the equivariance property, a Vector Quantizer is adopted during VAE training, discretizing the perturbation embedding into one-hot tokens which indicate the perturbation labels. By correctly predicting the perturbation labels from the perturbed point cloud, the property of equivariance can be encouraged in the learned features. Experiments on synthesized and real-world perturbed datasets show that ECO-3D outperforms most existing pre-training strategies under various downstream tasks, achieving SOTA performance for lots of perturbations.

PDF Details DOI