EAAI Journal 2026 Journal Article
End-to-end railway obstacle detection enhanced by point cloud segmentation
- Yuxing Yang
- Bowen Zhang
- Boyu Yang
- Kaizhong Xiao
- Xiaolong Tuo
- Yang Li
- Liewei Wang
- Siyue Yu
Author name cluster
Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.
EAAI Journal 2026 Journal Article
AAAI Conference 2025 Conference Paper
Training semantic segmenter with synthetic data has been attracting great attention due to its easy accessibility and huge quantities. Most previous methods focused on producing large-scale synthetic image-annotation samples and then training the segmenter with all of them. However, such a solution remains a main challenge in that the poor-quality samples are unavoidable, and using them to train the model will damage the training process. In this paper, we propose a training-free Synthetic Data Selection (SDS) strategy with CLIP to select high-quality samples for building a reliable synthetic dataset. Specifically, given massive synthetic image-annotation pairs, we first design a Perturbation-based CLIP Similarity (PCS) to measure the reliability of synthetic image, thus removing samples with low-quality images. Then we propose a class-balance Annotation Similarity Filter (ASF) by comparing the synthetic annotation with the response of CLIP to remove the samples related to low-quality annotations. The experimental results show that using our method significantly reduces the data size by half, while the trained segmenter achieves higher performance.
IJCAI Conference 2025 Conference Paper
This paper tackles the challenge of anomaly image synthesis and segmentation to generate various anomaly images and their segmentation labels to mitigate the issue of data scarcity. Existing approaches employ the precise mask to guide the generation, relying on additional mask generators, leading to increased computational costs and limited anomaly diversity. Although a few works use coarse masks as the guidance to expand diversity, they lack effective generation of labels for synthetic images, thereby reducing their practicality. Therefore, our proposed method simultaneously generates anomaly images and their corresponding masks by utilizing coarse masks and anomaly categories. The framework utilizes attention maps from synthesis process as mask labels and employs two optimization modules to tackle drift challenges, which are mismatches between synthetic results and real situations. Our evaluation demonstrates that our method improves pixel-level AP by 1. 3% and F1-MAX by 1. 8% in anomaly detection tasks on the MVTec dataset. Additionally, its successful application in practical scenarios highlights its effectiveness, improving IoU by 37. 2% and F-measure by 25. 1% with the Floor Dirt dataset. The code is available at https: //github. com/JJessicaYao/DriftRemover.
ECAI Conference 2024 Conference Paper
Weakly supervised semantic segmentation has attracted a lot of attention recently. Previous methods can be divided into two types, which are single-stage training and multi-stage training. In this paper, we focus on multi-stage training for image-level weakly supervised semantic segmentation. Many recent methods have tried to use transformer architecture as the backbone for CAM generation since it can capture global relationships to refine CAM accurately. However, we observe that such a backbone still fails to generate complete and smooth CAM. We argue that this is because the attention mechanism in the transformer can only pay attention to the most discriminative relationships. It is difficult to capture semantic-level long-range pair-wise relationships under image-level supervision. Thus, we propose an adversarial erasing transformer network called AETN, where an erasing attention mechanism is designed to establish more extensive pair-wise relationships. To cope with erasing, more target features will be forced to activate. Thus, better feature representation can be obtained for more accurate CAM generation. Besides, to further help our network learn better feature representation, we propose a self-consistent learning mechanism based on different augmentations. In this way, our AETN outperforms recent methods. Our AETN achieves 73. 0 mIoU on the PASCAL VOC 2012 val set and 73. 9 mIoU on the PASCAL VOC 2012 test set. Code is available a https: //github. com/siyueyu/AETN.
AAAI Conference 2021 Conference Paper
Sparse labels have been attracting much attention in recent years. However, the performance gap between weakly supervised and fully supervised salient object detection methods is huge, and most previous weakly supervised works adopt complex training methods with many bells and whistles. In this work, we propose a one-round end-to-end training approach for weakly supervised salient object detection via scribble annotations without pre/post-processing operations or extra supervision data. Since scribble labels fail to offer detailed salient regions, we propose a local coherence loss to propagate the labels to unlabeled regions based on image features and pixel distance, so as to predict integral salient regions with complete object structures. We design a saliency structure consistency loss as self-consistent mechanism to ensure consistent saliency maps are predicted with different scales of the same image as input, which could be viewed as a regularization technique to enhance the model generalization ability. Additionally, we design an aggregation module (AGGM) to better integrate high-level features, low-level features and global context information for the decoder to aggregate various information. Extensive experiments show that our method achieves a new state-of-the-art performance on six benchmarks (e. g. for the ECSSD dataset: Fβ = 0. 8995, Eξ = 0. 9079 and MAE = 0. 0489), with an average gain of 4. 60% for F-measure, 2. 05% for E-measure and 1. 88% for MAE over the previous best method on this task. Source code is available at http: //github. com/siyueyu/SCWSSOD.