Author name cluster

Ruibin Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

2 author rows

ICRA Conference 2024 Conference Paper

Masked Local-Global Representation Learning for 3D Point Cloud Domain Adaptation

Bowei Xing
Xianghua Ying
Ruibin Wang

Point cloud is a popular and widely used geometric representation, which has attracted significant attention in 3D vision. However, the geometric variability of point cloud representations across different datasets can cause domain discrepancies, which hinder knowledge transfer and model generalization, resulting in degraded performance in target domain. In this paper, we present a novel approach to improve point cloud domain adaptation by employing masked representation learning in a self-supervised manner. Specifically, our method combines masked feature prediction and masked sample consistency to encode both local structure and global semantic information for learning invariant point cloud representation across domains. Moreover, to learn domain-specific representation and transfer knowledge from source to target, we propose prototype-calibrated self-training. By exploiting class-wise prototypes in the shared feature space, the soft pseudo labels can be adaptively denoised, which benefits the decision boundary learning in target domain. We conduct experiments on PointDA-10 and PointSegDA for 3D point cloud shape classification and semantic segmentation, respectively. The results demonstrate the effectiveness of our method and show that we can achieve the new state-of-the-art performance on point cloud domain adaptation.

Details

AAAI Conference 2024 Conference Paper

VPDETR: End-to-End Vanishing Point DEtection TRansformers

Taiyan Chen
Xianghua Ying
Jinfa Yang
Ruibin Wang
Ruohao Guo
Bowei Xing
Ji Shi

In the field of vanishing point detection, previous works commonly relied on extracting and clustering straight lines or classifying candidate points as vanishing points. This paper proposes a novel end-to-end framework, called VPDETR (Vanishing Point DEtection TRansformer), that views vanishing point detection as a set prediction problem, applicable to both Manhattan and non-Manhattan world datasets. By using the positional embedding of anchor points as queries in Transformer decoders and dynamically updating them layer by layer, our method is able to directly input images and output their vanishing points without the need for explicit straight line extraction and candidate points sampling. Additionally, we introduce an orthogonal loss and a cross-prediction loss to improve accuracy on the Manhattan world datasets. Experimental results demonstrate that VPDETR achieves competitive performance compared to state-of-the-art methods, without requiring post-processing.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Cross-Modal Contrastive Learning for Domain Adaptation in 3D Semantic Segmentation

Bowei Xing
Xianghua Ying
Ruibin Wang
Jinfa Yang
Taiyan Chen

Domain adaptation for 3D point cloud has attracted a lot of interest since it can avoid the time-consuming labeling process of 3D data to some extent. A recent work named xMUDA leveraged multi-modal data to domain adaptation task of 3D semantic segmentation by mimicking the predictions between 2D and 3D modalities, and outperformed the previous single modality methods only using point clouds. Based on it, in this paper, we propose a novel cross-modal contrastive learning scheme to further improve the adaptation effects. By employing constraints from the correspondences between 2D pixel features and 3D point features, our method not only facilitates interaction between the two different modalities, but also boosts feature representations in both labeled source domain and unlabeled target domain. Meanwhile, to sufficiently utilize 2D context information for domain adaptation through cross-modal learning, we introduce a neighborhood feature aggregation module to enhance pixel features. The module employs neighborhood attention to aggregate nearby pixels in the 2D image, which relieves the mismatching between the two different modalities, arising from projecting relative sparse point cloud to dense image pixels. We evaluate our method on three unsupervised domain adaptation scenarios, including country-to-country, day-to-night, and dataset-to-dataset. Experimental results show that our approach outperforms existing methods, which demonstrates the effectiveness of the proposed method.

PDF Details DOI

AAAI Conference 2023 Conference Paper

ECO-3D: Equivariant Contrastive Learning for Pre-training on Perturbed 3D Point Cloud

Ruibin Wang
Xianghua Ying
Bowei Xing
Jinfa Yang

In this work, we investigate contrastive learning on perturbed point clouds and find that the contrasting process may widen the domain gap caused by random perturbations, making the pre-trained network fail to generalize on testing data. To this end, we propose the Equivariant COntrastive framework which closes the domain gap before contrasting, further introduces the equivariance property, and enables pre-training networks under more perturbation types to obtain meaningful features. Specifically, to close the domain gap, a pre-trained VAE is adopted to convert perturbed point clouds into less perturbed point embedding of similar domains and separated perturbation embedding. The contrastive pairs can then be generated by mixing the point embedding with different perturbation embedding. Moreover, to pursue the equivariance property, a Vector Quantizer is adopted during VAE training, discretizing the perturbation embedding into one-hot tokens which indicate the perturbation labels. By correctly predicting the perturbation labels from the perturbed point cloud, the property of equivariance can be encouraged in the learned features. Experiments on synthesized and real-world perturbed datasets show that ECO-3D outperforms most existing pre-training strategies under various downstream tasks, achieving SOTA performance for lots of perturbations.

PDF Details DOI

IJCAI Conference 2021 Conference Paper

Towards Cross-View Consistency in Semantic Segmentation While Varying View Direction

Xin Tong
Xianghua Ying
Yongjie Shi
He Zhao
Ruibin Wang

Several images are taken for the same scene with many view directions. Given a pixel in any one image of them, its correspondences may appear in the other images. However, by using existing semantic segmentation methods, we find that the pixel and its correspondences do not always have the same inferred label as expected. Fortunately, from the knowledge of multiple view geometry, if we keep the position of a camera unchanged, and only vary its orientation, there is a homography transformation to describe the relationship of corresponding pixels in such images. Based on this fact, we propose to generate images which are the same as real images of the scene taken in certain novel view directions for training and evaluation. We also introduce gradient guided deformable convolution to alleviate the inconsistency, by learning dynamic proper receptive field from feature gradients. Furthermore, a novel consistency loss is presented to enforce feature consistency. Compared with previous approaches, the proposed method gets significant improvement in both cross-view consistency and semantic segmentation performance on images with abundant view directions, while keeping comparable or better performance on the existing datasets.

PDF Details DOI