Author name cluster

Yi Niu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

2 author rows

AAAI Conference 2026 Conference Paper

Refine3D: Scene-Adaptive Reference Point Refinement for Sparse 3D Object Detection

Fan Li
Jing Lu
Yunlu Xu
Changhong Wu
Tao Xu
Zhaoyi Xiang
Yi Niu

Sparse query-based detectors have emerged as the dominant paradigm in camera-only 3D object detection, owing to their exceptional performance and computational efficiency. A central component of these approaches is the use of reference points, which serve as learnable spatial anchors to guide queries in localizing target objects. However, existing methods typically employ a unified set of reference points across all scenes, a design we find suboptimal for handling complex scenarios with highly imbalanced object distributions, such as road intersections or occluded environments. In this paper, we investigate the adaptability of reference points and propose Refine3D, an adaptive refinement mechanism that achieves scene-level alignment between the distribution of reference points and ground-truth objects. In particular, we introduce a novel Reference Point Distribution Loss (RPD-Loss) to ensure reference points converge globally toward object positions, and a Scene-Adaptive Refinement head (SAR-Head) that predicts dynamic offsets for each reference point. Both components can be seamlessly integrated into mainstream sparse detectors. Extensive experiments on two challenging autonomous driving datasets demonstrate that Refine3D outperforms the state-of-the-art with improved detection accuracy and robustness.

PDF Details DOI

IROS Conference 2025 Conference Paper

Parameter Selections and Applications for Soft Bellows Actuators (SBAs) with Various Performance Metrics

Wenjing Zou
Zhekai Li
Ziting Xiao
Kailan Zheng
Chao Lin
Haolin Chen
Peifeng Yu
Yi Niu

Soft bellows actuators (SBAs), a particular type of soft pneumatic actuators (SPAs), are widely used in various applications, such as climbing robots, industrial grippers, and wearable devices. Despite their advantages of uniform motion and high efficiency, the design of SBAs often relies on experiential methods rather than standardized guidelines. This results in unclear optimization pathways and a misalignment between SBA performance and specific application requirements. This study identifies six critical parameters of linear pneumatic SBAs: Shore hardness (SH), number of units (N), thickness (t), mid-diameter (R m ), unit width (x), and unit depth (h). We explore how these parameters influence load capacity, displacement efficiency, and bending resistance. Experimental findings indicate that increasing SH, t, x, and h and decreasing N enhance load capacity. Moreover, increases in N, R m, x, and h, along with decreases in SH and t, improve displacement efficiency. Furthermore, enhancing SH, t, and R m and reducing N, x, and h strengthen bending resistance. Based on these insights, we design three types of SBAs tailored to specific tasks, which are implemented in a high-load pneumatic gripper, a high-efficiency displacement table, and a pneumatic worm-inspired climbing robot. This research contributes to the targeted design of SBAs, offering a novel approach for the effective optimization and performance prediction of particular SPAs, thereby facilitating the broader application of soft robots.

Details

EAAI Journal 2024 Journal Article

Improving the performance of semi-supervised person Re-identification by selecting reliable unlabeled samples

Xinyuan Chen
Yi Niu
Fawen Du
Guilin Lv

In this paper, we focus on the semi-supervised person re-identification (Re-ID) task. Under the semi-supervised setting, only a subset of individuals in large-scale datasets need to be labeled for training to reduce the manual annotation cost. To enhance recognition performance under this setting, we propose a novel three-stage, two-branch semi-supervised Re-ID framework. Within this framework, we implement a progressive training process that integrates three methods for training unlabeled data: identification learning based on the pseudo-label estimation technique, metric learning based on mining positive or negative pairwise relationships between samples, and feature consistency learning between views generated through various data augmentations. To make full use of the unlabeled data, the above three methods are gradually added in each training stage, and the defects of these methods are avoided. Moreover, to further improve the reliability of training samples, we design two modules for selecting reliable samples to determine which method to adopt for each unlabeled sample to reduce model performance deterioration due to incorrect pseudo-labels or incorrect sample pairs. Our framework can be combined with existing supervised methods and has less performance cost. Moreover, different components in our proposed framework can be used as plug-ins to integrate into existing semi-supervised Re-ID methods and improve their performance. Extensive experiments were conducted on two public Re-ID benchmarks that demonstrate the effectiveness of our proposed framework.

Details DOI

AAAI Conference 2022 Conference Paper

PMAL: Open Set Recognition via Robust Prototype Mining

Jing Lu
Yunlu Xu
Hao Li
Zhanzhan Cheng
Yi Niu

Open Set Recognition (OSR) has been an emerging topic. Besides recognizing predefined classes, the system needs to reject the unknowns. Prototype learning is a potential manner to handle the problem, as its ability to improve intra-class compactness of representations is much needed in discrimination between the known and the unknowns. In this work, we propose a novel Prototype Mining And Learning (PMAL) framework. It has a prototype mining mechanism before the phase of optimizing embedding space, explicitly considering two crucial properties, namely high-quality and diversity of the prototype set. Concretely, a set of high-quality candidates are firstly extracted from training samples based on data uncertainty learning, avoiding the interference from unexpected noise. Considering the multifarious appearance of objects even in a single category, a diversity-based strategy for prototype set filtering is proposed. Accordingly, the embedding space can be better optimized to discriminate therein the predefined classes and between known and unknowns. Extensive experiments verify the two good characteristics (i. e. , high-quality and diversity) embraced in prototype mining, and show the remarkable performance of the proposed framework compared to state-of-the-arts.

PDF Details

AAAI Conference 2021 Conference Paper

MANGO: A Mask Attention Guided One-Stage Scene Text Spotter

Liang Qiao
Ying Chen
Zhanzhan Cheng
Yunlu Xu
Yi Niu
Shiliang Pu
Fei Wu

Recently end-to-end scene text spotting has become a popular research topic due to its advantages of global optimization and high maintainability in real applications. Most methods attempt to develop various region of interest (RoI) operations to concatenate the detection part and the sequence recognition part into a two-stage text spotting framework. However, in such framework, the recognition part is highly sensitive to the detected results (e. g. , the compactness of text contours). To address this problem, in this paper, we propose a novel Mask AttentioN Guided One-stage text spotting framework named MANGO, in which character sequences can be directly recognized without RoI operation. Concretely, a positionaware mask attention module is developed to generate attention weights on each text instance and its characters. It allows different text instances in an image to be allocated on different feature map channels which are further grouped as a batch of instance features. Finally, a lightweight sequence decoder is applied to generate the character sequences. It is worth noting that MANGO inherently adapts to arbitraryshaped text spotting and can be trained end-to-end with only coarse position information (e. g. , rectangular bounding box) and text annotations. Experimental results show that the proposed method achieves competitive and even new state-ofthe-art performance on both regular and irregular text spotting benchmarks, i. e. , ICDAR 2013, ICDAR 2015, Total-Text, and SCUT-CTW1500.

PDF Details

AAAI Conference 2021 Conference Paper

SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition

Chengwei Zhang
Yunlu Xu
Zhanzhan Cheng
Shiliang Pu
Yi Niu
Fei Wu
Futai Zou

Arbitrary text appearance poses a great challenge in scene text recognition tasks. Existing works mostly handle with the problem in consideration of the shape distortion, including perspective distortions, line curvature or other style variations. Rectification (i. e. , spatial transformers) as the preprocessing stage is one popular approach and extensively studied. However, chromatic difficulties in complex scenes have not been paid much attention on. In this work, we introduce a new learnable geometric-unrelated rectification, Structure- Preserving Inner Offset Network (SPIN), which allows the color manipulation of source data within the network. This differentiable module can be inserted before any recognition architecture to ease the downstream tasks, giving neural networks the ability to actively transform input intensity rather than only the spatial rectification. It can also serve as a complementary module to known spatial transformations and work in both independent and collaborative ways with them. Extensive experiments show the proposed transformation outperforms existing rectification networks and has comparable performance among the state-of-the-arts.

PDF Details

AAAI Conference 2020 Conference Paper

Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting

Liang Qiao
Sanli Tang
Zhanzhan Cheng
Yunlu Xu
Yi Niu
Shiliang Pu
Fei Wu

Many approaches have recently been proposed to detect irregular scene text and achieved promising results. However, their localization results may not well satisfy the following text recognition part mainly because of two reasons: 1) recognizing arbitrary shaped text is still a challenging task, and 2) prevalent non-trainable pipeline strategies between text detection and text recognition will lead to suboptimal performances. To handle this incompatibility problem, in this paper we propose an end-to-end trainable text spotting approach named Text Perceptron. Concretely, Text Perceptron ﬁrst employs an efﬁcient segmentation-based text detector that learns the latent text reading order and boundary information. Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies without extra parameters. It unites text detection and the following recognition part into a whole framework, and helps the whole network achieve global optimization. Experiments show that our method achieves competitive performance on two standard text benchmarks, i. e. , ICDAR 2013 and ICDAR 2015, and also obviously outperforms existing methods on irregular text benchmarks SCUT-CTW1500 and Total-Text.

PDF Details

AAAI Conference 2019 Conference Paper

Segregated Temporal Assembly Recurrent Networks for Weakly Supervised Multiple Action Detection

Yunlu Xu
Chengwei Zhang
Zhanzhan Cheng
Jianwen Xie
Yi Niu
Shiliang Pu
Fei Wu

This paper proposes a segregated temporal assembly recurrent (STAR) network for weakly-supervised multiple action detection. The model learns from untrimmed videos with only supervision of video-level labels and makes prediction of intervals of multiple actions. Specifically, we first assemble video clips according to class labels by an attention mechanism that learns class-variable attention weights and thus helps the noise relieving from background or other actions. Secondly, we build temporal relationship between actions by feeding the assembled features into an enhanced recurrent neural network. Finally, we transform the output of recurrent neural network into the corresponding action distribution. In order to generate more precise temporal proposals, we design a score term called segregated temporal gradient-weighted class activation mapping (ST-GradCAM) fused with attention weights. Experiments on THUMOS’14 and ActivityNet1. 3 datasets show that our approach outperforms the state-of-theart weakly-supervised method, and performs at par with the fully-supervised counterparts.

PDF Details

IJCAI Conference 2018 Conference Paper

Deep Propagation Based Image Matting

Yu Wang
Yi Niu
Peiyong Duan
Jianwei Lin
Yuanjie Zheng

In this paper, we propose a deep propagation based image matting framework by introducing deep learning into learning an alpha matte propagation principal. Our deep learning architecture is a concatenation of a deep feature extraction module, an affinity learning module and a matte propagation module. These three modules are all differentiable and can be optimized jointly via an end-to-end training process. Our framework results in a semantic-level pairwise similarity of pixels for propagation by learning deep image representations adapted to matte propagation. It combines the power of deep learning and matte propagation and can therefore surpass prior state-of-the-art matting techniques in terms of both accuracy and training complexity, as validated by our experimental results from 243K images created based on two benchmark matting databases.

PDF Details