Author name cluster

Xiushan Nie

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

2 author rows

AAAI Conference 2026 Conference Paper

PEOCH: Online Cross-Modal Hashing with Semi-Supervised Streaming Data Driving Prototype Evolution

Xiao Kang
Xingbo Liu
Shuo Pan
Xuening Zhang
Xiushan Nie
Yilong Yin

The exponential growth of streaming multi-modal data presents critical challenges for cross-modal retrieval: distribution shifts, modality gap, and scarce labels. Semi-supervised online cross-modal hashing has gained increasing interest due to its ability to encode complex streaming data and update hash functions simultaneously. Nevertheless, existing methods can hardly generate high-quality unsupervised hash codes, which fundamentally limits diversity and flexibility during the retrieval process. To this end, we propose a novel method named Prototype Evolution Online Cross-modal Hashing (PEOCH). By driving prototype evolution with semi-supervised streaming data, precise and stable hash codes are generated for both labeled and unlabeled data. Specifically, two prototype updates with stability guarantee are conducted: labeled samples push semantic knowledge into the supervised prototypes, while unlabeled samples perform clustering to generate unsupervised prototypes. Simultaneously, a co-optimization mechanism is designed to ensure the prototypes continuously evolve and preserve the consistency of the entire streaming data. Besides, an elasticity regularizer integrates discriminability and smoothness constraints, improving the reliability of prototypes. Extensive experiments on three benchmark datasets demonstrate that PEOCH outperforms state-of-the-art methods, achieving an average improvement of 6.7% in mAP@all across various retrieval tasks.

PDF Details DOI

ECAI Conference 2025 Conference Paper

Binary Continual Stream-View Clustering

Wen Xue
Xingbo Liu
Kang Xiao
Xuening Zhang
Xiushan Nie

Multi-view clustering is valued for uncovering latent common semantics lying in multi-view data, which has been a hot topic in unsupervised learning. However, when dealing with incremental streaming views, existing approaches typically require reconstructing the view data and aggregating streaming representations, leading to misalignment between representation and clusters. More importantly, conducting the clustering process frequently results in significant time consumption. To address these issues, we propose a novel method called Binary Continual Stream-View Clustering (BCSVC). Specifically, we design a continual clustering method that seamlessly unifies streaming representation learning and cluster assignment within a single framework. We also introduce a variance-weighted center updating mechanism to smooth the frequent clustering operation and absorb the semantics of previous views. In addition, to reduce the time and space expenditure on computation and storage, binary code for clustering representations is introduced, which can also significantly improve the computational efficiency of continuous updates in streaming scenarios. Last but not least, comprehensive theoretical analysis and extensive experimental results demonstrate its superior performance under various scenarios.

Details

AAAI Conference 2025 Conference Paper

Generalized Debiased Semi-Supervised Hashing for Large-Scale Image Retrieval

Xingbo Liu
Xuening Zhang
Xiushan Nie
Yang Shi
Yilong Yin

Semi-supervised hashing has shown promising efficacy in large-scale image retrieval, which learns similarity-preserving codes from both labeled and unlabeled data. To enable the use of advanced supervised hashing techniques, pseudo labels are widely applied. However, existing methods typically suffer from a biased learning issue due to pseudo label noise, which can be further aggravated during optimization. Although such a bias can adversely affect hashing accuracy, it has not been investigated sufficiently. In view of this, we present a comprehensive discussion on potential causes of biases, involving processes of pseudo-labeling, hash learning and optimization. Accordingly, a novel Generalized Debiased Semi-supervised Hashing (GDSH) method is proposed as a unified solution to mitigate the biases. Specifically, reliable pseudo labels are first predicted via a robust label completion strategy. Secondly, a debiased hash learning module is designed by combining label denoising and similarity updating. This can not only refine the supervision, but also obtain hash codes that are semantically debiased in both category and sample levels. Finally, a discrete semi-supervised hashing algorithm is proposed to alleviate the bias arising from optimization. Experimental results on three single-label and three multi-label image benchmarks demonstrate that GDSH remarkably outperforms the state-of-the-arts in different semi-supervised settings.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Semi-Supervised Online Cross-Modal Hashing

Xiao Kang
Xingbo Liu
Xuening Zhang
Wen Xue
Xiushan Nie
Yilong Yin

Online cross-modal hashing has gained increasing interest due to its ability to encode streaming data and update hash functions simultaneously. Existing online methods often assume either fully supervised or completely unsupervised settings. However, they overlook the prevalent and challenging scenario of semi-supervised cross-modal streaming data, where diverse data types, including labeled/unlabeled, paired/unpaired, and multi-modal, are intertwined. To address this issue, we propose Semi-Supervised Online Cross-modal Hashing (SSOCH). It presents an alignment-free pseudo-labeling strategy that extracts semantic information from unlabeled streaming data without relying on pairing relations. Furthermore, we design an online tri-consistent preserving scheme, integrating pseudo-labeled data regularization, discriminative label embedding, and fine-grained similarity preservation. This scheme fully explores consistency across data annotation, modalities, and streaming chunks, improving the model's adaptiveness in these challenging scenarios. Extensive experiments on benchmark datasets demonstrate the superiority of SSOCH under various scenarios, highlighting the importance of semi-supervised learning for online cross-modal hashing.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Towards Region-Adaptive Feature Disentanglement and Enhancement for Small Object Detection

Yanchao Bi
Yang Ning
Xiushan Nie
Xiankai Lu
Yongshun Gong
Leida Li

Current feature fusion strategies often fail to adequately account for the influence of activation intensity across different scales on small object features, which impedes the effective detection of small objects. To address this limitation, we propose the Region-Adaptive Feature Disentanglement and Enhancement (RAFDE) strategy, which improves both downsampling and feature fusion by leveraging activation intensity variations at multiple scales. First, we introduce the Boundary Transitional Region-enhanced Downsampling (BTRD) module, which enhances boundary transitional regions containing both strongly and weakly activated features, thereby mitigating the loss of crucial boundary information for small objects. Second, we present the Regional-Adaptive Feature Fusion (RAFF) module, which adaptively disentangles and fuses co-activated and uni-activated regions from adjacent levels into the current level, effectively reducing the risk of small objects being overwhelmed. Extensive experiments on several public datasets demonstrate that the RAFDE strategy is highly effective and outperforms state-of-the-art methods. The code is available at https: //github. com/b-yanchao/RAFDE. git.

PDF Details DOI

JBHI Journal 2024 Journal Article

Biomarkers-Aware Asymmetric Bibranch GAN With Adaptive Memory Batch Normalization for Prediction of Anti-VEGF Treatment Response in Neovascular Age-Related Macular Degeneration

Peng Zhao
Xian Song
Xiaoming Xi
Xiushan Nie
Xianjing Meng
Yi Qu
Yilong Yin

The emergence of anti-vascular endothelial growth factor (anti-VEGF) therapy has revolutionized neovascular age-related macular degeneration (nAMD). Post-therapeutic optical coherence tomography (OCT) imaging facilitates the prediction of therapeutic response to anti-VEGF therapy for nAMD. Although the generative adversarial network (GAN) is a popular generative model for post-therapeutic OCT image generation, it is realistically challenging to gather sufficient pre- and post-therapeutic OCT image pairs, resulting in overfitting. Moreover, the available GAN-based methods ignore local details, such as the biomarkers that are essential for nAMD treatment. To address these issues, a Biomarkers-aware Asymmetric Bibranch GAN (BAABGAN) is proposed to efficiently generate post-therapeutic OCT images. Specifically, one branch is developed to learn prior knowledge with a high degree of transferability from large-scale data, termed the source branch. Then, the source branch transfer knowledge to another branch, which is trained on small-scale paired data, termed the target branch. To boost the transferability, a novel Adaptive Memory Batch Normalization (AMBN) is introduced in the source branch, which learns more effective global knowledge that is impervious to noise via memory mechanism. Also, a novel Adaptive Biomarkers-aware Attention (ABA) module is proposed to encode biomarkers information into latent features of target branches to learn finer local details of biomarkers. The proposed method outperforms traditional GAN models and can produce high-quality post-treatment OCT pictures with limited data sets, as shown by the results of experiments.

Details DOI

AAAI Conference 2023 Conference Paper

Exposing the Self-Supervised Space-Time Correspondence Learning via Graph Kernels

Zheyun Qin
Xiankai Lu
Xiushan Nie
Yilong Yin
Jianbing Shen

Self-supervised space-time correspondence learning is emerging as a promising way of leveraging unlabeled video. Currently, most methods adapt contrastive learning with mining negative samples or reconstruction adapted from the image domain, which requires dense affinity across multiple frames or optical flow constraints. Moreover, video correspondence predictive models require mining more inherent properties in videos, such as structural information. In this work, we propose the VideoHiGraph, a space-time correspondence framework based on a learnable graph kernel. Concerning the video as the spatial-temporal graph, the learning objectives of VideoHiGraph are emanated in a self-supervised manner for predicting unobserved hidden graphs via graph kernel manner. We learn a representation of the temporal coherence across frames in which pairwise similarity defines the structured hidden graph, such that a biased random walk graph kernel along the sub-graph can predict long-range correspondence. Then, we learn a refined representation across frames on the node-level via a dense graph kernel. The self-supervision of the model training is formed by the structural and temporal consistency of the graph. VideoHiGraph achieves superior performance and demonstrates its robustness across the benchmark of label propagation tasks involving objects, semantic parts, keypoints, and instances. Our algorithm implementations have been made publicly available at https://github.com/zyqin19/VideoHiGraph.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

Unified 3D Segmenter As Prototypical Classifiers

Zheyun Qin
Cheng Han
Qifan Wang
Xiushan Nie
Yilong Yin
Lu Xiankai

The task of point cloud segmentation, comprising semantic, instance, and panoptic segmentation, has been mainly tackled by designing task-specific network architectures, which often lack the flexibility to generalize across tasks, thus resulting in a fragmented research landscape. In this paper, we introduce ProtoSEG, a prototype-based model that unifies semantic, instance, and panoptic segmentation tasks. Our approach treats these three homogeneous tasks as a classification problem with different levels of granularity. By leveraging a Transformer architecture, we extract point embeddings to optimize prototype-class distances and dynamically learn class prototypes to accommodate the end tasks. Our prototypical design enjoys simplicity and transparency, powerful representational learning, and ad-hoc explainability. Empirical results demonstrate that ProtoSEG outperforms concurrent well-known specialized architectures on 3D point cloud benchmarks, achieving 72. 3%, 76. 4% and 74. 2% mIoU for semantic segmentation on S3DIS, ScanNet V2 and SemanticKITTI, 66. 8% mCov and 51. 2% mAP for instance segmentation on S3DIS and ScanNet V2, 62. 4% PQ for panoptic segmentation on SemanticKITTI, validating the strength of our concept and the effectiveness of our algorithm. The code and models are available at https: //github. com/zyqin19/PROTOSEG.

PDF Details

JBHI Journal 2022 Journal Article

Learning Binary Semantic Embedding for Large-Scale Breast Histology Image Analysis

Xingbo Liu
Xiao Kang
Xiushan Nie
Jie Guo
Shaohua Wang
Yilong Yin

With the progress of clinical imaging innovation and machine learning, the computer-assisted diagnosis of breast histology images has attracted broad attention. Nonetheless, the use of computer-assisted diagnoses has been blocked due to the incomprehensibility of customary classification models. In view of this question, we propose a novel method for L earning B inary S emantic E mbedding (LBSE). In this study, bit balance and uncorrelation constraints, double supervision, discrete optimization and asymmetric pairwise similarity are seamlessly integrated for learning binary semantic-preserving embedding. Moreover, a fusion-based strategy is carefully designed to handle the intractable problem of parameter setting, saving huge amounts of time for boundary tuning. Based on the above-mentioned proficient and effective embedding, classification and retrieval are simultaneously performed to give interpretable image-based deduction and model helped conclusions for breast histology images. Extensive experiments are conducted on three benchmark datasets to approve the predominance of LBSE in different situations.

Details DOI

AAAI Conference 2020 Short Paper

Focusing on Detail: Deep Hashing Based on Multiple Region Details (Student Abstract)

Quan Zhou
Xiushan Nie
Yang Shi
Xingbo Liu
Yilong Yin

Fast retrieval efficiency and high performance hashing, which aims to convert multimedia data into a set of short binary codes while preserving the similarity of the original data, has been widely studied in recent years. Majority of the existing deep supervised hashing methods only utilize the semantics of a whole image in learning hash codes, but ignore the local image details, which are important in hash learning. To fully utilize the detailed information, we propose a novel deep multi-region hashing (DMRH), which learns hash codes from local regions, and in which the final hash codes of the image are obtained by fusing the local hash codes corresponding to local regions. In addition, we propose a self-similarity loss term to address the imbalance problem (i.e., the number of dissimilar pairs is significantly more than that of the similar ones) of methods based on pairwise similarity.

PDF Details

AAAI Conference 2019 Short Paper

Jointly Multiple Hash Learning

Xingbo Liu
Xiushan Nie
Yingxin Wang
Yilong Yin

Hashing can compress heterogeneous high-dimensional data into compact binary codes while preserving the similarity to facilitate efficient retrieval and storage, and thus hashing has recently received much attention from information retrieval researchers. Most of the existing hashing methods first predefine a fixed length (e.g., 32, 64, or 128 bit) for the hash codes before learning them with this fixed length. However, one sample can be represented by various hash codes with different lengths, and thus there must be some associations and relationships among these different hash codes because they represent the same sample. Therefore, harnessing these relationships will boost the performance of hashing methods. Inspired by this possibility, in this study, we propose a new model jointly multiple hash learning (JMH), which can learn hash codes with multiple lengths simultaneously. In the proposed JMH method, three types of information are used for hash learning, which come from hash codes with different lengths, the original features of the samples and label. In contrast to the existing hashing methods, JMH can learn hash codes with different lengths in one step. Users can select appropriate hash codes for their retrieval tasks according to the requirements in terms of accuracy and complexity. To the best of our knowledge, JMH is one of the first attempts to learn multi-length hash codes simultaneously. In addition, in the proposed model, discrete and closed-form solutions for variables can be obtained by cyclic coordinate descent, thereby making the proposed model much faster during training. Extensive experiments were performed based on three benchmark datasets and the results demonstrated the superior performance of the proposed method.

PDF Details

IJCAI Conference 2019 Conference Paper

Supervised Short-Length Hashing

Xingbo Liu
Xiushan Nie
Quan Zhou
Xiaoming Xi
Lei Zhu
Yilong Yin

Hashing can compress high-dimensional data into compact binary codes, while preserving the similarity, to facilitate efficient retrieval and storage. However, when retrieving using an extremely short length hash code learned by the existing methods, the performance cannot be guaranteed because of severe information loss. To address this issue, in this study, we propose a novel supervised short-length hashing (SSLH). In this proposed SSLH, mutual reconstruction between the short-length hash codes and original features are performed to reduce semantic loss. Furthermore, to enhance the robustness and accuracy of the hash representation, a robust estimator term is added to fully utilize the label information. Extensive experiments conducted on four image benchmarks demonstrate the superior performance of the proposed SSLH with short-length hash codes. In addition, the proposed SSLH outperforms the existing methods, with long-length hash codes. To the best of our knowledge, this is the first linear-based hashing method that focuses on both short and long-length hash codes for maintaining high precision.

PDF Details