Author name cluster

Mingqiang Wei

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

2 author rows

AAAI Conference 2026 Conference Paper

BridgeShape: Latent Diffusion Schrödinger Bridge for 3D Shape Completion

Dequan Kong
Honghua Chen
Zhe Zhu
Mingqiang Wei

Existing diffusion-based 3D shape completion methods typically use a conditional paradigm, injecting incomplete shape information into the denoising network via deep feature interactions (e.g., concatenation, cross-attention) to guide sampling toward complete shapes, often represented by voxel-based distance functions. However, these approaches fail to explicitly model the optimal global transport path, leading to suboptimal completions. Moreover, performing diffusion directly in voxel space imposes resolution constraints, limiting the generation of fine-grained geometric details. To address these challenges, we propose BridgeShape, a novel framework for 3D shape completion via latent diffusion Schrödinger bridge. The key innovations lie in two aspects: (i) BridgeShape formulates shape completion as an optimal transport problem, explicitly modeling the transition between incomplete and complete shapes to ensure a globally coherent transformation. (ii) We introduce a Depth-Enhanced Vector Quantized Variational Autoencoder (VQ-VAE) to encode 3D shapes into a compact latent space, leveraging self-projected multi-view depth information enriched with strong DINOv2 features to enhance geometric structural perception. By operating in a compact yet structurally informative latent space, BridgeShape effectively mitigates resolution constraints and enables more efficient and high-fidelity 3D shape completion. BridgeShape achieves state-of-the-art performance on 3D shape completion benchmarks, demonstrating superior fidelity at higher resolutions and for unseen object classes.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Perceive, Act and Correct: Confidence Is Not Enough for Hyperspectral Classification

Muzhou Yang
Wuzhou Quan
Mingqiang Wei

Confidence alone is often misleading in hyperspectral image classification, as models tend to mistake high predictive scores for correctness while lacking awareness of uncertainty. This leads to confirmation bias, especially under sparse annotations or class imbalance, where models overfit confident errors and fail to generalize. We propose CABIN (Cognitive-Aware Behavior-Informed learNing), a semi-supervised framework that addresses this limitation through a closed-loop learning process of perception, action, and correction. CABIN first develops perceptual awareness by estimating epistemic uncertainty, identifying ambiguous regions where errors are likely to occur. It then acts by adopting an Uncertainty-Guided Dual Sampling Strategy, selecting uncertain samples for exploration while anchoring confident ones as stable pseudo-labels to reduce bias. To correct noisy supervision, CABIN introduces a Fine-Grained Dynamic Assignment Strategy that categorizes pseudo-labeled data into reliable, ambiguous, and noisy subsets, applying tailored losses to enhance generalization. Experimental results show that a wide range of state-of-the-art methods benefit from the integration of CABIN, with improved labeling efficiency and performance.

PDF Details DOI

JBHI Journal 2025 Journal Article

AI-Driven Quantitative Analysis of Pathological Images for Membranous Nephropathy Across Macro and Micro Modalities

Guangze Shi
Ying Wang
Yongfei Wu
Xueyu Liu
Jia Shen
Hao Meng
Yexin Lai
Weixia Han

The diagnosis of membranous nephropathy (MN) has been reliant on the identification of glomerular basement membrane (GBM) variations and lesions at both macro and micro levels. At the macro level, light microscopy (LM) has been used to reveal spike- like projections that indicate pathological changes, whereas at the micro level, transmission electron microscopy (TEM) has been employed to identify GBM thickening. However, qualitative diagnosis has been limited by inter-pathologist variability, creating the need for deep learning approaches capable of quantifying pathological changes and predicting MN progression. In this study, an AI-driven framework based on the Mamba model has been proposed, in which the area and proportion of spike- like projections are quantified at the macro level, and GBM thickness is segmented and measured at the micro level. Classical machine learning models are then applied to predict MN progression based on pathological indicators extracted through factor analysis. Unlike prior approaches, the framework has been designed to emulate the diagnostic workflow of pathologists by integrating LM and TEM images for joint analysis. Experiments on an external dataset of 109 cases demonstrated strong performance in glomeruli classification, GBM segmentation, and MN progression prediction. These findings highlight the potential of multi-scale integrated quantification to provide objective, reproducible, and clinically interpretable assessment of MN progression.

Details DOI

EAAI Journal 2025 Journal Article

Multi-task cyclical consistency learning based medical image segmentation

Le Han
Jianan Zhang
Yan Hu
Xueyu Liu
Guanghui Yue
Mingqiang Wei
Yongfei Wu

Image segmentation and image reconstruction are two of the most prominent tasks in current computer vision research, with numerous advanced models contributing to increasing task accuracy. However, most of the existing models for these tasks are trained independently, overlooking the complementary potential of these tasks during the training process. In this work, we propose a progressive segmentation refinement strategy by designing a dual-stage joint multi-task consistency learning model based on the Transformer, effectively combining the image segmentation and image reconstruction tasks to achieve fine-grained segmentation of medical images. Specifically, we present a multi-stage joint multi-task consistency learning network, which includes a shared transformer encoder and two independent transformer decoders. These decoders are responsible for image segmentation and lesion region reconstruction tasks, respectively. The image reconstruction task aids the model in learning the feature representations of lesion regions, helping to refine the segmentation boundaries and improve segmentation precision. In addition, the model leverages semi-supervised learning by computing loss on the reconstructed masked lesion regions, further enhancing the generalizability of the model. Experimental results on the Kvasir-SEG, Kvasir-Capsule, ISIC 2016, and ISIC 2018 datasets demonstrate that our method outperforms other state-of-the-art methods.

Details DOI

NeurIPS Conference 2025 Conference Paper

SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting

Mengjiao Ma
Qi Ma
Yue Li
Jiahuan Cheng
Runyi Yang
Bin Ren
Nikola Popovic
Mingqiang Wei

3D Gaussian Splatting (3DGS) serves as a highly performant and efficient encoding of scene geometry, appearance, and semantics. Moreover, grounding language in 3D scenes has proven to be an effective strategy for 3D scene understanding. Current Language Gaussian Splatting line of work fall into three main groups: (i) per-scene optimization-based, (ii) per-scene optimization-free, and (iii) generalizable approach. However, most of them are evaluated only on rendered 2D views of a handful of scenes and viewpoints close to the training views, limiting ability and insight into holistic 3D understanding. To address this gap, we propose the first large-scale benchmark that systematically assesses these three groups of methods directly in 3D space, evaluating on 1060 scenes across three indoor datasets and one outdoor dataset. Benchmark results demonstrate a clear advantage of the generalizable paradigm, particularly in relaxing the scene-specific limitation, enabling fast feed-forward inference on novel scenes, and achieving superior segmentation performance. We further introduce SceneSplat-49K -- a carefully curated 3DGS dataset comprising of around 49K diverse indoor and outdoor scenes trained from multiple sources, with which we demonstrate generalizable approach could harness strong data priors. Our codes, benchmark, and datasets are available.

PDF Details

EAAI Journal 2024 Journal Article

A deep reinforcement learning model for dynamic job-shop scheduling problem with uncertain processing time

Xinquan Wu
Xuefeng Yan
Donghai Guan
Mingqiang Wei

The dynamic job-shop scheduling problem (DJSP) is a type of scheduling tasks where rescheduling is performed when encountering the uncertainties such as the uncertain operation processing time. However, the current deep reinforcement learning (DRL) scheduling approaches are hard to train convergent scheduling policies as the problem scale increases, which is very important for rescheduling under uncertainty. In this paper, we propose a DRL scheduling method for DJSP based on the proximal policy optimization (PPO) with hybrid prioritized experience replay. The job shop scheduling problem is formulated as a sequential decision-making problem based on Markov Decision Process (MDP) where a novel state representation is designed based on the feasible solution matrix which depicts the scheduling order of a scheduling task, a set of paired priority dispatching rules (PDR) are used as the actions and a new intuitive reward function is established based on the machine idle time. Moreover, a new hybrid prioritized experience replay method for PPO is proposed to reduce the training time where samples with positive temporal-difference (TD) error are replayed. Static experiments on classic benchmark instances show that the make-span obtained by our scheduling agent has been reduced by 1. 59% on average than the best known DRL methods. In addition, dynamic experiments demonstrate that the training time of the reused scheduling policy is reduced by 27% compared with the retrained policy when encountering uncertainties such as uncertain operation processing time.

Details DOI

AAAI Conference 2022 Conference Paper

I Can Find You! Boundary-Guided Separated Attention Network for Camouflaged Object Detection

Hongwei Zhu
Peng Li
Haoran Xie
Xuefeng Yan
Dong Liang
Dapeng Chen
Mingqiang Wei
Jing Qin

Can you find me? By simulating how humans to discover the so-called ‘perfectly’-camouflaged object, we present a novel boundary-guided separated attention network (call BSA-Net). Beyond the existing camouflaged object detection (COD) wisdom, BSA-Net utilizes two-stream separated attention modules to highlight the separator (or say the camouflaged object’s boundary) between an image’s background and foreground: the reverse attention stream helps erase the camouflaged object’s interior to focus on the background, while the normal attention stream recovers the interior and thus pay more attention to the foreground; and both streams are followed by a boundary guider module and combined to strengthen the understanding of the boundary. The core design of such separated attention is motivated by the COD procedure of humans: find the subtle difference between the foreground and background to delineate the boundary of a camouflaged object, then the boundary can help further enhance the COD accuracy. We validate on three benchmark datasets that our BSA-Net is very beneficial to detect camouflaged objects with the blurred boundaries and similar colors/patterns with their backgrounds. Extensive results exhibit very clear COD improvements on our BSA-Net over sixteen SOTAs.

PDF Details

AAAI Conference 2022 Conference Paper

Semantically Contrastive Learning for Low-Light Image Enhancement

Dong Liang
Ling Li
Mingqiang Wei
Shuo Yang
Liyan Zhang
Wenhan Yang
Yun Du
Huiyu Zhou

Low-light image enhancement (LLE) remains challenging due to the unfavorable prevailing low-contrast and weakvisibility problems of single RGB images. In this paper, we respond to the intriguing learning-related question – if leveraging both accessible unpaired over/underexposed images and high-level semantic guidance, can improve the performance of cutting-edge LLE models? Here, we propose an effective semantically contrastive learning paradigm for LLE (namely SCL-LLE). Beyond the existing LLE wisdom, it casts the image enhancement task as multi-task joint learning, where LLE is converted into three constraints of contrastive learning, semantic brightness consistency, and feature preservation for simultaneously ensuring the exposure, texture, and color consistency. SCL-LLE allows the LLE model to learn from unpaired positives (normal-light)/negatives (over/underexposed), and enables it to interact with the scene semantics to regularize the image enhancement network, yet the interaction of high-level semantic knowledge and the lowlevel signal prior is seldom investigated in previous methods. Training on readily available open data, extensive experiments demonstrate that our method surpasses the state-of-thearts LLE models over six independent cross-scenes datasets. Moreover, SCL-LLE’s potential to benefit the downstream semantic segmentation under extremely dark conditions is discussed. Source Code: https: //github. com/LingLIx/SCL-LLE.

PDF Details

ICRA Conference 2022 Conference Paper

Towards Robust Part-aware Instance Segmentation for Industrial Bin Picking

Yidan Feng
Biqi Yang
Xianzhi Li 0001
Chi-Wing Fu
Rui Cao
Kai Chen 0028
Qi Dou 0001
Mingqiang Wei

Industrial bin picking is a challenging task that requires accurate and robust segmentation of individual object instances. Particularly, industrial objects can have irregular shapes, that is, thin and concave, whereas in bin-picking scenarios, objects are often closely packed with strong occlusion. To address these challenges, we formulate a novel part-aware instance segmentation pipeline. The key idea is to decompose industrial objects into correlated approximate convex parts and enhance the object-level segmentation with part-level segmentation. We design a part-aware network to predict part masks and part-to-part offsets, followed by a part aggregation module to assemble the recognized parts into instances. To guide the network learning, we also propose an automatic label decoupling scheme to generate ground-truth part-level labels from instance-level labels. Finally, we contribute the first instance segmentation dataset, which contains a variety of industrial objects that are thin and have non-trivial shapes. Extensive experimental results on various industrial objects demonstrate that our method can achieve the best segmentation results compared with the state-of-the-art approaches.

Details

IJCAI Conference 2021 Conference Paper

Direction-aware Feature-level Frequency Decomposition for Single Image Deraining

Sen Deng
Yidan Feng
Mingqiang Wei
Haoran Xie
Yiping Chen
Jonathan Li
Xiao-Ping Zhang
Jing Qin

We present a novel direction-aware feature-level frequency decomposition network for single image deraining. Compared with existing solutions, the proposed network has three compelling characteristics. First, unlike previous algorithms, we propose to perform frequency decomposition at feature-level instead of image-level, allowing both low-frequency maps containing structures and high-frequency maps containing details to be continuously refined during the training procedure. Second, we further establish communication channels between low-frequency maps and high-frequency maps to interactively capture structures from high-frequency maps and add them back to low-frequency maps and, simultaneously, extract details from low-frequency maps and send them back to high-frequency maps, thereby removing rain streaks while preserving more delicate features in the input image. Third, different from existing algorithms using convolutional filters consistent in all directions, we propose a direction-aware filter to capture the direction of rain streaks in order to more effectively and thoroughly purge the input images of rain streaks. We extensively evaluate the proposed approach in three representative datasets and experimental results corroborate our approach consistently outperforms state-of-the-art deraining algorithms.

PDF Details DOI