Author name cluster

Min Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

36 papers

2 author rows

JBHI Journal 2026 Journal Article

Asymmetric Co-Training With Decoder–Head Decoupling for Semi-Supervised Medical Image Segmentation

Yuxin Tian
Muhan Shi
Jianxun Li
Bin Zhang
Min Qu
Yinxue Shi
Xian Yang
Min Wang

Semi-supervised learning reduces annotation costs in medical image segmentation by leveraging abundant unlabeled data alongside scarce labels. Most models adopt an encoder–decoder architecture with a task-specific segmentation head. While co-training is effective, existing frameworks suffer from intra-network coupling (decoder–head binding) and inter-network coupling (over-aligned predictions), which reduce prediction diversity and amplify confirmation bias–particularly for small structures, ambiguous boundaries, and anatomically variable regions. We propose AsyCo, an asymmetric co-training framework with two components. (1) Asymmetric Decoder Coupling implements decoder–head decoupling by dynamically remapping encoder–decoder features to non-default heads across branches, breaking intra-network coupling and creating diverse prediction paths without additional parameters. (2) Hierarchical Consistency Regularization converts this diversity into stable supervision by aligning (i) the two branches' final outputs along their default paths (branch-output consistency), (ii) predictions from different segmentation heads evaluated on identical decoder features (inter-head consistency), and (iii) intermediate encoder–decoder representations (representation consistency). Through these mechanisms, AsyCo explicitly mitigates both intra- and inter-network coupling, improving training stability and reducing confirmation bias. Extensive experiments on three clinical benchmarks under limited-label regimes demonstrate that AsyCo consistently outperforms nine state-of-the-art semi-supervised learning methods. These results indicate that AsyCo delivers accurate and reliable segmentation with minimal annotation, thereby enhancing the reliability of medical image analysis in real-world clinical practice.

YNICL Journal 2026 Journal Article

Functional gradient reorganization and transcriptomic signatures underlying 1 Hz rTMS treatment in first-episode schizophrenia following right orbitofrontal cortex stimulation

Wenwen Miao
Xiong Jiao
Ningning Zeng
Min Wang
Kexu Zhang
Cheng Yang
Yuanjun Xie
Ziliang Wang

BACKGROUND: The orbitofrontal cortex (OFC) represents a promising yet underexplored neuromodulation target for negative symptoms in schizophrenia, and the macroscale and microscale mechanisms underlying its therapeutic effects remain unclear. METHODS: In a randomized, double-blind, sham-controlled trial, 84 patients with drug-naïve first-episode schizophrenia received consecutive 20 days of active 1-Hz repetitive transcranial magnetic stimulation (rTMS) over the right OFC (n = 45) or sham (n = 39). Resting-state fMRI gradient mapping was integrated with imaging transcriptomics to characterize multiscale neural reorganization and predict treatment response. RESULTS: Active rTMS was associated with greater improvement in negative symptoms and general psychopathology compared with sham stimulation. Gradient analyses indicated a renormalization of macroscale functional hierarchy, with increased principal gradient scores in transmodal nodes (e.g., right middle occipital gyrus and angular gyri) and decreased scores in salience-related regions (e.g., anterior cingulate cortex). These spatial patterns were coupled to transcriptomic signatures enriched for synaptic plasticity and ion transport, with cell-type enrichment implicating cortical interneurons and oligodendrocytes. In addition, baseline gradient topographies in the right middle occipital gyrus and ventral attention network predicted improvements in negative and general symptoms, respectively. CONCLUSIONS: Low-frequency OFC-rTMS was associated with alleviation of negative symptoms in first-episode schizophrenia, alongside a reconfiguration of cortical functional hierarchy. The observed gradient effects were spatially linked to molecular pathways underlying synaptic plasticity and excitation-inhibition regulation, supporting functional gradients as biomarkers for precision neuromodulation in early-stage schizophrenia.

JBHI Journal 2026 Journal Article

Liquid-Sequencer: A Lightweight Liquid Neural Network for Real-Time Fetal Congenital Heart Disease Diagnosis

Sibo Qiao
Ju Gao
Min Wang
Qiang Guo
Cong Liu
Shuqiang Wang
Zhiyuan Zhao
Mohammad Shabaz

Congenital heart disease (CHD) is the leading cause of neonatal mortality worldwide, making early and accurate diagnosis crucial. In resource-constrained regions, standardized ultrasound screening remains difficult due to the shortage of specialized clinicians. Digital Twin (DT) technology, which constructs virtual AI diagnostic models, enables personalized assessments of heart structure and function, offering intelligent diagnostic support in primary or remote healthcare settings. However, the real-time updates required by DT systems place increased demands on the inference speed and computational efficiency of AI models. Existing methods often suffer from parameter redundancy and inference delays, making them inadequate for meeting the low-latency, large-scale needs of clinical applications. To overcome these challenges, we propose the Liquid-Sequencer, a lightweight model for the diagnosis of fetal CHD. The model first employs a convolutional network with DPSE (Depthwise Separable and Squeeze-and-Excitation) modules to efficiently extract spatial features by utilizing depthwise separable convolution and channel attention. These feature maps are then processed by a bidirectional liquid sequence module, where orthogonally scanned Liquid Neural Networks (LNNs) capture global context with linear complexity, offering a more efficient alternative to self-attention mechanisms. This integration of spatial and sequential learning is well-suited to the dynamic nature of fetal cardiac ultrasound and the demands of DT systems. Experimental results on 12 datasets demonstrate outstanding performance with only 0. 30M parameters. Additionally, t-SNE visualizations reveal highly discriminative feature representations and clear inter-class separations, underscoring the model's potential as an advanced diagnostic tool.

AAAI Conference 2026 Conference Paper

Offline Meta-Reinforcement Learning with Flow-Based Task Inference and Adaptive Correction of Feature Overgeneralization

Min Wang
Xin Li
Mingzhong Wang
Hasnaa Bennis

Offline meta-reinforcement learning (OMRL) combines the strengths of learning from diverse datasets in offline RL with the adaptability to new tasks of meta-RL, promising safe and efficient knowledge acquisition by RL agents. However, OMRL still suffers extrapolation errors due to out-of-distribution (OOD) actions, compromised by broad task distributions and Markov Decision Process (MDP) ambiguity in meta-RL setups. Existing research indicates that the generalization of the Q network affects the extrapolation error in offline RL. This paper investigates this relationship by decomposing the Q value into feature and weight components, observing that while decomposition enhances adaptability and convergence in the case of high-quality data, it often leads to policy degeneration or collapse in complex tasks. We observe that decomposed Q values introduce a large estimation bias when the feature encounters OOD samples, a phenomenon we term "feature overgeneralization''. To address this issue, we propose FLORA, which identifies OOD samples by modeling feature distributions and estimating their uncertainties. FLORA integrates a return feedback mechanism to adaptively adjust feature components. Furthermore, to learn precise task representations, FLORA explicitly models the complex task distribution using a chain of invertible transformations. We theoretically and empirically demonstrate that FLORA achieves rapid adaptation and meta-policy improvement compared to baselines across various environments.

PDF Details DOI

JBHI Journal 2026 Journal Article

SIBW: A Swarm Intelligence-Based Network Flow Watermarking Approach for Privacy Leakage Detection in Digital Healthcare Systems

Sibo Qiao
Qiang Guo
Fengdong Shi
Min Wang
Haohao Zhu
Fazlullah Khan
Joel J. P. C. Rodrigues
Zhihan Lyu

The exponential growth of sensitive patient information and diagnostic records in digital healthcare systems has increased the complexity of data protection, while frequent medical data breaches severely compromise system security and reliability. Existing privacy protection techniques often lack robustness and real-time capabilities in high-noise, high-packet-loss, and dynamic network environments, limiting their effectiveness in detecting healthcare data leaks. To address these challenges, we propose a Swarm Intelligence-Based Network Watermarking (SIBW) method for real-time privacy data leakage detection in digital healthcare systems. SIBW integrates fountain codes with outer error correction codes and employs a Multi-Phase Synergistic Swarm Optimization Algorithm (MPSSOA) to dynamically optimize encoding parameters, significantly enhancing the robustness and interference resistance of watermark detection. Additionally, a reliable synchronization sequence and lightweight embedding mechanism are designed to ensure adaptability to complex, dynamic networks. Experimental results demonstrate that SIBW achieves over 90% detection accuracy under high latency jitter and packet loss conditions, surpassing existing methods in both robustness and efficiency. With a compact design of only 3. 7 MB, SIBW is particularly suited for rapid deployment in resource-constrained digital healthcare systems.

AAAI Conference 2025 Conference Paper

Bright-NeRF: Brightening Neural Radiance Field with Color Restoration from Low-Light RAW Images

Min Wang
Xin Huang
Guoqing Zhou
Qifeng Guo
Qing Wang

Neural Radiance Fields (NeRF) have demonstrated prominent performance in novel view synthesis tasks. However, their input heavily relies on image acquisition under normal light conditions, making it challenging to learn accurate scene contents in low-light environments where images typically exhibit significant noise and severe color distortion. To address these challenges, we propose a novel approach, Bright-NeRF, which learns enhanced and high-quality radiance fields from multi-view low-light RAW images in an unsupervised manner. Our method simultaneously achieves color restoration, denoising, and enhanced novel view synthesis. Specifically, we leverage a physically-inspired model of the sensor's response to illumination and introduce a chromatic adaptation loss to constrain the learning of response, enabling consistent color perception of objects regardless of lighting conditions. We further utilize the RAW data's properties to expose the scene's intensity automatically. Additionally, we have collected a multi-view low-light RAW image dataset of real-world scenes to advance research in this field. Experimental results demonstrate that our proposed method significantly outperforms existing 2D and 3D approaches. Our code and dataset will be made publicly available.

PDF Details DOI

ICLR Conference 2025 Conference Paper

Chain-of-Focus Prompting: Leveraging Sequential Visual Cues to Prompt Large Autoregressive Vision Models

Jiyang Zheng
Jialiang Shen
Yu Yao 0005
Min Wang
Yang Yang 0002
Dadong Wang
Tongliang Liu

In-context learning (ICL) has revolutionized natural language processing by enabling models to adapt to diverse tasks with only a few illustrative examples. However, the exploration of ICL within the field of computer vision remains limited. Inspired by Chain-of-Thought (CoT) prompting in the language domain, we propose Chain-of-Focus (CoF) Prompting, which enhances vision models by enabling step-by-step visual comprehension. CoF Prompting addresses the challenges of absent logical structure in visual data by generating intermediate reasoning steps through visual saliency. Moreover, it provides a solution for creating tailored prompts from visual inputs by selecting contextually informative prompts based on query similarity and target richness. The significance of CoF prompting is demonstrated by the recent introduction of Large Autoregressive Vision Models (LAVMs), which predict downstream targets via in-context learning with pure visual inputs. By integrating intermediate reasoning steps into visual prompts and effectively selecting the informative ones, the LAVMs are capable of generating significantly better inferences. Extensive experiments on downstream visual understanding tasks validate the effectiveness of our proposed method for visual in-context learning.

NeurIPS Conference 2025 Conference Paper

Geometric Logit Decoupling for Energy-Based Graph Out-of-distribution Detection

Min Wang
Hao Yang
Qing Cheng
Jincai Huang

GNNs have achieved remarkable performance across a range of tasks, but their reliability under distribution shifts remains a significant challenge. In particular, energy-based OOD detection methods—which compute energy scores from GNN logits—suffer from unstable performance due to a fundamental coupling between the norm and direction of node embeddings. Our analysis reveals that this coupling leads to systematic misclassification of high-norm OOD samples and hinders reliable ID–OOD separation. Interestingly, GNNs also exhibit a desirable inductive bias known as angular clustering, where embeddings of the same class align in direction. Motivated by these observations, we propose GeoEnergy (Geometric Logit Decoupling for Energy-Based OOD Detection), a plug-and-play framework that enforces hyperspherical logit geometry by normalizing class weights while preserving embedding norms. This decoupling yields more structured energy distributions, sharper intra-class alignment, and improved calibration. GeoEnergy can be integrated into existing energy-based GNNs without retraining or architectural modification. Extensive experiments demonstrate that GeoEnergy consistently improves OOD detection performance and confidence reliability across various benchmarks and distribution shifts.

EAAI Journal 2025 Journal Article

Probabilistic intervals prediction based on adaptive regression with attention residual connections and covariance constraints

Fan Zhang
Min Wang
Lin Li
Yepeng Liu
Hua Wang

This paper introduces a novel prediction interval method called Adaptive Regression with Attention Residual Connection and Covariance Constraint (AR-ARCC). By integrating Monte Carlo and Bayesian methods, we leverage the strengths of both to achieve a more flexible and accurate method for generating prediction intervals. Additionally, through the optimization of the loss function, introduction of penalty terms, and improvement of mean squared error calculations, the model’s performance in interval prediction tasks is enhanced. Finally, the integration of an interactive channel heterogeneous self-attention module, combined with residual blocks, enhances the modeling capability of the neural network. The comprehensive application of these methods results in superior performance of the model in handling uncertainty and local variations.

JBHI Journal 2024 Journal Article

Deep Learning-Based Microscopic Cell Detection Using Inverse Distance Transform and Auxiliary Counting

Rui Liu
Wei Dai
Cong Wu
Tianyi Wu
Min Wang
Junxian Zhou
Xiaozhen Zhang
Wen Jung Li

Microscopic cell detection is a challenging task due to significant inter-cell occlusions in dense clusters and diverse cell morphologies. This paper introduces a novel framework designed to enhance automated cell detection. The proposed approach integrates a deep learning model that produces an inverse distance transform-based detection map from the given image, accompanied by a secondary network designed to regress a cell density map from the same input. The inverse distance transform-based map effectively highlights each cell instance in the densely populated areas, while the density map accurately estimates the total cell count in the image. Then, a custom counting-aided cell center extraction strategy leverages the cell count obtained by integrating over the density map to refine the detection process, significantly reducing false responses and thereby boosting overall accuracy. The proposed framework demonstrated superior performance with F-scores of 96. 93%, 91. 21%, and 92. 00% on the VGG, MBM, and ADI datasets, respectively, surpassing existing state-of-the-art methods. It also achieved the lowest distance error, further validating the effectiveness of the proposed approach. These results demonstrate significant potential for automated cell analysis in biomedical applications.

JBHI Journal 2024 Journal Article

Deeply Supervised Skin Lesions Diagnosis With Stage and Branch Attention

Wei Dai
Rui Liu
Tianyi Wu
Min Wang
Jianqin Yin
Jun Liu

Accurate and unbiased examinations of skin lesions are critical for the early diagnosis and treatment of skin diseases. Visual features of skin lesions vary significantly because the images are collected from patients with different lesion colours and morphologies by using dissimilar imaging equipment. Recent studies have reported that ensembled convolutional neural networks (CNNs) are practical to classify the images for early diagnosis of skin disorders. However, the practical use of these ensembled CNNs is limited as these networks are heavyweight and inadequate for processing contextual information. Although lightweight networks (e. g. , MobileNetV3 and EfficientNet) were developed to achieve parameter reduction for implementing deep neural networks on mobile devices, insufficient depth of feature representation restricts the performance. To address the existing limitations, we develop a new lite and effective neural network, namely HierAttn. The HierAttn applies a novel deep supervision strategy to learn the local and global features by using multi-stage and multi-branch attention mechanisms with only one training loss. The efficacy of HierAttn was evaluated by using the dermoscopy images dataset ISIC2019 and smartphone photos dataset PAD-UFES-20 (PAD2020). The experimental results show that HierAttn achieves the best accuracy and area under the curve (AUC) among the state-of-the-art lightweight networks.

YNIMG Journal 2024 Journal Article

Detection of individual brain tau deposition in Alzheimer's disease based on latent feature-enhanced generative adversarial network

Jiehui Jiang
Rong Shi
Jiaying Lu
Min Wang
Qi Zhang
Shuoyan Zhang
Luyao Wang
Ian Alberts

OBJECTIVE: The conventional methods for interpreting tau PET imaging in Alzheimer's disease (AD), including visual assessment and semi-quantitative analysis of fixed hallmark regions, are insensitive to detect individual small lesions because of the spatiotemporal neuropathology's heterogeneity. In this study, we proposed a latent feature-enhanced generative adversarial network model for the automatic extraction of individual brain tau deposition regions. METHODS: The latent feature-enhanced generative adversarial network we propose can learn the distribution characteristics of tau PET images of cognitively normal individuals and output the abnormal distribution regions of patients. This model was trained and validated using 1131 tau PET images from multiple centres (with distinct races, i.e., Caucasian and Mongoloid) with different tau PET ligands. The overall quality of synthetic imaging was evaluated using structural similarity (SSIM), peak signal to noise ratio (PSNR), and mean square error (MSE). The model was compared to the fixed templates method for diagnosing and predicting AD. RESULTS: The reconstructed images archived good quality, with SSIM = 0.967 ± 0.008, PSNR = 31.377 ± 3.633, and MSE = 0.0011 ± 0.0007 in the independent test set. The model showed higher classification accuracy (AUC = 0.843, 95 % CI = 0.796-0.890) and stronger correlation with clinical scales (r = 0.508, P < 0.0001). The model also achieved superior predictive performance in the survival analysis of cognitive decline, with a higher hazard ratio: 3.662, P < 0.001. INTERPRETATION: The LFGAN4Tau model presents a promising new approach for more accurate detection of individualized tau deposition. Its robustness across tracers and races makes it a potentially reliable diagnostic tool for AD in practice.

EAAI Journal 2024 Journal Article

High-resolution cross-scale transformer: A deep learning model for bolt loosening detection based on monocular vision measurement

Tianyi Wu
Ke Shang
Wei Dai
Min Wang
Rui Liu
Junxian Zhou
Jun Liu

The reliability of bolt connections significantly impacts the operational state and lifespan of industrial equipment. Vision-based noncontact methods exhibit high efficiency in bolt loosening detection. However, limited image features hinder measurement accuracy. To improve bolt loosening detection performance, this paper proposes a novel deep learning backbone, the high-resolution cross-scale transformer, to extract high precision keypoints for bolt three-dimensional model construction. Simultaneously, a monocular vision measurement model is established to get the bolt exposed length and evaluate the connection loosening state. The proposed backbone hybridizes the advantages of high-resolution architecture and transformer, realizing global information aggregation and fine-grained image details. A simplified module, dual-scale multi-head self-attention, is designed to reduce the computational redundancy caused by the implementation of high-resolution multi-branch architecture. In the experiment section, the high-resolution cross-scale transformer outperforms other keypoint detection baselines, achieving the top one performance with 91. 6 average precision and 84. 9 average recall. The monocular vision measurement model realizes a 0. 053 mm error with a 0. 028 mm standard deviation, satisfying the industrial implementation requirement. Additionally, the model is tested on different industrial situations and an additional outside dataset, indicating the model’s robustness and actual environment adaptability.

AAAI Conference 2024 Conference Paper

MetaCARD: Meta-Reinforcement Learning with Task Uncertainty Feedback via Decoupled Context-Aware Reward and Dynamics Components

Min Wang
Xin Li
Leiji Zhang
Mingzhong Wang

Meta-Reinforcement Learning (Meta-RL) aims to reveal shared characteristics in dynamics and reward functions across diverse training tasks. This objective is achieved by meta-learning a policy that is conditioned on task representations with encoded trajectory data or context, thus allowing rapid adaptation to new tasks from a known task distribution. However, since the trajectory data generated by the policy may be biased, the task inference module tends to form spurious correlations between trajectory data and specific tasks, thereby leading to poor adaptation to new tasks. To address this issue, we propose the Meta-RL with task unCertAinty feedback through decoupled context-aware Reward and Dynamics components (MetaCARD). MetaCARD distinctly decouples the dynamics and rewards when inferring tasks and integrates task uncertainty feedback from policy evaluation into the task inference module. This design effectively reduces uncertainty in tasks with changes in dynamics or/and reward functions, thereby enabling accurate task identification and adaptation. The experiment results on both Meta-World and classical MuJoCo benchmarks show that MetaCARD significantly outperforms prevailing Meta-RL baselines, demonstrating its remarkable adaptation ability in sophisticated environments that involve changes in both reward functions and dynamics.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Moderate Message Passing Improves Calibration: A Universal Way to Mitigate Confidence Bias in Graph Neural Networks

Min Wang
Hao Yang
Jincai Huang
Qing Cheng

Confidence calibration in Graph Neural Networks (GNNs) aims to align a model's predicted confidence with its actual accuracy. Recent studies have indicated that GNNs exhibit an under-confidence bias, which contrasts the over-confidence bias commonly observed in deep neural networks. However, our deeper investigation into this topic reveals that not all GNNs exhibit this behavior. Upon closer examination of message passing in GNNs, we found a clear link between message aggregation and confidence levels. Specifically, GNNs with extensive message aggregation, often seen in deep architectures or when leveraging large amounts of labeled data, tend to exhibit overconfidence. This overconfidence can be attributed to factors like over-learning and over-smoothing. Conversely, GNNs with fewer layers, known for their balanced message passing and superior node representation, may exhibit under-confidence. To counter these confidence biases, we introduce the Adaptive Unified Label Smoothing (AU-LS) technique. Our experiments show that AU-LS outperforms existing methods, addressing both over and under-confidence in various GNN scenarios.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach

Ziyin Zhang
Ning Lu
Minghui Liao
Yongshuai Huang
Cheng Li
Min Wang
Wei Peng

Text recognition methods are gaining rapid development. Some advanced techniques, e.g., powerful modules, language models, and un- and semi-supervised learning schemes, consecutively push the performance on public benchmarks forward. However, the problem of how to better optimize a text recognition model from the perspective of loss functions is largely overlooked. CTC-based methods, widely used in practice due to their good balance between performance and inference speed, still grapple with accuracy degradation. This is because CTC loss emphasizes the optimization of the entire sequence target while neglecting to learn individual characters. We propose a self-distillation scheme for CTC-based model to address this issue. It incorporates a framewise regularization term in CTC loss to emphasize individual supervision, and leverages the maximizing-a-posteriori of latent alignment to solve the inconsistency problem that arises in distillation between CTC-based models. We refer to the regularized CTC loss as Distillation Connectionist Temporal Classification (DCTC) loss. DCTC loss is module-free, requiring no extra parameters, longer inference lag, or additional training data or phases. Extensive experiments on public benchmarks demonstrate that DCTC can boost text recognition model accuracy by up to 2.6%, without any of these drawbacks.

PDF Details DOI

EAAI Journal 2023 Journal Article

MPDNet: An underwater image deblurring framework with stepwise feature refinement module

Guangjie Han
Min Wang
Hongbo Zhu
Chuan Lin

In this study, a general network model called multi-progressive image deblurring network is proposed to correct blurring artifacts and local imaging details in underwater images. As a solution to nonuniform image distortion, a deformable convolution module was designed to enrich the encoded information of the image representation. Using a stepwise feature refinement module, multi-progressive image deblurring network can reduce the loss of contextual information to produce a more realistic underwater image for subsequent applications. Constructing a loss function based on multi-scale content can help the model improve image perception quality. We conducted experimental evaluations on large-scale image deblurring benchmark datasets, such as GoPro and HIDE, achieving excellent results with 32. 84 dB and 31. 03 dB peak signal-to-noise ratio, respectively, using the proposed method. Subsequently, a detailed optimization comparison was conducted on the in-house underwater image deblurring dataset. Multi-progressive image deblurring network obtained higher-quality, clearer images. Compared with the current state-of-the-art image deblurring algorithms, the proposed model achieved significant results with a 6. 6% increase in deblur performance in peak signal-to-noise ratio. Finally, we conducted ablation experiments to evaluate the effectiveness of all the modules in the proposed framework.

YNICL Journal 2023 Journal Article

The heterogeneity of asymmetric tau distribution is associated with an early age at onset and poor prognosis in Alzheimer’s disease

Jiaying Lu
Zhengwei Zhang
Ping Wu
Xiaoniu Liang
Huiwei Zhang
Jimin Hong
Christoph Clement
Tzu-Chen Yen

PURPOSE: Left-right asymmetry, an important feature of brain development, has been implicated in neurodegenerative diseases, although it's less discussed in typical Alzheimer's disease (AD). We sought to investigate whether asymmetric tau deposition plays a potential role in AD heterogeneity. METHODS: F-Florzolotau]. Based on the absolute global tau interhemispheric differences, each cohort was divided into two groups (asymmetric versus symmetric tau distribution). The two groups were cross-sectionally compared in terms of demographic, cognitive characteristics, and pathological burden. The cognitive decline trajectories were analyzed longitudinally. RESULTS: Fourteen (23.3%) and 42 (48.3%) patients in the ADNI and SMS cohorts showed an asymmetric tau distribution, respectively. An asymmetric tau distribution was associated with an earlier age at disease onset (proportion of early-onset AD: ADNI/SMS/combined cohorts, p = 0.093/0.026/0.001) and more severe pathological burden (i.e., global tau burden: ADNI/SMS cohorts, p < 0.001/= 0.007). And patients with an asymmetric tau distribution were characterized by a steeper cognitive decline longitudinally (i.e., the annual decline of Mini-Mental Status Examination score: ADNI/SMS/combined cohorts, p = 0.053 / 0.035 / < 0.001). CONCLUSIONS: Asymmetry in tau deposition, which may be associated with an earlier age at onset, more severe pathological burden, and a steeper cognitive decline, is potentially an important characteristic of AD heterogeneity.

AAAI Conference 2023 Conference Paper

WaveForM: Graph Enhanced Wavelet Learning for Long Sequence Forecasting of Multivariate Time Series

Fuhao Yang
Xin Li
Min Wang
Hongyu Zang
Wei Pang
Mingzhong Wang

Multivariate time series (MTS) analysis and forecasting are crucial in many real-world applications, such as smart traffic management and weather forecasting. However, most existing work either focuses on short sequence forecasting or makes predictions predominantly with time domain features, which is not effective at removing noises with irregular frequencies in MTS. Therefore, we propose WaveForM, an end-to-end graph enhanced Wavelet learning framework for long sequence FORecasting of MTS. WaveForM first utilizes Discrete Wavelet Transform (DWT) to represent MTS in the wavelet domain, which captures both frequency and time domain features with a sound theoretical basis. To enable the effective learning in the wavelet domain, we further propose a graph constructor, which learns a global graph to represent the relationships between MTS variables, and graph-enhanced prediction modules, which utilize dilated convolution and graph convolution to capture the correlations between time series and predict the wavelet coefficients at different levels. Extensive experiments on five real-world forecasting datasets show that our model can achieve considerable performance improvement over different prediction lengths against the most competitive baseline of each dataset.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

A Universal PINNs Method for Solving Partial Differential Equations with a Point Source

Xiang Huang
Hongsheng Liu
Beiji Shi
Zidong Wang
Kang Yang
Yang Li
Min Wang
Haotian Chu

In recent years, deep learning technology has been used to solve partial differential equations (PDEs), among which the physics-informed neural networks (PINNs)method emerges to be a promising method for solving both forward and inverse PDE problems. PDEs with a point source that is expressed as a Dirac delta function in the governing equations are mathematical models of many physical processes. However, they cannot be solved directly by conventional PINNs method due to the singularity brought by the Dirac delta function. In this paper, we propose a universal solution to tackle this problem by proposing three novel techniques. Firstly the Dirac delta function is modeled as a continuous probability density function to eliminate the singularity at the point source; secondly a lower bound constrained uncertainty weighting algorithm is proposed to balance the physics-informed loss terms of point source area and the remaining areas; and thirdly a multi-scale deep neural network with periodic activation function is used to improve the accuracy and convergence speed. We evaluate the proposed method with three representative PDEs, and the experimental results show that our method outperforms existing deep learning based methods with respect to the accuracy, the efficiency and the versatility.

PDF Details DOI

AAAI Conference 2022 Conference Paper

Learning Token-Based Representation for Image Retrieval

Hui Wu
Min Wang
Wengang Zhou
Yang Hu
Houqiang Li

In image retrieval, deep local features learned in a data-driven manner have been demonstrated effective to improve retrieval performance. To realize efficient retrieval on large image database, some approaches quantize deep local features with a large codebook and match images with aggregated match kernel. However, the complexity of these approaches is nontrivial with large memory footprint, which limits their capability to jointly perform feature learning and aggregation. To generate compact global representations while maintaining regional matching capability, we propose a unified framework to jointly learn local feature representation and aggregation. In our framework, we first extract deep local features using CNNs. Then, we design a tokenizer module to aggregate them into a few visual tokens, each corresponding to a specific visual pattern. This helps to remove background noise, and capture more discriminative regions in the image. Next, a refinement block is introduced to enhance the visual tokens with self-attention and cross-attention. Finally, different visual tokens are concatenated to generate a compact global representation. The whole framework is trained end-to-end with image-level labels. Extensive experiments are conducted to evaluate our approach, which outperforms the state-of-the-art methods on the Revisited Oxford and Paris datasets.

NeurIPS Conference 2022 Conference Paper

Meta-Auto-Decoder for Solving Parametric Partial Differential Equations

Xiang Huang
Zhanhong Ye
Hongsheng Liu
Shi Ji
Zidong Wang
Kang Yang
Yang Li
Min Wang

Many important problems in science and engineering require solving the so-called parametric partial differential equations (PDEs), i. e. , PDEs with different physical parameters, boundary conditions, shapes of computation domains, etc. Recently, building learning-based numerical solvers for parametric PDEs has become an emerging new field. One category of methods such as the Deep Galerkin Method (DGM) and Physics-Informed Neural Networks (PINNs) aim to approximate the solution of the PDEs. They are typically unsupervised and mesh-free, but require going through the time-consuming network training process from scratch for each set of parameters of the PDE. Another category of methods such as Fourier Neural Operator (FNO) and Deep Operator Network (DeepONet) try to approximate the solution mapping directly. Being fast with only one forward inference for each PDE parameter without retraining, they often require a large corpus of paired input-output observations drawn from numerical simulations, and most of them need a predefined mesh as well. In this paper, we propose Meta-Auto-Decoder (MAD), a mesh-free and unsupervised deep learning method that enables the pre-trained model to be quickly adapted to equation instances by implicitly encoding (possibly heterogenous) PDE parameters as latent vectors. The proposed method MAD can be interpreted by manifold learning in infinite-dimensional spaces, granting it a geometric insight. Extensive numerical experiments show that the MAD method exhibits faster convergence speed without losing accuracy than other deep learning-based methods.

JBHI Journal 2022 Journal Article

Non-Invasive Glucose Metabolism Quantification Method Based on Unilateral ICA Image Derived Input Function by Hybrid PET/MR in Ischemic Cerebrovascular Disease

Min Wang
Bixiao Cui
Yi Shan
Hongwei Yang
Zhuangzhi Yan
Lalith Kumar Shiyam Sundar
Ian Alberts
Axel Rominger

The non-invasive quantification of the cerebral metabolic rate for glucose (CMRGlc) and the characterization of cerebral metabolism in the cerebrovascular territories are helpful in understanding ischemic cerebrovascular disease (ICVD). Firstly, we investigated a non-invasive quantification approach based on an image-derived input function (IDIF) in ICVD. Second, we studied the metabolic changes in CMRGlc after surgical intervention. We evaluated the hypothesis that the IDIF method based on the unilateral internal carotid artery could address challenges in ICVD quantification. The CMRGlc and standardized uptake value ratio (SUVR) were used to measure glucose metabolism activity. Healthy controls showed no significant differences in CMRGlc values between bilateral and unilateral IDIF measurements (intraclass correlation coefficient [ICC]: 0. 91–0. 98). Patients with ICVD showed significantly increased CMRGlc values after surgical intervention for all territories (percentage changes: 7. 4%–22. 5%). In contrast, SUVR showed minor differences between postoperative and preoperative patients, indicating that it was a poor biomarker for the diagnosis of ICVD. A significant association between CMRGlc and the National Institutes of Health Stroke Scale (NIHSS) scores was observed ( r =-0. 54). Our findings suggested that IDIF could be a valuable tool for CMRGlc quantification in patients with ICVD and may advance personalized precision interventions.

AAAI Conference 2021 Conference Paper

A Trace-restricted Kronecker-Factored Approximation to Natural Gradient

Kaixin Gao
Xiaolei Liu
Zhenghai Huang
Min Wang
Zidong Wang
Dachuan Xu
Fan Yu

Second-order optimization methods have the ability to accelerate convergence by modifying the gradient through the curvature matrix. There have been many attempts to use secondorder optimization methods for training deep neural networks. In this work, inspired by diagonal approximations and factored approximations such as Kronecker-factored Approximate Curvature (KFAC), we propose a new approximation to the Fisher information matrix (FIM) called Trace-restricted Kronecker-factored Approximate Curvature (TKFAC), which can hold the certain trace relationship between the exact and the approximate FIM. In TKFAC, we decompose each block of the approximate FIM as a Kronecker product of two smaller matrices and scaled by a coefficient related to trace. We theoretically analyze TKFAC’s approximation error and give an upper bound of it. We also propose a new damping technique for TKFAC on convolutional neural networks to maintain the superiority of second-order optimization methods during training. Experiments show that our method has better performance compared with several state-of-the-art algorithms on some deep network architectures.

NeurIPS Conference 2021 Conference Paper

Contextual Similarity Aggregation with Self-attention for Visual Re-ranking

Jianbo Ouyang
Hui Wu
Min Wang
Wengang Zhou
Houqiang Li

In content-based image retrieval, the first-round retrieval result by simple visual feature comparison may be unsatisfactory, which can be refined by visual re-ranking techniques. In image retrieval, it is observed that the contextual similarity among the top-ranked images is an important clue to distinguish the semantic relevance. Inspired by this observation, in this paper, we propose a visual re-ranking method by contextual similarity aggregation with self-attention. In our approach, for each image in the top-K ranking list, we represent it into an affinity feature vector by comparing it with a set of anchor images. Then, the affinity features of the top-K images are refined by aggregating the contextual information with a transformer encoder. Finally, the affinity features are used to recalculate the similarity scores between the query and the top-K images for re-ranking of the latter. To further improve the robustness of our re-ranking model and enhance the performance of our method, a new data augmentation scheme is designed. Since our re-ranking model is not directly involved with the visual feature used in the initial retrieval, it is ready to be applied to retrieval result lists obtained from various retrieval algorithms. We conduct comprehensive experiments on four benchmark datasets to demonstrate the generality and effectiveness of our proposed visual re-ranking method.

TIST Journal 2021 Journal Article

Fine-Grained Semantic Image Synthesis with Object-Attention Generative Adversarial Network

Min Wang
Congyan Lang
Liqian Liang
Songhe Feng
Tao Wang
Yutong Gao

Semantic image synthesis is a new rising and challenging vision problem accompanied by the recent promising advances in generative adversarial networks. The existing semantic image synthesis methods only consider the global information provided by the semantic segmentation mask, such as class label, global layout, and location, so the generative models cannot capture the rich local fine-grained information of the images (e.g., object structure, contour, and texture). To address this issue, we adopt a multi-scale feature fusion algorithm to refine the generated images by learning the fine-grained information of the local objects. We propose OA-GAN, a novel object-attention generative adversarial network that allows attention-driven, multi-fusion refinement for fine-grained semantic image synthesis. Specifically, the proposed model first generates multi-scale global image features and local object features, respectively, then the local object features are fused into the global image features to improve the correlation between the local and the global. In the process of feature fusion, the global image features and the local object features are fused through the channel-spatial-wise fusion block to learn ‘what’ and ‘where’ to attend in the channel and spatial axes, respectively. The fused features are used to construct correlation filters to obtain feature response maps to determine the locations, contours, and textures of the objects. Extensive quantitative and qualitative experiments on COCO-Stuff, ADE20K and Cityscapes datasets demonstrate that our OA-GAN significantly outperforms the state-of-the-art methods.

AAAI Conference 2021 Conference Paper

THOR, Trace-based Hardware-driven Layer-Oriented Natural Gradient Descent Computation

Mengyun Chen
Kaixin Gao
Xiaolei Liu
Zidong Wang
Ningxi Ni
Qian Zhang
Lei Chen
Chao Ding

It is well-known that second-order optimizer can accelerate the training of deep neural networks, however, the huge computation cost of second-order optimization makes it impractical to apply in real practice. In order to reduce the cost, many methods have been proposed to approximate a second-order matrix. Inspired by KFAC, we propose a novel Trace-based Hardware-driven layer-ORiented Natural Gradient Descent Computation method, called THOR, to make the second-order optimization applicable in the real application models. Specifically, we gradually increase the update interval and use the matrix trace to determine which blocks of Fisher Information Matrix (FIM) need to be updated. Moreover, by resorting the power of hardware, we have designed a hardware-driven approximation method for computing FIM to achieve better performance. To demonstrate the effectiveness of THOR, we have conducted extensive experiments. The results show that training ResNet-50 on ImageNet with THOR only takes 66. 7 minutes to achieve a top-1 accuracy of 75. 9 % under an 8 Ascend 910 environment with MindSpore, a new deep learning computing framework. Moreover, with more computational resources, THOR can only takes 2. 7 minutes to 75. 9 % with 256 Ascend 910.

JBHI Journal 2021 Journal Article

Two-Way MR-Forest Based Growing Path Classification for Malignancy Estimation of Pulmonary Nodules

Hongbo Zhu
Guangjie Han
Chuan Lin
Min Wang
Mohsen Guizani
Jianxia Hou
Wei Xing

This paper proposes a two-way multi-ringed forest (TMR-Forest) to estimating the malignancy of the pulmonary nodules for false positive reduction (FPR). Based on our previous work of deep decision framework, named MR-Forest, we generate a growing path mode on predefined pseudo-timeline of $L$ time slots to build pseudo-spatiotemporal features. It synchronously works with FPR based on MR-Forest to help predict the labels from a dynamic perspective. Concretely, Mask R-CNN is first used to recommend the bounding boxes of ROIs and classify their pathological features. Afterward, hierarchical attribute matching is introduced to obtain the input ROIs’ attribute layouts and select the candidates for their growing path generation. The selected ROIs can replace the fixed-sized ROIs’ fitting results at different time slots for data augmentation. A two-stage counterfactual path elimination is used to screen out the input paths of the cascade forest. Finally, a simple label selection strategy is executed to output the predicted label to point out the input nodule's malignancy. On 1034 scans of the merged dataset, the framework can report more accurate malignancy labels to achieve a better CPM score of 0. 912, which exceeds those of MR-Forest and 3DDCNNs about 2. 8% and 4. 7%, respectively.

TIST Journal 2020 Journal Article

End-to-End Text-to-Image Synthesis with Spatial Constrains

Min Wang
Congyan Lang
Liqian Liang
Songhe Feng
Tao Wang
Yutong Gao

Although the performance of automatically generating high-resolution realistic images from text descriptions has been significantly boosted, many challenging issues in image synthesis have not been fully investigated, due to shapes variations, viewpoint changes, pose changes, and the relations of multiple objects. In this article, we propose a novel end-to-end approach for text-to-image synthesis with spatial constraints by mining object spatial location and shape information. Instead of learning a hierarchical mapping from text to image, our algorithm directly generates multi-object fine-grained images through the guidance of the generated semantic layouts. By fusing text semantic and spatial information into a synthesis module and jointly fine-tuning them with multi-scale semantic layouts generated, the proposed networks show impressive performance in text-to-image synthesis for complex scenes. We evaluate our method both on single-object CUB dataset and multi-object MS-COCO dataset. Comprehensive experimental results demonstrate that our method significantly outperforms the state-of-the-art approaches consistently across different evaluation metrics.

AAAI Conference 2019 System Paper

K3S: Knowledge-Driven Solution Support System

Yu Zhang
Morteza Saberi
Min Wang
Elizabeth Chang

As the volume of scientific papers grows rapidly in size, knowledge management for scientific publications is greatly needed. Information extraction and knowledge fusion techniques have been proposed to obtain information from scholarly publications and build knowledge repositories. However, retrieving the knowledge of problem/solution from academic papers to support users on solving specific research problems is rarely seen in the state of the art. Therefore, to remedy this gap, a knowledge-driven solution support system (K3S) is proposed in this paper to extract the information of research problems and proposed solutions from academic papers, and integrate them into knowledge maps. With the bibliometric information of the papers, K3S is capable of providing recommended solutions for any extracted problems. The subject of intrusion detection is chosen for demonstration in which required information is extracted with high accuracy, a knowledge map is constructed properly, and solutions to address intrusion problems are recommended.

IJCAI Conference 2018 Conference Paper

DRPose3D: Depth Ranking in 3D Human Pose Estimation

Min Wang
Xipeng Chen
Wentao Liu
Chen Qian
Liang Lin
Lizhuang Ma

In this paper, we propose a two-stage depth ranking based method (DRPose3D) to tackle the problem of 3D human pose estimation. Instead of accurate 3D positions, the depth ranking can be identified by human intuitively and learned using the deep neural network more easily by solving classification problems. Moreover, depth ranking contains rich 3D information. It prevents the 2D-to-3D pose regression in two-stage methods from being ill-posed. In our method, firstly, we design a Pairwise Ranking Convolutional Neural Network (PRCNN) to extract depth rankings of human joints from images. Secondly, a coarse-to-fine 3D Pose Network(DPNet) is proposed to estimate 3D poses from both depth rankings and 2D human joint locations. Additionally, to improve the generality of our model, we introduce a statistical method to augment depth rankings. Our approach outperforms the state-of-the-art methods in the Human3. 6M benchmark for all three testing protocols, indicating that depth ranking is an essential geometric feature which can be learned to improve the 3D pose estimation.

YNIMG Journal 2017 Journal Article

The effects of changes in object location on object identity detection: A simultaneous EEG-fMRI study

Ping Yang
Chenggui Fan
Min Wang
Noa Fogelson
Ling Li

Object identity and location are bound together to form a unique integration that is maintained and processed in visual working memory (VWM). Changes in task-irrelevant object location have been shown to impair the retrieval of memorial representations and the detection of object identity changes. However, the neural correlates of this cognitive process remain largely unknown. In the present study, we aim to investigate the underlying brain activation during object color change detection and the modulatory effects of changes in object location and VWM load. To this end we used simultaneous electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) recordings, which can reveal the neural activity with both high temporal and high spatial resolution. Subjects responded faster and with greater accuracy in the repeated compared to the changed object location condition, when a higher VWM load was utilized. These results support the spatial congruency advantage theory and suggest that it is more pronounced with higher VWM load. Furthermore, the spatial congruency effect was associated with larger posterior N1 activity, greater activation of the right inferior frontal gyrus (IFG) and less suppression of the right supramarginal gyrus (SMG), when object location was repeated compared to when it was changed. The ERP-fMRI integrative analysis demonstrated that the object location discrimination-related N1 component is generated in the right SMG.

IJCAI Conference 2015 Conference Paper

Learning Term Embeddings for Hypernymy Identification

Zheng Yu
Haixun Wang
Xuemin Lin
Min Wang

Hypernymy identification aims at detecting if isA relationship holds between two words or phrases. Most previous methods are based on lexical patterns or the Distributional Inclusion Hypothesis, and the accuracy of such methods is not ideal. In this paper, we propose a simple yet effective supervision framework to identify hypernymy relations using distributed term representations (a. k. a term embeddings). First, we design a distance-margin neural network to learn term embeddings based on some pre-extracted hypernymy data. Then, we apply such embeddings as term features to identify positive hypernymy pairs through a supervision method. Experimental results demonstrate that our approach outperforms other supervised methods on two popular datasets and the learned term embeddings has better quality than existing term distributed representations with respect to hypernymy identification.

EAAI Journal 2014 Journal Article

Improved Bandelet with heuristic evolutionary optimization for image compression

Shuyuan Yang
Shigang Wang
Zhengkang Liu
Min Wang
Licheng Jiao

As an efficient geometric analysis tool, Bandelet has exhibited enormous potential in image compression for its capability in capturing the geometrical structure in images. However, in available implementation of Bandelet, the optimal geometric flows are determined by a coarse and exhaustive search, which will degrade the performance of Bandelet in an accurate representation of images. In this paper, we advance a new image compression approach based on Laplacian Pyramid (LP) and improved Bandelet, where a Heuristic Memetic Algorithm (HMA) is proposed to locate accurate geometric flows. The chromosome is defined to represent geometric flows, and local and heuristic evolution operators are employed to make fast search possible. The improved Bandelet is used to compress the high-frequency band of the image decomposed by LP, and the low-frequency band is coded by Set Partitioning In Hierarchical Trees (SPIHT) algorithm. Some experiments are taken on some natural images and remote sensing images, and the results demonstrate that our proposed scheme outperforms SPIHT and the second generation Bandelet(2 G-Bandelet) in both PSNR and time consumption at low bit rate compression.

EAAI Journal 2012 Journal Article

Evolution-enhanced multiscale overcomplete dictionaries learning for image denoising

Shuyuan Yang
Min Wang
Meirong Wei
Licheng Jiao

In this paper, a multiscale overcomplete dictionary learning approach is proposed for image denoising by exploiting the multiscale property and sparse representation of images. The images are firstly sparsely represented by a translation invariant dictionary and then the coefficients are denoised using some learned multiscale dictionaries. Dictionaries learning can be reduced to a non-convex l 0-norm minimization problem with multiple variables, so an evolution-enhanced algorithm is proposed to alternately optimize the variables. Some experiments are taken on comparing the performance of our proposed method with its counterparts on some benchmark natural images, and the superiorities of our proposed method to its counterparts can be observed in both the visual result and some numerical guidelines.

YNIMG Journal 2010 Journal Article

Age-related neuroinflammation in non-demented elderly adults: Preliminary findings with the TSPO ligand [11C]PBR28

Karmen K. Yoder
Shannon L. Risacher
Tamiko R. MaGee
Brenna C. McDonald
Qi-Huang Zheng
Min Wang
Bruce H. Mock
John D. West