Arrow Research search

Author name cluster

Feng Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

17 papers
2 author rows

Possible papers

17

AAAI Conference 2026 Conference Paper

Domain-Aware Multi-View Contrastive Representation Learning for Protein Subcellular Localization Prediction

  • Qiang Zhang
  • Feng Yang
  • Weihong Huang
  • Jing Feng
  • Juan Liu

Protein subcellular localization prediction is essential for understanding protein function and cellular organization. However, existing methods exhibit two major limitations: (1) they overlook the critical role of evolutionarily conserved protein domains, which are fundamental functional and structural units that significantly influence functions and subcellular localization, and (2) they rarely learn residue order and backbone coordinates simultaneously, neglecting the complementary information inherent in multi-modal representations. In this paper, we propose a novel Domain-Aware Multi-View Contrastive Representation Learning for Protein Subcellular Localization prediction, named DMVCL. Firstly, it devises domain-sequence/structure attention modules, which identify functionally significant regions in protein structures/sequences that critically determine subcellular localization. Secondly, it introduces a multi-view contrastive learning framework that unites inter-view and intra-view objectives. Inter-view contrastive learning aligns protein sequences with their corresponding structures by maximizing mutual information, thereby capturing the consistency of protein residue order and backbone coordinates. Intra-view contrastive learning enhances the representation discriminability of each modality by explicitly separating proteins with no common location and attracting those with any shared localization. Extensive experiments demonstrate that DMVCL significantly outperforms existing baselines. Ablation studies and visualizations further highlight the contributions of domain-sequence/structure attention and multi-view contrastive learning in achieving superior predictive performance.

AAAI Conference 2026 Conference Paper

Dynamic Geometric Equivariant Network for Full-Atom Antibody Design

  • Weihong Huang
  • Feng Yang
  • Qiang Zhang
  • Juan Liu

Antibody design is critically important in biomedical and therapeutic contexts but remains extremely challenging due to the complexity of antibody sequence–structure relationships and stringent antigen specificity requirements. Traditional computational approaches rely on multi-stage pipelines and often overlook full-atom details (e.g., side-chain conformations) as well as fine-grained geometric features, resulting in limited effectiveness. To overcome these limitations, we propose Dynamic Geometric Equivariant Network (DGENet), an end-to-end full-atom antibody design model that integrates a geometric-kinematic equivariant dynamic optimization module (GK-EDO) with an full-atom E(3)-equivariant message-passing architecture. This framework enables iterative optimization of antibody structures under explicit geometric and kinematic constraints, generating complete antibody structures (including backbone and side chains) and simultaneously jointly optimizing the sequences and 3D structures of the complementarity-determining regions (CDRs). DGENet also introduces a novel virtual anchor docking mechanism that employs an adaptive PNet-Kabsch module to explicitly guide antibody–antigen binding and achieve precise bound conformations. Evaluations on multiple benchmark datasets demonstrate that DGENet exhibits outstanding performance in antibody structure and sequence generation as well as in designing high-affinity antibodies, underscoring its reliability as an advanced antibody design model.

AAAI Conference 2026 Conference Paper

Injection Without Distortion: Geometrically Constrained Knowledge Enhancement for Vision-Language Models

  • Zhongze Wu
  • Xiu Su
  • Feng Yang
  • Shan You
  • Jun Long
  • Yueyi Luo

Vision-Language Models (VLMs) are widely used in tasks like Open-Vocabulary Object Detection and zero-shot Classification, owing to their powerful generalization. However, recent research reveals that VLMs exhibit significant performance instability when tasked with recognizing concepts at varying granularities (e.g., ``animal'' vs. ``dog''). Prevailing methods inject external knowledge from Large Language Models, but this unconstrained approach distorts the VLM's inherent hierarchical orthogonal geometry, leading to performance collapse on general concepts. To address this, we introduce GeCoin, an innovative Geometrically Constrained framework that safely enhances existing VLMs with external knowledge for improved hierarchical understanding, without additional training. By projecting knowledge into the null-space of a query concept's feature space, GeCoin mathematically guarantees the preservation of general knowledge while integrating specialized information. Extensive experiments across large-scale benchmarks, diverse VLMs, and knowledge from various LLMs (e.g., GPT-3.5, Claude-3, Gemini-Pro) show that GeCoin boosts performance by an average of 3.9% over the strongest baseline—crucially eradicating performance collapse on general concepts.

AAAI Conference 2026 Conference Paper

OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model

  • Xingcheng Zhou
  • Xuyuan Han
  • Feng Yang
  • Yunpu Ma
  • Volker Tresp
  • Alois Knoll

We present OpenDriveVLA, a Vision-Language Action (VLA) model designed for end-to-end autonomous driving, built upon open-source large language models. OpenDriveVLA generates spatially-grounded driving actions by leveraging multimodal inputs, including both 2D and 3D instance-aware visual representations, ego vehicle states, and language commands. To bridge the modality gap between driving visual representations and language embeddings, we introduce a hierarchical vision-language alignment process, projecting both 2D and 3D structured visual tokens into a unified semantic space. Furthermore, we incorporate structured agent–environment–ego interaction modeling into the autoregressive decoding process, enabling the model to capture fine-grained spatial dependencies and behavior-aware dynamics critical for reliable trajectory planning. Extensive experiments on the nuScenes dataset demonstrate that OpenDriveVLA achieves state-of-the-art results across open-loop trajectory planning and driving-related question-answering tasks. Qualitative analyses further illustrate its superior capability to follow high-level driving commands and robustly generate trajectories under challenging scenarios, highlighting its potential for next-generation end-to-end autonomous driving.

IROS Conference 2025 Conference Paper

Design and Development of a GPR-Equipped Robot for Full-space External Diseases Detection in Drainage Pipelines *

  • Yuanjin Fang
  • Feng Yang
  • Xu Qiao
  • Maoxuan Xu

Soil diseases around drainage pipelines are a major factor in road collapse. Robots designed to detect these diseases face multiple challenges, including harsh internal environments, size limitations, difficulties in achieving full external space coverage, and the impact of pose misalignment on disease localization. To address these challenges, this work presents the design and development of a pipeline robot equipped with Ground-Penetrating Radar (GPR), capable of adapting to a pipe diameter range of 500-1000 millimeters and providing comprehensive detection of external space diseases. A radial offset estimation model is introduced, and by integrating multi-sensor data, the robot achieves full-pose perception, overcoming challenges related to angular and positional misalignment during disease localization. Experimental results demonstrate that the robot can achieve a maximum detection speed of up to 0. 5 meters per second and is capable of adapting to various field drainage pipeline scenarios, including full water, rough terrain, pose misalignment, and 90-degree bends. Azimuth errors for external disease localization are controlled within 1 degree, and axial displacement errors are controlled within 2 centimeters.

AAAI Conference 2025 Conference Paper

Motion Artifact Removal in Pixel-Frequency Domain via Alternate Masks and Diffusion Model

  • Jiahua Xu
  • Dawei Zhou
  • Lei Hu
  • Jianfeng Guo
  • Feng Yang
  • Zaiyi Liu
  • Nannan Wang
  • Xinbo Gao

Motion artifacts present in magnetic resonance imaging (MRI) can seriously interfere with clinical diagnosis. Removing motion artifacts is a straightforward solution and has been extensively studied. However, paired data are still heavily relied on in recent works and the perturbations in k-space (frequency domain) are not well considered, which limits their applications in the clinical field. To address these issues, we propose a novel unsupervised purification method which leverages pixel-frequency information of noisy MRI images to guide a pre-trained diffusion model to recover clean MRI images. Specifically, considering that motion artifacts are mainly concentrated in high-frequency components in k-space, we utilize the low-frequency components as the guide to ensure correct tissue textures. Additionally, given that high-frequency and pixel information are helpful for recovering shape and detail textures, we design alternate complementary masks to simultaneously destroy the artifact structure and exploit useful information. Quantitative experiments are performed on datasets from different tissues and show that our method achieves superior performance on several metrics. Qualitative evaluations with radiologists also show that our method provides better clinical feedback.

JBHI Journal 2025 Journal Article

Uncertainty-Inspired Multi-Task Learning in Arbitrary Scenarios of ECG Monitoring

  • Xingyao Wang
  • Hongxiang Gao
  • Caiyun Ma
  • Tingting Zhu
  • Feng Yang
  • Chengyu Liu
  • Huazhu Fu

As the scenarios for electrocardiogram (ECG) monitoring become increasingly diverse, particularly with the development of wearable ECG, the influence of ambiguous factors in diagnosis has been amplified. Reliable ECG information must be extracted from abundant noises and confusing artifacts. To address this issue, we suggest an uncertainty-inspired model for beat-level diagnosis (UI-Beat). The base architecture of UI-Beat separates heartbeat localization and event diagnosis in two branches to address the problem of heterogeneous data sources. To disentangle the epistemic and aleatoric uncertainty within one stage in a deterministic neural network, we propose a new method derived from uncertainty formulation and realize it by introducing the class-biased transformation. Then the disentangled uncertainty can be utilized to screen out noise and identify ambiguous heartbeat synchronously. The results indicate that UI-Beat can significantly improve the performance of noise detection (from 91. 60% to 97. 50% for real-world noise detection and from 61. 40% to 82. 41% for real-world artifact detection). For multi-lead ECG analysis, UI-Beat is approaching the performance upper bound in heartbeat localization (only 15 false positives and 9 false negatives out of the 175, 907 heartbeats in the INCART database) and achieving a significant performance improvement in heartbeat classification through uncertainty-based cross-lead fusion compared to single-lead prediction and other state-of-the-art methods (an average improvement of 14. 28% for detecting heartbeats of S and 3. 37% for detecting heartbeats of V). Considering the characteristic of one-stage ECG analysis within one model, it is suggested that the proposed UI-Beat has the potential to be employed as a general model for arbitrary scenarios of ECG monitoring, with the capacity to remove unusableepisodes, and realize heartbeat-level diagnosis with confidence provided.

JBHI Journal 2024 Journal Article

Isolated Random Forest Assisted Spatio-Temporal Ant Colony Evolutionary Algorithm for Cell Tracking in Time-Lapse Sequences

  • Benlian Xu
  • Di Wu
  • Jian Shi
  • Jinliang Cong
  • Mingli Lu
  • Feng Yang
  • Brett Nener

Multi-Object tracking in real world environments is a tough problem, especially for cell morphogenesis with division. Most cell tracking methods are hard to achieve reliable mitosis detection, efficient inter-frame matching, and accurate state estimation simultaneously within a unified tracking framework. In this paper, we propose a novel unified framework that leverages a spatio-temporal ant colony evolutionary algorithm to track cells amidst mitosis under measurement uncertainty. Each Bernoulli ant colony representing a migrating cell is able to capture the occurrence of mitosis through the proposed Isolation Random Forest (IRF)-assisted temporal mitosis detection algorithm with the assumption that mitotic cells exhibit unique spatio-temporal features different from non-mitotic ones. Guided by prediction of a division event, multiple ant colonies evolve between consecutive frames according to an augmented assignment matrix solved by the extended Hungarian method. To handle dense cell populations, an efficient group partition between cells and measurements is exploited, which enables multiple assignment tasks to be executed in parallel with a reduction in matrix dimension. After inter-frame traversing, the ant colony transitions to a foraging stage in which it begins approximating the Bernoulli parameter to estimate cell state by iteratively updating its pheromone field. Experiments on multi-cell tracking in the presence of cell mitosis and morphological changes are conducted, and the results demonstrate that the proposed method outperforms state-of-the-art approaches, striking a balance between accuracy and computational efficiency.

NeurIPS Conference 2024 Conference Paper

Optical Diffusion Models for Image Generation

  • Ilker Oguz
  • Niyazi U. Dinc
  • Mustafa Yildirim
  • Junjie Ke
  • Innfarn Yoo
  • Qifei Wang
  • Feng Yang
  • Christophe Moser

Diffusion models generate new samples by progressively decreasing the noise from the initially provided random distribution. This inference procedure generally utilizes a trained neural network numerous times to obtain the final output, creating significant latency and energy consumption on digital electronic hardware such as GPUs. In this study, we demonstrate that the propagation of a light beam through a transparent medium can be programmed to implement a denoising diffusion model on image samples. This framework projects noisy image patterns through passive diffractive optical layers, which collectively only transmit the predicted noise term in the image. The optical transparent layers, which are trained with an online training approach, backpropagating the error to the analytical model of the system, are passive and kept the same across different steps of denoising. Hence this method enables high-speed image generation with minimal power consumption, benefiting from the bandwidth and energy efficiency of optical information processing.

JBHI Journal 2024 Journal Article

TaiChiNet: Negative-Positive Cross-Attention Network for Breast Lesion Segmentation in Ultrasound Images

  • Jinting Wang
  • Jiafei Liang
  • Yang Xiao
  • Joey Tianyi Zhou
  • Zhiwen Fang
  • Feng Yang

Breast lesion segmentation in ultrasound images is essential for computer-aided breast-cancer diagnosis. To improve the segmentation performance, most approaches design sophisticated deep-learning models by mining the patterns of foreground lesions and normal backgrounds simultaneously or by unilaterally enhancing foreground lesions via various focal losses. However, the potential of normal backgrounds is underutilized, which could reduce false positives by compacting the feature representation of all normal backgrounds. From a novel viewpoint of bilateral enhancement, we propose a negative-positive cross-attention network to concentrate on normal backgrounds and foreground lesions, respectively. Derived from the complementing opposites of bipolarity in TaiChi, the network is denoted as TaiChiNet, which consists of the negative normal-background and positive foreground-lesion paths. To transmit the information across the two paths, a cross-attention module, a complementary MLP-head, and a complementary loss are built for deep-layer features, shallow-layer features, and mutual-learning supervision, separately. To the best of our knowledge, this is the first work to formulate breast lesion segmentation as a mutual supervision task from the foreground-lesion and normal-background views. Experimental results have demonstrated the effectiveness of TaiChiNet on two breast lesion segmentation datasets with a lightweight architecture. Furthermore, extensive experiments on the thyroid nodule segmentation and retinal optic cup/disc segmentation datasets indicate the application potential of TaiChiNet.

JBHI Journal 2021 Journal Article

Clustering-Based Dual Deep Learning Architecture for Detecting Red Blood Cells in Malaria Diagnostic Smears

  • Yasmin M. Kassim
  • Kannappan Palaniappan
  • Feng Yang
  • Mahdieh Poostchi
  • Nila Palaniappan
  • Richard J Maude
  • Sameer Antani
  • Stefan Jaeger

Computer-assisted algorithms have become a mainstay of biomedical applications to improve accuracy and reproducibility of repetitive tasks like manual segmentation and annotation. We propose a novel pipeline for red blood cell detection and counting in thin blood smear microscopy images, named RBCNet, using a dual deep learning architecture. RBCNet consists of a U-Net first stage for cell-cluster or superpixel segmentation, followed by a second refinement stage Faster R-CNN for detecting small cell objects within the connected component clusters. RBCNet uses cell clustering instead of region proposals, which is robust to cell fragmentation, is highly scalable for detecting small objects or fine scale morphological structures in very large images, can be trained using non-overlapping tiles, and during inference is adaptive to the scale of cell-clusters with a low memory footprint. We tested our method on an archived collection of human malaria smears with nearly 200, 000 labeled cells across 965 images from 193 patients, acquired in Bangladesh, with each patient contributing five images. Cell detection accuracy using RBCNet was higher than 97 $\%$. The novel dual cascade RBCNet architecture provides more accurate cell detections because the foreground cell-cluster masks from U-Net adaptively guide the detection stage, resulting in a notably higher true positive and lower false alarm rates, compared to traditional and other deep learning methods. The RBCNet pipeline implements a crucial step towards automated malaria diagnosis.

JBHI Journal 2020 Journal Article

Deep Learning for Smartphone-Based Malaria Parasite Detection in Thick Blood Smears

  • Feng Yang
  • Mahdieh Poostchi
  • Hang Yu
  • Zhou Zhou
  • Kamolrat Silamut
  • Jian Yu
  • Richard J. Maude
  • Stefan Jaeger

Objective: This work investigates the possibility of automated malaria parasite detection in thick blood smears with smartphones. Methods: We have developed the first deep learning method that can detect malaria parasites in thick blood smear images and can run on smartphones. Our method consists of two processing steps. First, we apply an intensity-based Iterative Global Minimum Screening (IGMS), which performs a fast screening of a thick smear image to find parasite candidates. Then, a customized Convolutional Neural Network (CNN) classifies each candidate as either parasite or background. Together with this paper, we make a dataset of 1819 thick smear images from 150 patients publicly available to the research community. We used this dataset to train and test our deep learning method, as described in this paper. Results: A patient-level five-fold cross-evaluation demonstrates the effectiveness of the customized CNN model in discriminating between positive (parasitic) and negative image patches in terms of the following performance indicators: accuracy (93. 46% ± 0. 32%), AUC (98. 39% ± 0. 18%), sensitivity (92. 59% ± 1. 27%), specificity (94. 33% ± 1. 25%), precision (94. 25% ± 1. 13%), and negative predictive value (92. 74% ± 1. 09%). High correlation coefficients (>0. 98) between automatically detected parasites and ground truth, on both image level and patient level, demonstrate the practicality of our method. Conclusion: Promising results are obtained for parasite detection in thick blood smears for a smartphone application using deep learning methods. Significance: Automated parasite detection running on smartphones is a promising alternative to manual parasite counting for malaria diagnosis, especially in areas lacking experienced parasitologists.

JBHI Journal 2014 Journal Article

A Comparative Study of Different Level Interpolations for Improving Spatial Resolution in Diffusion Tensor Imaging

  • Feng Yang
  • Yue-Min Zhu
  • Jian-Hua Luo
  • Marc Robini
  • Jie Liu
  • Pierre Croisille

This paper studies and evaluates the feasibility and the performance of different level interpolations for improving spatial resolution of diffusion tensor magnetic resonance imaging (DT-MRI or DTI). In particular, the following techniques are investigated: anisotropic interpolation operating on scalar gray-level images, log-Euclidean interpolation method, and the quaternion interpolation method, which operate on diffusion tensor fields. The performance is evaluated both qualitatively and quantitatively using criteria such as tensor determinant, fractional anisotropy (FA), mean diffusivity (MD), fiber length, etc. We conclude that tensor field interpolations allow avoiding undesirable swelling effect in DTI, which is not the case with scalar gray-level interpolation, and that scalar gray-level image interpolation and log-Euclidean tensor field interpolation suffer from decrease in FA and MD, which may mislead the interpretation of the clinical parameters FA and MD. In contrast, the quaternion tensor field interpolation avoids such FA and MD decrease, which suggests its use for clinical applications.