Author name cluster

Fei He

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

1 author row

JBHI Journal 2024 Journal Article

pathCLIP: Detection of Genes and Gene Relations From Biological Pathway Figures Through Image-Text Contrastive Learning

Fei He
Kai Liu
Zhiyuan Yang
Yibo Chen
Richard D. Hammer
Dong Xu
Mihail Popescu

In biomedical literature, biological pathways are commonly described through a combination of images and text. These pathways contain valuable information, including genes and their relationships, which provide insight into biological mechanisms and precision medicine. Curating pathway information across the literature enables the integration of this information to build a comprehensive knowledge base. While some studies have extracted pathway information from images and text independently, they often overlook the correspondence between the two modalities. In this paper, we present a pathway figure curation system named pathCLIP for identifying genes and gene relations from pathway figures. Our key innovation is the use of an image-text contrastive learning model to learn coordinated embeddings of image snippets and text descriptions of genes and gene relations, thereby improving curation. Our validation results, using pathway figures from PubMed, showed that our multimodal model outperforms models using only a single modality. Additionally, our system effectively curates genes and gene relations from multiple literature sources. Two case studies on extracting pathway information from literature of non-small cell lung cancer and Alzheimer's disease further demonstrate the usefulness of our curated pathway information in enhancing related pathways in the KEGG database.

Details DOI

JBHI Journal 2022 Journal Article

Characterising Alzheimer’s Disease With EEG-Based Energy Landscape Analysis

Dominik Klepl
Fei He
Min Wu
Matteo De Marco
Daniel J. Blackburn
Ptolemaios G. Sarrigiannis

Alzheimer’s disease (AD) is one of the most common neurodegenerative diseases, with around 50 million patients worldwide. Accessible and non-invasive methods of diagnosing and characterising AD are therefore urgently required. Electroencephalography (EEG) fulfils these criteria and is often used when studying AD. Several features derived from EEG were shown to predict AD with high accuracy, e. g. signal complexity and synchronisation. However, the dynamics of how the brain transitions between stable states have not been properly studied in the case of AD and EEG. Energy landscape analysis is a method that can be used to quantify these dynamics. This work presents the first application of this method to both AD and EEG. Energy landscape assigns energy value to each possible state, i. e. pattern of activations across brain regions. The energy is inversely proportional to the probability of occurrence. By studying the features of energy landscapes of 20 AD patients and 20 age-matched healthy counterparts (HC), significant differences are found. The dynamics of AD patients’ EEG are shown to be more constrained - with more local minima, less variation in basin size, and smaller basins. We show that energy landscapes can predict AD with high accuracy, performing significantly better than baseline models. Moreover, these findings are replicated in a separate dataset including 9 AD and 10 HC above 70 years old.

Details DOI

NeurIPS Conference 2022 Conference Paper

InsPro: Propagating Instance Query and Proposal for Online Video Instance Segmentation

Fei He
Haoyang Zhang
Naiyu Gao
Jian Jia
Yanhu Shan
Xin Zhao
Kaiqi Huang

Video instance segmentation (VIS) aims at segmenting and tracking objects in videos. Prior methods typically generate frame-level or clip-level object instances first and then associate them by either additional tracking heads or complex instance matching algorithms. This explicit instance association approach increases system complexity and fails to fully exploit temporal cues in videos. In this paper, we design a simple, fast and yet effective query-based framework for online VIS. Relying on an instance query and proposal propagation mechanism with several specially developed components, this framework can perform accurate instance association implicitly. Specifically, we generate frame-level object instances based on a set of instance query-proposal pairs propagated from previous frames. This instance query-proposal pair is learned to bind with one specific object across frames through conscientiously developed strategies. When using such a pair to predict an object instance on the current frame, not only the generated instance is automatically associated with its precursors on previous frames, but the model gets a good prior for predicting the same object. In this way, we naturally achieve implicit instance association in parallel with segmentation and elegantly take advantage of temporal clues in videos. To show the effectiveness of our method InsPro, we evaluate it on two popular VIS benchmarks, i. e. , YouTube-VIS 2019 and YouTube-VIS 2021. Without bells-and-whistles, our InsPro with ResNet-50 backbone achieves 43. 2 AP and 37. 6 AP on these two benchmarks respectively, outperforming all other online VIS methods.

PDF Details

AAAI Conference 2022 Conference Paper

Learning Disentangled Attribute Representations for Robust Pedestrian Attribute Recognition

Jian Jia
Naiyu Gao
Fei He
Xiaotang Chen
Kaiqi Huang

Although various methods have been proposed for pedestrian attribute recognition, most studies follow the same feature learning mechanism, i. e. , learning a shared pedestrian image feature to classify multiple attributes. However, this mechanism leads to low-confidence predictions and non-robustness of the model in the inference stage. In this paper, we investigate why this is the case. We mathematically discover that the central cause is that the optimal shared feature cannot maintain high similarities with multiple classifiers simultaneously in the context of minimizing classification loss. In addition, this feature learning mechanism ignores the spatial and semantic distinctions between different attributes. To address these limitations, we propose a novel disentangled attribute feature learning (DAFL) framework to learn a disentangled feature for each attribute, which exploits the semantic and spatial characteristics of attributes. The framework mainly consists of learnable semantic queries, a cascaded semantic-spatial cross-attention (SSCA) module, and a group attention merging (GAM) module. Specifically, based on learnable semantic queries, the cascaded SSCA module iteratively enhances the spatial localization of attribute-related regions and aggregates region features into multiple disentangled attribute features, used for classification and updating learnable semantic queries. The GAM module splits attributes into groups based on spatial distribution and utilizes reliable group attention to supervise query attention maps. Experiments on PETA, RAPv1, PA100k, and RAPv2 show that the proposed method performs favorably against stateof-the-art methods.

PDF Details

AAAI Conference 2022 Conference Paper

QueryProp: Object Query Propagation for High-Performance Video Object Detection

Fei He
Naiyu Gao
Jian Jia
Xin Zhao
Kaiqi Huang

Video object detection has been an important yet challenging topic in computer vision. Traditional methods mainly focus on designing the image-level or box-level feature propagation strategies to exploit temporal information. This paper argues that with a more effective and efficient feature propagation framework, video object detectors can gain improvement in terms of both accuracy and speed. For this purpose, this paper studies object-level feature propagation, and proposes an object query propagation (QueryProp) framework for high-performance video object detection. The proposed QueryProp contains two propagation strategies: 1) query propagation is performed from sparse key frames to dense non-key frames to reduce the redundant computation on nonkey frames; 2) query propagation is performed from previous key frames to the current key frame to improve feature representation by temporal context modeling. To further facilitate query propagation, an adaptive propagation gate is designed to achieve flexible key frame selection. We conduct extensive experiments on the ImageNet VID dataset. QueryProp achieves comparable accuracy with state-of-the-art methods and strikes a decent accuracy/speed trade-off.

PDF Details

AAAI Conference 2020 Conference Paper

Temporal Context Enhanced Feature Aggregation for Video Object Detection

Fei He
Naiyu Gao
Qiaozhe Li
Senyao Du
Xin Zhao
Kaiqi Huang

Video object detection is a challenging task because of the presence of appearance deterioration in certain video frames. One typical solution is to aggregate neighboring features to enhance per-frame appearance features. However, such a method ignores the temporal relations between the aggregated frames, which is critical for improving video recognition accuracy. To handle the appearance deterioration problem, this paper proposes a temporal context enhanced network (TCENet) to exploit temporal context information by temporal aggregation for video object detection. To handle the displacement of the objects in videos, a novel DeformAlign module is proposed to align the spatial features from frame to frame. Instead of adopting a ﬁxed-length window fusion strategy, a temporal stride predictor is proposed to adaptively select video frames for aggregation, which facilitates exploiting variable temporal information and requiring fewer video frames for aggregation to achieve better results. Our TCENet achieves state-of-the-art performance on the ImageNet VID dataset and has a faster runtime. Without bellsand-whistles, our TCENet achieves 80. 3% mAP by only aggregating 3 frames.

PDF Details