JBHI Journal 2026 Journal Article
ECG-AuxNet: A Dual-Branch Spatial-Temporal Feature Fusion Framework with Auxiliary Learning for Enhanced Cardiac Disease Diagnosis
- Ruiqi Shen
- Yanan Wang
- Chunge Cao
- Shuaicong Hu
- Jia Liu
- Hongyu Wang
- Gaoyan Zhong
- Cuiwei Yang
Objective: Multiple limitations exist in current automated ECG analysis, including insufficient feature integration across leads, limited interpretability, poor generalization, and inadequate handling of class imbalance. To address these challenges, we develop a novel dual-branch framework that comprehensively captures spatial-temporal features for cardiac disease diagnosis. Methods: ECG-AuxNet combines a Multi-scale Transformer Attention CNN for spatial feature extraction and a GRU network for temporal dependency modeling. A Dual-stage Cross-Attention Fusion module integrates features from both branches, while a Feature Space Reconstruction (FSR) auxiliary task is introduced as a manifold regularizer to enhance feature discrimination. The framework was evaluated on PTB-XL (15, 709 ECGs) and validated in real-world clinical scenarios (SXMU-2k, 1, 673 ECGs). Results: For class-imbalanced disease recognition (NORM, CD, MI, STTC), ECG-AuxNet attained 78. 34% F1-score on PTB-XL and 82. 63% F1-score on SXMU-2k, outperforming 9 baseline models. FSR significantly improved feature discrimination by 11. 7%, enhancing class boundary clarity and classification accuracy. Grad-CAM analysis revealed attention patterns that precisely match cardiologists' diagnostic focus areas. Conclusion: ECG-AuxNet effectively integrates spatial-temporal features through auxiliary learning, achieving robust generalizability in cardiac disease diagnosis with interpretability aligned with clinical expertise.