AAAI 2025
Accurate Nucleic Acid-Binding Residue Identification Based Domain-Adaptive Protein Language Model and Explainable Geometric Deep Learning
Abstract
Protein-nucleic acid interactions play a fundamental and critical role in a wide range of life activities. Accurate identification of nucleic acid-binding residues helps to understand the intrinsic mechanisms of the interactions. However, the accuracy and interpretability of existing computational methods for recognizing nucleic acid-binding residues need to be further improved. Here, we propose a novel method called GeSite based the domain-adaptive protein language model and E(3)-equivariant graph neural network. Prediction results across multiple benchmark test sets demonstrate that GeSite is superior or comparable to state-of-the-art prediction methods. The MCC values of GeSite are 0.522 and 0.326 for the one DNA-binding residue test set and one RNA-binding resi-due test set, which are 0.57 and 38.14% higher than that of the second-best method, respectively. Detailed experi-mental results suggest that the advanced performance of GeSite lies in the well-designed nucleic acid-binding pro-tein adaptive language model. Additionally, interpretabil-ity analysis exposes the perception of the prediction mod-el on various remote and close functional domains, which is the source of its discernment ability.
Authors
Keywords
No keywords are indexed for this paper.
Context
- Venue
- AAAI Conference on Artificial Intelligence
- Archive span
- 1980-2026
- Indexed papers
- 28718
- Paper id
- 690427696338251134