TIST Journal 2026 Journal Article
REMEND: Neural Decompilation for Reverse Engineering Math Equations from Binary Executables
- Meet Udeshi
- Prashanth Krishnamurthy
- Ramesh Karri
- Farshad Khorrami
Analysis of binary executables implementing mathematical equations can benefit from the reverse engineering of semantic information about the implementation. Traditional algorithmic reverse engineering tools either do not recover semantic information or rely on dynamic analysis and symbolic execution with high reverse engineering time. Algorithmic tools also require significant re-engineering effort to target new platforms and languages. Recently, neural methods for decompilation have been developed to recover human-like source code, but they do not extract semantic information explicitly. We develop REMEND, a neural decompilation framework to reverse engineer math equations from binaries to explicitly recover program semantics like dataflow and order of operations. REMEND combines a transformer encoder–decoder model for neural decompilation with algorithmic processing for enhanced symbolic reasoning necessary for math equations. REMEND is the first work to demonstrate that transformers for neural decompilation go beyond source code and reason about program semantics in the form of math equations. We train on a synthetically generated dataset containing multiple implementations and compilations of math equations to produce a robust neural decompilation model and demonstrate retargettability. REMEND obtains an accuracy of 89.8% to 92.4% across three Instruction Set Architectures (ISAs), three optimization levels, and two programming languages with a single trained model, extending the capability of state-of-the-art neural decompilers. We achieve high accuracy with a small model of up to 12 million parameters and an average execution time of 0.132 seconds per function. On a real-world dataset collected from open source programs, REMEND generalizes better than state-of-the-art neural decompilers despite being trained with synthetic data, achieving 8% higher accuracy. The synthetic and real-world datasets are provided at https://hf.co/udiboy1209/REMEND.