Retdec is an LLVM-based machine code decompiler and static binary analysis tool designed for binary reverse engineering. It translates binary executable code into high-level representations to facilitate the reconstruction of program logic from compiled machine code. The system utilizes a retargetable frontend architecture and a multi-stage lifting pipeline to convert raw bytes into a common intermediate language. It differentiates custom program logic from known library code through signature-based identification and provides utilities for binary symbol demangling to restore human-readable n
Angr is a binary analysis framework and static analysis tool used for reverse engineering compiled binaries. It serves as a binary decompiler and a lifting platform that translates machine code into a common intermediate representation to enable cross-architecture analysis. The framework integrates a symbolic execution engine and constraint solvers to determine the inputs required to reach specific program states. It also employs untrusted code sandboxing to isolate guest code from the host environment during analysis. Its capabilities cover control flow and data flow analysis, including the
Detect-It-Easy is a binary file identifier and analysis toolkit designed to determine file formats, compilers, and packers. It functions as a binary file identifier that utilizes signature matching and heuristic analysis to identify executable and archive formats. The project includes a custom file signature engine and a scriptable rule system for defining and applying detection logic to identify specific binary patterns. It features specialized detectors for Android packages, such as APK and DEX files, and a malware packer detector to identify protections, obfuscators, and virus families. T
LLM4Decompile is a toolset and framework for binary-to-source code translation. It uses large language models to transform machine code into readable source code and recover the original logic of compiled executables. The project includes a specialized pipeline for generating synthetic training datasets by converting source code into assembly pairs. It provides a fine-tuning framework to optimize deep learning models on these binary-to-source datasets, increasing the accuracy of code recovery. The system also features capabilities for refining decompiled pseudo-code. This process focuses on