All Projects → decomp → doc

decomp / doc

Licence: Unlicense license
Design documents related to the decompilation pipeline.

Programming Languages

TeX
3793 projects
LLVM
166 projects
PostScript
262 projects
Makefile
30231 projects
assembly
5116 projects
c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to doc

llvm-hs-typed
Type Safe LLVM IR ( Experimental )
Stars: ✭ 47 (+104.35%)
Mutual labels:  llvm, llvm-ir
Decomp
Components of a decompilation pipeline.
Stars: ✭ 343 (+1391.3%)
Mutual labels:  llvm, decompiler
llvm-kaleidoscope
LLVM Tutorial: Kaleidoscope (Implementing a Language with LLVM)
Stars: ✭ 124 (+439.13%)
Mutual labels:  llvm, llvm-ir
anvill
anvill forges beautiful LLVM bitcode out of raw machine code
Stars: ✭ 228 (+891.3%)
Mutual labels:  llvm, decompiler
Rellic
Rellic produces goto-free C output from LLVM bitcode
Stars: ✭ 234 (+917.39%)
Mutual labels:  llvm, decompiler
LLVM-Metadata-Visualizer
LLVM Metadata Visualizer
Stars: ✭ 20 (-13.04%)
Mutual labels:  llvm, llvm-ir
bl
Simple imperative programming language created for fun.
Stars: ✭ 57 (+147.83%)
Mutual labels:  llvm, llvm-ir
TinyCompiler
c compiler based on flex(lex), bison(yacc) and LLVM, supports LLVM IR and obj code generation. 基于flex,bison以及LLVM,使用c++11实现的类C语法编译器, 支持生成中间代码及可执行文件.
Stars: ✭ 162 (+604.35%)
Mutual labels:  llvm, llvm-ir
Mcsema
Framework for lifting x86, amd64, aarch64, sparc32, and sparc64 program binaries to LLVM bitcode
Stars: ✭ 2,198 (+9456.52%)
Mutual labels:  llvm, llvm-ir
Fcd
An optimizing decompiler
Stars: ✭ 622 (+2604.35%)
Mutual labels:  llvm, decompiler
validating-binary-decompilation
Scalable Validator for Binary Lifters
Stars: ✭ 41 (+78.26%)
Mutual labels:  decompilation, llvm-ir
LLAST
A high level LLVM IR AST provider for GraphEngine JIT.
Stars: ✭ 21 (-8.7%)
Mutual labels:  llvm, llvm-ir
llvm-brainfuck
Brainfuck compiler based on LLVM API
Stars: ✭ 27 (+17.39%)
Mutual labels:  llvm, llvm-ir
Bytecoder
Rich Domain Model for JVM Bytecode and Framework to interpret and transpile it.
Stars: ✭ 401 (+1643.48%)
Mutual labels:  llvm, decompiler
TML.Patcher
Console application for decompiling, recompiling, packaging, and patching tModLoader's .tmod files at blazing-fast speeds.
Stars: ✭ 38 (+65.22%)
Mutual labels:  decompiler, decompilation
llvm-semantics
Formal semantics of LLVM IR in K
Stars: ✭ 42 (+82.61%)
Mutual labels:  llvm, llvm-ir
mlir-hs
Haskell bindings for MLIR
Stars: ✭ 53 (+130.43%)
Mutual labels:  llvm
sead
Decompilation of sead: the standard C++ library for first-party Nintendo games
Stars: ✭ 91 (+295.65%)
Mutual labels:  decompilation
llvm-compile-time-data
LLVM compile-time performance data over time.
Stars: ✭ 16 (-30.43%)
Mutual labels:  llvm
systemc-compiler
This tool translates synthesizable SystemC code to synthesizable SystemVerilog.
Stars: ✭ 128 (+456.52%)
Mutual labels:  llvm

Design Documents

This repository contains design documents related to the decompilation pipeline of decomp/decomp.

Architecture of the Decompilation Pipeline

  1. paper Compositional Decompilation using LLVM IR (compositional_decompilation.pdf)
    • An overview of the compositional architecture used in the decomp/decomp decompilation pipeline.
  2. slides Using LLVM for Decompilation and Binary Analysis (intro.slide)
    • Introductory talk providing an overview of the decomp/decomp decompilation pipeline (presented at LLVM Sweden Socials in Aug 2017).

Control Flow Recovery

  1. paper Evaluation of Methods for Effective Control Flow Recovery (control_flow_recovery.pdf)
    • An evaluation of control flow recovery methods, outlining key ideas, showcasing strenghts and providing insight into failure modes.
  2. slides Evaluation of Methods for Effective Control Flow Recovery (cfa_presentation.pdf)
    • A high-level presentation of key ideas, strenghts and failure modes of different control flow recovery methods.

Type Recovery

  1. paper Type Analysis of Low-level Code (type_analysis.pdf)
    • A meta-study of type recovery methods used during binary lifting.

Auxiliary material

Poster

Poster summarizing the current capabilities of the decompilation pipeline.

Poster: Compositional Decompilation

Abstract

Abstract of the compositional_decompilation.pdf report; presented here to make it indexable by search engines.

Decompilation or reverse compilation is the process of translating low-level machine-readable code into high-level human-readable code. The problem is non-trivial due to the amount of information lost during compilation, but it can be divided into several smaller problems which may be solved independently. This report explores the feasibility of composing a decompilation pipeline from independent components, and the potential of exposing those components to the end-user. The components of the decompilation pipeline are conceptually grouped into three modules. Firstly, the front-end translates a source language (e.g. x86 assembly) into LLVM IR; a platform-independent low-level intermediate representation. Secondly, the middle-end structures the LLVM IR by identifying high-level control flow primitives (e.g. pre-test loops, 2-way conditionals). Lastly, the back-end translates the structured LLVM IR into a high-level target programming language (e.g. Go). The control flow analysis stage of the middle-end uses subgraph isomorphism search algorithms to locate control flow primitives in CFGs, both of which are described using Graphviz DOT files.

The decompilation pipeline has been proven capable of recovering nested pre-test and post-test loops (e.g. while, do-while), and 1-way and 2-way conditionals (e.g. if, if-else) from LLVM IR. Furthermore, the data-driven design of the control flow analysis stage facilitates extensions to identify new control flow primitives. There is huge potential for future development. The Go output could be made more idiomatic by extending the post-processing stage, using components such as Grind by Russ Cox which moves variable declarations closer to their usage. The language-agnostic aspects of the design will be validated by implementing components in other languages; e.g. data flow analysis in Haskell. Additional back-ends (e.g. Python output) will be implemented to verify that the general decompilation tasks (e.g. control flow analysis, data flow analysis) are handled by the middle-end.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].