All Projects → shramos → Pyc Cfg

shramos / Pyc Cfg

Licence: gpl-2.0
Pyc-cfg is a pure python control flow graph builder for almost all Ansi C programming language.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Pyc Cfg

Cppinsights
C++ Insights - See your source code with the eyes of a compiler
Stars: ✭ 1,382 (+4665.52%)
Mutual labels:  ast, clang
Codechecker
CodeChecker is an analyzer tooling, defect database and viewer extension for the Clang Static Analyzer and Clang Tidy
Stars: ✭ 1,209 (+4068.97%)
Mutual labels:  clang, static-analyzer
Cppast.net
CppAst is a .NET library providing a C/C++ parser for header files powered by Clang/libclang with access to the full AST, comments and macros
Stars: ✭ 228 (+686.21%)
Mutual labels:  ast, clang
Scan Build
Clang's scan-build re-implementation in python
Stars: ✭ 224 (+672.41%)
Mutual labels:  clang, static-analyzer
CastXMLSuperbuild
Build CastXML and its dependencies (LLVM/Clang)
Stars: ✭ 32 (+10.34%)
Mutual labels:  ast, clang
Pfr
std::tuple like methods for user defined types without any macro or boilerplate code
Stars: ✭ 896 (+2989.66%)
Mutual labels:  clang
Irony Mode
A C/C++ minor mode for Emacs powered by libclang
Stars: ✭ 851 (+2834.48%)
Mutual labels:  clang
Proposal Binary Ast
Binary AST proposal for ECMAScript
Stars: ✭ 831 (+2765.52%)
Mutual labels:  ast
Estraverse
ECMAScript JS AST traversal functions
Stars: ✭ 795 (+2641.38%)
Mutual labels:  ast
Cfmt
cfmt is a tool to wrap Go comments over a certain length to a new line.
Stars: ✭ 28 (-3.45%)
Mutual labels:  static-analyzer
Saul
Tool to use a parsed AST to understand its tokens and regenerate code, tool usage is bound by developer-client privelidge, better call saul.py! 👍📞👨⚖️⚫🐍
Stars: ✭ 14 (-51.72%)
Mutual labels:  ast
Color coded
A vim plugin for libclang-based highlighting of C, C++, ObjC
Stars: ✭ 841 (+2800%)
Mutual labels:  clang
Cxxhttp
Asynchronous, Header-only C++ HTTP-over-(TCP|UNIX Socket|STDIO) Library
Stars: ✭ 24 (-17.24%)
Mutual labels:  clang
Esdispatch
trigger events based on esquery selectors during a traversal of a SpiderMonkey format AST
Stars: ✭ 9 (-68.97%)
Mutual labels:  ast
Modiscript
Acche din aa gaye
Stars: ✭ 888 (+2962.07%)
Mutual labels:  ast
Hive sql ast
利用Druid SQL Parser解析HiveSQL日志,自动构建字段级别的血缘关系及主外键的自动抽取
Stars: ✭ 20 (-31.03%)
Mutual labels:  ast
Sonar Jproperties Plugin
SonarQube Java Properties Analyzer
Stars: ✭ 5 (-82.76%)
Mutual labels:  static-analyzer
Vim Clang Format
Vim plugin for clang-format, a formatter for C, C++, Obj-C, Java, JavaScript, TypeScript and ProtoBuf.
Stars: ✭ 837 (+2786.21%)
Mutual labels:  clang
Vscode Tsquery
TSQuery extension for Visual Studio Code
Stars: ✭ 13 (-55.17%)
Mutual labels:  ast
Woboq codebrowser
Woboq CodeBrowser
Stars: ✭ 837 (+2786.21%)
Mutual labels:  clang

pyc-cfg

Pyc-cfg is a pure python control flow graph builder for almost all Ansi C programming language. It works by building the CFG from the abstract syntax tree generated by Clang, and accessing it through its python bindings to libclang. I started this project back in 2015 as a way to learn more about compilers and static code analysis while studying my computer engineering career. Right now the code is being improved to be more pythonic and better comply with style rules, although there are some complex C language constructions that have not yet been implemented, their operation is adequate. I will be more than grateful to respond to any error report or problem you may encounter.

What is a CFG?

In a few words and quoting wikipedia,

A control flow graph (CFG) in computer science is a representation, using graph notation, of all paths that might be traversed through a program during its execution.

Probably a more accurate and extensive definition can be found in the Dragon Book, which says the following:

A Flow Graph is a graph representation of intermediate code. The representation is constructed as follows:

  1. Partition the intermediate code into basic blocks, which are maximal sequences of consecutive three-address instructions with the properties that:
    a) The flow of control can only enter the basic block through the first instruction in the block. That is, there are no jumps into the middle of the block.
    b) Control will leave the block without halting or branching, except possibly at the last instruction in the block.
  2. The basic blocks become the nodes of a flow graph, whose edges indicate which blocks can follow which other blocks.

A Control Flow Graph can be used for many things, from the generation of source code, to the construction of a static source code analyzer.

What are libclang and the python bindings?

As its documentation says:

Libclang: The C Interface to Clang provides a relatively small API that exposes facilities for parsing source code into an abstract syntax tree (AST), loading already-parsed ASTs, traversing the AST, associating physical source locations with elements within the AST, and other facilities that support Clang-based development tools.

And the python bindings, as it says in its code:

This module provides an interface to the Clang indexing library. It is a low-level interface to the indexing library which attempts to match the Clang API directly while also being "pythonic".

Installation

The installation is relatively simple and the only step that can lead to complications is locating libclang.

apt-get install clang
pip install -r requirements.txt

At this point you must locate libclang and perform a symbolic link as it appears in the lower capture. I have found libclang in two positions within the file system in Debian operating systems, depending on whether it is x64 or x86:

  • x64
ln -s /usr/lib/x86_64-linux-gnu/libclang* libclang.so
  • x86
ln -s /usr/lib/llvm-4.0/lib/libclang* libclang.so

After this, you only have to indicate to the python bindings where the new symbolic link to libclang is located, to do this, you must open the utils.py file and modify the following line by entering the path where you have created the symbolic link:

Config.set_library_path('/usr/lib/llvm-4.0/lib') 

Usage

  • example.c
#include <stdio.h>

int addNumbers(int a, int b);

int main()
{
  int n1 = 0;
  int n2 = 1;
  int sum;  
  sum = addNumbers(n1, n2);
  printf("sum = %d",sum);
  return 0;
}

int addNumbers(int a,int b)
{
    int result;
    result = a+b;
    return result;
}

  • cfg_builder.py
from utils import buildCFG

cfg = buildCFG('example.c', 'addNumbers')

print "[+] Size of the CFG:", str(cfg.size())
print cfg.printer()

Output:

[+] Size of the CFG: 2
[B0]
Preds (1): B1

[B1]
0: int result;
1: RefExpr: a
2: Implicit Cast
3: RefExpr: b
4: Implicit Cast
5:  a + b
6: RefExpr: r
7:  result = a + b
8: RefExpr: r
9: Implicit Cast
10:  return result
Preds (1): B2
Succs (1): B0

[B2]
Succs (1): B1

=>Entry: 2
<=Exit: 0
  • cfg_builder2.py
from utils import buildCFG

cfgs = buildCFG('example.c')

for cfg in cfgs:
    print "\n[+] Function: ", cfg[0]
    print cfg[1].printer()

Output:

[+] Function:  main
[B0]
Preds (1): B1

[B1]
0: value: 0
1:  return 0
Preds (1): B2
Succs (1): B0

[B2]
0: RefExpr: s
1:  sum = addNumbers ( n1 , n2 )
2: RefExpr: p
3: Implicit Cast
4: printf
Preds (1): B3
Succs (1): B1

[B3]
0: value: 0
1: int n1;
2: value: 1
3: int n2;
4: int sum;
5: RefExpr: a
6: Implicit Cast
7: addNumbers
Preds (1): B4
Succs (1): B2

[B4]
Succs (1): B3

=>Entry: 4
<=Exit: 0


[+] Function:  addNumbers
[B0]
Preds (1): B1

[B1]
0: int result;
1: RefExpr: a
2: Implicit Cast
3: RefExpr: b
4: Implicit Cast
5:  a + b
6: RefExpr: r
7:  result = a + b
8: RefExpr: r
9: Implicit Cast
10:  return result
Preds (1): B2
Succs (1): B0

[B2]
Succs (1): B1

=>Entry: 2
<=Exit: 0

Disclaimer

As I said before, I started this project back in 2015 as a way to learn about source code static analysis and compilers. When I wrote this code I was an initiate in Python, therefore, both the architecture and the syntax can be improved a lot. I will invest effort in improving all those things. On the other hand, you can see how I have included the main file of the clang python-bindings in the project (clang/cindex.py). This is because Clang is in constant development and it may be that if you have a different version of python bindings, it is not compatible. Regardless of this, you can always delete that folder and access the python-bindings without making any changes to the code by installing the package pip install clang.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].