All Projects → bdcht → ccrawl

bdcht / ccrawl

Licence: GPL-3.0 license
clang-based search engine for C/C++ data structures, classes, prototypes & macros

Programming Languages

python
139335 projects - #7 most used programming language
C++
36643 projects - #6 most used programming language
c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to ccrawl

sycl
SYCL for Vitis: Experimental fusion of triSYCL with Intel SYCL oneAPI DPC++ up-streaming effort into Clang/LLVM
Stars: ✭ 80 (-2.44%)
Mutual labels:  clang
minilib
A c standard system library with a focus on size, headeronly, "singlefile", intended for static linking. 187 Bytes for "Hello World"(regular elf), compiled with the standard gcc toolchain.
Stars: ✭ 29 (-64.63%)
Mutual labels:  clang
systemc-compiler
This tool translates synthesizable SystemC code to synthesizable SystemVerilog.
Stars: ✭ 128 (+56.1%)
Mutual labels:  clang
portablebpf
You came here so you could have a base code to serve you as an example on how to develop a BPF application, compatible to BCC and/or LIBBPF, specially LIBBPF, having the userland part made in C or PYTHON.
Stars: ✭ 32 (-60.98%)
Mutual labels:  clang
gopio
Raspberry pi GPIO controller package(CGO)
Stars: ✭ 14 (-82.93%)
Mutual labels:  clang
Chromium Clang
Chromium browser compiled with the Clang/LLVM compiler.
Stars: ✭ 77 (-6.1%)
Mutual labels:  clang
quickjs-build
Build for QuickJS JavaScript Engine
Stars: ✭ 25 (-69.51%)
Mutual labels:  clang
pranaOS
A unix operating system written from scratch in c that gives support for arm, x86
Stars: ✭ 138 (+68.29%)
Mutual labels:  clang
los
Los是一个c/c++语言编译型的虚拟机。它使用llvm/clang作为其前端,losld做后端对源代码进行编译,生成los指令文件。Los is a c/c++-compiled virtual machine. It uses llvm/clang as its front end, losld does the backend to compile the source code, and generates the los directive file.
Stars: ✭ 46 (-43.9%)
Mutual labels:  clang
OS-CFI
Origin-sensitive Control Flow Integrity (OS-CFI) - USENIX Security 2019
Stars: ✭ 27 (-67.07%)
Mutual labels:  clang
FPChecker
A dynamic analysis tool to detect floating-point errors in HPC applications.
Stars: ✭ 26 (-68.29%)
Mutual labels:  clang
ndslvim
专用于C-family和web前端开发的vim配置(其实通用性也还不错啦)
Stars: ✭ 14 (-82.93%)
Mutual labels:  clang
lessram
Pure PHP implementation of array data structures that use less memory.
Stars: ✭ 20 (-75.61%)
Mutual labels:  structures
autoprogrammer
C++ to C++ code generation tool (enum2string conversion, serialization, reflection etc.)
Stars: ✭ 77 (-6.1%)
Mutual labels:  clang
SynapseOS
SynapseOS - модульная операционная система на языке C.
Stars: ✭ 93 (+13.41%)
Mutual labels:  clang
flextool
C++ compile-time programming (serialization, reflection, code modification, enum to string, better enum, enum to json, extend or parse language, etc.)
Stars: ✭ 32 (-60.98%)
Mutual labels:  clang
clang-tool
Simple and powerful standalone project for clang-based tools using libtooling (e.g. refactoring, auto-completion, etc.)
Stars: ✭ 35 (-57.32%)
Mutual labels:  clang
open-ops
Open Optimizing Parallelizing System
Stars: ✭ 21 (-74.39%)
Mutual labels:  clang
c-compiler-security
Security-related flags and options for C compilers
Stars: ✭ 125 (+52.44%)
Mutual labels:  clang
CMLFS
Clang-Built Musl Linux From Scratch
Stars: ✭ 51 (-37.8%)
Mutual labels:  clang

Ccrawl

Documentation Status Code Quality
Status: Under Development
Location: https://github.com/bdcht/ccrawl
Version: 1.x
Doc: http://ccrawl.readthedocs.io/en/latest/index.html

Description

Ccrawl uses clang to build a database related to various C/C++ data structures (struct, union, class, enum, typedef, prototypes and macros) which allows to identify data types and constants/macros by querying this database for specific properties, including properties related to the struct/class memory layout.

Basically it allows for example to

  • "find all structures that have a pointer to char at offset 8 and an unsigned integer at offset 56 ?
  • "find types with a total size of 96 bytes ?" or
  • "find every macro that define value 0x1234 ?" or
  • "find the mask of values from enum X that correspond to 0xabcd ?"
  • "find all functions that return 'size_t' and have 'struct X' as first argument ?"

Ccrawl then allows to output found structures in many formats: C/C++ of course, but also ctypes, or amoco. The ctypes output of a C++ class corresponds to an instance (object) layout in memory, including all virtual table pointers (or VTT) that result from possibly multiple parent (possibly virtual) classes.

Finally, Ccrawl allows to compute various statistics about a library API, and allows to compute the dependency graph of all collected types.

User documentation and API can be found at http://ccrawl.readthedocs.io/en/latest/index.html

Examples

Consider the following C struct from file samples/simple.h

struct S {
  char c;
  int  n;
  union {
    unsigned char x[2];
    unsigned short s;
  } u;
  char (*PtrCharArrayOf3[2])[3];
  void (*pfunc)(int, int);
};

First, collect the structure definition in a local database:

$ ccrawl -b None -l test.db -g 'test0' collect samples/simple.h
[100%] simple.h                                                [  2]
--------------------------------------------------------------------
saving database...                                            [   2]

Then, its possible to translate the full structure in ctypes

$ ccrawl -b None -l test.db show -r -f ctypes 'struct S'
struct_S = type('struct_S',(Structure,),{})
union_b0eccf67 = type('union_b0eccf67',(Union,),{})
union_b0eccf67._fields_ = [("x", c_ubyte*2),
                           ("s", c_ushort)]

struct_S._anonymous_ = ("u",)
struct_S._fields_ = [("c", c_byte),
                     ("n", c_int),
                     ("u", union_b0eccf67),
                     ("PtrCharArrayOf3", POINTER(c_byte*3)*2),
                     ("pfunc", POINTER(CFUNCTYPE(None, c_int, c_int)))]

Or simply to compute the fields offsets

$ ccrawl -b None -l test.db info 'struct S'
identifier: struct S
class     : cStruct
source    : simple.h
tag       : test0
size      : 40
offsets   : [(0, 1), (4, 4), (8, 2), (16, 16), (32, 8)]

Now let's deal with a more tricky C++ example:

$ ccrawl -b None -l test.db -g 'c++' collect -a samples/shahar.cpp
[100%] shahar.cpp                                              [ 18]
--------------------------------------------------------------------
saving database...                                            [  18]

We can show a full (recursive) definition of a class:

$ ccrawl -b None -l test.db show -r 'class Child'
class Grandparent {
  public:
    virtual void grandparent_foo();
    int grandparent_data;
};

class Parent1 : virtual public Grandparent {
  public:
    virtual void parent1_foo();
    int parent1_data;
};
class Parent2 : virtual public Grandparent {
  public:
    virtual void parent2_foo();
    int parent2_data;
};

class Child : public Parent1, public Parent2 {
  public:
    virtual void child_foo();
    int child_data;
};

And its ctypes memory layout:

$ ccrawl -b None -l test.db show -f ctypes 'class Child'
struct___layout$Child = type('struct___layout$Child',(Structure,),{})

struct___layout$Child._fields_ = [("__vptr$Parent1", c_void_p),
                                  ("parent1_data", c_int),
                                  ("__vptr$Parent2", c_void_p),
                                  ("parent2_data", c_int),
                                  ("child_data", c_int),
                                  ("__vptr$Grandparent", c_void_p),
                                  ("grandparent_data", c_int)]

See the documentation for more examples.

Todo

  • improve C++ support (namespaces, template formatters, external build in ctypes/amoco/Ghidra)
  • add web frontend
  • plugin for IDA Pro

Changelog

  • v1.7
    • optionally parse functions' bodies and update 'cFunc' descriptions with parsed infos
    • add sync command to update mongodb remote database from a rebuilt local database
    • improve Ghidra's interface to detect structures
    • add pointer size option to compute structures' fields offsets
    • fix: adjust enum size to its minimal needed size
    • fix: apply global tag filter to all queries to the ProxyDB
    • update to libclang-14
  • v1.6
    • add external interface to export types into Ghidra's data type manager
    • add find_matching_types to replicate the Ghidra's "auto_struct" command
    • add database(s) cleanup methods
  • v1.5
    • update code for libclang-12 (using python3-clang)
    • update to tinydb v4.x
  • v1.4
    • update code for libclang-10 (using python3-clang)
    • improve bitfield support
  • v1.3
    • add Flask-based REST API and server command
    • support for mongodb database backend
    • support for local tinydb databases
    • c_type and cxx_type parsers for C/C++ types
    • support anonymous types in C structs/unions
    • support C++ multiple inheritance, including virtual parents
    • basic support for C++ class & function templates
    • support bitfield structures
    • support user-defined alignment policies
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].