All Projects → jserv → Shecc

jserv / Shecc

Licence: bsd-2-clause
A self-hosting and educational C compiler

Programming Languages

c
50402 projects - #5 most used programming language

Projects that are alternatives of or similar to Shecc

Lbforth
Self-hosting metacompiled Forth, bootstrapping from a few lines of C; targets Linux, Windows, ARM, RISC-V, 68000, PDP-11, asm.js.
Stars: ✭ 293 (+2.45%)
Mutual labels:  compiler, arm, risc-v, riscv
mdepx
MDEPX — A BSD-style RTOS
Stars: ✭ 17 (-94.06%)
Mutual labels:  arm, riscv, qemu, risc-v
Ataraxia
Simple and lightweight source-based multi-platform Linux distribution with musl libc.
Stars: ✭ 226 (-20.98%)
Mutual labels:  cross-compiler, arm, risc-v
how-to-qemu-arm-gdb-gtest
How to run, debug, and unit test ARM code on X86 ubuntu
Stars: ✭ 19 (-93.36%)
Mutual labels:  arm, qemu, cross-compiler
hero-sdk
⛔ DEPRECATED ⛔ HERO Software Development Kit
Stars: ✭ 21 (-92.66%)
Mutual labels:  riscv, armv7, risc-v
elfloader
ARMv7M ELF loader
Stars: ✭ 71 (-75.17%)
Mutual labels:  arm, armv7, elf
Maxine Vm
Maxine VM: A meta-circular research VM
Stars: ✭ 274 (-4.2%)
Mutual labels:  risc-v, riscv, armv7
Ppci
A compiler for ARM, X86, MSP430, xtensa and more implemented in pure Python
Stars: ✭ 210 (-26.57%)
Mutual labels:  compiler, arm, riscv
Amacc
Small C Compiler generating ELF executable Arm architecture, supporting JIT execution
Stars: ✭ 661 (+131.12%)
Mutual labels:  self-hosting, compiler, arm
rust-crosscompiler-arm
Docker images for Rust dedicated to cross compilation for ARM v6 and more
Stars: ✭ 48 (-83.22%)
Mutual labels:  arm, armv7, cross-compiler
gitlab-runner
GitLab Runner (Docker image) for ARM devices, this is a mirror repository of
Stars: ✭ 17 (-94.06%)
Mutual labels:  arm, armv7
android-openssl
OpenSSL build for Android (arm, armv7, x86)
Stars: ✭ 69 (-75.87%)
Mutual labels:  arm, armv7
ria-jit
Lightweight and performant dynamic binary translation for RISC–V code on x86–64
Stars: ✭ 38 (-86.71%)
Mutual labels:  qemu, risc-v
docker-nagios
Docker image for Nagios Core in Alpine Linux with basic plugins, available for x86, x64 , ARM v6, ARM v7 and ARM64.
Stars: ✭ 33 (-88.46%)
Mutual labels:  arm, armv7
Raspberry Pi Cross Compilers
Latest GCC Cross Compiler & Native (ARM & ARM64) CI generated precompiled standalone toolchains for all Raspberry Pis. 🍇
Stars: ✭ 261 (-8.74%)
Mutual labels:  cross-compiler, arm
arv
ARV: Asynchronous RISC-V Go High-level Functional Model
Stars: ✭ 18 (-93.71%)
Mutual labels:  riscv, risc-v
m3forth
m3forth is a forth cross-compiler for cortex-m3 ARM microcontrollers
Stars: ✭ 16 (-94.41%)
Mutual labels:  arm, qemu
nordvpn
NordVpn Docker Client
Stars: ✭ 475 (+66.08%)
Mutual labels:  arm, armv7
interp
Interpreter experiment. Testing dispatch methods: Switching, Direct/Indirect Threaded Code, Tail-Calls and Inlining
Stars: ✭ 32 (-88.81%)
Mutual labels:  arm, riscv
NMSIS
Nuclei Microcontroller Software Interface Standard Development Repo
Stars: ✭ 24 (-91.61%)
Mutual labels:  riscv, risc-v

shecc : self-hosting and educational C compiler

logo image

Introduction

shecc is built from scratch, targeted at 32-bit Arm and RISC-V architecture, as a self-compiling compiler for a subset of the C language.

Features

  • Generate executable Linux ELF binaries for ARMv7-A and RV32IM;
  • Provide a minimal C standard library for basic I/O on GNU/Linux;
  • The cross-compiler is written in ANSI C, arguably running on most platforms;
  • Self-contained C language front-end and machine code generator;
  • Two-pass compilation: on the first pass it checks the syntax of statements and constructs a table of symbols, while on the second pass it actually translates program statements into Arm/RISC-V machine code.

Compatibility

shecc is capable of compiling C source files written in the following syntax:

  • data types: char, int, struct, and pointer
  • condition statements: if, while, for, switch, case, break, return, and general expressions
  • compound assignments: +=, -=, *=
  • global/local variable initializations for supported data types
    • e.g. int i = [expr]

The backend targets armv7hf with Linux ABI, verified on Raspberry Pi 3.

Bootstrapping

The steps to validate shecc bootstrapping:

  1. stage0: shecc source code is initially compiled using an ordinary compiler which generates a native executable. The generated compiler can be used as a cross-compiler.
  2. stage1: The built binary reads its own source code as input and generates an ARMv7-A/RV32IM binary.
  3. stage2: The generated ARMv7-A/RV32IM binary is invoked (via QEMU or running on Arm and RISC-V devices) with its own source code as input and generates another ARMv7-A/RV32IM binary.
  4. bootstrap: Build the stage1 and stage2 compilers, and verify that they are byte-wise identical. If so, shecc can compile its own source code and produce new versions of that same program.

Prerequisites

Code generator in shecc does not rely on external utilities. You only need ordinary C compilers such as gcc and clang. However, shecc would bootstrap itself, and Arm/RISC-V ISA emulation is required. Install QEMU for Arm/RISC-V user emulation on GNU/Linux:

$ sudo apt-get install qemu-user

It is still possible to build shecc on macOS or Microsoft Windows. However, the second stage bootstrapping would fail due to qemu-arm absence.

Build and Verify

Configure which backend you want, shecc supports ARMv7-A and RV32IM backend:

$ make config ARCH=arm
# Target machine code switch to Arm

$ make config ARCH=riscv
# Target machine code switch to RISC-V

Run make and you should see this:

  CC+LD	out/inliner
  GEN	out/libc.inc
  CC	out/src/main.o
  LD	out/shecc
  SHECC	out/shecc-stage1.elf
  SHECC	out/shecc-stage2.elf

File out/shecc is the first stage compiler. Its usage:

shecc [-o output] [-no-libc] [--dump-ir] <infile.c>

Compiler options:

  • -o : output file name (default: out.elf)
  • --no-libc : Exclude embedded C library (default: embedded)
  • --dump-ir : Dump intermediate representation (IR)

Example:

$ out/shecc -o fib tests/fib.c
$ chmod +x fib
$ qemu-arm fib

shecc comes with unit tests. To run the tests, give "check" as an argument:

$ make check

Reference output:

...
int main(int argc, int argv) { exit(sizeof(char)); } => 1
int main(int argc, int argv) { int a; a = 0; switch (3) { case 0: return 2; case 3: a = 10; break; case 1: return 0; } exit(a); } => 10
int main(int argc, int argv) { int a; a = 0; switch (3) { case 0: return 2; default: a = 10; break; } exit(a); } => 10
OK

Intermediate Representation

Once the option --dump-ir is passed to shecc, the intermediate representation (IR) will be generated. Take the file tests/fib.c for example. It consists of a recursive Fibonacci sequence function.

int fib(int n)
{
    if (n == 0)
        return 0;
    else if (n == 1)
        return 1;
    return fib(n - 1) + fib(n - 2);
}

Execute the following to generate IR:

$ out/shecc --dump-ir -o fib tests/fib.c

Line-by-line explanation between C source and IR:

 C Source            IR                         Explanation
-------------------+--------------------------+----------------------------------------------------

int fib(int n)      fib:                        Reserve stack frame for function fib
{                     {
    if (n == 0)         x0 = &n                 Get address of variable n
                        x0 = *x0 (4)            Read value from address into x0, length = 4 (int)
                        x1 := 0                 Set x1 to zero
                        x0 == x1 ?              Compare x0 with x1
                        if false then goto 1641 If x0 != x1, then jump to label 1641
        return 0;       x0 := 0                 Set x0 to zero. x0 is the return value.
                        return (from fib)       Jump to function exit
                    1641:
    else if (n == 1)    x0 = &n                 Get address of variable n
                        x0 = *x0 (4)            Read value from address into x0, length = 4 (int)
                        x1 := 1                 Set x1 to 1
                        x0 == x1 ?              Compare x0 with x1
                        if true then goto 1649  If x0 != x1, then jump to label 1649
        return 1;       x0 := 1                 Set x0 to 1. x0 is the return value.
                        return (from fib)       Jump to function exit
                    1649:
    return              x0 = &n                 Get address of variable n
       fib(n - 1)       x0 = *x0 (4)            Read value from address into x0, length = 4 (int)
                        x1 := 1                 Set x1 to 1
                        x0 -= x1                Subtract x1 from x0 i.e. (n - 1)
       +                x0 := fib() @ 1631      Call function fib() into x0
                        push x0                 Store the result on stack
       fib(n - 2);      x0 = &n                 Get address of variable n
                        x0 = *x0 (4)            Read value from address into x0, length = 4 (int)
                        x1 := 2                 Set x1 to 2
                        x0 -= x1                Subtract x1 from x0 i.e. (n - 2)
                        x1 := fib() @ 1631      Call function fib() into x1
                        pop x0                  Retrieve the result off stack into x0
                        x0 += x1                Add x1 to x0 i.e. the result of fib(n-1) + fib(n-2)
                        return (from fib)       Jump to function exit
                      }                         Restore the previous stack frame
                      exit fib

Known Issues

  1. The generated ELF lacks of .bss and .rodata section
  2. The unary * operator is not supported, which makes it necessary to use [0] syntax. Consider int x = 5; int *ptr = &x; and it is forbidden to use *ptr. However, it is valid to use ptr[0], which behaves the same of *ptr.
  3. The support of varying number of function arguments is incomplete. No <stdarg.h> can be used. Alternatively, check the implementation printf in source lib/c.c for var_arg.
  4. The C front-end is a bit dirty because there is no effective AST.

License

shecc is freely redistributable under the BSD 2 clause license. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].