All Projects → minio → C2goasm

minio / C2goasm

Licence: apache-2.0
C to Go Assembly

Programming Languages

go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to C2goasm

lldbg
A lightweight native GUI for LLDB.
Stars: ✭ 83 (-92.26%)
Mutual labels:  llvm, gcc, clang
Cmake Scripts
A selection of useful scripts for use in CMake projects, include code coverage, sanitizers, and dependency graph generation.
Stars: ✭ 202 (-81.16%)
Mutual labels:  llvm, clang, gcc
Avalonstudio
Cross platform IDE and Shell
Stars: ✭ 1,132 (+5.6%)
Mutual labels:  llvm, clang, gcc
hmg
💝 My personal Gentoo/Linux configuration backup files
Stars: ✭ 16 (-98.51%)
Mutual labels:  llvm, gcc, clang
clang-format-editor
Clang-Format Editor is a tool that helps you find the best Clang-Format Style for your C++, C#, Java, JavaScript, and Objective-C code.
Stars: ✭ 15 (-98.6%)
Mutual labels:  llvm, clang
llvm-svn
Arch Linux PKGBUILD for LLVM, Clang et al. (latest SVN code)
Stars: ✭ 18 (-98.32%)
Mutual labels:  llvm, clang
Boomerang
Boomerang Decompiler - Fighting the code-rot :)
Stars: ✭ 265 (-75.28%)
Mutual labels:  clang, gcc
Swiftweekly.github.io
A community-driven weekly newsletter about Swift.org
Stars: ✭ 305 (-71.55%)
Mutual labels:  llvm, clang
arch-packages
Arch Linux performance important packages
Stars: ✭ 27 (-97.48%)
Mutual labels:  llvm, clang
Usrefl
Header-only, tiny (99 lines) and powerful C++20 static reflection library.
Stars: ✭ 287 (-73.23%)
Mutual labels:  clang, gcc
Clangkit
ClangKit provides an Objective-C frontend to LibClang. Source tokenization, diagnostics and fix-its are actually implemented.
Stars: ✭ 330 (-69.22%)
Mutual labels:  llvm, clang
llvm-project-prepo
Fork of LLVM with modifications to support a program repository
Stars: ✭ 27 (-97.48%)
Mutual labels:  llvm, clang
SameTypeClangPlugin
自定义检查规范的 Clang 插件
Stars: ✭ 47 (-95.62%)
Mutual labels:  llvm, clang
Olifant
A simple programming language targeting LLVM
Stars: ✭ 58 (-94.59%)
Mutual labels:  llvm, clang
rooki
A stupid simple script runner supporting c, c++, rust, haskell and virtually anything
Stars: ✭ 26 (-97.57%)
Mutual labels:  gcc, clang
Clang Power Tools
Bringing clang-tidy magic to Visual Studio C++ developers.
Stars: ✭ 285 (-73.41%)
Mutual labels:  llvm, clang
Gcovr
generate code coverage reports with gcc/gcov
Stars: ✭ 482 (-55.04%)
Mutual labels:  clang, gcc
Enzyme
High-performance automatic differentiation of LLVM.
Stars: ✭ 418 (-61.01%)
Mutual labels:  llvm, clang
Fcd
An optimizing decompiler
Stars: ✭ 622 (-41.98%)
Mutual labels:  llvm, clang
wasm-toolchain
WebAssembly toolchain
Stars: ✭ 34 (-96.83%)
Mutual labels:  llvm, clang

c2goasm: C to Go Assembly

Introduction

This is a tool to convert assembly as generated by a C/C++ compiler into Golang assembly. It is meant to be used in combination with asm2plan9s in order to automatically generate pure Go wrappers for C/C++ code (that may for instance take advantage of compiler SIMD intrinsics or template<> code).

Mode of operation:

$ c2goasm -a /path/to/some/great/c-code.s /path/to/now/great/golang-code_amd64.s

You can optionally nicely format the code using asmfmt by passing in an -f flag.

This project has been developed as part of developing a Go wrapper around Simd. However it should also work with other projects and libraries. Keep in mind though that it is not intented to 'port' a complete C/C++ project in a single action but rather do it on a case-by-case basis per function/source file (and create accompanying high level Go code to call into the assembly code).

Command line options

$ c2goasm --help
Usage of c2goasm:
  -a	Immediately invoke asm2plan9s
  -c	Compact byte codes
  -f	Format using asmfmt
  -s	Strip comments

A simple example

Here is a simple C function doing an AVX2 intrinsics computation:

void MultiplyAndAdd(float* arg1, float* arg2, float* arg3, float* result) {
    __m256 vec1 = _mm256_load_ps(arg1);
    __m256 vec2 = _mm256_load_ps(arg2);
    __m256 vec3 = _mm256_load_ps(arg3);
    __m256 res  = _mm256_fmadd_ps(vec1, vec2, vec3);
    _mm256_storeu_ps(result, res);
}

Compiling into assembly gives the following

__ZN14MultiplyAndAddEPfS1_S1_S1_: ## @_ZN14MultiplyAndAddEPfS1_S1_S1_
## BB#0:
        push          rbp
        mov           rbp, rsp
        vmovups       ymm0, ymmword ptr [rdi]
        vmovups       ymm1, ymmword ptr [rsi]
        vfmadd213ps   ymm1, ymm0, ymmword ptr [rdx]
        vmovups       ymmword ptr [rcx], ymm1
        pop           rbp
        vzeroupper
        ret

Running c2goasm will generate the following Go assembly (eg. saved in MultiplyAndAdd_amd64.s)

//+build !noasm !appengine
// AUTO-GENERATED BY C2GOASM -- DO NOT EDIT

TEXT ·_MultiplyAndAdd(SB), $0-32

	MOVQ vec1+0(FP), DI
	MOVQ vec2+8(FP), SI
	MOVQ vec3+16(FP), DX
	MOVQ result+24(FP), CX

	LONG $0x0710fcc5             // vmovups    ymm0, yword [rdi]
	LONG $0x0e10fcc5             // vmovups    ymm1, yword [rsi]
	LONG $0xa87de2c4; BYTE $0x0a // vfmadd213ps    ymm1, ymm0, yword [rdx]
	LONG $0x0911fcc5             // vmovups    yword [rcx], ymm1

	VZEROUPPER
	RET

This needs to be accompanied by the following Go code (in MultiplyAndAdd_amd64.go)

//go:noescape
func _MultiplyAndAdd(vec1, vec2, vec3, result unsafe.Pointer)

func MultiplyAndAdd(someObj Object) {

	_MultiplyAndAdd(someObj.GetVec1(), someObj.GetVec2(), someObj.GetVec3(), someObj.GetResult()))
}

And as you may have gathered the amd64.go file needs to be in place in order for the arguments names to be derived (and allow go vet to succeed).

Benchmark against cgo

We have run benchmarks of c2goasm versus cgo for both Go version 1.7.5 and 1.8.1. You can find the c2goasm benchmark test in test/ and the cgo test in cgocmp/ respectively. Here are the results for both versions:

$ benchcmp ../cgocmp/cgo-1.7.5.out c2goasm.out 
benchmark                      old ns/op     new ns/op     delta
BenchmarkMultiplyAndAdd-12     382           10.9          -97.15%
$ benchcmp ../cgocmp/cgo-1.8.1.out c2goasm.out 
benchmark                      old ns/op     new ns/op     delta
BenchmarkMultiplyAndAdd-12     236           10.9          -95.38%

As you can see Golang 1.8 has made a significant improvement (38.2%) over 1.7.5, but it is still about 20x slower than directly calling into assembly code as wrapped by c2goasm.

Converted projects

Internals

The basic process is to (in the prologue) setup the stack and registers as how the C code expects this to be the case, and upon exiting the subroutine (in the epilogue) to revert back to the golang world and pass a return value back if required. In more details:

  • Define assembly subroutine with proper golang decoration in terms of needed stack space and overall size of arguments plus return value.
  • Function arguments are loaded from the golang stack into registers and prior to starting the C code any arguments beyond 6 are stored in C stack space.
  • Stack space is reserved and setup for the C code. Depending on the C code, the stack pointer maybe aligned on a certain boundary (especially needed for code that takes advantages of SIMD instructions such as AVX etc.).
  • A constants table is generated (if needed) and any rip-based references are replaced with proper offsets to where Go will put the table.

Limitations

  • Arguments need (for now) to be 64-bit size, meaning either a value or a pointer (this requirement will be lifted)
  • Maximum number of 14 arguments (hard limit -- if you hit this maybe you should rethink your api anyway...)
  • Generally no call statements (thus inline your C code) with a couple of exceptions for functions such as memset and memcpy (see clib_amd64.s)

Generate assembly from C/C++

For eg. projects using cmake, here is how to see a list of assembly targets

$ make help | grep "\.s"

To see the actual command to generate the assembly

$ make -n SimdAvx2BgraToGray.s

Supported golang architectures

For now just the AMD64 architecture is supported. Also ARM64 should work just fine in a similar fashion but support is lacking at the moment.

Compatible compilers

The following compilers have been tested:

  • clang (Apple LLVM version) on OSX/darwin
  • clang on linux

Compiler flags:

-masm=intel -mno-red-zone -mstackrealign -mllvm -inline-threshold=1000 -fno-asynchronous-unwind-tables -fno-exceptions -fno-rtti
Flag Explanation
-masm=intel Output Intel syntax for assembly
-mno-red-zone Do not write below stack pointer (avoid red zone)
-mstackrealign Use explicit stack initialization
-mllvm -inline-threshold=1000 Higher limit for inlining heuristic (default=255)
-fno-asynchronous-unwind-tables Do not generate unwind tables (for debug purposes)
-fno-exceptions Disable exception handling
-fno-rtti Disable run-time type information

The following flags are only available in clang -cc1 frontend mode (see below):

Flag Explanation
-fno-jump-tables Do not use jump tables as may be generated for select statements

clang vs clang -cc1

As per the clang FAQ, clang -cc1 is the frontend, and clang is a (mostly GCC compatible) driver for the frontend. To see all options that the driver passes on to the frontend, use -### like this:

$ clang -### -c hello.c
"/usr/lib/llvm/bin/clang" "-cc1" "-triple" "x86_64-pc-linux-gnu" etc. etc. etc.

Command line flags for clang

To see all command line flags use either clang --help or clang --help-hidden for the clang driver or clang -cc1 -help for the frontend.

Further optimization and fine tuning

Using the LLVM optimizer (opt) you can further optimize the code generation. Use opt -help or opt -help-hidden for all available options.

An option can be passed in via clang using the -mllvm <value> option, such as -mllvm -inline-threshold=1000 as discussed above.

Also LLVM allows you to tune specific functions via function attributes like define void @f() alwaysinline norecurse { ... }.

What about GCC support?

For now GCC code will not work out of the box. However there is no reason why GCC should not work fundamentally (PRs are welcome).

Resources

License

c2goasm is released under the Apache License v2.0. You can find the complete text in the file LICENSE.

Contributing

Contributions are welcome, please send PRs for any enhancements.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].