All Projects → jeffamstutz → Tsimd

jeffamstutz / Tsimd

Licence: mit
Fundamental C++ SIMD types for Intel CPUs (sse, avx, avx2, avx512)

Programming Languages

cpp
1120 projects
cpp11
221 projects

Labels

Projects that are alternatives of or similar to Tsimd

tbslas
A parallel, fast solver for the scalar advection-diffusion and the incompressible Navier-Stokes equations based on semi-Lagrangian/Volume-Integral method.
Stars: ✭ 21 (-92.76%)
Mutual labels:  simd
artic
The AlteRnaTive Impala Compiler
Stars: ✭ 16 (-94.48%)
Mutual labels:  simd
Questdb
An open source SQL database designed to process time series data, faster
Stars: ✭ 7,544 (+2501.38%)
Mutual labels:  simd
T13x
An Extended Version of the T0x multithreaded cores, with a custom general purpose parametrized SIMD/MIMD vector coprocessor and support for 3-5 way superscalar execution. The core is pin-to-pin compatible with the RISCY cores from PULP
Stars: ✭ 28 (-90.34%)
Mutual labels:  simd
positional-popcount
Fast C functions for the computing the positional popcount (pospopcnt).
Stars: ✭ 47 (-83.79%)
Mutual labels:  simd
varint-simd
Decoding and encoding gigabytes of LEB128 variable-length integers per second in Rust with SIMD
Stars: ✭ 31 (-89.31%)
Mutual labels:  simd
utf8
Fast UTF-8 validation with range algorithm (NEON+SSE4+AVX2)
Stars: ✭ 60 (-79.31%)
Mutual labels:  simd
Std Simd
std::experimental::simd for GCC [ISO/IEC TS 19570:2018]
Stars: ✭ 275 (-5.17%)
Mutual labels:  simd
block-aligner
SIMD-accelerated library for computing global and X-drop affine gap penalty sequence-to-sequence or sequence-to-profile alignments using an adaptive block-based algorithm.
Stars: ✭ 58 (-80%)
Mutual labels:  simd
simdutf
Unicode routines (UTF8, UTF16): billions of characters per second.
Stars: ✭ 108 (-62.76%)
Mutual labels:  simd
Turbo-Histogram
Fastest Histogram Construction
Stars: ✭ 44 (-84.83%)
Mutual labels:  simd
fast-base64
Fastest base64 on the web, with Wasm + SIMD
Stars: ✭ 23 (-92.07%)
Mutual labels:  simd
awesome-simd
A curated list of awesome SIMD frameworks, libraries and software
Stars: ✭ 39 (-86.55%)
Mutual labels:  simd
SIMDxorshift
Fast random number generators: Vectorized (SIMD) version of xorshift128+
Stars: ✭ 84 (-71.03%)
Mutual labels:  simd
Bitmagic
BitMagic Library
Stars: ✭ 263 (-9.31%)
Mutual labels:  simd
hlml
vectorized high-level math library
Stars: ✭ 42 (-85.52%)
Mutual labels:  simd
shortcut-comparison
Performance comparison of parallel Rust and C++
Stars: ✭ 74 (-74.48%)
Mutual labels:  simd
Fastor
A lightweight high performance tensor algebra framework for modern C++
Stars: ✭ 280 (-3.45%)
Mutual labels:  simd
Graphene
A thin layer of graphic data types
Stars: ✭ 268 (-7.59%)
Mutual labels:  simd
highway-rs
Native Rust port of Google's HighwayHash, which makes use of SIMD instructions for a fast and strong hash function
Stars: ✭ 57 (-80.34%)
Mutual labels:  simd

tsimd - Fundamental C++ SIMD types for Intel CPUs (sse to avx512)

This library is header-only and is implemented according to which Intel ISA flags are enabled in the translation unit for which they are used (e.g. -mavx with gcc or clang).

Master Status: Build Status

TODOs (contributions welcome!)

  • unsigned integer pack<> types
  • support for other CPU ISAs

Build Requirements

Using tsimd

  • C++11 compiler

(unofficial list of compilers, not all are tested)

  • GCC >= 4.8.1
  • clang >= 3.4
  • ICC >= 16
  • Visual Studio 2015 (64-bit target)

Building tsimd's examples/benchmarks/tests and installing from soure

  • cmake >= 3.2

Library layout and usage

The library is logically composed of 3 different components:

  1. The pack<T, W> class, which is a logical SIMD register
  2. Functions which can load and store packs in and out of larger arrays.
  3. Operators and functions to manipulate packs.

While there does not yet exist any true documentation, users are encouraged to see what type aliases are defined in tsimd/detail/pack.h, as well as what operators and functions are available in tsimd/detail/operators/ and tsimd/detail/functions/ respectively. Generally speaking, each header found in detail/ encapsulates exactly one type, operator, or function to aide in discovery.

Example

SAXPY

Consider the following function (kernel) taking values from two input arrays and storing in an output array.

// NOTE: n is the length of all 3 arrays
void saxpy(float a, int n, float x[], float y[], float out[])
{
  for (int i = 0; i < n; ++i) {
    const float xi = x[i];
    const float yi = y[i];
    const float result = a * xi + yi;
    out[i] = result;
  }
}

This kernel ends up applying the exact same formula to every element in the data. SIMD instructions enable us to reduce the total number of iterations by a factor of the CPU's SIMD register size. We do this by using tsimd types instead of builtin types.

// NOTE: n is the length of all 3 arrays
void saxpy_tsimd(float a, int n, float x[], float y[], float out[])
{
  using namespace tsimd;
  for (int i = 0; i < n; i += vfloat::static_size) {
    const vfloat xi = load<vfloat>(&x[i]);
    const vfloat yi = load<vfloat>(&y[i]);
    const vfloat result = a * xi + yi; // same formula!
    store(result, &out[i]);
  }
}

The advantage to this version (instead of using a specific SIMD width, say vfloat4 or vfloat8) is that the kernel function will be "widened" to the best available width based on how it gets compiled. In other words: 4-wide for SSE, 8-wide for AVX/AVX2, and 16-wide for AVX512.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].