Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → xtensor-stack → Xsimd

xtensor-stack / Xsimd

Licence: bsd-3-clause

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)

Programming Languages

cpp

1120 projects

Labels

simd sse neon avx512 avx vectorization

Projects that are alternatives of or similar to Xsimd

SIMD Vector Classes for C++

Stars: ✭ 985 (+2.18%)

Mutual labels: simd, sse, neon, vectorization, avx512, avx

Simde

Implementations of SIMD instruction sets for systems which don't natively support them.

Stars: ✭ 1,012 (+4.98%)

Mutual labels: simd, sse, neon, vectorization, avx512, avx

Boost.simd

Boost SIMD

Stars: ✭ 238 (-75.31%)

Mutual labels: simd, sse, neon, vectorization, avx512, avx

Std Simd

std::experimental::simd for GCC [ISO/IEC TS 19570:2018]

Stars: ✭ 275 (-71.47%)

Mutual labels: simd, sse, neon, avx512, avx

Unisimd Assembler

SIMD macro assembler unified for ARM, MIPS, PPC and x86

Stars: ✭ 63 (-93.46%)

Mutual labels: simd, sse, neon, avx512, avx

Sleef

SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT

Stars: ✭ 353 (-63.38%)

Mutual labels: simd, neon, vectorization, avx512, avx

Quadray Engine

Realtime raytracer using SIMD on ARM, MIPS, PPC and x86

Stars: ✭ 13 (-98.65%)

Mutual labels: simd, sse, neon, avx512, avx

Simd

C++ image processing and machine learning library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX(Altivec) and VSX(Power7), NEON for ARM.

Stars: ✭ 1,263 (+31.02%)

Mutual labels: simd, sse, neon, avx512, avx

Umesimd

UME::SIMD A library for explicit simd vectorization.

Stars: ✭ 66 (-93.15%)

Mutual labels: simd, neon, vectorization, avx512, avx

oversimple

A library for audio oversampling, which tries to offer a simple api while wrapping HIIR, by Laurent De Soras, for minimum phase antialiasing, and r8brain-free-src, by Aleksey Vaneev, for linear phase antialiasing.

Stars: ✭ 25 (-97.41%)

Mutual labels: neon, avx, sse, simd

Base64simd

Base64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)

Stars: ✭ 115 (-88.07%)

Mutual labels: simd, sse, neon, avx512

Nsimd

Agenium Scale vectorization library for CPUs and GPUs

Stars: ✭ 138 (-85.68%)

Mutual labels: simd, neon, avx512, avx

Libxsmm

Library for specialized dense and sparse matrix operations, and deep learning primitives.

Stars: ✭ 518 (-46.27%)

Mutual labels: simd, sse, avx512, avx

Mipp

MIPP is a portable wrapper for SIMD instructions written in C++11. It supports NEON, SSE, AVX and AVX-512.

Stars: ✭ 253 (-73.76%)

Mutual labels: simd, sse, neon, avx

Directxmath

DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps

Stars: ✭ 859 (-10.89%)

Mutual labels: simd, sse, neon, avx

Libsimdpp

Portable header-only C++ low level SIMD library

Stars: ✭ 914 (-5.19%)

Mutual labels: simd, sse, neon, avx512

ternary-logic

Support for ternary logic in SSE, XOP, AVX2 and x86 programs

Stars: ✭ 21 (-97.82%)

Mutual labels: avx, sse, simd, avx512

Cglm

📽 Highly Optimized Graphics Math (glm) for C

Stars: ✭ 887 (-7.99%)

Mutual labels: simd, sse, neon, avx

penguinV

Simple and fast C++ image processing library with focus on heterogeneous systems

Stars: ✭ 110 (-88.59%)

Mutual labels: avx, sse, simd

std find simd

std::find simd version

Stars: ✭ 19 (-98.03%)

Mutual labels: simd, vectorization, avx512

View All Similar Projects ➔

C++ wrappers for SIMD intrinsics

Introduction

SIMD (Single Instruction, Multiple Data) is a feature of microprocessors that has been available for many years. SIMD instructions perform a single operation on a batch of values at once, and thus provide a way to significantly accelerate code execution. However, these instructions differ between microprocessor vendors and compilers.

xsimd provides a unified means for using these features for library authors. Namely, it enables manipulation of batches of numbers with the same arithmetic operators as for single values. It also provides accelerated implementation of common mathematical functions operating on batches.

You can find out more about this implementation of C++ wrappers for SIMD intrinsics at the The C++ Scientist. The mathematical functions are a lightweight implementation of the algorithms used in boost.SIMD.

xsimd requires a C++11 compliant compiler. The following C++ compilers are supported:

Compiler	Version
Microsoft Visual Studio	MSVC 2015 update 2 and above
g++	4.9 and above
clang	4.0 and above

The following SIMD instruction set extensions are supported:

Architecture	Instruction set extensions
x86	SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, FMA3, AVX2
x86	AVX512 (gcc7 and higher)
x86 AMD	same as above + SSE4A, FMA4, XOP
ARM	ARMv7, ARMv8

Installation

Although xsimd is a header-only library, we provide standardized means to install it, with package managers or with cmake.

Besides the xsimd headers, all these methods place the CMake project configuration file in the right location so that third-party projects can use cmake's find_package to locate xsimd headers.

Install with conda

A package for xsimd is available on the conda package manager.

conda install -c conda-forge xsimd

Install with Conan

If you are using Conan to manage your dependencies, merely add xsimd/[email protected]/public-conan to your requires, where x.y.z is the release version you want to use. Please file issues in conan-xsimd if you experience problems with the packages. Sample conanfile.txt:

[requires]
xsimd/[email protected]/public-conan

[generators]
cmake

Install with Spack

A package for xsimd is available on the Spack package manager.

spack install xsimd
spack load xsimd

Install from sources

You can directly install it from the sources with cmake:

cmake -D CMAKE_INSTALL_PREFIX=your_install_prefix
make install

Documentation

To get started with using xsimd, check out the full documentation

http://xsimd.readthedocs.io/

Usage

Explicit use of an instruction set extension

Here is an example that computes the mean of two sets of 4 double floating point values, assuming AVX extension is supported:

#include <iostream>
#include "xsimd/xsimd.hpp"

namespace xs = xsimd;

int main(int argc, char* argv[])
{
    xs::batch<double, 4> a(1.5, 2.5, 3.5, 4.5);
    xs::batch<double, 4> b(2.5, 3.5, 4.5, 5.5);
    auto mean = (a + b) / 2;
    std::cout << mean << std::endl;
    return 0;
}

Do not forget to enable AVX extension when building the example. With gcc or clang, this is done with the -march=native flag, on MSVC you have to pass the /arch:AVX option.

This example outputs:

(2.0, 3.0, 4.0, 5.0)

Auto detection of the instruction set extension to be used

The same computation operating on vectors and using the most performant instruction set available:

#include <cstddef>
#include <vector>
#include "xsimd/xsimd.hpp"

namespace xs = xsimd;
using vector_type = std::vector<double, xsimd::aligned_allocator<double, XSIMD_DEFAULT_ALIGNMENT>>;

void mean(const vector_type& a, const vector_type& b, vector_type& res)
{
    std::size_t size = a.size();
    constexpr std::size_t simd_size = xsimd::simd_type<double>::size;
    std::size_t vec_size = size - size % simd_size;

    for(std::size_t i = 0; i < vec_size; i += simd_size)
    {
        auto ba = xs::load_aligned(&a[i]);
        auto bb = xs::load_aligned(&b[i]);
        auto bres = (ba + bb) / 2.;
        bres.store_aligned(&res[i]);
    }
    for(std::size_t i = vec_size; i < size; ++i)
    {
        res[i] = (a[i] + b[i]) / 2.;
    }
}

We also implement STL algorithms to work optimally on batches. Using xsimd::transform the loop from the example becomes:

#include <cstddef>
#include <vector>
#include "xsimd/xsimd.hpp"
#include "xsimd/stl/algorithms.hpp"

namespace xs = xsimd;
using vector_type = std::vector<double, xsimd::aligned_allocator<double, XSIMD_DEFAULT_ALIGNMENT>>;

void mean(const vector_type& a, const vector_type& b, vector_type& res)
{
    xsimd::transform(a.begin(), a.end(), b.begin(), res.begin(),
                     [](const auto& x, const auto& y) { (x + y) / 2.; });
}

Building and Running the Tests

Building the tests requires the GTest testing framework and cmake.

gtest and cmake are available as a packages for most linux distributions. Besides, they can also be installed with the conda package manager (even on windows):

conda install -c conda-forge gtest cmake

Once gtest and cmake are installed, you can build and run the tests:

mkdir build
cd build
cmake ../ -DBUILD_TESTS=ON
make xtest

In the context of continuous integration with Travis CI, tests are run in a conda environment, which can be activated with

cd test
conda env create -f ./test-environment.yml
source activate test-xsimd
cd ..
cmake . -DBUILD_TESTS=ON
make xtest

Building the HTML Documentation

xsimd's documentation is built with three tools

While doxygen must be installed separately, you can install breathe by typing

pip install breathe

Breathe can also be installed with conda

conda install -c conda-forge breathe

Finally, build the documentation with

make html

from the docs subdirectory.

License

We use a shared copyright model that enables all contributors to maintain the copyright on their contributions.

This software is licensed under the BSD-3-Clause license. See the LICENSE file for details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 964

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (66) 🔗