MIPP is a portable wrapper for SIMD instructions written in C++11. It supports NEON, SSE, AVX and AVX-512.
turbo.js - perform massive parallel computations in your browser with GPGPU.
Remote protein homology detection suite.
Reed-Solomon Erasure Code engine in Go, could more than 15GB/s per core
Modular node based noise generation library using SIMD, C++17 and templates
Fast integer compression in C using the StreamVByte codec
The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
Parsing gigabytes of JSON per second
Ubpa Graphics Mathematics
The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
Code for paper "Base64 encoding and decoding at almost the speed of a memory copy"
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Intel SPMD Program Compiler
Highly optimized inference engine for Binarized Neural Networks
Agenium Scale vectorization library for CPUs and GPUs
Approximate and vectorized versions of common mathematical functions
An imperative and functional programming language
Acceleration package for neural networks on multi-core CPUs
The Higher-Order Intermediate Representation
⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
Base64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)
The SIMD-accelereted ray tracing in C# powered by Intel hardware intrinsic of .NET Core.
A small study in hardware accelerated AoS reversal
C++ Implementations of sketch data structures with SIMD Parallelism, including Python bindings
Amplifier allows .NET developers to easily run complex applications with intensive mathematical computation on Intel CPU/GPU, NVIDIA, AMD without writing any additional C kernel code. Write your function in .NET and Amplifier will take care of running it on your favorite hardware.
C library to remove white space from strings as fast as possible
C++ image processing and machine learning library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX(Altivec) and VSX(Power7), NEON for ARM.
Open source c++ skeletal animation library and toolset
A crate to help you go wide. By which I mean use SIMD stuff.
Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.
UME::SIMD A library for explicit simd vectorization.
Computer Vision package in pure Go taking advantage of SIMD acceleration
Run Keras models from a C++ application on embedded devices
A SIMD optimized fixed-length string class along with an adaptive hash table for fast searching
Implementations of SIMD instruction sets for systems which don't natively support them.
Minimalistic Vulkan engine for fast propotyping.
SIMD Vector Classes for C++
Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)
Portable header-only C++ low level SIMD library
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
SIMD Floating point and integer compressed vector library
testbed for different SIMD implementations for set intersection and set union
Extreme-scale Discontinuous Galerkin Environment (EDGE)
📽 Highly Optimized Graphics Math (glm) for C
High-efficiency floating-point neural network inference operators for mobile, server, and Web
A linear algebra and mathematics library for computer graphics.