VcSIMD Vector Classes for C++
Stars: ✭ 985 (+764.04%)
UmesimdUME::SIMD A library for explicit simd vectorization.
Stars: ✭ 66 (-42.11%)
Quadray EngineRealtime raytracer using SIMD on ARM, MIPS, PPC and x86
Stars: ✭ 13 (-88.6%)
Guided Missile SimulationGuided Missile, Radar and Infrared EOS Simulation Framework written in Fortran.
Stars: ✭ 33 (-71.05%)
Unisimd AssemblerSIMD macro assembler unified for ARM, MIPS, PPC and x86
Stars: ✭ 63 (-44.74%)
LibxsmmLibrary for specialized dense and sparse matrix operations, and deep learning primitives.
Stars: ✭ 518 (+354.39%)
NsimdAgenium Scale vectorization library for CPUs and GPUs
Stars: ✭ 138 (+21.05%)
ternary-logicSupport for ternary logic in SSE, XOP, AVX2 and x86 programs
Stars: ✭ 21 (-81.58%)
SimdC++ image processing and machine learning library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX(Altivec) and VSX(Power7), NEON for ARM.
Stars: ✭ 1,263 (+1007.89%)
SimdeImplementations of SIMD instruction sets for systems which don't natively support them.
Stars: ✭ 1,012 (+787.72%)
Ctranslate2Fast inference engine for OpenNMT models
Stars: ✭ 140 (+22.81%)
KfrFast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
Stars: ✭ 985 (+764.04%)
XsimdC++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)
Stars: ✭ 964 (+745.61%)
LibsimdppPortable header-only C++ low level SIMD library
Stars: ✭ 914 (+701.75%)
DirectxmathDirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
Stars: ✭ 859 (+653.51%)
OnednnoneAPI Deep Neural Network Library (oneDNN)
Stars: ✭ 2,600 (+2180.7%)
Md5 SimdAccelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.
Stars: ✭ 71 (-37.72%)
Base64simdBase64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)
Stars: ✭ 115 (+0.88%)
cpuwhatNim utilities for advanced CPU operations: CPU identification, ISA extension detection, bindings to assorted intrinsics
Stars: ✭ 25 (-78.07%)
positional-popcountFast C functions for the computing the positional popcount (pospopcnt).
Stars: ✭ 47 (-58.77%)
Std Simdstd::experimental::simd for GCC [ISO/IEC TS 19570:2018]
Stars: ✭ 275 (+141.23%)
ultra-sortDSL for SIMD Sorting on AVX2 & AVX512
Stars: ✭ 29 (-74.56%)
HighwayPerformance-portable, length-agnostic SIMD with runtime dispatch
Stars: ✭ 301 (+164.04%)
OsacaOpen Source Architecture Code Analyzer
Stars: ✭ 162 (+42.11%)
SleefSIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
Stars: ✭ 353 (+209.65%)
MippMIPP is a portable wrapper for SIMD instructions written in C++11. It supports NEON, SSE, AVX and AVX-512.
Stars: ✭ 253 (+121.93%)
LaserThe HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
Stars: ✭ 191 (+67.54%)
Chromium ClangChromium browser compiled with the Clang/LLVM compiler.
Stars: ✭ 77 (-32.46%)
EdgeExtreme-scale Discontinuous Galerkin Environment (EDGE)
Stars: ✭ 18 (-84.21%)
WheelsPerformance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Stars: ✭ 891 (+681.58%)
penguinVSimple and fast C++ image processing library with focus on heterogeneous systems
Stars: ✭ 110 (-3.51%)
hpcLearning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )
Stars: ✭ 39 (-65.79%)
sliceslice-rsA fast implementation of single-pattern substring search using SIMD acceleration.
Stars: ✭ 66 (-42.11%)
Cglm📽 Highly Optimized Graphics Math (glm) for C
Stars: ✭ 887 (+678.07%)
Turbo-TransposeTranspose: SIMD Integer+Floating Point Compression Filter
Stars: ✭ 50 (-56.14%)
DespacerC library to remove white space from strings as fast as possible
Stars: ✭ 90 (-21.05%)
Sha256 SimdAccelerate SHA256 computations in pure Go using Accelerate SHA256 computations in pure Go using AVX512, SHA Extensions for x86 and ARM64 for ARM. On AVX512 it provides an up to 8x improvement (over 3 GB/s per core). SHA Extensions give a performance boost of close to 4x over native.
Stars: ✭ 657 (+476.32%)
utf8Fast UTF-8 validation with range algorithm (NEON+SSE4+AVX2)
Stars: ✭ 60 (-47.37%)
yaskYASK--Yet Another Stencil Kit: a domain-specific language and framework to create high-performance stencil code for implementing finite-difference methods and similar applications.
Stars: ✭ 81 (-28.95%)
simdjson-rsRust version of lemire's SimdJson
Stars: ✭ 18 (-84.21%)
tbslasA parallel, fast solver for the scalar advection-diffusion and the incompressible Navier-Stokes equations based on semi-Lagrangian/Volume-Integral method.
Stars: ✭ 21 (-81.58%)
awesome-simdA curated list of awesome SIMD frameworks, libraries and software
Stars: ✭ 39 (-65.79%)
simdutfUnicode routines (UTF8, UTF16): billions of characters per second.
Stars: ✭ 108 (-5.26%)
simdutf8SIMD-accelerated UTF-8 validation for Rust.
Stars: ✭ 426 (+273.68%)
block-alignerSIMD-accelerated library for computing global and X-drop affine gap penalty sequence-to-sequence or sequence-to-profile alignments using an adaptive block-based algorithm.
Stars: ✭ 58 (-49.12%)
SimdjsonsharpC# bindings for lemire/simdjson (and full C# port)
Stars: ✭ 506 (+343.86%)
Fastbase64SIMD-accelerated base64 codecs
Stars: ✭ 309 (+171.05%)
JohnJohn the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
Stars: ✭ 5,656 (+4861.4%)
SimdjsonParsing gigabytes of JSON per second
Stars: ✭ 15,115 (+13158.77%)
oversimpleA library for audio oversampling, which tries to offer a simple api while wrapping HIIR, by Laurent De Soras, for minimum phase antialiasing, and r8brain-free-src, by Aleksey Vaneev, for linear phase antialiasing.
Stars: ✭ 25 (-78.07%)