SimdeImplementations of SIMD instruction sets for systems which don't natively support them.
Stars: ✭ 1,012 (+2.74%)
XsimdC++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)
Stars: ✭ 964 (-2.13%)
Unisimd AssemblerSIMD macro assembler unified for ARM, MIPS, PPC and x86
Stars: ✭ 63 (-93.6%)
UmesimdUME::SIMD A library for explicit simd vectorization.
Stars: ✭ 66 (-93.3%)
Quadray EngineRealtime raytracer using SIMD on ARM, MIPS, PPC and x86
Stars: ✭ 13 (-98.68%)
SimdC++ image processing and machine learning library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX(Altivec) and VSX(Power7), NEON for ARM.
Stars: ✭ 1,263 (+28.22%)
NsimdAgenium Scale vectorization library for CPUs and GPUs
Stars: ✭ 138 (-85.99%)
LibxsmmLibrary for specialized dense and sparse matrix operations, and deep learning primitives.
Stars: ✭ 518 (-47.41%)
ultra-sortDSL for SIMD Sorting on AVX2 & AVX512
Stars: ✭ 29 (-97.06%)
Std Simdstd::experimental::simd for GCC [ISO/IEC TS 19570:2018]
Stars: ✭ 275 (-72.08%)
ternary-logicSupport for ternary logic in SSE, XOP, AVX2 and x86 programs
Stars: ✭ 21 (-97.87%)
LibsimdppPortable header-only C++ low level SIMD library
Stars: ✭ 914 (-7.21%)
DirectxmathDirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
Stars: ✭ 859 (-12.79%)
SleefSIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
Stars: ✭ 353 (-64.16%)
MippMIPP is a portable wrapper for SIMD instructions written in C++11. It supports NEON, SSE, AVX and AVX-512.
Stars: ✭ 253 (-74.31%)
Base64simdBase64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)
Stars: ✭ 115 (-88.32%)
Guided Missile SimulationGuided Missile, Radar and Infrared EOS Simulation Framework written in Fortran.
Stars: ✭ 33 (-96.65%)
oversimpleA library for audio oversampling, which tries to offer a simple api while wrapping HIIR, by Laurent De Soras, for minimum phase antialiasing, and r8brain-free-src, by Aleksey Vaneev, for linear phase antialiasing.
Stars: ✭ 25 (-97.46%)
Sse4 StrstrSIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification
Stars: ✭ 115 (-88.32%)
Cglm📽 Highly Optimized Graphics Math (glm) for C
Stars: ✭ 887 (-9.95%)
cpuwhatNim utilities for advanced CPU operations: CPU identification, ISA extension detection, bindings to assorted intrinsics
Stars: ✭ 25 (-97.46%)
Corrfunc⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
Stars: ✭ 114 (-88.43%)
HighwayPerformance-portable, length-agnostic SIMD with runtime dispatch
Stars: ✭ 301 (-69.44%)
hpcLearning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )
Stars: ✭ 39 (-96.04%)
OsacaOpen Source Architecture Code Analyzer
Stars: ✭ 162 (-83.55%)
Libpopcnt🚀 Fast C/C++ bit population count library
Stars: ✭ 219 (-77.77%)
penguinVSimple and fast C++ image processing library with focus on heterogeneous systems
Stars: ✭ 110 (-88.83%)
simdutf8SIMD-accelerated UTF-8 validation for Rust.
Stars: ✭ 426 (-56.75%)
ToysStorage for my snippets, toy programs, etc.
Stars: ✭ 187 (-81.02%)
utf8Fast UTF-8 validation with range algorithm (NEON+SSE4+AVX2)
Stars: ✭ 60 (-93.91%)
Sse PopcountSIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html
Stars: ✭ 226 (-77.06%)
Md5 SimdAccelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.
Stars: ✭ 71 (-92.79%)
HlslppMath library using hlsl syntax with SSE/NEON support
Stars: ✭ 153 (-84.47%)
positional-popcountFast C functions for the computing the positional popcount (pospopcnt).
Stars: ✭ 47 (-95.23%)
SoftLightA shader-based Software Renderer Using The LightSky Framework.
Stars: ✭ 2 (-99.8%)
CoriumCorium is a modern scripting language which combines simple, safe and efficient programming.
Stars: ✭ 18 (-98.17%)
simdutfUnicode routines (UTF8, UTF16): billions of characters per second.
Stars: ✭ 108 (-89.04%)
SimdjsonParsing gigabytes of JSON per second
Stars: ✭ 15,115 (+1434.52%)
Sse2neonA translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation
Stars: ✭ 316 (-67.92%)
articThe AlteRnaTive Impala Compiler
Stars: ✭ 16 (-98.38%)
Turbo-TransposeTranspose: SIMD Integer+Floating Point Compression Filter
Stars: ✭ 50 (-94.92%)
DespacerC library to remove white space from strings as fast as possible
Stars: ✭ 90 (-90.86%)
Ctranslate2Fast inference engine for OpenNMT models
Stars: ✭ 140 (-85.79%)
HLMLAuto-generated maths library for C and C++ based on HLSL/Cg
Stars: ✭ 23 (-97.66%)
KfrFast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
Stars: ✭ 985 (+0%)
runtimeAnyDSL Runtime Library
Stars: ✭ 17 (-98.27%)
Chromium ClangChromium browser compiled with the Clang/LLVM compiler.
Stars: ✭ 77 (-92.18%)
qHilbertqHilbert is a vectorized speedup of Hilbert curve generation using SIMD intrinsics
Stars: ✭ 22 (-97.77%)
WheelsPerformance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Stars: ✭ 891 (-9.54%)
java-multithreadCódigos feitos para o curso de Multithreading com Java, no canal RinaldoDev do YouTube.
Stars: ✭ 24 (-97.56%)