C++ image processing and machine learning library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX(Altivec) and VSX(Power7), NEON for ARM.

Stars: ✭ 1,263 (+3727.27%)

Mutual labels: avx, simd, avx2

runtime

AnyDSL Runtime Library

Stars: ✭ 17 (-48.48%)

Mutual labels: simd, gpu-acceleration, vectorization

Xsimd

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)

Stars: ✭ 964 (+2821.21%)

Mutual labels: avx, simd, vectorization

Libxsmm

Library for specialized dense and sparse matrix operations, and deep learning primitives.

Stars: ✭ 518 (+1469.7%)

Mutual labels: avx, simd, avx2

Nsimd

Agenium Scale vectorization library for CPUs and GPUs

Stars: ✭ 138 (+318.18%)

Mutual labels: avx, simd, avx2

Laser

The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers

Stars: ✭ 191 (+478.79%)

Mutual labels: openmp, simd, high-performance-computing

Unisimd Assembler

SIMD macro assembler unified for ARM, MIPS, PPC and x86

Stars: ✭ 63 (+90.91%)

Mutual labels: avx, simd, avx2

ultra-sort

DSL for SIMD Sorting on AVX2 & AVX512

Stars: ✭ 29 (-12.12%)

Mutual labels: simd, avx2, vectorization

std find simd

std::find simd version

Stars: ✭ 19 (-42.42%)

Mutual labels: simd, avx2, vectorization

Quadray Engine

Realtime raytracer using SIMD on ARM, MIPS, PPC and x86

Stars: ✭ 13 (-60.61%)

Mutual labels: avx, simd, avx2

Turbopfor Integer Compression

Fastest Integer Compression

Stars: ✭ 520 (+1475.76%)

Mutual labels: simd, avx2

Fastnoisesimd

C++ SIMD Noise Library

Stars: ✭ 542 (+1542.42%)

Mutual labels: simd, avx2

Libsimdpp

Portable header-only C++ low level SIMD library

Stars: ✭ 914 (+2669.7%)

Mutual labels: simd, avx2

Simdjsonsharp

C# bindings for lemire/simdjson (and full C# port)

Stars: ✭ 506 (+1433.33%)

Mutual labels: simd, avx2

Fastbase64

SIMD-accelerated base64 codecs

Stars: ✭ 309 (+836.36%)

Mutual labels: simd, avx2

Highway

Performance-portable, length-agnostic SIMD with runtime dispatch

Stars: ✭ 301 (+812.12%)

Mutual labels: simd, avx2

Nnpack

Acceleration package for neural networks on multi-core CPUs

Stars: ✭ 1,538 (+4560.61%)

Mutual labels: simd, high-performance-computing

Thorin

The Higher-Order Intermediate Representation

Stars: ✭ 116 (+251.52%)

Mutual labels: simd, vectorization

Base64simd

Base64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)

Stars: ✭ 115 (+248.48%)

Mutual labels: simd, avx2

simdutf

Unicode routines (UTF8, UTF16): billions of characters per second.

Stars: ✭ 108 (+227.27%)

Mutual labels: simd, avx2

Fastapprox

Approximate and vectorized versions of common mathematical functions

Stars: ✭ 128 (+287.88%)

Mutual labels: simd, vectorization

Md5 Simd

Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.

Stars: ✭ 71 (+115.15%)

Mutual labels: simd, avx2

Impala

An imperative and functional programming language

Stars: ✭ 118 (+257.58%)

Mutual labels: simd, vectorization

Simdjson

Parsing gigabytes of JSON per second

Stars: ✭ 15,115 (+45703.03%)

Mutual labels: simd, avx2

Chromium Clang

Chromium browser compiled with the Clang/LLVM compiler.

Stars: ✭ 77 (+133.33%)

Mutual labels: avx, avx2

penguinV

Simple and fast C++ image processing library with focus on heterogeneous systems

Stars: ✭ 110 (+233.33%)

Mutual labels: avx, simd

sse-avx-rasterization

Triangle rasterization routines accelerated by SSE and AVX

Stars: ✭ 53 (+60.61%)

Mutual labels: avx, simd

hpc

Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )

Stars: ✭ 39 (+18.18%)

Mutual labels: avx, simd

oversimple

A library for audio oversampling, which tries to offer a simple api while wrapping HIIR, by Laurent De Soras, for minimum phase antialiasing, and r8brain-free-src, by Aleksey Vaneev, for linear phase antialiasing.

Stars: ✭ 25 (-24.24%)

Mutual labels: avx, simd

Bitmagic

BitMagic Library

Stars: ✭ 263 (+696.97%)

Mutual labels: avx, simd

Componentarrays.jl

Arrays with arbitrarily nested named components.

Stars: ✭ 72 (+118.18%)

Mutual labels: modeling, control-systems

Turbo Run Length Encoding

TurboRLE-Fastest Run Length Encoding

Stars: ✭ 212 (+542.42%)

Mutual labels: simd, avx2

Std Simd

std::experimental::simd for GCC [ISO/IEC TS 19570:2018]

Stars: ✭ 275 (+733.33%)

Mutual labels: avx, simd

Cglm

📽 Highly Optimized Graphics Math (glm) for C

Stars: ✭ 887 (+2587.88%)

Mutual labels: avx, simd

Wheels

Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)

Stars: ✭ 891 (+2600%)

Mutual labels: avx, avx2

Kfr

Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)

Stars: ✭ 985 (+2884.85%)

Mutual labels: avx, simd

Despacer

C library to remove white space from strings as fast as possible

Stars: ✭ 90 (+172.73%)

Mutual labels: avx, simd

Osaca

Open Source Architecture Code Analyzer

Stars: ✭ 162 (+390.91%)

Mutual labels: avx, avx2

Packettracer

The SIMD-accelereted ray tracing in C# powered by Intel hardware intrinsic of .NET Core.

Stars: ✭ 109 (+230.3%)

Mutual labels: avx, simd

Mipp

MIPP is a portable wrapper for SIMD instructions written in C++11. It supports NEON, SSE, AVX and AVX-512.

Stars: ✭ 253 (+666.67%)

Mutual labels: avx, simd

tbslas

A parallel, fast solver for the scalar advection-diffusion and the incompressible Navier-Stokes equations based on semi-Lagrangian/Volume-Integral method.

Stars: ✭ 21 (-36.36%)

Mutual labels: openmp, simd

John

John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs

Stars: ✭ 5,656 (+17039.39%)

Mutual labels: openmp, simd

Stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU

Stars: ✭ 531 (+1509.09%)

Mutual labels: openmp, gpu-acceleration

Arraymancer

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends

Stars: ✭ 793 (+2303.03%)

Mutual labels: openmp, high-performance-computing

block-aligner

SIMD-accelerated library for computing global and X-drop affine gap penalty sequence-to-sequence or sequence-to-profile alignments using an adaptive block-based algorithm.

Stars: ✭ 58 (+75.76%)

Mutual labels: simd, avx2

artic

The AlteRnaTive Impala Compiler

Stars: ✭ 16 (-51.52%)

Mutual labels: simd, vectorization

t8code

Parallel algorithms and data structures for tree-based AMR with arbitrary element shapes.

Stars: ✭ 37 (+12.12%)

Mutual labels: modeling, high-performance-computing

1-60 of 682 similar projects

›

next*5