LibxsmmLibrary for specialized dense and sparse matrix operations, and deep learning primitives.
Stars: ✭ 518 (+171.2%)
tbslasA parallel, fast solver for the scalar advection-diffusion and the incompressible Navier-Stokes equations based on semi-Lagrangian/Volume-Integral method.
Stars: ✭ 21 (-89.01%)
EdgeExtreme-scale Discontinuous Galerkin Environment (EDGE)
Stars: ✭ 18 (-90.58%)
ArraymancerA fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+315.18%)
Guided Missile SimulationGuided Missile, Radar and Infrared EOS Simulation Framework written in Fortran.
Stars: ✭ 33 (-82.72%)
t8codeParallel algorithms and data structures for tree-based AMR with arbitrary element shapes.
Stars: ✭ 37 (-80.63%)
NPB-CPPNAS Parallel Benchmark Kernels in C/C++. The parallel versions are in FastFlow, TBB, and OpenMP.
Stars: ✭ 18 (-90.58%)
HpttHigh-Performance Tensor Transpose library
Stars: ✭ 141 (-26.18%)
Jsturbo.js - perform massive parallel computations in your browser with GPGPU.
Stars: ✭ 2,591 (+1256.54%)
VectoriousLinear algebra in TypeScript.
Stars: ✭ 616 (+222.51%)
mir-glas[Experimental] LLVM-accelerated Generic Linear Algebra Subprograms
Stars: ✭ 99 (-48.17%)
Armadillo CodeArmadillo: fast C++ library for linear algebra & scientific computing - http://arma.sourceforge.net
Stars: ✭ 388 (+103.14%)
OccaJIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal
Stars: ✭ 230 (+20.42%)
SIMDArraySIMD enhanced Array operations
Stars: ✭ 123 (-35.6%)
ultra-sortDSL for SIMD Sorting on AVX2 & AVX512
Stars: ✭ 29 (-84.82%)
linneaLinnea is an experimental tool for the automatic generation of optimized code for linear algebra problems.
Stars: ✭ 60 (-68.59%)
optimathA #[no_std] LinAlg library
Stars: ✭ 47 (-75.39%)
blas-benchmarksTiming results for BLAS (Basic Linear Algebra Subprograms) libraries in R
Stars: ✭ 24 (-87.43%)
monolishmonolish: MONOlithic LInear equation Solvers for Highly-parallel architecture
Stars: ✭ 166 (-13.09%)
Awesome Tensor CompilersA list of awesome compiler projects and papers for tensor computation and deep learning.
Stars: ✭ 490 (+156.54%)
IlgpuILGPU JIT Compiler for high-performance .Net GPU programs
Stars: ✭ 374 (+95.81%)
JohnJohn the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
Stars: ✭ 5,656 (+2861.26%)
EmsExtended Memory Semantics - Persistent shared object memory and parallelism for Node.js and Python
Stars: ✭ 552 (+189.01%)
BlisBLAS-like Library Instantiation Software Framework
Stars: ✭ 859 (+349.74%)
VcSIMD Vector Classes for C++
Stars: ✭ 985 (+415.71%)
NxMulti-dimensional arrays (tensors) and numerical definitions for Elixir
Stars: ✭ 1,133 (+493.19%)
Corrfunc⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
Stars: ✭ 114 (-40.31%)
URTFast Unit Root Tests and OLS regression in C++ with wrappers for R and Python
Stars: ✭ 70 (-63.35%)
TaskflowA General-purpose Parallel and Heterogeneous Task Programming System
Stars: ✭ 6,128 (+3108.38%)
MpmCB-Geo High-Performance Material Point Method
Stars: ✭ 115 (-39.79%)
NnpackAcceleration package for neural networks on multi-core CPUs
Stars: ✭ 1,538 (+705.24%)
JitfromscratchExample project from my talks in the LLVM Social Berlin and C++ User Group
Stars: ✭ 158 (-17.28%)
Neural FortranA parallel neural net microframework
Stars: ✭ 173 (-9.42%)
AphrosFinite volume solver for incompressible multiphase flows with surface tension
Stars: ✭ 154 (-19.37%)
Base64 Avx512Code for paper "Base64 encoding and decoding at almost the speed of a memory copy"
Stars: ✭ 158 (-17.28%)
BkcrackCrack legacy zip encryption with Biham and Kocher's known plaintext attack.
Stars: ✭ 178 (-6.81%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-20.42%)
Sparse Winograd CnnEfficient Sparse-Winograd Convolutional Neural Networks (ICLR 2018)
Stars: ✭ 156 (-18.32%)
HpcinfoInformation about many aspects of high-performance computing. Wiki content moved to ~/docs.
Stars: ✭ 171 (-10.47%)
RaytracerRay tracer with phong lighting, reflections, refractions, normal mapping, procedural textures, super sampling, and depth of field.
Stars: ✭ 155 (-18.85%)
SimdjsonParsing gigabytes of JSON per second
Stars: ✭ 15,115 (+7813.61%)
Rawspeedfast raw decoding library
Stars: ✭ 179 (-6.28%)
CoreclrCoreCLR is the runtime for .NET Core. It includes the garbage collector, JIT compiler, primitive data types and low-level classes.
Stars: ✭ 12,610 (+6502.09%)
TinytpuImplementation of a Tensor Processing Unit for embedded systems and the IoT.
Stars: ✭ 153 (-19.9%)
CompactcnncascadeA binary library for very fast face detection using compact CNNs.
Stars: ✭ 152 (-20.42%)
LightgbmA fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Stars: ✭ 13,293 (+6859.69%)
OpencoarraysA parallel application binary interface for Fortran 2018 compilers.
Stars: ✭ 151 (-20.94%)
Pyecopython implementation of efficient convolution operators for tracking
Stars: ✭ 150 (-21.47%)
UgmUbpa Graphics Mathematics
Stars: ✭ 178 (-6.81%)
Libgrape Lite🍇 A C++ library for parallel graph processing 🍇
Stars: ✭ 169 (-11.52%)
OneflowLargeScale Multiphysics Scientific Simulation Environment-OneFLOW CFD
Stars: ✭ 150 (-21.47%)
Rangelessc++ LINQ -like library of higher-order functions for data manipulation
Stars: ✭ 148 (-22.51%)
RocblasNext generation BLAS implementation for ROCm platform
Stars: ✭ 147 (-23.04%)