runtimeAnyDSL Runtime Library
Stars: ✭ 17 (+6.25%)
VcSIMD Vector Classes for C++
Stars: ✭ 985 (+6056.25%)
AccelerateEmbedded language for high-performance array computations
Stars: ✭ 751 (+4593.75%)
PyMFEMPython wrapper for MFEM
Stars: ✭ 91 (+468.75%)
OpenPHParallel reduction of boundary matrices for Persistent Homology with CUDA
Stars: ✭ 14 (-12.5%)
FastapproxApproximate and vectorized versions of common mathematical functions
Stars: ✭ 128 (+700%)
NeanderthalFast Clojure Matrix Library
Stars: ✭ 927 (+5693.75%)
UmesimdUME::SIMD A library for explicit simd vectorization.
Stars: ✭ 66 (+312.5%)
HLMLAuto-generated maths library for C and C++ based on HLSL/Cg
Stars: ✭ 23 (+43.75%)
ThorinThe Higher-Order Intermediate Representation
Stars: ✭ 116 (+625%)
ImpalaAn imperative and functional programming language
Stars: ✭ 118 (+637.5%)
Guided Missile SimulationGuided Missile, Radar and Infrared EOS Simulation Framework written in Fortran.
Stars: ✭ 33 (+106.25%)
gardeniaGARDENIA: Graph Analytics Repository for Designing Efficient Next-generation Accelerators
Stars: ✭ 22 (+37.5%)
opensbliA framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures.
Stars: ✭ 56 (+250%)
SleefSIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
Stars: ✭ 353 (+2106.25%)
hpcLearning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )
Stars: ✭ 39 (+143.75%)
qHilbertqHilbert is a vectorized speedup of Hilbert curve generation using SIMD intrinsics
Stars: ✭ 22 (+37.5%)
XsimdC++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)
Stars: ✭ 964 (+5925%)
SimdeImplementations of SIMD instruction sets for systems which don't natively support them.
Stars: ✭ 1,012 (+6225%)
ArraymancerA fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+4856.25%)
learn-gpgpuAlgorithms implemented in CUDA + resources about GPGPU
Stars: ✭ 37 (+131.25%)
FastA framework for GPU based high-performance medical image processing and visualization
Stars: ✭ 179 (+1018.75%)
ultra-sortDSL for SIMD Sorting on AVX2 & AVX512
Stars: ✭ 29 (+81.25%)
T13xAn Extended Version of the T0x multithreaded cores, with a custom general purpose parametrized SIMD/MIMD vector coprocessor and support for 3-5 way superscalar execution. The core is pin-to-pin compatible with the RISCY cores from PULP
Stars: ✭ 28 (+75%)
b-rabbitA thread safe library that aims to provide a simple API for interfacing with RabbitMQ. Built on top of rabbitpy, the library make it very easy to use the RabbitMQ message broker with just few lines of code. It implements all messaging pattern used by message brokers
Stars: ✭ 15 (-6.25%)
HeCBenchsoftware.intel.com/content/www/us/en/develop/articles/repo-evaluating-performance-productivity-oneapi.html
Stars: ✭ 85 (+431.25%)
generic-simdGeneric SIMD abstractions for Rust.
Stars: ✭ 45 (+181.25%)
GOSHAn ultra-fast, GPU-based large graph embedding algorithm utilizing a novel coarsening algorithm requiring not more than a single GPU.
Stars: ✭ 12 (-25%)
optimathA #[no_std] LinAlg library
Stars: ✭ 47 (+193.75%)
SIMDxorshiftFast random number generators: Vectorized (SIMD) version of xorshift128+
Stars: ✭ 84 (+425%)
cruiseUser space POSIX-like file system in main memory
Stars: ✭ 27 (+68.75%)
ndzipA High-Throughput Parallel Lossless Compressor for Scientific Data
Stars: ✭ 19 (+18.75%)
delayed🕟 💻 Dependent Delayed Computation
Stars: ✭ 14 (-12.5%)
hero-sdk⛔ DEPRECATED ⛔ HERO Software Development Kit
Stars: ✭ 21 (+31.25%)
hypothesis-gufuncExtension to hypothesis for testing numpy general universal functions
Stars: ✭ 32 (+100%)
PySDMPythonic particle-based (super-droplet) warm-rain/aqueous-chemistry cloud microphysics package with box, parcel & 1D/2D prescribed-flow examples in Python, Julia and Matlab
Stars: ✭ 26 (+62.5%)
komputeGeneral purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
Stars: ✭ 872 (+5350%)
tbslasA parallel, fast solver for the scalar advection-diffusion and the incompressible Navier-Stokes equations based on semi-Lagrangian/Volume-Integral method.
Stars: ✭ 21 (+31.25%)
RXMDRXMD : Linear-Scaling Parallel Reactive Molecular Dynamics Simulation Engine
Stars: ✭ 13 (-18.75%)
tinker9Tinker9: Next Generation of Tinker with GPU Support
Stars: ✭ 31 (+93.75%)
rbcudaCUDA bindings for Ruby
Stars: ✭ 57 (+256.25%)
hlmlvectorized high-level math library
Stars: ✭ 42 (+162.5%)
course高性能并行编程与优化 - 课件
Stars: ✭ 1,610 (+9962.5%)
madpy-daskMadPy Dask talk materials
Stars: ✭ 33 (+106.25%)
mrscA toolkit for building multi-result supercompilers
Stars: ✭ 30 (+87.5%)
taichi ptprogressive path tracer written in taichi
Stars: ✭ 20 (+25%)
future.callr🚀 R package future.callr: A Future API for Parallel Processing using 'callr'
Stars: ✭ 52 (+225%)
block-alignerSIMD-accelerated library for computing global and X-drop affine gap penalty sequence-to-sequence or sequence-to-profile alignments using an adaptive block-based algorithm.
Stars: ✭ 58 (+262.5%)
ADbHashReally fast C++ hash table
Stars: ✭ 12 (-25%)
nnpsoTraining of Neural Network using Particle Swarm Optimization
Stars: ✭ 24 (+50%)
ludwigA lattice Boltzmann code for complex fluids
Stars: ✭ 35 (+118.75%)
pybase64Fast Base64 encoding/decoding in Python
Stars: ✭ 84 (+425%)
IntrimanIntriman is a documentation generator that retargets the Intel Intrinsics Guide to other documentation formats
Stars: ✭ 25 (+56.25%)
frpFRP: Fast Random Projections
Stars: ✭ 40 (+150%)