MatXAn efficient C++17 GPU numerical computing library with Python-like syntax
Stars: ✭ 418 (+121.16%)
Dive Into Ml SystemDive into machine learning system, start from reinventing the wheel.
Stars: ✭ 220 (+16.4%)
BvhA modern C++ BVH construction and traversal library
Stars: ✭ 216 (+14.29%)
OnednnoneAPI Deep Neural Network Library (oneDNN)
Stars: ✭ 2,600 (+1275.66%)
LaserThe HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
Stars: ✭ 191 (+1.06%)
BkcrackCrack legacy zip encryption with Biham and Kocher's known plaintext attack.
Stars: ✭ 178 (-5.82%)
Rawspeedfast raw decoding library
Stars: ✭ 179 (-5.29%)
GapbsGAP Benchmark Suite
Stars: ✭ 165 (-12.7%)
Ctranslate2Fast inference engine for OpenNMT models
Stars: ✭ 140 (-25.93%)
BabelstreamSTREAM, for lots of devices written in many programming models
Stars: ✭ 121 (-35.98%)
Corrfunc⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
Stars: ✭ 114 (-39.68%)
Arm VoEfficient monocular visual odometry for ground vehicles on ARM processors
Stars: ✭ 115 (-39.15%)
CompactnsearchA C++ library to compute neighborhood information for point clouds within a fixed radius. Suitable for many applications, e.g. neighborhood search for SPH fluid simulations.
Stars: ✭ 93 (-50.79%)
NbodyN body gravity attraction problem solver
Stars: ✭ 40 (-78.84%)
ArraymancerA fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+319.58%)
EmsExtended Memory Semantics - Persistent shared object memory and parallelism for Node.js and Python
Stars: ✭ 552 (+192.06%)
Stdgpustdgpu: Efficient STL-like Data Structures on the GPU
Stars: ✭ 531 (+180.95%)
OptimOptimLib: a lightweight C++ library of numerical optimization methods for nonlinear functions
Stars: ✭ 411 (+117.46%)
WeaveA state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead
Stars: ✭ 305 (+61.38%)
StatsA C++ header-only library of statistical distribution functions.
Stars: ✭ 292 (+54.5%)
crowdsource-video-experiments-on-androidCrowdsourcing video experiments (such as collaborative benchmarking and optimization of DNN algorithms) using Collective Knowledge Framework across diverse Android devices provided by volunteers. Results are continuously aggregated in the open repository:
Stars: ✭ 29 (-84.66%)
vercorsThe VerCors verification toolset for verifying parallel and concurrent software
Stars: ✭ 30 (-84.13%)
capture3C++ research project to learn more about cameras, image processing, color spaces, OpenCV and multi‑threading.
Stars: ✭ 17 (-91.01%)
Aff3ctA fast simulator and a library dedicated to the channel coding.
Stars: ✭ 240 (+26.98%)
DmtcpDMTCP: Distributed MultiThreaded CheckPointing
Stars: ✭ 229 (+21.16%)
Mpi.jlMPI wrappers for Julia
Stars: ✭ 197 (+4.23%)
Raxml NgRAxML Next Generation: faster, easier-to-use and more flexible
Stars: ✭ 191 (+1.06%)
TimemoryModular C++ Toolkit for Performance Analysis and Logging. Profiling API and Tools for C, C++, CUDA, Fortran, and Python. The C++ template API is essentially a framework to creating tools: it is designed to provide a unifying interface for recording various performance measurements alongside data logging and interfaces to other tools.
Stars: ✭ 192 (+1.59%)
Mpi OperatorKubernetes Operator for Allreduce-style Distributed Training
Stars: ✭ 190 (+0.53%)
Libgrape Lite🍇 A C++ library for parallel graph processing 🍇
Stars: ✭ 169 (-10.58%)
TomsfastmathTomsFastMath is a fast public domain, open source, large integer arithmetic library written in portable ISO C.
Stars: ✭ 169 (-10.58%)
QudaQUDA is a library for performing calculations in lattice QCD on GPUs.
Stars: ✭ 166 (-12.17%)
HorovodDistributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Stars: ✭ 11,943 (+6219.05%)
MatexMachine Learning Toolkit for Extreme Scale (MaTEx)
Stars: ✭ 104 (-44.97%)
PfunitParallel Fortran Unit Testing Framework
Stars: ✭ 104 (-44.97%)
ParamonteParaMonte: Plain Powerful Parallel Monte Carlo and MCMC Library for Python, MATLAB, Fortran, C++, C.
Stars: ✭ 88 (-53.44%)
SchwimmbadA common interface to processing pools.
Stars: ✭ 82 (-56.61%)
Pismrepository for the Parallel Ice Sheet Model (PISM)
Stars: ✭ 61 (-67.72%)
Incompact3dNew version of our solver for the incompressible Navier-Stokes equations
Stars: ✭ 61 (-67.72%)
OcgisOpenClimateGIS is a set of geoprocessing and calculation tools for CF-compliant climate datasets.
Stars: ✭ 60 (-68.25%)
T FlowsProgram for Simulation of Turbulent Flows
Stars: ✭ 47 (-75.13%)
AddaADDA - light scattering simulator based on the discrete dipole approximation
Stars: ✭ 43 (-77.25%)
QballQball (also known as [email protected]) is a first-principles molecular dynamics code that is used to compute the electronic structure of atoms, molecules, solids, and liquids within the Density Functional Theory (DFT) formalism. It is a fork of the Qbox code by Francois Gygi.
Stars: ✭ 33 (-82.54%)
DfloDiscontinuous Galerkin solver for compressible flows
Stars: ✭ 31 (-83.6%)
Prplparallel Raster Processing Library (pRPL) is a MPI-enabled C++ programming library that provides easy-to-use interfaces to parallelize raster/image processing algorithms
Stars: ✭ 15 (-92.06%)
Pp Mm A03Parallel Processing - Matrix Multiplication (Cannon, DNS, LUdecomp)
Stars: ✭ 12 (-93.65%)
Esmpy TutorialBasic tutorial for ESMPy Python package
Stars: ✭ 22 (-88.36%)
MpimemuMPI Memory Consumption Utilities
Stars: ✭ 17 (-91.01%)
ElmerfemOfficial git repository of Elmer FEM software
Stars: ✭ 523 (+176.72%)
LibtommathLibTomMath is a free open source portable number theoretic multiple-precision integer library written entirely in C.
Stars: ✭ 438 (+131.75%)
Mpi4pyPython bindings for MPI
Stars: ✭ 388 (+105.29%)
mpi-parallelizationExamples for MPI Spawning and Splitting, and the differences between two implementations
Stars: ✭ 16 (-91.53%)
sympy-paperRepo for the paper "SymPy: symbolic computing in python"
Stars: ✭ 42 (-77.78%)