NeanderthalFast Clojure Matrix Library
Stars: ✭ 927 (+16.9%)
HipsyclImplementation of SYCL for CPUs, AMD GPUs, NVIDIA GPUs
Stars: ✭ 377 (-52.46%)
OccaJIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal
Stars: ✭ 230 (-71%)
TaskflowA General-purpose Parallel and Heterogeneous Task Programming System
Stars: ✭ 6,128 (+672.76%)
learn-gpgpuAlgorithms implemented in CUDA + resources about GPGPU
Stars: ✭ 37 (-95.33%)
Ctranslate2Fast inference engine for OpenNMT models
Stars: ✭ 140 (-82.35%)
BayaderaHigh-performance Bayesian Data Analysis on the GPU in Clojure
Stars: ✭ 342 (-56.87%)
AmgclC++ library for solving large sparse linear systems with algebraic multigrid method
Stars: ✭ 390 (-50.82%)
BabelstreamSTREAM, for lots of devices written in many programming models
Stars: ✭ 121 (-84.74%)
ParenchymaAn extensible HPC framework for CUDA, OpenCL and native CPU.
Stars: ✭ 71 (-91.05%)
gardeniaGARDENIA: Graph Analytics Repository for Designing Efficient Next-generation Accelerators
Stars: ✭ 22 (-97.23%)
Stdgpustdgpu: Efficient STL-like Data Structures on the GPU
Stars: ✭ 531 (-33.04%)
PyopenclOpenCL integration for Python, plus shiny features
Stars: ✭ 790 (-0.38%)
crowdsource-video-experiments-on-androidCrowdsourcing video experiments (such as collaborative benchmarking and optimization of DNN algorithms) using Collective Knowledge Framework across diverse Android devices provided by volunteers. Results are continuously aggregated in the open repository:
Stars: ✭ 29 (-96.34%)
Pine🌲 Aimbot powered by real-time object detection with neural networks, GPU accelerated with Nvidia. Optimized for use with CS:GO.
Stars: ✭ 202 (-74.53%)
MOTMulti-threaded Optimization Toolbox
Stars: ✭ 28 (-96.47%)
Rust AutogradTensors and differentiable operations (like TensorFlow) in Rust
Stars: ✭ 278 (-64.94%)
CekirdeklerMulti-device OpenCL kernel load balancer and pipeliner API for C#. Uses shared-distributed memory model to keep GPUs updated fast while using same kernel on all devices(for simplicity).
Stars: ✭ 76 (-90.42%)
VuhVulkan compute for people
Stars: ✭ 264 (-66.71%)
AccelerateEmbedded language for high-performance array computations
Stars: ✭ 751 (-5.3%)
LaserThe HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
Stars: ✭ 191 (-75.91%)
ArrayfireArrayFire: a general purpose GPU library.
Stars: ✭ 3,693 (+365.7%)
Grassmann.jl⟨Leibniz-Grassmann-Clifford⟩ differential geometric algebra / multivector simplicial complex
Stars: ✭ 289 (-63.56%)
Futhark💥💻💥 A data-parallel functional programming language
Stars: ✭ 1,641 (+106.94%)
LoopyA code generator for array-based code on CPUs and GPUs
Stars: ✭ 367 (-53.72%)
Cuda Api WrappersThin C++-flavored wrappers for the CUDA Runtime API
Stars: ✭ 362 (-54.35%)
IlgpuILGPU JIT Compiler for high-performance .Net GPU programs
Stars: ✭ 374 (-52.84%)
mbsolveAn open-source solver tool for the Maxwell-Bloch equations.
Stars: ✭ 14 (-98.23%)
Qualia2.0Qualia is a deep learning framework deeply integrated with automatic differentiation and dynamic graphing with CUDA acceleration. Qualia was built from scratch.
Stars: ✭ 41 (-94.83%)
Autodock GpuAutoDock for GPUs and other accelerators
Stars: ✭ 65 (-91.8%)
ClojureclClojureCL is a Clojure library for parallel computations with OpenCL.
Stars: ✭ 266 (-66.46%)
HashcatWorld's fastest and most advanced password recovery utility
Stars: ✭ 11,014 (+1288.9%)
KernelsThis is a set of simple programs that can be used to explore the features of a parallel platform.
Stars: ✭ 287 (-63.81%)
DeepnetDeep.Net machine learning framework for F#
Stars: ✭ 99 (-87.52%)
SpocStream Processing with OCaml
Stars: ✭ 115 (-85.5%)
opensbliA framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures.
Stars: ✭ 56 (-92.94%)
BohriumAutomatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX
Stars: ✭ 209 (-73.64%)
gpuowlGPU Mersenne primality test.
Stars: ✭ 77 (-90.29%)
Arrayfire PythonPython bindings for ArrayFire: A general purpose GPU library.
Stars: ✭ 358 (-54.85%)
MatXAn efficient C++17 GPU numerical computing library with Python-like syntax
Stars: ✭ 418 (-47.29%)
BitcrackerBitCracker is the first open source password cracking tool for memory units encrypted with BitLocker
Stars: ✭ 463 (-41.61%)
HpttHigh-Performance Tensor Transpose library
Stars: ✭ 141 (-82.22%)
PycudaCUDA integration for Python, plus shiny features
Stars: ✭ 1,112 (+40.23%)
CUDAfy.NETCUDAfy .NET allows easy development of high performance GPGPU applications completely from the .NET. It's developed in C#.
Stars: ✭ 56 (-92.94%)
JohnJohn the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
Stars: ✭ 5,656 (+613.24%)
CupyNumPy & SciPy for GPU
Stars: ✭ 5,625 (+609.33%)
FastA framework for GPU based high-performance medical image processing and visualization
Stars: ✭ 179 (-77.43%)
VexclVexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP
Stars: ✭ 626 (-21.06%)
LuxcoreLuxCore source repository
Stars: ✭ 601 (-24.21%)
ChainerA flexible framework of neural networks for deep learning
Stars: ✭ 5,656 (+613.24%)
GOSHAn ultra-fast, GPU-based large graph embedding algorithm utilizing a novel coarsening algorithm requiring not more than a single GPU.
Stars: ✭ 12 (-98.49%)
HeCBenchsoftware.intel.com/content/www/us/en/develop/articles/repo-evaluating-performance-productivity-oneapi.html
Stars: ✭ 85 (-89.28%)
GinkgoNumerical linear algebra software package
Stars: ✭ 149 (-81.21%)
Nvidia libs testTests and benchmarks for cudnn (and in the future, other nvidia libraries)
Stars: ✭ 36 (-95.46%)
EtalerA flexable HTM (Hierarchical Temporal Memory) framework with full GPU support.
Stars: ✭ 79 (-90.04%)
rbcudaCUDA bindings for Ruby
Stars: ✭ 57 (-92.81%)
monolishmonolish: MONOlithic LInear equation Solvers for Highly-parallel architecture
Stars: ✭ 166 (-79.07%)