PrimitivA Neural Network Toolkit.
Stars: ✭ 164 (+26.15%)
Nvidia libs testTests and benchmarks for cudnn (and in the future, other nvidia libraries)
Stars: ✭ 36 (-72.31%)
PyopenclOpenCL integration for Python, plus shiny features
Stars: ✭ 790 (+507.69%)
ArrayfireArrayFire: a general purpose GPU library.
Stars: ✭ 3,693 (+2740.77%)
BohriumAutomatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX
Stars: ✭ 209 (+60.77%)
ParenchymaAn extensible HPC framework for CUDA, OpenCL and native CPU.
Stars: ✭ 71 (-45.38%)
HipsyclImplementation of SYCL for CPUs, AMD GPUs, NVIDIA GPUs
Stars: ✭ 377 (+190%)
BayaderaHigh-performance Bayesian Data Analysis on the GPU in Clojure
Stars: ✭ 342 (+163.08%)
KhivaAn open-source library of algorithms to analyse time series in GPU and CPU.
Stars: ✭ 161 (+23.85%)
Arrayfire PythonPython bindings for ArrayFire: A general purpose GPU library.
Stars: ✭ 358 (+175.38%)
Futhark💥💻💥 A data-parallel functional programming language
Stars: ✭ 1,641 (+1162.31%)
NeanderthalFast Clojure Matrix Library
Stars: ✭ 927 (+613.08%)
hipaccA domain-specific language and compiler for image processing
Stars: ✭ 72 (-44.62%)
IlgpuILGPU JIT Compiler for high-performance .Net GPU programs
Stars: ✭ 374 (+187.69%)
OccaJIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal
Stars: ✭ 230 (+76.92%)
BitcrackerBitCracker is the first open source password cracking tool for memory units encrypted with BitLocker
Stars: ✭ 463 (+256.15%)
BabelstreamSTREAM, for lots of devices written in many programming models
Stars: ✭ 121 (-6.92%)
LuxcoreLuxCore source repository
Stars: ✭ 601 (+362.31%)
SpeedtorchLibrary for faster pinned CPU <-> GPU transfer in Pytorch
Stars: ✭ 615 (+373.08%)
ChainerA flexible framework of neural networks for deep learning
Stars: ✭ 5,656 (+4250.77%)
JuiceThe Hacker's Machine Learning Engine
Stars: ✭ 743 (+471.54%)
CorianderBuild NVIDIA® CUDA™ code for OpenCL™ 1.2 devices
Stars: ✭ 665 (+411.54%)
Tf CorianderOpenCL 1.2 implementation for Tensorflow
Stars: ✭ 775 (+496.15%)
MarianFast Neural Machine Translation in C++
Stars: ✭ 777 (+497.69%)
WheelsPerformance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Stars: ✭ 891 (+585.38%)
Compute RuntimeIntel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
Stars: ✭ 593 (+356.15%)
ThundergbmThunderGBM: Fast GBDTs and Random Forests on GPUs
Stars: ✭ 586 (+350.77%)
VexclVexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP
Stars: ✭ 626 (+381.54%)
ClblastTuned OpenCL BLAS
Stars: ✭ 559 (+330%)
GunrockHigh-Performance Graph Primitives on GPUs
Stars: ✭ 718 (+452.31%)
Cpu XCPU-X is a Free software that gathers information on CPU, motherboard and more
Stars: ✭ 676 (+420%)
CudasiftA CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)
Stars: ✭ 555 (+326.92%)
GpusortingImplementation of a few sorting algorithms in OpenCL
Stars: ✭ 9 (-93.08%)
CubCooperative primitives for CUDA C++.
Stars: ✭ 883 (+579.23%)
Scikit CudaPython interface to GPU-powered libraries
Stars: ✭ 803 (+517.69%)
GraphviteGraphVite: A General and High-performance Graph Embedding System
Stars: ✭ 865 (+565.38%)
TvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Stars: ✭ 7,494 (+5664.62%)
Soul EnginePhysically based renderer and simulation engine for real-time applications.
Stars: ✭ 37 (-71.54%)
KttKernel Tuning Toolkit
Stars: ✭ 33 (-74.62%)
CupyNumPy & SciPy for GPU
Stars: ✭ 5,625 (+4226.92%)
Carlsim3CARLsim is an efficient, easy-to-use, GPU-accelerated software framework for simulating large-scale spiking neural network (SNN) models with a high degree of biological detail.
Stars: ✭ 52 (-60%)
ArraymancerA fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+510%)
CudaExperiments with CUDA and Rust
Stars: ✭ 31 (-76.15%)
Qualia2.0Qualia is a deep learning framework deeply integrated with automatic differentiation and dynamic graphing with CUDA acceleration. Qualia was built from scratch.
Stars: ✭ 41 (-68.46%)
HeteroflowConcurrent CPU-GPU Programming using Task Models
Stars: ✭ 57 (-56.15%)
GgnnGGNN: State of the Art Graph-based GPU Nearest Neighbor Search
Stars: ✭ 63 (-51.54%)
Tsne CudaGPU Accelerated t-SNE for CUDA with Python bindings
Stars: ✭ 1,120 (+761.54%)
ArboretumGradient Boosting powered by GPU(NVIDIA CUDA)
Stars: ✭ 64 (-50.77%)
PycudaCUDA integration for Python, plus shiny features
Stars: ✭ 1,112 (+755.38%)
Autodock GpuAutoDock for GPUs and other accelerators
Stars: ✭ 65 (-50%)
ComputeA C++ GPU Computing Library for OpenCL
Stars: ✭ 1,192 (+816.92%)
Cuda Design PatternsSome CUDA design patterns and a bit of template magic for CUDA
Stars: ✭ 78 (-40%)
CekirdeklerMulti-device OpenCL kernel load balancer and pipeliner API for C#. Uses shared-distributed memory model to keep GPUs updated fast while using same kernel on all devices(for simplicity).
Stars: ✭ 76 (-41.54%)
SpocStream Processing with OCaml
Stars: ✭ 115 (-11.54%)
Cudart.jlJulia wrapper for CUDA runtime API
Stars: ✭ 75 (-42.31%)
MprReference implementation for "Massively Parallel Rendering of Complex Closed-Form Implicit Surfaces" (SIGGRAPH 2020)
Stars: ✭ 84 (-35.38%)
ThundersvmThunderSVM: A Fast SVM Library on GPUs and CPUs
Stars: ✭ 1,282 (+886.15%)