monolishmonolish: MONOlithic LInear equation Solvers for Highly-parallel architecture
Stars: ✭ 166 (+36.07%)
ParenchymaAn extensible HPC framework for CUDA, OpenCL and native CPU.
Stars: ✭ 71 (-41.8%)
IlgpuILGPU JIT Compiler for high-performance .Net GPU programs
Stars: ✭ 374 (+206.56%)
PyopenclOpenCL integration for Python, plus shiny features
Stars: ✭ 790 (+547.54%)
ArrayfireArrayFire: a general purpose GPU library.
Stars: ✭ 3,693 (+2927.05%)
DllFast Deep Learning Library (DLL) for C++ (ANNs, CNNs, RBMs, DBNs...)
Stars: ✭ 605 (+395.9%)
Scikit CudaPython interface to GPU-powered libraries
Stars: ✭ 803 (+558.2%)
NeanderthalFast Clojure Matrix Library
Stars: ✭ 927 (+659.84%)
mbsolveAn open-source solver tool for the Maxwell-Bloch equations.
Stars: ✭ 14 (-88.52%)
CreepminerBurstcoin C++ CPU and GPU Miner
Stars: ✭ 169 (+38.52%)
dbcsrDBCSR: Distributed Block Compressed Sparse Row matrix library
Stars: ✭ 65 (-46.72%)
BohriumAutomatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX
Stars: ✭ 209 (+71.31%)
H2o4gpuH2Oai GPU Edition
Stars: ✭ 416 (+240.98%)
Arrayfire PythonPython bindings for ArrayFire: A general purpose GPU library.
Stars: ✭ 358 (+193.44%)
EtlBlazing-fast Expression Templates Library (ETL) with GPU support, in C++
Stars: ✭ 190 (+55.74%)
peakperfAchieve peak performance on x86 CPUs and NVIDIA GPUs
Stars: ✭ 33 (-72.95%)
gpubootcampThis repository consists for gpu bootcamp material for HPC and AI
Stars: ✭ 227 (+86.07%)
OccaJIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal
Stars: ✭ 230 (+88.52%)
Futhark💥💻💥 A data-parallel functional programming language
Stars: ✭ 1,641 (+1245.08%)
MatXAn efficient C++17 GPU numerical computing library with Python-like syntax
Stars: ✭ 418 (+242.62%)
RemoterySingle C file, Realtime CPU/GPU Profiler with Remote Web Viewer
Stars: ✭ 1,908 (+1463.93%)
allgebraBase container for developing C++ and Fortran HPC applications
Stars: ✭ 14 (-88.52%)
cuda memtestFork of CUDA GPU memtest 👓
Stars: ✭ 68 (-44.26%)
ComputeA C++ GPU Computing Library for OpenCL
Stars: ✭ 1,192 (+877.05%)
MarianFast Neural Machine Translation in C++
Stars: ✭ 777 (+536.89%)
Fancontrol.releasesThis is the release repository for Fan Control, a highly customizable fan controlling software for Windows.
Stars: ✭ 768 (+529.51%)
ArraymancerA fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+550%)
TvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Stars: ✭ 7,494 (+6042.62%)
Tf CorianderOpenCL 1.2 implementation for Tensorflow
Stars: ✭ 775 (+535.25%)
ImpalaAn imperative and functional programming language
Stars: ✭ 118 (-3.28%)
WalleiOS Application performance monitoring
Stars: ✭ 19 (-84.43%)
AccelerateEmbedded language for high-performance array computations
Stars: ✭ 751 (+515.57%)
CubCooperative primitives for CUDA C++.
Stars: ✭ 883 (+623.77%)
WheelsPerformance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Stars: ✭ 891 (+630.33%)
GraphviteGraphVite: A General and High-performance Graph Embedding System
Stars: ✭ 865 (+609.02%)
Keras object detectionConvert any classification model or architecture trained in keras to an object detection model
Stars: ✭ 28 (-77.05%)
Nvidia libs testTests and benchmarks for cudnn (and in the future, other nvidia libraries)
Stars: ✭ 36 (-70.49%)
SosSandia OpenSHMEM is an implementation of the OpenSHMEM specification over multiple Networking APIs, including Portals 4, the Open Fabric Interface (OFI), and UCX. Please click on the Wiki tab for help with building and using SOS.
Stars: ✭ 34 (-72.13%)
ComputesharpA .NET 5 library to run C# code in parallel on the GPU through DX12 and dynamically generated HLSL compute shaders, with the goal of making GPU computing easy to use for all .NET developers! 🚀
Stars: ✭ 982 (+704.92%)
Carlsim3CARLsim is an efficient, easy-to-use, GPU-accelerated software framework for simulating large-scale spiking neural network (SNN) models with a high degree of biological detail.
Stars: ✭ 52 (-57.38%)
KttKernel Tuning Toolkit
Stars: ✭ 33 (-72.95%)
Qualia2.0Qualia is a deep learning framework deeply integrated with automatic differentiation and dynamic graphing with CUDA acceleration. Qualia was built from scratch.
Stars: ✭ 41 (-66.39%)
PycudaCUDA integration for Python, plus shiny features
Stars: ✭ 1,112 (+811.48%)
Tsne CudaGPU Accelerated t-SNE for CUDA with Python bindings
Stars: ✭ 1,120 (+818.03%)
Future🚀 R package: future: Unified Parallel and Distributed Processing in R for Everyone
Stars: ✭ 735 (+502.46%)
CudaExperiments with CUDA and Rust
Stars: ✭ 31 (-74.59%)
HeteroflowConcurrent CPU-GPU Programming using Task Models
Stars: ✭ 57 (-53.28%)
GgnnGGNN: State of the Art Graph-based GPU Nearest Neighbor Search
Stars: ✭ 63 (-48.36%)
ArboretumGradient Boosting powered by GPU(NVIDIA CUDA)
Stars: ✭ 64 (-47.54%)
Thor OsSimple operating system in C++, written from scratch
Stars: ✭ 1,204 (+886.89%)
HiopHPC solver for nonlinear optimization problems
Stars: ✭ 75 (-38.52%)
Cuda Design PatternsSome CUDA design patterns and a bit of template magic for CUDA
Stars: ✭ 78 (-36.07%)
MprReference implementation for "Massively Parallel Rendering of Complex Closed-Form Implicit Surfaces" (SIGGRAPH 2020)
Stars: ✭ 84 (-31.15%)
NplusminerNPlusMiner + GUI | NVIDIA/AMD/CPU miner | AI | Autoupdate | MultiRig remote management
Stars: ✭ 75 (-38.52%)
PcmProcessor Counter Monitor
Stars: ✭ 1,240 (+916.39%)
D2dlibA .NET library for hardware-accelerated, high performance, immediate mode rendering via Direct2D.
Stars: ✭ 84 (-31.15%)