PycudaCUDA integration for Python, plus shiny features
Stars: ✭ 1,112 (+1208.24%)
Nvidia libs testTests and benchmarks for cudnn (and in the future, other nvidia libraries)
Stars: ✭ 36 (-57.65%)
monolishmonolish: MONOlithic LInear equation Solvers for Highly-parallel architecture
Stars: ✭ 166 (+95.29%)
AmgclC++ library for solving large sparse linear systems with algebraic multigrid method
Stars: ✭ 390 (+358.82%)
BabelstreamSTREAM, for lots of devices written in many programming models
Stars: ✭ 121 (+42.35%)
Stdgpustdgpu: Efficient STL-like Data Structures on the GPU
Stars: ✭ 531 (+524.71%)
ArraymancerA fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+832.94%)
TutorialsSome basic programming tutorials
Stars: ✭ 353 (+315.29%)
LoopyA code generator for array-based code on CPUs and GPUs
Stars: ✭ 367 (+331.76%)
HeteroflowConcurrent CPU-GPU Programming using Task Models
Stars: ✭ 57 (-32.94%)
Hetero-MarkA Benchmark Suite for Heterogeneous System Computation
Stars: ✭ 41 (-51.76%)
GOSHAn ultra-fast, GPU-based large graph embedding algorithm utilizing a novel coarsening algorithm requiring not more than a single GPU.
Stars: ✭ 12 (-85.88%)
BayaderaHigh-performance Bayesian Data Analysis on the GPU in Clojure
Stars: ✭ 342 (+302.35%)
MatXAn efficient C++17 GPU numerical computing library with Python-like syntax
Stars: ✭ 418 (+391.76%)
PyopenclOpenCL integration for Python, plus shiny features
Stars: ✭ 790 (+829.41%)
NeanderthalFast Clojure Matrix Library
Stars: ✭ 927 (+990.59%)
HipsyclImplementation of SYCL for CPUs, AMD GPUs, NVIDIA GPUs
Stars: ✭ 377 (+343.53%)
GinkgoNumerical linear algebra software package
Stars: ✭ 149 (+75.29%)
ClojurecudaClojure library for CUDA development
Stars: ✭ 158 (+85.88%)
gpubootcampThis repository consists for gpu bootcamp material for HPC and AI
Stars: ✭ 227 (+167.06%)
mbsolveAn open-source solver tool for the Maxwell-Bloch equations.
Stars: ✭ 14 (-83.53%)
crowdsource-video-experiments-on-androidCrowdsourcing video experiments (such as collaborative benchmarking and optimization of DNN algorithms) using Collective Knowledge Framework across diverse Android devices provided by volunteers. Results are continuously aggregated in the open repository:
Stars: ✭ 29 (-65.88%)
NPB-CPPNAS Parallel Benchmark Kernels in C/C++. The parallel versions are in FastFlow, TBB, and OpenMP.
Stars: ✭ 18 (-78.82%)
PyMFEMPython wrapper for MFEM
Stars: ✭ 91 (+7.06%)
OccaJIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal
Stars: ✭ 230 (+170.59%)
rbcudaCUDA bindings for Ruby
Stars: ✭ 57 (-32.94%)
GapbsGAP Benchmark Suite
Stars: ✭ 165 (+94.12%)
ArrayfireArrayFire: a general purpose GPU library.
Stars: ✭ 3,693 (+4244.71%)
3GPU-accelerated micromagnetic simulator
Stars: ✭ 324 (+281.18%)
Cuda Api WrappersThin C++-flavored wrappers for the CUDA Runtime API
Stars: ✭ 362 (+325.88%)
Armadillo CodeArmadillo: fast C++ library for linear algebra & scientific computing - http://arma.sourceforge.net
Stars: ✭ 388 (+356.47%)
EdgeExtreme-scale Discontinuous Galerkin Environment (EDGE)
Stars: ✭ 18 (-78.82%)
NbodyN body gravity attraction problem solver
Stars: ✭ 40 (-52.94%)
AccelerateEmbedded language for high-performance array computations
Stars: ✭ 751 (+783.53%)
VexclVexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP
Stars: ✭ 626 (+636.47%)
SixtyfourHow fast can we brute force a 64-bit comparison?
Stars: ✭ 41 (-51.76%)
LuxcoreLuxCore source repository
Stars: ✭ 601 (+607.06%)
DeepnetDeep.Net machine learning framework for F#
Stars: ✭ 99 (+16.47%)
Autodock GpuAutoDock for GPUs and other accelerators
Stars: ✭ 65 (-23.53%)
FGPUNo description or website provided.
Stars: ✭ 30 (-64.71%)
allgebraBase container for developing C++ and Fortran HPC applications
Stars: ✭ 14 (-83.53%)
EFDCPluswww.eemodelingsystem.com
Stars: ✭ 9 (-89.41%)
mini-nbodyA simple gravitational N-body simulation in less than 100 lines of C code, with CUDA optimizations.
Stars: ✭ 73 (-14.12%)
MixbenchA GPU benchmark tool for evaluating GPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL)
Stars: ✭ 130 (+52.94%)
Foundations of HPC 2021This repository collects the materials from the course "Foundations of HPC", 2021, at the Data Science and Scientific Computing Department, University of Trieste
Stars: ✭ 22 (-74.12%)
euler2d kokkosSimple 2d finite volume solver for Euler equations using c++ kokkos library
Stars: ✭ 27 (-68.24%)
CARECHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.
Stars: ✭ 22 (-74.12%)
HipHIP: C++ Heterogeneous-Compute Interface for Portability
Stars: ✭ 2,609 (+2969.41%)
Ctranslate2Fast inference engine for OpenNMT models
Stars: ✭ 140 (+64.71%)
hipercHigh Performance Computing Strategies for Boundary Value Problems
Stars: ✭ 36 (-57.65%)
gardeniaGARDENIA: Graph Analytics Repository for Designing Efficient Next-generation Accelerators
Stars: ✭ 22 (-74.12%)
yjit-benchSet of benchmarks for the YJIT CRuby JIT compiler
Stars: ✭ 38 (-55.29%)
flake8-aaaA Flake8 plugin that checks Python tests follow the Arrange-Act-Assert pattern
Stars: ✭ 51 (-40%)
go-perftunerHelper tool for manual Go code optimization.
Stars: ✭ 111 (+30.59%)
komputeGeneral purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
Stars: ✭ 872 (+925.88%)
nelsonNelson numerical interpreter
Stars: ✭ 42 (-50.59%)