Top 527 cuda open source projects

DIY-Deep-Learning-Workstation
Build a deep learning workstation from scratch (HW & SW).
FoldsCUDA.jl
Data-parallelism on CUDA using Transducers.jl and for loops (FLoops.jl)
NCCV
Short course on computer vision and image processing using Numba+CUDA+OpenCV
mlspace
MLSpace: Hassle-free machine learning & deep learning development
ppq
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
OmniSci.jl
Julia client for OmniSci GPU-accelerated SQL engine and analytics platform
CuVec
Unifying Python/C++/CUDA memory: Python buffered array ↔️ `std::vector` ↔️ CUDA managed memory
kitti deeplab
Inference script and frozen inference graph with fine tuned weights for semantic segmentation on images from the KITTI dataset.
gblastn
G-BLASTN is a GPU-accelerated nucleotide alignment tool based on the widely used NCBI-BLAST.
SOLIDWORKS-for-Linux
This is a project, where I give you a way to use SOLIDWORKS on Linux!
VoxelTerrain
This project's main goal is to generate and visualize terrain built using voxels. It was achieved using different approaches and computing technologies just for the sake of performance and implementation comparison.
ffmpegtoolkit
CentOS 8.x 64bit ffmpeg auto installer scripts
flexBox
FlexBox is a fexible MATLAB toolbox for finite dimensional convex variational problems in image processing and beyond.
GooFit
Code repository for the massively-parallel framework for maximum-likelihood fits, implemented in CUDA/OpenMP
xmrig-cuda
NVIDIA CUDA plugin for XMRig miner
tensorflow-builds
Tensorflow binaries and Docker images compiled with GPU support and CPU optimizations.
2D 3D PolarFourierTransform
C++, CUDA, and MATLAB codes for the paper "An Exact and Fast Computation of Discrete Fourier Transform for Polar and Spherical Grid"
cuda-neural-network
Convolutional Neural Network with CUDA (MNIST 99.23%)
sentence2vec
Deep sentence embedding using Sequence to Sequence learning
EPPM
CUDA implementation of the paper "Fast Edge-Preserving PatchMatch for Large Displacement Optical Flow" in CVPR 2014.
MEGADOCK
An ultra-high-performance protein-protein docking for heterogeneous supercomputers
gpumembench
A GPU benchmark suite for assessing on-chip GPU memory bandwidth
CARLsim4
CARLsim is an efficient, easy-to-use, GPU-accelerated software framework for simulating large-scale spiking neural network (SNN) models with a high degree of biological detail.
Optical-Flow-GPU-Docker
Compute dense optical flow using TV-L1 algorithm with NVIDIA GPU acceleration.
Matrix-Inversion-with-CUDA
I implemented a parallel algorithm for matrix inversion based on Gauss-Jordan elimination.
quick-start
FloydHub quick start project - train TensorFlow model with MNIST dataset
fdtd3d
fdtd3d is an open source 1D, 2D, 3D FDTD electromagnetics solver with MPI, OpenMP and CUDA support for x86, arm, arm64 architectures
cuda-revised-simplex
An implementation of the revised simplex algorithm in CUDA for solving linear optimization problems in the form max{c*x | A*x=b, l<=x<=u}
Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
nvidia-video-codec-rs
Bindings for the NVIDIA Video Codec SDK
VS-Code-Cuda
support cuda grammars in Visual Studio Code
gpu mandelbrot
Interactive Mandelbrot set on GPU with Python
Jamais-Vu
Audio Fingerprinting and Recognition in Python using NVidia's CUDA
fresnel
Publication quality path tracing in real time.
ArrayLSTM
GPU/CPU (CUDA) Implementation of "Recurrent Memory Array Structures", Simple RNN, LSTM, Array LSTM..
SRmeetsPS-CUDA
CUDA implementation of the paper "Depth Super-Resolution Meets Uncalibrated Photometric Stereo"
pnn
pnn is Darknet compatible neural nets inference engine implemented in Rust.
CobraML
AutoML Software designed to give users access to a whole plethora of ML models, some trainable on the GPU.
buildTensorflow
A lightweight deep learning framework made with ❤️
NNPOps
High-performance operations for neural network potentials
rocRAND
RAND library for HIP programming language
bsuir-csn-cmsn-helper
Repository containing ready-made laboratory works in the specialty of computing machines, systems and networks
sboxgates
Program for finding low gate count implementations of S-boxes.
la-core
Linear algebra accelerators for RISC-V (published in ICCD 17)
cuhnsw
CUDA implementation of Hierarchical Navigable Small World Graph algorithm
gpu-pathtracer
physically based path tracer on gpu
astro-accelerate
AstroAccelerate is a many-core accelerated software package for processing time-domain radio-astronomy data.
local-search-quantization
State-of-the-art method for large-scale ANN search as of Oct 2016. Presented at ECCV 16.
421-480 of 527 cuda projects