SpeedtorchLibrary for faster pinned CPU <-> GPU transfer in Pytorch
ThundergbmThunderGBM: Fast GBDTs and Random Forests on GPUs
TrtorchPyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT
CudasiftA CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)
CudamatPython module for performing basic dense linear algebra computations on the GPU using CUDA.
CupyNumPy & SciPy for GPU
Lighthouse2Lighthouse 2 framework for real-time ray tracing
Stdgpustdgpu: Efficient STL-like Data Structures on the GPU
DepthwiseconvolutionA personal depthwise convolution layer implementation on caffe by liuhao.(only GPU)
RustacudaRusty wrapper for the CUDA Driver API
ConvnetA GPU implementation of Convolutional Neural Nets in C++
LightseqLightSeq: A High Performance Inference Library for Sequence Processing and Generation
Xray Oxygen🌀 Oxygen Engine 2.0. [Preview] Discord: https://discord.gg/P3aMf66
BitcrackerBitCracker is the first open source password cracking tool for memory units encrypted with BitLocker
CaerHigh-performance Vision library in Python. Scale your research, not boilerplate.
Open3dOpen3D: A Modern Library for 3D Data Processing
Tsdf FusionFuse multiple depth frames into a TSDF voxel volume.
Tensorflow CmakeTensorFlow examples in C, C++, Go and Python without bazel but with cmake and FindTensorFlow.cmake
Accel(Mirror of GitLab) GPGPU Framework for Rust
IcpcudaSuper fast implementation of ICP in CUDA for compute capable devices 3.5 or higher
Ai LabAll-in-one AI container for rapid prototyping
GocvGo package for computer vision using OpenCV 4 and beyond.
Pytorch Pwc a reimplementation of PWC-Net in PyTorch that matches the official Caffe version
CubertFast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL
GanetGA-Net: Guided Aggregation Net for End-to-end Stereo Matching
Neuralnetwork.netA TensorFlow-inspired neural network library built from scratch in C# 7.3 for .NET Standard 2.0, with GPU support through cuDNN
AmgclC++ library for solving large sparse linear systems with algebraic multigrid method
CudfcuDF - GPU DataFrame Library
Music TranslationA UNIVERSAL MUSIC TRANSLATION NETWORK - a method for translating music across musical instruments and styles.
HipsyclImplementation of SYCL for CPUs, AMD GPUs, NVIDIA GPUs
IlgpuILGPU JIT Compiler for high-performance .Net GPU programs
NvpipeNVIDIA-accelerated zero latency video compression library for interactive remoting applications
VudaVUDA is a header-only library based on Vulkan that provides a CUDA Runtime API interface for writing GPU-accelerated applications.
Mini CaffeMinimal runtime core of Caffe, Forward only, GPU support and Memory efficiency.
LibsgmStereo Semi Global Matching by cuda
DarkposeDistribution-Aware Coordinate Representation for Human Pose Estimation
LoopyA code generator for array-based code on CPUs and GPUs
K2FSA/FST algorithms, differentiable, with PyTorch compatibility.
SleefSIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
VisionarayA C++-based, cross platform ray tracing library
BayaderaHigh-performance Bayesian Data Analysis on the GPU in Clojure
CudppCUDA Data Parallel Primitives Library