ConvnetA GPU implementation of Convolutional Neural Nets in C++
Stars: ✭ 506 (+3792.31%)
LightseqLightSeq: A High Performance Inference Library for Sequence Processing and Generation
Stars: ✭ 501 (+3753.85%)
JuiceThe Hacker's Machine Learning Engine
Stars: ✭ 743 (+5615.38%)
AutokernelAutoKernel 是一个简单易用,低门槛的自动算子优化工具,提高深度学习算法部署效率。
Stars: ✭ 485 (+3630.77%)
PresentationsSlides and demo code for past presentations
Stars: ✭ 7 (-46.15%)
Tsdf Fusion PythonPython code to fuse multiple RGB-D images into a TSDF voxel volume.
Stars: ✭ 464 (+3469.23%)
Deep Painterly HarmonizationCode and data for paper "Deep Painterly Harmonization": https://arxiv.org/abs/1804.03189
Stars: ✭ 6,027 (+46261.54%)
CaerHigh-performance Vision library in Python. Scale your research, not boilerplate.
Stars: ✭ 452 (+3376.92%)
WheelsPerformance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Stars: ✭ 891 (+6753.85%)
Cuda Convnet2Automatically exported from code.google.com/p/cuda-convnet2
Stars: ✭ 690 (+5207.69%)
Tensorflow CmakeTensorFlow examples in C, C++, Go and Python without bazel but with cmake and FindTensorFlow.cmake
Stars: ✭ 418 (+3115.38%)
Gpu badmm mtBregman ADMM for mass transportation on GPU
Stars: ✭ 10 (-23.08%)
IcpcudaSuper fast implementation of ICP in CUDA for compute capable devices 3.5 or higher
Stars: ✭ 416 (+3100%)
Nv WavenetReference implementation of real-time autoregressive wavenet inference
Stars: ✭ 681 (+5138.46%)
Ddsh Tip2018source code for paper "Deep Discrete Supervised Hashing"
Stars: ✭ 16 (+23.08%)
Ai LabAll-in-one AI container for rapid prototyping
Stars: ✭ 406 (+3023.08%)
Mc CnnStereo Matching by Training a Convolutional Neural Network to Compare Image Patches
Stars: ✭ 638 (+4807.69%)
GocvGo package for computer vision using OpenCV 4 and beyond.
Stars: ✭ 4,511 (+34600%)
ZludaCUDA on Intel GPUs
Stars: ✭ 937 (+7107.69%)
CubertFast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL
Stars: ✭ 395 (+2938.46%)
KmcudaLarge scale K-means and K-nn implementation on NVIDIA GPU / CUDA
Stars: ✭ 627 (+4723.08%)
DlpackRFC for common in-memory tensor structure and operator interface for deep learning system
Stars: ✭ 398 (+2961.54%)
LibcudarangeAn interval arithmetic and affine arithmetic library for NVIDIA CUDA
Stars: ✭ 5 (-61.54%)
Cudanative.jlJulia support for native CUDA programming
Stars: ✭ 393 (+2923.08%)
LrslibraryLow-Rank and Sparse Tools for Background Modeling and Subtraction in Videos
Stars: ✭ 625 (+4707.69%)
Neuralnetwork.netA TensorFlow-inspired neural network library built from scratch in C# 7.3 for .NET Standard 2.0, with GPU support through cuDNN
Stars: ✭ 392 (+2915.38%)
UammdA CUDA project for Molecular Dynamics, Brownian Dynamics, Hydrodynamics... intended to simulate a very generic system constructing a simulation with modules.
Stars: ✭ 11 (-15.38%)
MegengineMegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架
Stars: ✭ 4,081 (+31292.31%)
SpeedtorchLibrary for faster pinned CPU <-> GPU transfer in Pytorch
Stars: ✭ 615 (+4630.77%)
Music TranslationA UNIVERSAL MUSIC TRANSLATION NETWORK - a method for translating music across musical instruments and styles.
Stars: ✭ 385 (+2861.54%)
Scikit CudaPython interface to GPU-powered libraries
Stars: ✭ 803 (+6076.92%)
IlgpuILGPU JIT Compiler for high-performance .Net GPU programs
Stars: ✭ 374 (+2776.92%)
ThundergbmThunderGBM: Fast GBDTs and Random Forests on GPUs
Stars: ✭ 586 (+4407.69%)
NvpipeNVIDIA-accelerated zero latency video compression library for interactive remoting applications
Stars: ✭ 376 (+2792.31%)
ThorAtmospheric fluid dynamics solver optimized for GPUs.
Stars: ✭ 23 (+76.92%)
Cuda Api WrappersThin C++-flavored wrappers for the CUDA Runtime API
Stars: ✭ 362 (+2684.62%)
TrtorchPyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT
Stars: ✭ 583 (+4384.62%)
DarkposeDistribution-Aware Coordinate Representation for Human Pose Estimation
Stars: ✭ 369 (+2738.46%)
TvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Stars: ✭ 7,494 (+57546.15%)
DiffsharpDiffSharp: Differentiable Functional Programming
Stars: ✭ 365 (+2707.69%)
Xmrig NvidiaMonero (XMR) NVIDIA miner
Stars: ✭ 560 (+4207.69%)
MindseyeNeural Networks in Java 8 with CuDNN and Aparapi
Stars: ✭ 8 (-38.46%)
Arrayfire PythonPython bindings for ArrayFire: A general purpose GPU library.
Stars: ✭ 358 (+2653.85%)
Lattice netFast Point Cloud Segmentation Using Permutohedral Lattices
Stars: ✭ 23 (+76.92%)
BlocksparseEfficient GPU kernels for block-sparse matrix multiplication and convolution
Stars: ✭ 797 (+6030.77%)
CudasiftA CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)
Stars: ✭ 555 (+4169.23%)
K2FSA/FST algorithms, differentiable, with PyTorch compatibility.
Stars: ✭ 354 (+2623.08%)
TutorialsSome basic programming tutorials
Stars: ✭ 353 (+2615.38%)
CudamatPython module for performing basic dense linear algebra computations on the GPU using CUDA.
Stars: ✭ 554 (+4161.54%)
SleefSIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
Stars: ✭ 353 (+2615.38%)
VisionarayA C++-based, cross platform ray tracing library
Stars: ✭ 342 (+2530.77%)
NimtorchPyTorch - Python + Nim
Stars: ✭ 346 (+2561.54%)
CudahandbookSource code that accompanies The CUDA Handbook.
Stars: ✭ 345 (+2553.85%)
Lighthouse2Lighthouse 2 framework for real-time ray tracing
Stars: ✭ 542 (+4069.23%)
BayaderaHigh-performance Bayesian Data Analysis on the GPU in Clojure
Stars: ✭ 342 (+2530.77%)
NiutensorNiuTensor is an open-source toolkit developed by a joint team from NLP Lab. at Northeastern University and the NiuTrans Team. It provides tensor utilities to create and train neural networks.
Stars: ✭ 337 (+2492.31%)
Sepconv Slomoan implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch
Stars: ✭ 918 (+6961.54%)
PyopenclOpenCL integration for Python, plus shiny features
Stars: ✭ 790 (+5976.92%)