SpeedtorchLibrary for faster pinned CPU <-> GPU transfer in Pytorch
Stars: ✭ 615 (+459.09%)
Pytorch Pwc a reimplementation of PWC-Net in PyTorch that matches the official Caffe version
Stars: ✭ 402 (+265.45%)
3d Ken Burnsan implementation of 3D Ken Burns Effect from a Single Image using PyTorch
Stars: ✭ 1,073 (+875.45%)
revisiting-sepconvan implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch
Stars: ✭ 43 (-60.91%)
Pytorch Unflow a reimplementation of UnFlow in PyTorch that matches the official TensorFlow version
Stars: ✭ 113 (+2.73%)
CupyNumPy & SciPy for GPU
Stars: ✭ 5,625 (+5013.64%)
SpanetSpatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19)
Stars: ✭ 136 (+23.64%)
ChainerA flexible framework of neural networks for deep learning
Stars: ✭ 5,656 (+5041.82%)
Pytorch Liteflownet a reimplementation of LiteFlowNet in PyTorch that matches the official Caffe version
Stars: ✭ 281 (+155.45%)
Softmax Splattingan implementation of softmax splatting for differentiable forward warping using PyTorch
Stars: ✭ 218 (+98.18%)
Sepconv Slomoan implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch
Stars: ✭ 918 (+734.55%)
PynvvlA Python wrapper of NVIDIA Video Loader (NVVL) with CuPy for fast video loading with Python
Stars: ✭ 95 (-13.64%)
Pytorch EmdlossPyTorch 1.0 implementation of the approximate Earth Mover's Distance
Stars: ✭ 82 (-25.45%)
Fbtt EmbeddingThis is a Tensor Train based compression library to compress sparse embedding tables used in large-scale machine learning models such as recommendation and natural language processing. We showed this library can reduce the total model size by up to 100x in Facebook’s open sourced DLRM model while achieving same model quality. Our implementation is faster than the state-of-the-art implementations. Existing the state-of-the-art library also decompresses the whole embedding tables on the fly therefore they do not provide memory reduction during runtime of the training. Our library decompresses only the requested rows therefore can provide 10,000 times memory footprint reduction per embedding table. The library also includes a software cache to store a portion of the entries in the table in decompressed format for faster lookup and process.
Stars: ✭ 92 (-16.36%)
Nnabla Ext CudaA CUDA Extension of Neural Network Libraries
Stars: ✭ 79 (-28.18%)
Cuda Design PatternsSome CUDA design patterns and a bit of template magic for CUDA
Stars: ✭ 78 (-29.09%)
Cuda WinogradFast CUDA Kernels for ResNet Inference.
Stars: ✭ 104 (-5.45%)
Cudart.jlJulia wrapper for CUDA runtime API
Stars: ✭ 75 (-31.82%)
ParenchymaAn extensible HPC framework for CUDA, OpenCL and native CPU.
Stars: ✭ 71 (-35.45%)
MatconvnetMatConvNet: CNNs for MATLAB
Stars: ✭ 1,299 (+1080.91%)
DeepjointfilterThe source code of ECCV16 'Deep Joint Image Filtering'.
Stars: ✭ 68 (-38.18%)
AlenkaGPU database engine
Stars: ✭ 1,150 (+945.45%)
DeepnetDeep.Net machine learning framework for F#
Stars: ✭ 99 (-10%)
Deeppipe2Deep Learning library using GPU(CUDA/cuBLAS)
Stars: ✭ 90 (-18.18%)
Autodock GpuAutoDock for GPUs and other accelerators
Stars: ✭ 65 (-40.91%)
MprReference implementation for "Massively Parallel Rendering of Complex Closed-Form Implicit Surfaces" (SIGGRAPH 2020)
Stars: ✭ 84 (-23.64%)
Region ConvNot All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade
Stars: ✭ 95 (-13.64%)
Modulated Deform Convdeformable convolution 2D 3D DeformableConvolution DeformConv Modulated Pytorch CUDA
Stars: ✭ 81 (-26.36%)
2016 super resolutionICCV2015 Image Super-Resolution Using Deep Convolutional Networks
Stars: ✭ 78 (-29.09%)
NumerNumeric Erlang - vector and matrix operations with CUDA. Heavily inspired by Pteracuda - https://github.com/kevsmith/pteracuda
Stars: ✭ 91 (-17.27%)
HiopHPC solver for nonlinear optimization problems
Stars: ✭ 75 (-31.82%)
ChainercvChainerCV: a Library for Deep Learning in Computer Vision
Stars: ✭ 1,463 (+1230%)
TitanA high-performance CUDA-based physics simulation sandbox for soft robotics and reinforcement learning.
Stars: ✭ 73 (-33.64%)
ElasticfusionReal-time dense visual SLAM system
Stars: ✭ 1,298 (+1080%)
PygraphistryPyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
Stars: ✭ 1,365 (+1140.91%)
Torch samplingEfficient reservoir sampling implementation for PyTorch
Stars: ✭ 68 (-38.18%)
AuroraMinimal Deep Learning library is written in Python/Cython/C++ and Numpy/CUDA/cuDNN.
Stars: ✭ 90 (-18.18%)
CuheCUDA Homomorphic Encryption Library
Stars: ✭ 109 (-0.91%)
ArboretumGradient Boosting powered by GPU(NVIDIA CUDA)
Stars: ✭ 64 (-41.82%)
HallocA fast and highly scalable GPU dynamic memory allocator
Stars: ✭ 89 (-19.09%)
Cudadrv.jlA Julia wrapper for the CUDA driver API.
Stars: ✭ 64 (-41.82%)
CudadtwGPU-Suite
Stars: ✭ 63 (-42.73%)
DppDetail-Preserving Pooling in Deep Networks (CVPR 2018)
Stars: ✭ 99 (-10%)
Mpn Cov@ICCV2017: For exploiting second-order statistics, we propose Matrix Power Normalized Covariance pooling (MPN-COV) ConvNets, different from and outperforming those using global average pooling.
Stars: ✭ 63 (-42.73%)
CutlassCUDA Templates for Linear Algebra Subroutines
Stars: ✭ 1,123 (+920.91%)
GgnnGGNN: State of the Art Graph-based GPU Nearest Neighbor Search
Stars: ✭ 63 (-42.73%)
Tsne CudaGPU Accelerated t-SNE for CUDA with Python bindings
Stars: ✭ 1,120 (+918.18%)
HashcatWorld's fastest and most advanced password recovery utility
Stars: ✭ 11,014 (+9912.73%)
Extending JaxExtending JAX with custom C++ and CUDA code
Stars: ✭ 98 (-10.91%)
ThundersvmThunderSVM: A Fast SVM Library on GPUs and CPUs
Stars: ✭ 1,282 (+1065.45%)
Gdax Orderbook MlApplication of machine learning to the Coinbase (GDAX) orderbook
Stars: ✭ 60 (-45.45%)
MinkowskiengineMinkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors
Stars: ✭ 1,110 (+909.09%)
MinhashcudaWeighted MinHash implementation on CUDA (multi-gpu).
Stars: ✭ 88 (-20%)