JitifyA single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).
Stars: ✭ 314 (+105.23%)
NbodyN body gravity attraction problem solver
Stars: ✭ 40 (-73.86%)
ThrustThe C++ parallel algorithms library.
Stars: ✭ 3,595 (+2249.67%)
Deeppipe2Deep Learning library using GPU(CUDA/cuBLAS)
Stars: ✭ 90 (-41.18%)
Knn CudaFast k nearest neighbor search using GPU
Stars: ✭ 310 (+102.61%)
Soul EnginePhysically based renderer and simulation engine for real-time applications.
Stars: ✭ 37 (-75.82%)
Deep High Resolution Net.pytorchThe project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"
Stars: ✭ 3,521 (+2201.31%)
Person Reid ganICCV2017 Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro
Stars: ✭ 301 (+96.73%)
Nvidia libs testTests and benchmarks for cudnn (and in the future, other nvidia libraries)
Stars: ✭ 36 (-76.47%)
Ffmpeg Build ScriptThe FFmpeg build script provides an easy way to build a static FFmpeg on OSX and Linux with non-free codecs included.
Stars: ✭ 290 (+89.54%)
Object Detection And Location Realsensed435Use the Intel D435 real-sensing camera to realize target detection based on the Yolov3 framework under the Opencv DNN framework, and realize the 3D positioning of the Objection according to the depth information. Real-time display of the coordinates in the camera coordinate system.ADD--Using Yolov5 By TensorRT model,AGX-Xavier,RealTime Object Detection
Stars: ✭ 36 (-76.47%)
Cuarrays.jlA Curious Cumulation of CUDA Cuisine
Stars: ✭ 283 (+84.97%)
BabelstreamSTREAM, for lots of devices written in many programming models
Stars: ✭ 121 (-20.92%)
Tensor StreamA library for real-time video stream decoding to CUDA memory
Stars: ✭ 277 (+81.05%)
FbcudaFacebook's CUDA extensions.
Stars: ✭ 275 (+79.74%)
ThundersvmThunderSVM: A Fast SVM Library on GPUs and CPUs
Stars: ✭ 1,282 (+737.91%)
CudaExperiments with CUDA and Rust
Stars: ✭ 31 (-79.74%)
GprmaxgprMax is open source software that simulates electromagnetic wave propagation using the Finite-Difference Time-Domain (FDTD) method for numerical modelling of Ground Penetrating Radar (GPR)
Stars: ✭ 268 (+75.16%)
Marian DevFast Neural Machine Translation in C++ - development repository
Stars: ✭ 136 (-11.11%)
Kinectfusionlib Implementation of the KinectFusion approach in modern C++14 and CUDA
Stars: ✭ 261 (+70.59%)
PopsiftPopSift is an implementation of the SIFT algorithm in CUDA.
Stars: ✭ 259 (+69.28%)
gpu-monitorScript to remotely check GPU servers for free GPUs
Stars: ✭ 85 (-44.44%)
Des CudaDES cracking using brute force algorithm and CUDA
Stars: ✭ 21 (-86.27%)
Tensorflow Optimized WheelsTensorFlow wheels built for latest CUDA/CuDNN and enabled performance flags: SSE, AVX, FMA; XLA
Stars: ✭ 118 (-22.88%)
Torch-TensorRTPyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT
Stars: ✭ 1,216 (+694.77%)
crowdsource-video-experiments-on-androidCrowdsourcing video experiments (such as collaborative benchmarking and optimization of DNN algorithms) using Collective Knowledge Framework across diverse Android devices provided by volunteers. Results are continuously aggregated in the open repository:
Stars: ✭ 29 (-81.05%)
Knn cudapytorch knn [cuda version]
Stars: ✭ 86 (-43.79%)
desertA fast (?) random sampling drawing library
Stars: ✭ 61 (-60.13%)
UammdA CUDA project for Molecular Dynamics, Brownian Dynamics, Hydrodynamics... intended to simulate a very generic system constructing a simulation with modules.
Stars: ✭ 11 (-92.81%)
CPP-ProgrammingVarious C/C++ examples. DirectX, OpenGL, CUDA, Vulkan, OpenCL.
Stars: ✭ 30 (-80.39%)
hipaccA domain-specific language and compiler for image processing
Stars: ✭ 72 (-52.94%)
Gpu badmm mtBregman ADMM for mass transportation on GPU
Stars: ✭ 10 (-93.46%)
tensorflow-windowsTensorFlow builds compiled on windows with avx and avx2 extensions
Stars: ✭ 20 (-86.93%)
Pytorch EmdlossPyTorch 1.0 implementation of the approximate Earth Mover's Distance
Stars: ✭ 82 (-46.41%)
QPT[内测中]前向式Python环境快捷封装工具,快速将Python打包为EXE并添加CUDA、NoAVX等支持。
Stars: ✭ 308 (+101.31%)
PresentationsSlides and demo code for past presentations
Stars: ✭ 7 (-95.42%)
octotigerAstrophysics program simulating the evolution of star systems based on the fast multipole method on adaptive Octrees
Stars: ✭ 30 (-80.39%)
MtensorA C++ Cuda Tensor Lazy Computing Library
Stars: ✭ 115 (-24.84%)
ThrustRTCCUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.
Stars: ✭ 41 (-73.2%)
ZludaCUDA on Intel GPUs
Stars: ✭ 937 (+512.42%)
mini-nbodyA simple gravitational N-body simulation in less than 100 lines of C code, with CUDA optimizations.
Stars: ✭ 73 (-52.29%)
Nnabla Ext CudaA CUDA Extension of Neural Network Libraries
Stars: ✭ 79 (-48.37%)
ThorAtmospheric fluid dynamics solver optimized for GPUs.
Stars: ✭ 23 (-84.97%)
CompactcnncascadeA binary library for very fast face detection using compact CNNs.
Stars: ✭ 152 (-0.65%)
NsimdAgenium Scale vectorization library for CPUs and GPUs
Stars: ✭ 138 (-9.8%)
OnemkloneAPI Math Kernel Library (oneMKL) Interfaces
Stars: ✭ 122 (-20.26%)
HallocA fast and highly scalable GPU dynamic memory allocator
Stars: ✭ 89 (-41.83%)
Smallpt Parallel Bvh GpuA GPU implementation of smallpt (http://www.kevinbeason.com/smallpt/) with Bounding Volume Hierarchy (BVH) tree.
Stars: ✭ 36 (-76.47%)
Cuda voxelizerCUDA Voxelizer to convert polygon meshes into annotated voxel grids
Stars: ✭ 299 (+95.42%)