Marian DevFast Neural Machine Translation in C++ - development repository
Stars: ✭ 136 (-39.01%)
DragonDragon: A Computation Graph Virtual Machine Based Deep Learning Framework.
Stars: ✭ 168 (-24.66%)
Partial Order PruningPartial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search
Stars: ✭ 135 (-39.46%)
RemoterySingle C file, Realtime CPU/GPU Profiler with Remote Web Viewer
Stars: ✭ 1,908 (+755.61%)
FloorA C++ Compute/Graphics Library and Toolchain enabling same-source CUDA/Host/Metal/OpenCL/Vulkan C++ programming and execution.
Stars: ✭ 166 (-25.56%)
LibcudacxxThe C++ Standard Library for your entire system.
Stars: ✭ 1,861 (+734.53%)
SimplegpuhashtableA simple GPU hash table implemented in CUDA using lock free techniques
Stars: ✭ 198 (-11.21%)
AgencyExecution primitives for C++
Stars: ✭ 127 (-43.05%)
SporcoSparse Optimisation Research Code
Stars: ✭ 164 (-26.46%)
RmsdCalculate Root-mean-square deviation (RMSD) of two molecules, using rotation, in xyz or pdb format
Stars: ✭ 215 (-3.59%)
Kaldikaldi-asr/kaldi is the official location of the Kaldi project.
Stars: ✭ 11,151 (+4900.45%)
JcudaJCuda - Java bindings for CUDA
Stars: ✭ 165 (-26.01%)
FcisFully Convolutional Instance-aware Semantic Segmentation
Stars: ✭ 1,563 (+600.9%)
ViseronSelf-hosted NVR with object detection
Stars: ✭ 192 (-13.9%)
OnemkloneAPI Math Kernel Library (oneMKL) Interfaces
Stars: ✭ 122 (-45.29%)
Multi Gpu Programming ModelsExamples demonstrating available options to program multiple GPUs in a single node or a cluster
Stars: ✭ 165 (-26.01%)
Knn cudaFast K-Nearest Neighbor search with GPU
Stars: ✭ 119 (-46.64%)
SmartsystemmenuSmartSystemMenu extends system menu of all windows in the system
Stars: ✭ 209 (-6.28%)
SpocStream Processing with OCaml
Stars: ✭ 115 (-48.43%)
Cx db8a contextual, biasable, word-or-sentence-or-paragraph extractive summarizer powered by the latest in text embeddings (Bert, Universal Sentence Encoder, Flair)
Stars: ✭ 164 (-26.46%)
CltuneCLTune: An automatic OpenCL & CUDA kernel tuner
Stars: ✭ 114 (-48.88%)
Ck CaffeCollective Knowledge workflow for Caffe to automate installation across diverse platforms and to collaboratively evaluate and optimize Caffe-based workloads across diverse hardware, software and data sets (compilers, libraries, tools, models, inputs):
Stars: ✭ 192 (-13.9%)
3ddfa v2The official PyTorch implementation of Towards Fast, Accurate and Stable 3D Dense Face Alignment, ECCV 2020.
Stars: ✭ 1,961 (+779.37%)
ClojurecudaClojure library for CUDA development
Stars: ✭ 158 (-29.15%)
Pytorch Unflow a reimplementation of UnFlow in PyTorch that matches the official TensorFlow version
Stars: ✭ 113 (-49.33%)
RelionImage-processing software for cryo-electron microscopy
Stars: ✭ 219 (-1.79%)
Adacof PytorchOfficial source code for our paper "AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation" (CVPR 2020)
Stars: ✭ 110 (-50.67%)
CuheCUDA Homomorphic Encryption Library
Stars: ✭ 109 (-51.12%)
Macos Egpu Cuda GuideSet up CUDA for machine learning (and gaming) on macOS using a NVIDIA eGPU
Stars: ✭ 187 (-16.14%)
SortmernaSortMeRNA: next-generation sequence filtering and alignment tool
Stars: ✭ 108 (-51.57%)
Cumf alsCUDA Matrix Factorization Library with Alternating Least Square (ALS)
Stars: ✭ 154 (-30.94%)
HashcatWorld's fastest and most advanced password recovery utility
Stars: ✭ 11,014 (+4839.01%)
HipHIP: C++ Heterogeneous-Compute Interface for Portability
Stars: ✭ 2,609 (+1069.96%)
DsmnetDomain-invariant Stereo Matching Networks
Stars: ✭ 153 (-31.39%)
DeepnetDeep.Net machine learning framework for F#
Stars: ✭ 99 (-55.61%)
JetsonHelmut Hoffer von Ankershoffen experimenting with arm64 based NVIDIA Jetson (Nano and AGX Xavier) edge devices running Kubernetes (K8s) for machine learning (ML) including Jupyter Notebooks, TensorFlow Training and TensorFlow Serving using CUDA for smart IoT.
Stars: ✭ 151 (-32.29%)
Extending JaxExtending JAX with custom C++ and CUDA code
Stars: ✭ 98 (-56.05%)
TigreTIGRE: Tomographic Iterative GPU-based Reconstruction Toolbox
Stars: ✭ 215 (-3.59%)
PynvvlA Python wrapper of NVIDIA Video Loader (NVVL) with CuPy for fast video loading with Python
Stars: ✭ 95 (-57.4%)
GinkgoNumerical linear algebra software package
Stars: ✭ 149 (-33.18%)
Fbtt EmbeddingThis is a Tensor Train based compression library to compress sparse embedding tables used in large-scale machine learning models such as recommendation and natural language processing. We showed this library can reduce the total model size by up to 100x in Facebook’s open sourced DLRM model while achieving same model quality. Our implementation is faster than the state-of-the-art implementations. Existing the state-of-the-art library also decompresses the whole embedding tables on the fly therefore they do not provide memory reduction during runtime of the training. Our library decompresses only the requested rows therefore can provide 10,000 times memory footprint reduction per embedding table. The library also includes a software cache to store a portion of the entries in the table in decompressed format for faster lookup and process.
Stars: ✭ 92 (-58.74%)
CumlcuML - RAPIDS Machine Learning Library
Stars: ✭ 2,504 (+1022.87%)
SketchgraphsA dataset of 15 million CAD sketches with geometric constraint graphs.
Stars: ✭ 148 (-33.63%)
MatconvnetMatConvNet: CNNs for MATLAB
Stars: ✭ 1,299 (+482.51%)
Cunn Stars: ✭ 205 (-8.07%)
Softmax Splattingan implementation of softmax splatting for differentiable forward warping using PyTorch
Stars: ✭ 218 (-2.24%)
NicehashquickminerSuper simple & easy Windows 10 cryptocurrency miner made by NiceHash.
Stars: ✭ 211 (-5.38%)
Core LayoutFlexbox & CSS-style Layout in Swift.
Stars: ✭ 215 (-3.59%)
OneflowOneFlow is a performance-centered and open-source deep learning framework.
Stars: ✭ 2,868 (+1186.1%)
CreepminerBurstcoin C++ CPU and GPU Miner
Stars: ✭ 169 (-24.22%)
Hoomd BlueMolecular dynamics and Monte Carlo soft matter simulation on GPUs.
Stars: ✭ 143 (-35.87%)