Kaldikaldi-asr/kaldi is the official location of the Kaldi project.
Stars: ✭ 11,151 (+6617.47%)
SpocStream Processing with OCaml
Stars: ✭ 115 (-30.72%)
Hoomd BlueMolecular dynamics and Monte Carlo soft matter simulation on GPUs.
Stars: ✭ 143 (-13.86%)
MixbenchA GPU benchmark tool for evaluating GPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL)
Stars: ✭ 130 (-21.69%)
CuheCUDA Homomorphic Encryption Library
Stars: ✭ 109 (-34.34%)
Optical Flow FilterA real time optical flow algorithm implemented on GPU
Stars: ✭ 146 (-12.05%)
OnemkloneAPI Math Kernel Library (oneMKL) Interfaces
Stars: ✭ 122 (-26.51%)
Cumf alsCUDA Matrix Factorization Library with Alternating Least Square (ALS)
Stars: ✭ 154 (-7.23%)
Tensorflow Object Detection TutorialThe purpose of this tutorial is to learn how to install and prepare TensorFlow framework to train your own convolutional neural network object detection classifier for multiple objects, starting from scratch
Stars: ✭ 113 (-31.93%)
NsimdAgenium Scale vectorization library for CPUs and GPUs
Stars: ✭ 138 (-16.87%)
NnvmNo description or website provided.
Stars: ✭ 1,639 (+887.35%)
Cuda CnnCNN accelerated by cuda. Test on mnist and finilly get 99.76%
Stars: ✭ 148 (-10.84%)
FcisFully Convolutional Instance-aware Semantic Segmentation
Stars: ✭ 1,563 (+841.57%)
GpurirPython library for Room Impulse Response (RIR) simulation with GPU acceleration
Stars: ✭ 145 (-12.65%)
Knn cudaFast K-Nearest Neighbor search with GPU
Stars: ✭ 119 (-28.31%)
Cx db8a contextual, biasable, word-or-sentence-or-paragraph extractive summarizer powered by the latest in text embeddings (Bert, Universal Sentence Encoder, Flair)
Stars: ✭ 164 (-1.2%)
CltuneCLTune: An automatic OpenCL & CUDA kernel tuner
Stars: ✭ 114 (-31.33%)
ForwardA library for high performance deep learning inference on NVIDIA GPUs.
Stars: ✭ 136 (-18.07%)
Adacof PytorchOfficial source code for our paper "AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation" (CVPR 2020)
Stars: ✭ 110 (-33.73%)
CompactcnncascadeA binary library for very fast face detection using compact CNNs.
Stars: ✭ 152 (-8.43%)
HashcatWorld's fastest and most advanced password recovery utility
Stars: ✭ 11,014 (+6534.94%)
SpanetSpatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19)
Stars: ✭ 136 (-18.07%)
Cuda WinogradFast CUDA Kernels for ResNet Inference.
Stars: ✭ 104 (-37.35%)
GinkgoNumerical linear algebra software package
Stars: ✭ 149 (-10.24%)
LibcudacxxThe C++ Standard Library for your entire system.
Stars: ✭ 1,861 (+1021.08%)
Xmrminer🐜 A CUDA based miner for Monero
Stars: ✭ 158 (-4.82%)
AgencyExecution primitives for C++
Stars: ✭ 127 (-23.49%)
SketchgraphsA dataset of 15 million CAD sketches with geometric constraint graphs.
Stars: ✭ 148 (-10.84%)
PrimitivA Neural Network Toolkit.
Stars: ✭ 164 (-1.2%)
Warp RnntCUDA-Warp RNN-Transducer
Stars: ✭ 122 (-26.51%)
RmmRAPIDS Memory Manager
Stars: ✭ 154 (-7.23%)
BabelstreamSTREAM, for lots of devices written in many programming models
Stars: ✭ 121 (-27.11%)
RemoterySingle C file, Realtime CPU/GPU Profiler with Remote Web Viewer
Stars: ✭ 1,908 (+1049.4%)
Tensorflow Optimized WheelsTensorFlow wheels built for latest CUDA/CuDNN and enabled performance flags: SSE, AVX, FMA; XLA
Stars: ✭ 118 (-28.92%)
MtensorA C++ Cuda Tensor Lazy Computing Library
Stars: ✭ 115 (-30.72%)
Libgdf[ARCHIVED] C GPU DataFrame Library
Stars: ✭ 142 (-14.46%)
Pytorch spnExtension package for spatial propagation network in pytorch.
Stars: ✭ 114 (-31.33%)
DsmnetDomain-invariant Stereo Matching Networks
Stars: ✭ 153 (-7.83%)
Pytorch Unflow a reimplementation of UnFlow in PyTorch that matches the official TensorFlow version
Stars: ✭ 113 (-31.93%)
Ctranslate2Fast inference engine for OpenNMT models
Stars: ✭ 140 (-15.66%)
Futhark💥💻💥 A data-parallel functional programming language
Stars: ✭ 1,641 (+888.55%)
KhivaAn open-source library of algorithms to analyse time series in GPU and CPU.
Stars: ✭ 161 (-3.01%)
Marian DevFast Neural Machine Translation in C++ - development repository
Stars: ✭ 136 (-18.07%)
DaceDaCe - Data Centric Parallel Programming
Stars: ✭ 106 (-36.14%)
JetsonHelmut Hoffer von Ankershoffen experimenting with arm64 based NVIDIA Jetson (Nano and AGX Xavier) edge devices running Kubernetes (K8s) for machine learning (ML) including Jupyter Notebooks, TensorFlow Training and TensorFlow Serving using CUDA for smart IoT.
Stars: ✭ 151 (-9.04%)
Partial Order PruningPartial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search
Stars: ✭ 135 (-18.67%)
JcudaJCuda - Java bindings for CUDA
Stars: ✭ 165 (-0.6%)
Multi Gpu Programming ModelsExamples demonstrating available options to program multiple GPUs in a single node or a cluster
Stars: ✭ 165 (-0.6%)
ClojurecudaClojure library for CUDA development
Stars: ✭ 158 (-4.82%)