Ctranslate2Fast inference engine for OpenNMT models
Stars: ✭ 140 (+18.64%)
DirectxmathDirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
Stars: ✭ 859 (+627.97%)
Sse4 StrstrSIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification
Stars: ✭ 115 (-2.54%)
Tensorflow Object Detection TutorialThe purpose of this tutorial is to learn how to install and prepare TensorFlow framework to train your own convolutional neural network object detection classifier for multiple objects, starting from scratch
Stars: ✭ 113 (-4.24%)
ChainerA flexible framework of neural networks for deep learning
Stars: ✭ 5,656 (+4693.22%)
ToysStorage for my snippets, toy programs, etc.
Stars: ✭ 187 (+58.47%)
WheelsPerformance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Stars: ✭ 891 (+655.08%)
LibxsmmLibrary for specialized dense and sparse matrix operations, and deep learning primitives.
Stars: ✭ 518 (+338.98%)
Base64simdBase64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)
Stars: ✭ 115 (-2.54%)
AuroraMinimal Deep Learning library is written in Python/Cython/C++ and Numpy/CUDA/cuDNN.
Stars: ✭ 90 (-23.73%)
ternary-logicSupport for ternary logic in SSE, XOP, AVX2 and x86 programs
Stars: ✭ 21 (-82.2%)
cpuwhatNim utilities for advanced CPU operations: CPU identification, ISA extension detection, bindings to assorted intrinsics
Stars: ✭ 25 (-78.81%)
Quadray EngineRealtime raytracer using SIMD on ARM, MIPS, PPC and x86
Stars: ✭ 13 (-88.98%)
NsimdAgenium Scale vectorization library for CPUs and GPUs
Stars: ✭ 138 (+16.95%)
Sse PopcountSIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html
Stars: ✭ 226 (+91.53%)
LibsimdppPortable header-only C++ low level SIMD library
Stars: ✭ 914 (+674.58%)
Arch-Data-ScienceArchlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision
Stars: ✭ 92 (-22.03%)
gpu-monitorScript to remotely check GPU servers for free GPUs
Stars: ✭ 85 (-27.97%)
ArraymancerA fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+572.03%)
VcSIMD Vector Classes for C++
Stars: ✭ 985 (+734.75%)
Nvidia libs testTests and benchmarks for cudnn (and in the future, other nvidia libraries)
Stars: ✭ 36 (-69.49%)
SimdC++ image processing and machine learning library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX(Altivec) and VSX(Power7), NEON for ARM.
Stars: ✭ 1,263 (+970.34%)
Turbo-TransposeTranspose: SIMD Integer+Floating Point Compression Filter
Stars: ✭ 50 (-57.63%)
Mini CaffeMinimal runtime core of Caffe, Forward only, GPU support and Memory efficiency.
Stars: ✭ 373 (+216.1%)
Social-Distancing-and-Face-Mask-DetectionSocial Distancing and Face Mask Detection using TensorFlow. Install all required Libraries and GPU drivers as well. Refer to README.md or REPORT for know to installation requirement
Stars: ✭ 39 (-66.95%)
CupyNumPy & SciPy for GPU
Stars: ✭ 5,625 (+4666.95%)
Simple Sh DatascienceA collection of Bash scripts and Dockerfiles to install data science Tool, Lib and application
Stars: ✭ 32 (-72.88%)
SixtyfourHow fast can we brute force a 64-bit comparison?
Stars: ✭ 41 (-65.25%)
SimdeImplementations of SIMD instruction sets for systems which don't natively support them.
Stars: ✭ 1,012 (+757.63%)
Unisimd AssemblerSIMD macro assembler unified for ARM, MIPS, PPC and x86
Stars: ✭ 63 (-46.61%)
DespacerC library to remove white space from strings as fast as possible
Stars: ✭ 90 (-23.73%)
Deeppipe2Deep Learning library using GPU(CUDA/cuBLAS)
Stars: ✭ 90 (-23.73%)
Cuda WinogradFast CUDA Kernels for ResNet Inference.
Stars: ✭ 104 (-11.86%)
HallocA fast and highly scalable GPU dynamic memory allocator
Stars: ✭ 89 (-24.58%)
Demo Spring Sse'Server-Sent Events (SSE) in Spring 5 with Web MVC and Web Flux' article and source code.
Stars: ✭ 102 (-13.56%)
ThundersvmThunderSVM: A Fast SVM Library on GPUs and CPUs
Stars: ✭ 1,282 (+986.44%)
MtensorA C++ Cuda Tensor Lazy Computing Library
Stars: ✭ 115 (-2.54%)
Pytorch Unflow a reimplementation of UnFlow in PyTorch that matches the official TensorFlow version
Stars: ✭ 113 (-4.24%)
PygraphistryPyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
Stars: ✭ 1,365 (+1056.78%)
MinhashcudaWeighted MinHash implementation on CUDA (multi-gpu).
Stars: ✭ 88 (-25.42%)
Spring 5 ExamplesThis repository is contains spring-boot 2 / spring framework 5 project examples. Using reactive programming model / paradigm and Kotlin
Stars: ✭ 87 (-26.27%)
DeepnetDeep.Net machine learning framework for F#
Stars: ✭ 99 (-16.1%)
Python Opencv Cudacustom opencv_contrib module which exposes opencv cuda optical flow methods with python bindings
Stars: ✭ 86 (-27.12%)
Adacof PytorchOfficial source code for our paper "AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation" (CVPR 2020)
Stars: ✭ 110 (-6.78%)
DppDetail-Preserving Pooling in Deep Networks (CVPR 2018)
Stars: ✭ 99 (-16.1%)
Knn cudapytorch knn [cuda version]
Stars: ✭ 86 (-27.12%)
Extending JaxExtending JAX with custom C++ and CUDA code
Stars: ✭ 98 (-16.95%)