ArraymancerA fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+466.43%)
OnednnoneAPI Deep Neural Network Library (oneDNN)
Stars: ✭ 2,600 (+1757.14%)
Guided Missile SimulationGuided Missile, Radar and Infrared EOS Simulation Framework written in Fortran.
Stars: ✭ 33 (-76.43%)
Corrfunc⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
Stars: ✭ 114 (-18.57%)
Tensorflow Optimized WheelsTensorFlow wheels built for latest CUDA/CuDNN and enabled performance flags: SSE, AVX, FMA; XLA
Stars: ✭ 118 (-15.71%)
WheelsPerformance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Stars: ✭ 891 (+536.43%)
VcSIMD Vector Classes for C++
Stars: ✭ 985 (+603.57%)
mbsolveAn open-source solver tool for the Maxwell-Bloch equations.
Stars: ✭ 14 (-90%)
NsimdAgenium Scale vectorization library for CPUs and GPUs
Stars: ✭ 138 (-1.43%)
crowdsource-video-experiments-on-androidCrowdsourcing video experiments (such as collaborative benchmarking and optimization of DNN algorithms) using Collective Knowledge Framework across diverse Android devices provided by volunteers. Results are continuously aggregated in the open repository:
Stars: ✭ 29 (-79.29%)
CoriumCorium is a modern scripting language which combines simple, safe and efficient programming.
Stars: ✭ 18 (-87.14%)
DistillerNeural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
Stars: ✭ 3,760 (+2585.71%)
gpu-monitorScript to remotely check GPU servers for free GPUs
Stars: ✭ 85 (-39.29%)
SleefSIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
Stars: ✭ 353 (+152.14%)
AmgclC++ library for solving large sparse linear systems with algebraic multigrid method
Stars: ✭ 390 (+178.57%)
GraffitistGraph Transforms to Quantize and Retrain Deep Neural Nets in TensorFlow
Stars: ✭ 135 (-3.57%)
Awesome EmdlEmbedded and mobile deep learning research resources
Stars: ✭ 554 (+295.71%)
TaskflowA General-purpose Parallel and Heterogeneous Task Programming System
Stars: ✭ 6,128 (+4277.14%)
PyopenclOpenCL integration for Python, plus shiny features
Stars: ✭ 790 (+464.29%)
peakperfAchieve peak performance on x86 CPUs and NVIDIA GPUs
Stars: ✭ 33 (-76.43%)
AimetAIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Stars: ✭ 453 (+223.57%)
ChainerA flexible framework of neural networks for deep learning
Stars: ✭ 5,656 (+3940%)
DirectxmathDirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
Stars: ✭ 859 (+513.57%)
Arch-Data-ScienceArchlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision
Stars: ✭ 92 (-34.29%)
monolishmonolish: MONOlithic LInear equation Solvers for Highly-parallel architecture
Stars: ✭ 166 (+18.57%)
hero-sdk⛔ DEPRECATED ⛔ HERO Software Development Kit
Stars: ✭ 21 (-85%)
SockeyeSequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet
Stars: ✭ 990 (+607.14%)
KernelsThis is a set of simple programs that can be used to explore the features of a parallel platform.
Stars: ✭ 287 (+105%)
Deep DiamondA fast Clojure Tensor & Deep Learning library
Stars: ✭ 288 (+105.71%)
Mini CaffeMinimal runtime core of Caffe, Forward only, GPU support and Memory efficiency.
Stars: ✭ 373 (+166.43%)
gpubootcampThis repository consists for gpu bootcamp material for HPC and AI
Stars: ✭ 227 (+62.14%)
CupyNumPy & SciPy for GPU
Stars: ✭ 5,625 (+3917.86%)
Stdgpustdgpu: Efficient STL-like Data Structures on the GPU
Stars: ✭ 531 (+279.29%)
KratosKratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface.
Stars: ✭ 558 (+298.57%)
LibxsmmLibrary for specialized dense and sparse matrix operations, and deep learning primitives.
Stars: ✭ 518 (+270%)
MarianFast Neural Machine Translation in C++
Stars: ✭ 777 (+455%)
AccelerateEmbedded language for high-performance array computations
Stars: ✭ 751 (+436.43%)
NbodyN body gravity attraction problem solver
Stars: ✭ 40 (-71.43%)
SimdeImplementations of SIMD instruction sets for systems which don't natively support them.
Stars: ✭ 1,012 (+622.86%)
Nvidia libs testTests and benchmarks for cudnn (and in the future, other nvidia libraries)
Stars: ✭ 36 (-74.29%)
SixtyfourHow fast can we brute force a 64-bit comparison?
Stars: ✭ 41 (-70.71%)
Simple Sh DatascienceA collection of Bash scripts and Dockerfiles to install data science Tool, Lib and application
Stars: ✭ 32 (-77.14%)
Unisimd AssemblerSIMD macro assembler unified for ARM, MIPS, PPC and x86
Stars: ✭ 63 (-55%)
Gdax Orderbook MlApplication of machine learning to the Coinbase (GDAX) orderbook
Stars: ✭ 60 (-57.14%)
Quadray EngineRealtime raytracer using SIMD on ARM, MIPS, PPC and x86
Stars: ✭ 13 (-90.71%)
UmesimdUME::SIMD A library for explicit simd vectorization.
Stars: ✭ 66 (-52.86%)
Marian DevFast Neural Machine Translation in C++ - development repository
Stars: ✭ 136 (-2.86%)
AuroraMinimal Deep Learning library is written in Python/Cython/C++ and Numpy/CUDA/cuDNN.
Stars: ✭ 90 (-35.71%)
PytorchnlpbookCode and data accompanying Natural Language Processing with PyTorch published by O'Reilly Media https://nlproc.info
Stars: ✭ 1,390 (+892.86%)
SimdC++ image processing and machine learning library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX(Altivec) and VSX(Power7), NEON for ARM.
Stars: ✭ 1,263 (+802.14%)
Tensorflow Object Detection TutorialThe purpose of this tutorial is to learn how to install and prepare TensorFlow framework to train your own convolutional neural network object detection classifier for multiple objects, starting from scratch
Stars: ✭ 113 (-19.29%)
allgebraBase container for developing C++ and Fortran HPC applications
Stars: ✭ 14 (-90%)
FGPUNo description or website provided.
Stars: ✭ 30 (-78.57%)