LightseqLightSeq: A High Performance Inference Library for Sequence Processing and Generation
Stars: ✭ 501 (+26.84%)
ForwardA library for high performance deep learning inference on NVIDIA GPUs.
Stars: ✭ 136 (-65.57%)
Tensorflow CmakeTensorFlow examples in C, C++, Go and Python without bazel but with cmake and FindTensorFlow.cmake
Stars: ✭ 418 (+5.82%)
Onnxt5Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Stars: ✭ 143 (-63.8%)
Turbotransformersa fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
Stars: ✭ 826 (+109.11%)
fastT5⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Stars: ✭ 421 (+6.58%)
CudahandbookSource code that accompanies The CUDA Handbook.
Stars: ✭ 345 (-12.66%)
NvpipeNVIDIA-accelerated zero latency video compression library for interactive remoting applications
Stars: ✭ 376 (-4.81%)
VisionarayA C++-based, cross platform ray tracing library
Stars: ✭ 342 (-13.42%)
GfocalGeneralized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection, NeurIPS2020
Stars: ✭ 376 (-4.81%)
TransformerA TensorFlow Implementation of the Transformer: Attention Is All You Need
Stars: ✭ 3,646 (+823.04%)
CudfcuDF - GPU DataFrame Library
Stars: ✭ 4,370 (+1006.33%)
Mini CaffeMinimal runtime core of Caffe, Forward only, GPU support and Memory efficiency.
Stars: ✭ 373 (-5.57%)
ArrayfireArrayFire: a general purpose GPU library.
Stars: ✭ 3,693 (+834.94%)
Contextualized Topic ModelsA python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.
Stars: ✭ 318 (-19.49%)
Gpt2client✍🏻 gpt2-client: Easy-to-use TensorFlow Wrapper for GPT-2 117M, 345M, 774M, and 1.5B Transformer Models 🤖 📝
Stars: ✭ 322 (-18.48%)
GanetGA-Net: Guided Aggregation Net for End-to-end Stereo Matching
Stars: ✭ 393 (-0.51%)
Music TranslationA UNIVERSAL MUSIC TRANSLATION NETWORK - a method for translating music across musical instruments and styles.
Stars: ✭ 385 (-2.53%)
DarkposeDistribution-Aware Coordinate Representation for Human Pose Estimation
Stars: ✭ 369 (-6.58%)
JitifyA single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).
Stars: ✭ 314 (-20.51%)
Cuda ProgrammingSample codes for my CUDA programming book
Stars: ✭ 313 (-20.76%)
Cuda Api WrappersThin C++-flavored wrappers for the CUDA Runtime API
Stars: ✭ 362 (-8.35%)
Snips Nlu RsSnips NLU rust implementation
Stars: ✭ 315 (-20.25%)
SleefSIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
Stars: ✭ 353 (-10.63%)
Cuda.jlCUDA programming in Julia.
Stars: ✭ 370 (-6.33%)
NimtorchPyTorch - Python + Nim
Stars: ✭ 346 (-12.41%)
AmgclC++ library for solving large sparse linear systems with algebraic multigrid method
Stars: ✭ 390 (-1.27%)
BayaderaHigh-performance Bayesian Data Analysis on the GPU in Clojure
Stars: ✭ 342 (-13.42%)
Flow ForecastDeep learning PyTorch library for time series forecasting, classification, and anomaly detection (originally for flood forecasting).
Stars: ✭ 368 (-6.84%)
Cudanative.jlJulia support for native CUDA programming
Stars: ✭ 393 (-0.51%)
CudppCUDA Data Parallel Primitives Library
Stars: ✭ 333 (-15.7%)
VudaVUDA is a header-only library based on Vulkan that provides a CUDA Runtime API interface for writing GPU-accelerated applications.
Stars: ✭ 373 (-5.57%)
Gpt2 ChineseChinese version of GPT2 training code, using BERT tokenizer.
Stars: ✭ 4,592 (+1062.53%)
3GPU-accelerated micromagnetic simulator
Stars: ✭ 324 (-17.97%)
LibsgmStereo Semi Global Matching by cuda
Stars: ✭ 368 (-6.84%)
CutorchA CUDA backend for Torch7
Stars: ✭ 322 (-18.48%)
Nlp TutorialsSimple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com
Stars: ✭ 394 (-0.25%)
Transformer TensorflowTensorFlow implementation of 'Attention Is All You Need (2017. 6)'
Stars: ✭ 319 (-19.24%)
LoopyA code generator for array-based code on CPUs and GPUs
Stars: ✭ 367 (-7.09%)
ThrustThe C++ parallel algorithms library.
Stars: ✭ 3,595 (+810.13%)
RezeroOfficial PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"
Stars: ✭ 317 (-19.75%)
PyhgtCode for "Heterogeneous Graph Transformer" (WWW'20), which is based on pytorch_geometric
Stars: ✭ 313 (-20.76%)
Neuralnetwork.netA TensorFlow-inspired neural network library built from scratch in C# 7.3 for .NET Standard 2.0, with GPU support through cuDNN
Stars: ✭ 392 (-0.76%)
HipsyclImplementation of SYCL for CPUs, AMD GPUs, NVIDIA GPUs
Stars: ✭ 377 (-4.56%)
Fast gicpA collection of GICP-based fast point cloud registration algorithms
Stars: ✭ 307 (-22.28%)
Knn CudaFast k nearest neighbor search using GPU
Stars: ✭ 310 (-21.52%)
Arrayfire PythonPython bindings for ArrayFire: A general purpose GPU library.
Stars: ✭ 358 (-9.37%)
Cognitive Speech TtsMicrosoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
Stars: ✭ 312 (-21.01%)
CppflowRun TensorFlow models in C++ without installation and without Bazel
Stars: ✭ 357 (-9.62%)