bifrostA stream processing framework for high-throughput applications.
Stars: ✭ 48 (-54.72%)
RelionImage-processing software for cryo-electron microscopy
Stars: ✭ 219 (+106.6%)
ArraymancerA fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+648.11%)
HipsyclImplementation of SYCL for CPUs, AMD GPUs, NVIDIA GPUs
Stars: ✭ 377 (+255.66%)
BayaderaHigh-performance Bayesian Data Analysis on the GPU in Clojure
Stars: ✭ 342 (+222.64%)
GOSHAn ultra-fast, GPU-based large graph embedding algorithm utilizing a novel coarsening algorithm requiring not more than a single GPU.
Stars: ✭ 12 (-88.68%)
rbcudaCUDA bindings for Ruby
Stars: ✭ 57 (-46.23%)
TaskflowA General-purpose Parallel and Heterogeneous Task Programming System
Stars: ✭ 6,128 (+5681.13%)
NeanderthalFast Clojure Matrix Library
Stars: ✭ 927 (+774.53%)
ThundersvmThunderSVM: A Fast SVM Library on GPUs and CPUs
Stars: ✭ 1,282 (+1109.43%)
Region ConvNot All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade
Stars: ✭ 95 (-10.38%)
DeepnetDeep.Net machine learning framework for F#
Stars: ✭ 99 (-6.6%)
Python Opencv Cudacustom opencv_contrib module which exposes opencv cuda optical flow methods with python bindings
Stars: ✭ 86 (-18.87%)
VgasimA Video display simulator
Stars: ✭ 94 (-11.32%)
MprReference implementation for "Massively Parallel Rendering of Complex Closed-Form Implicit Surfaces" (SIGGRAPH 2020)
Stars: ✭ 84 (-20.75%)
Pytorch EmdlossPyTorch 1.0 implementation of the approximate Earth Mover's Distance
Stars: ✭ 82 (-22.64%)
2016 super resolutionICCV2015 Image Super-Resolution Using Deep Convolutional Networks
Stars: ✭ 78 (-26.42%)
Cuda WinogradFast CUDA Kernels for ResNet Inference.
Stars: ✭ 104 (-1.89%)
NyuziprocessorGPGPU microprocessor architecture
Stars: ✭ 1,351 (+1174.53%)
Ustc RvsocFPGA-based RISC-V CPU+SoC.
Stars: ✭ 77 (-27.36%)
Cudart.jlJulia wrapper for CUDA runtime API
Stars: ✭ 75 (-29.25%)
PynvvlA Python wrapper of NVIDIA Video Loader (NVVL) with CuPy for fast video loading with Python
Stars: ✭ 95 (-10.38%)
MinhashcudaWeighted MinHash implementation on CUDA (multi-gpu).
Stars: ✭ 88 (-16.98%)
PygraphistryPyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
Stars: ✭ 1,365 (+1187.74%)
InfinityA lightweight C++ RDMA library for InfiniBand networks.
Stars: ✭ 86 (-18.87%)
Fbtt EmbeddingThis is a Tensor Train based compression library to compress sparse embedding tables used in large-scale machine learning models such as recommendation and natural language processing. We showed this library can reduce the total model size by up to 100x in Facebook’s open sourced DLRM model while achieving same model quality. Our implementation is faster than the state-of-the-art implementations. Existing the state-of-the-art library also decompresses the whole embedding tables on the fly therefore they do not provide memory reduction during runtime of the training. Our library decompresses only the requested rows therefore can provide 10,000 times memory footprint reduction per embedding table. The library also includes a software cache to store a portion of the entries in the table in decompressed format for faster lookup and process.
Stars: ✭ 92 (-13.21%)
Knn cudapytorch knn [cuda version]
Stars: ✭ 86 (-18.87%)
Neorv32A small and customizable full-scale 32-bit RISC-V soft-core CPU and SoC written in platform-independent VHDL.
Stars: ✭ 106 (+0%)
Kactus2devKactus2 is a graphical EDA tool based on the IP-XACT standard.
Stars: ✭ 82 (-22.64%)
BoincOpen-source software for volunteer computing and grid computing.
Stars: ✭ 1,320 (+1145.28%)
Modulated Deform Convdeformable convolution 2D 3D DeformableConvolution DeformConv Modulated Pytorch CUDA
Stars: ✭ 81 (-23.58%)
DppDetail-Preserving Pooling in Deep Networks (CVPR 2018)
Stars: ✭ 99 (-6.6%)
Nnabla Ext CudaA CUDA Extension of Neural Network Libraries
Stars: ✭ 79 (-25.47%)
NumerNumeric Erlang - vector and matrix operations with CUDA. Heavily inspired by Pteracuda - https://github.com/kevsmith/pteracuda
Stars: ✭ 91 (-14.15%)
Cuda Design PatternsSome CUDA design patterns and a bit of template magic for CUDA
Stars: ✭ 78 (-26.42%)
Fpga Soc LinuxFPGA+SoC+Linux+Device Tree Overlay+FPGA Manager U-Boot&Linux Kernel&Debian10 Images (for Xilinx:Zynq-Zybo:PYNQ-Z1 Altera:de0-nano-soc)
Stars: ✭ 106 (+0%)
HiopHPC solver for nonlinear optimization problems
Stars: ✭ 75 (-29.25%)
ElasticfusionReal-time dense visual SLAM system
Stars: ✭ 1,298 (+1124.53%)
Extending JaxExtending JAX with custom C++ and CUDA code
Stars: ✭ 98 (-7.55%)
AntikernelThe Antikernel operating system project
Stars: ✭ 75 (-29.25%)
DrakeAn R-focused pipeline toolkit for reproducibility and high-performance computing
Stars: ✭ 1,301 (+1127.36%)
TitanA high-performance CUDA-based physics simulation sandbox for soft robotics and reinforcement learning.
Stars: ✭ 73 (-31.13%)
MatconvnetMatConvNet: CNNs for MATLAB
Stars: ✭ 1,299 (+1125.47%)
Symbiflow ExamplesExample designs showing different ways to use SymbiFlow toolchains.
Stars: ✭ 71 (-33.02%)
Mads.jlMADS: Model Analysis & Decision Support
Stars: ✭ 71 (-33.02%)
RiscboyPortable games console, designed from scratch: CPU, graphics, PCB, and the kitchen sink
Stars: ✭ 103 (-2.83%)
SupraSUPRA: Software Defined Ultrasound Processing for Real-Time Applications - An Open Source 2D and 3D Pipeline from Beamforming to B-Mode
Stars: ✭ 96 (-9.43%)
AuroraMinimal Deep Learning library is written in Python/Cython/C++ and Numpy/CUDA/cuDNN.
Stars: ✭ 90 (-15.09%)
ParenchymaAn extensible HPC framework for CUDA, OpenCL and native CPU.
Stars: ✭ 71 (-33.02%)
LibceedCEED Library: Code for Efficient Extensible Discretizations
Stars: ✭ 90 (-15.09%)
DeepjointfilterThe source code of ECCV16 'Deep Joint Image Filtering'.
Stars: ✭ 68 (-35.85%)
Torch samplingEfficient reservoir sampling implementation for PyTorch
Stars: ✭ 68 (-35.85%)