All Projects → SciML → Autooffload.jl

SciML / Autooffload.jl

Licence: mit
Automatic GPU, TPU, FPGA, Xeon Phi, Multithreaded, Distributed, etc. offloading for scientific machine learning (SciML) and differential equations

Programming Languages

julia
2034 projects

Projects that are alternatives of or similar to Autooffload.jl

Occa
JIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal
Stars: ✭ 230 (+995.24%)
Mutual labels:  multithreading, gpu
funboost
pip install funboost,python全功能分布式函数调度框架,。支持python所有类型的并发模式和全球一切知名消息队列中间件,python函数加速器,框架包罗万象,一统编程思维,兼容50% python编程业务场景,适用范围广。只需要一行代码即可分布式执行python一切函数。旧名字是function_scheduling_distributed_framework
Stars: ✭ 351 (+1571.43%)
Mutual labels:  multiprocessing, distributed
Mongols
C++ high performance networking with TCP/UDP/RESP/HTTP/WebSocket protocols
Stars: ✭ 250 (+1090.48%)
Mutual labels:  multithreading, multiprocessing
Tsne Cuda
GPU Accelerated t-SNE for CUDA with Python bindings
Stars: ✭ 1,120 (+5233.33%)
Mutual labels:  multithreading, gpu
Crypto Rl
Deep Reinforcement Learning toolkit: record and replay cryptocurrency limit order book data & train a DDQN agent
Stars: ✭ 328 (+1461.9%)
Mutual labels:  multithreading, multiprocessing
Archive Password Cracker
设计精良的压缩包密码破解工具,具有自定义字典、导出字典、选择字典等功能。基于Python实现,支持多线程与多进程,不断完善中……
Stars: ✭ 65 (+209.52%)
Mutual labels:  multithreading, multiprocessing
pooljs
Browser computing unleashed!
Stars: ✭ 17 (-19.05%)
Mutual labels:  multithreading, distributed
Vqengine
DirectX 11 Renderer written in C++11
Stars: ✭ 250 (+1090.48%)
Mutual labels:  multithreading, gpu
Fast gicp
A collection of GICP-based fast point cloud registration algorithms
Stars: ✭ 307 (+1361.9%)
Mutual labels:  multithreading, gpu
Pypette
Ridiculously simple flow controller for building complex pipelines
Stars: ✭ 258 (+1128.57%)
Mutual labels:  multithreading, multiprocessing
Heteroflow
Concurrent CPU-GPU Programming using Task Models
Stars: ✭ 57 (+171.43%)
Mutual labels:  multithreading, gpu
Scanner
Efficient video analysis at scale
Stars: ✭ 569 (+2609.52%)
Mutual labels:  gpu, distributed
Fsynth
Web-based and pixels-based collaborative synthesizer
Stars: ✭ 146 (+595.24%)
Mutual labels:  gpu, distributed
React Native Multithreading
🧵 Fast and easy multithreading for React Native using JSI
Stars: ✭ 164 (+680.95%)
Mutual labels:  multithreading, multiprocessing
Nyuziprocessor
GPGPU microprocessor architecture
Stars: ✭ 1,351 (+6333.33%)
Mutual labels:  gpu, fpga
bsuir-csn-cmsn-helper
Repository containing ready-made laboratory works in the specialty of computing machines, systems and networks
Stars: ✭ 43 (+104.76%)
Mutual labels:  multiprocessing, multithreading
John
John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
Stars: ✭ 5,656 (+26833.33%)
Mutual labels:  gpu, fpga
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+26833.33%)
Mutual labels:  gpu, distributed
Puma
A Ruby/Rack web server built for parallelism
Stars: ✭ 6,924 (+32871.43%)
Mutual labels:  multithreading
Lib
single header libraries for C/C++
Stars: ✭ 866 (+4023.81%)
Mutual labels:  multithreading

AutoOffload.jl

Build Status

AutoOffload.jl is an experimental library looking into automatic offloading of costly computations to accelerators like GPUs for user-friendly speedups. While not as efficient as an algorithm fully designed to stay on an accelerator due to the data syncing, for costly operations, like matrix multiplications and FFTs, this can give a sizable speedup. The purpose of this library is to attempt to automatically determine cutoff points for which offloading to an accelerator makes sense, and then utilize this so that all other libraries auto-GPU/TPU/distribute/etc. when appropriate.

Installation

AutoOffload.jl does not depend on the accelerator libraries. Thus in order to allow usage of an accelerator, you must have already installed it. For example, for GPU offloading, we require that you have done ]add CuArrays.

Design Goal

The goal is to have an autotune() function which runs some benchmarks to determine optimal cutoff values for your hardware configuration, and from this setup internal calls so that acclerated versions will auto-offload. The calls are all appended with accelerated, like:

  • accelerated_mul!
  • accelerated_fft
  • accelerated_ldiv!

This library is made to be automatic, using compile-time checking to enable offloads based on installed compatible packages, but not require any special dependencies. This means that a library is safe to depend on and use AutoOffload.jl for the accelerated functions without getting a dependency on the GPU/TPU/etc. libraries.

Pirate Mode

We plan to implement a pirated version, so that using AutoOffload.Pirate will replace the common *, mul!, etc. calls with the accelerated versions, which will allow auto-acceleration in libraries which have not been setup with the accelerated interface functions.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].