Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → SciML → Autooffload.jl

SciML / Autooffload.jl

Licence: mit

Automatic GPU, TPU, FPGA, Xeon Phi, Multithreaded, Distributed, etc. offloading for scientific machine learning (SciML) and differential equations

Programming Languages

julia

2034 projects

Labels

gpu distributed fpga multithreading multiprocessing

Projects that are alternatives of or similar to Autooffload.jl

Occa

JIT Compilation for Multiple Architectures: C++, OpenMP, CUDA, HIP, OpenCL, Metal

Stars: ✭ 230 (+995.24%)

Mutual labels: multithreading, gpu

funboost

pip install funboost，python全功能分布式函数调度框架,。支持python所有类型的并发模式和全球一切知名消息队列中间件，python函数加速器，框架包罗万象，一统编程思维，兼容50% python编程业务场景，适用范围广。只需要一行代码即可分布式执行python一切函数。旧名字是function_scheduling_distributed_framework

Stars: ✭ 351 (+1571.43%)

Mutual labels: multiprocessing, distributed

Mongols

C++ high performance networking with TCP/UDP/RESP/HTTP/WebSocket protocols

Stars: ✭ 250 (+1090.48%)

Mutual labels: multithreading, multiprocessing

Tsne Cuda

GPU Accelerated t-SNE for CUDA with Python bindings

Stars: ✭ 1,120 (+5233.33%)

Mutual labels: multithreading, gpu

Crypto Rl

Deep Reinforcement Learning toolkit: record and replay cryptocurrency limit order book data & train a DDQN agent

Stars: ✭ 328 (+1461.9%)

Mutual labels: multithreading, multiprocessing

Archive Password Cracker

设计精良的压缩包密码破解工具，具有自定义字典、导出字典、选择字典等功能。基于Python实现，支持多线程与多进程，不断完善中……

Stars: ✭ 65 (+209.52%)

Mutual labels: multithreading, multiprocessing

pooljs

Browser computing unleashed!

Stars: ✭ 17 (-19.05%)

Mutual labels: multithreading, distributed

Vqengine

DirectX 11 Renderer written in C++11

Stars: ✭ 250 (+1090.48%)

Mutual labels: multithreading, gpu

Fast gicp

A collection of GICP-based fast point cloud registration algorithms

Stars: ✭ 307 (+1361.9%)

Mutual labels: multithreading, gpu

Pypette

Ridiculously simple flow controller for building complex pipelines

Stars: ✭ 258 (+1128.57%)

Mutual labels: multithreading, multiprocessing

Heteroflow

Concurrent CPU-GPU Programming using Task Models

Stars: ✭ 57 (+171.43%)

Mutual labels: multithreading, gpu

Scanner

Efficient video analysis at scale

Stars: ✭ 569 (+2609.52%)

Mutual labels: gpu, distributed

Fsynth

Web-based and pixels-based collaborative synthesizer

Stars: ✭ 146 (+595.24%)

Mutual labels: gpu, distributed

React Native Multithreading

🧵 Fast and easy multithreading for React Native using JSI

Stars: ✭ 164 (+680.95%)

Mutual labels: multithreading, multiprocessing

Nyuziprocessor

GPGPU microprocessor architecture

Stars: ✭ 1,351 (+6333.33%)

Mutual labels: gpu, fpga

bsuir-csn-cmsn-helper

Repository containing ready-made laboratory works in the specialty of computing machines, systems and networks

Stars: ✭ 43 (+104.76%)

Mutual labels: multiprocessing, multithreading

John

John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs

Stars: ✭ 5,656 (+26833.33%)

Mutual labels: gpu, fpga

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+26833.33%)

Mutual labels: gpu, distributed

Puma

A Ruby/Rack web server built for parallelism

Stars: ✭ 6,924 (+32871.43%)

Mutual labels: multithreading

Lib

single header libraries for C/C++

Stars: ✭ 866 (+4023.81%)

Mutual labels: multithreading

View All Similar Projects ➔

AutoOffload.jl

AutoOffload.jl is an experimental library looking into automatic offloading of costly computations to accelerators like GPUs for user-friendly speedups. While not as efficient as an algorithm fully designed to stay on an accelerator due to the data syncing, for costly operations, like matrix multiplications and FFTs, this can give a sizable speedup. The purpose of this library is to attempt to automatically determine cutoff points for which offloading to an accelerator makes sense, and then utilize this so that all other libraries auto-GPU/TPU/distribute/etc. when appropriate.

Installation

AutoOffload.jl does not depend on the accelerator libraries. Thus in order to allow usage of an accelerator, you must have already installed it. For example, for GPU offloading, we require that you have done ]add CuArrays.

Design Goal

The goal is to have an autotune() function which runs some benchmarks to determine optimal cutoff values for your hardware configuration, and from this setup internal calls so that acclerated versions will auto-offload. The calls are all appended with accelerated, like:

accelerated_mul!
accelerated_fft
accelerated_ldiv!

This library is made to be automatic, using compile-time checking to enable offloads based on installed compatible packages, but not require any special dependencies. This means that a library is safe to depend on and use AutoOffload.jl for the accelerated functions without getting a dependency on the GPU/TPU/etc. libraries.

Pirate Mode

We plan to implement a pirated version, so that using AutoOffload.Pirate will replace the common *, mul!, etc. calls with the accelerated versions, which will allow auto-acceleration in libraries which have not been setup with the accelerated interface functions.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 21

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗