Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → snuspl → parallax

snuspl / parallax

Licence: Apache-2.0 License

A Tool for Automatic Parallelization of Deep Learning Training in Distributed Multi-GPU Environments.

Programming Languages

139335 projects - #7 most used programming language

77523 projects

911 projects

Labels

machine-learning deep-neural-networks deep-learning tool ml parallelization distributed

Projects that are alternatives of or similar to parallax

[unmaintained] An open-source convolutional neural networks platform for research in medical image analysis and image-guided therapy

Stars: ✭ 1,276 (+896.88%)

Mutual labels: ml, distributed

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.

Stars: ✭ 23,798 (+18492.19%)

Mutual labels: ml, distributed

Framework and Library for Distributed Online Machine Learning

Stars: ✭ 702 (+448.44%)

Mutual labels: ml, distributed

Tensorflow in Docker on Mesos #tfmesos #tensorflow #mesos

Stars: ✭ 194 (+51.56%)

Mutual labels: ml, distributed

OneFlow is a performance-centered and open-source deep learning framework.

Stars: ✭ 2,868 (+2140.63%)

Mutual labels: ml, distributed

An Open Source Machine Learning Framework for Everyone

Stars: ✭ 161,335 (+125942.97%)

Mutual labels: ml, distributed

An ML framework to accelerate research and its path to production.

Stars: ✭ 236 (+84.38%)

Mutual labels: ml, distributed

Distributed SQL Engine in Python using Dask

Stars: ✭ 271 (+111.72%)

Mutual labels: ml, distributed

Expose local http-server (web-app) through IPNS

Stars: ✭ 18 (-85.94%)

Mutual labels: distributed

Hashtopolis - A Hashcat wrapper for distributed hashcracking

Stars: ✭ 954 (+645.31%)

Mutual labels: distributed

Official python agent for using the distributed hashcracker Hashtopolis

Stars: ✭ 39 (-69.53%)

Mutual labels: distributed

bare bone Hierarchial Temporal Memory

Stars: ✭ 14 (-89.06%)

Mutual labels: distributed

DMIA ProductionML 2021 Spring

Репозиторий направления Production ML, весна 2021

Stars: ✭ 42 (-67.19%)

Mutual labels: ml

🍃A high performance distributed unique ID generation system

Stars: ✭ 31 (-75.78%)

Mutual labels: distributed

Library to provide Erlang style distributed computations. This library is inspired by Cloud Haskell.

Stars: ✭ 49 (-61.72%)

Mutual labels: distributed

probabilistic-circuits

A curated collection of papers on probabilistic circuits, computational graphs encoding tractable probability distributions.

Stars: ✭ 33 (-74.22%)

Mutual labels: ml

WolfPACS is an DICOM load balancer written in Erlang.

Stars: ✭ 1 (-99.22%)

Mutual labels: ml

AlphaSQL provides Integrated Type and Schema Check and Parallelization for SQL file set mainly for BigQuery

Stars: ✭ 35 (-72.66%)

Mutual labels: parallelization

Final-Year-Project

8th sem Final year Project of VTU

Stars: ✭ 53 (-58.59%)

Mutual labels: ml

DTail is a distributed DevOps tool for tailing, grepping, catting logs and other text files on many remote machines at once.

Stars: ✭ 112 (-12.5%)

Mutual labels: distributed

View All Similar Projects ➔

Parallax

Parallax is a tool that optimizes data parallel training by considering whether each variable in a deep learning model is sparse or dense. The sparsity-aware data parallel training improves performance of models with sparse variables that show relatively low scalability on existing frameworks while maintaining equal performance for models with only dense variables such as ResNet-50 and Inception-V3. In addition, Parallax automatically parallelizes training of a single-GPU deep learning model to minimize user efforts. If you are interested, you can find the technical details of Parallax in our paper.

Parallax is currently implemented on TensorFlow. We support TensorFlow v1.6 and TensorFlow v1.11. In case that Parallax uses Message Passing Interface (MPI), Parallax requires AllReduce, AllGather operations implemented in Horovod v0.11.2. We plan to support multiple TensorFlow versions.

Why Parallax?

Parallax makes it easier for users to do distributed training of a deep learning model developed in a single device (e.g., GPU or CPU) while employing various optimization techniques that Parallax provides. A Parallax user simply specifies a single-device model graph, resource specification for distributed training and Parallax does the rest! For distributed training, Parallax supports hybrid architecture that combines two different distributed training architectures: Parameter Server (PS) and AllReduce (AR). Hybrid architecture exploits the advantages of both architectures. Moreover, Parallax will provide large sparse variable partitioning soon to maximize parallelism while maintaining low computation and communication overhead. Parallax further optimizes training with local aggregation and smart operation placement to mitigate communication overhead.

PS and AR architectures are still available in Parallax; users can choose the training architecture if they want (default is hybrid for synchronous training).

Hybrid Architecture

The amount of data transfer of each PS and AR achitecture changes according to whether a variable is sparse or dense. Based on the fact, Parallax pursues a hybrid architecture in which the AR architecture handles dense variables and the PS architecture handles sparse variables to minimize communication overhead. Each worker has a replica of dense variables, while separate server processes manage only sparse variables.

Parallax Execution Model

When a client initiates a deep learning job with a single-device computation graph, resource information, and optionally a flag that indicates either synchronous or asynchronous training, Parallax transforms the computation graph by analyzing its characteristics. Then, Parallax executes the transformed graph with its optimized communication layer in the distributed environment.

Parallax Benchmark

To give you an idea on how well Parallax performs, we present the following chart that shows the result of experiments done in a cluster of eight machines that are connected via Mellanox ConnectX-4 cards with 100Gbps InfiniBand. Each machine has six NVIDIA GeForce TITAN Xp GPU cards.

Parallax converges correctly as other frameworks(TensorFlow and Horovod). Parallax is faster than TensorFlow and similiar to Horovod for ResNet50 (dense model). In case of LM1B (sparse model), Parallax outperforms than both TensorFlow and Horovod.

Parallax outperforms TensorFlow for both Resnet50 and LM1B. In addition, Parallax outperforms Horovod for LM1B.

Troubleshooting

See the Troubleshooting page and submit a new issue or contact us if you cannot find an answer.

Contact us

To contact us, send an email to [email protected].

License

Apache License 2.0

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 128

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (5) 🔗