All Projects → XiaohangZhan → dist-framework

XiaohangZhan / dist-framework

Licence: MIT license
A prototype for distributed training/validation/evaluation/extraction with PyTorch.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to dist-framework

Dweb.page
Your Gateway to the Distributed Web
Stars: ✭ 239 (+1607.14%)
Mutual labels:  distributed
itc.lua
A Lua implementation of Interval Tree Clocks
Stars: ✭ 21 (+50%)
Mutual labels:  distributed
majordodo
Distributed Operations and Data Organizer built on Apache BookKeeper
Stars: ✭ 25 (+78.57%)
Mutual labels:  distributed
Spring Boot Start Current
Spring Boot 脚手架 Mybatis Spring Security JWT 权限 Spring Cache + Redis
Stars: ✭ 246 (+1657.14%)
Mutual labels:  distributed
Ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Stars: ✭ 18,547 (+132378.57%)
Mutual labels:  distributed
Multi-Node-TimescaleDB
The multi-node setup of TimescaleDB 🐯🐯🐯 🐘 🐯🐯🐯
Stars: ✭ 42 (+200%)
Mutual labels:  distributed
Brainiak
Brain Imaging Analysis Kit
Stars: ✭ 232 (+1557.14%)
Mutual labels:  distributed
simplx
C++ development framework for building reliable cache-friendly distributed and concurrent multicore software
Stars: ✭ 61 (+335.71%)
Mutual labels:  distributed
Tensorflow
An Open Source Machine Learning Framework for Everyone
Stars: ✭ 161,335 (+1152292.86%)
Mutual labels:  distributed
osilo
Personal data silos with secure sharing
Stars: ✭ 15 (+7.14%)
Mutual labels:  distributed
Shardingsphere Elasticjob Cloud
Stars: ✭ 248 (+1671.43%)
Mutual labels:  distributed
Cat
CAT 作为服务端项目基础组件,提供了 Java, C/C++, Node.js, Python, Go 等多语言客户端,已经在美团点评的基础架构中间件框架(MVC框架,RPC框架,数据库框架,缓存框架等,消息队列,配置系统等)深度集成,为美团点评各业务线提供系统丰富的性能指标、健康状况、实时告警等。
Stars: ✭ 16,236 (+115871.43%)
Mutual labels:  distributed
webhunger
WebHunger is an extensible, full-scale crawler framework that supports distributed crawling, aiming at getting users focused on web page parsing without concerning for the crawling process.
Stars: ✭ 17 (+21.43%)
Mutual labels:  distributed
Powerjob
Enterprise job scheduling middleware with distributed computing ability.
Stars: ✭ 3,231 (+22978.57%)
Mutual labels:  distributed
tool-db
A peer-to-peer decentralized database
Stars: ✭ 15 (+7.14%)
Mutual labels:  distributed
Flambe
An ML framework to accelerate research and its path to production.
Stars: ✭ 236 (+1585.71%)
Mutual labels:  distributed
celery-monitor
The celery monitor app was written by Django.
Stars: ✭ 92 (+557.14%)
Mutual labels:  distributed
zlimiter
A toolkit for rate limite,support memory and redis
Stars: ✭ 17 (+21.43%)
Mutual labels:  distributed
Distributed-ResNet-Tensorflow
A Distributed ResNet on multi-machines each with one GPU card.
Stars: ✭ 20 (+42.86%)
Mutual labels:  distributed
spicedb
Open Source, Google Zanzibar-inspired fine-grained permissions database
Stars: ✭ 3,358 (+23885.71%)
Mutual labels:  distributed

A prototype for distributed training with PyTorch.

Note

This repo will not be maintained.

Features

With this framework, you get:

  • High extensibility: customize your algorithm for any purpose.
  • High-efficiency distributed training, validation, evaluation, feature extraction.

Requirements

  • PyTorch >= 0.4.1

  • Others:

    pip install -r requirements.txt

Usage

  • For example, train Cifar-10 with resnet20 in 14 minutes, get 92.59% accuracy.

    cd dist-framework
    mkdir data
    cd data
    wget https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
    tar -xf cifar-10-python.tar.gz
    cd ..
    sh experiments/classification/Cifar/resnet20/train.sh # train, don't forget to open tensorboard for visualization
    sh experiments/classification/Cifar/resnet20/resume.sh $ITER # resume from iteration $ITER
    sh experiments/classification/Cifar/resnet20/validate.sh $ITER # offline validation
    sh experiments/classification/Cifar/resnet20/evaluate.sh $ITER # offline evaluation
    sh experiments/classification/Cifar/resnet20/extract.sh $ITER # feature extraction

Extensibility

  • You need to write your own Dataset in dataset.py and your algorithm under models (refer to models/classification.py), and design your config file. That't it!

Note

  • Please use sh scripts/kill.sh to kill.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].