All Projects → anshuman73 → DeML-Golem

anshuman73 / DeML-Golem

Licence: GPL-3.0 license
Proof Of Concept of DEcentralised Machine Learning on top of the Golem (https://golem.network/) architecture

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to DeML-Golem

backdoors101
Backdoors Framework for Deep Learning and Federated Learning. A light-weight tool to conduct your research on backdoors.
Stars: ✭ 181 (+417.14%)
Mutual labels:  federated-learning
federated-xgboost
Federated gradient boosted decision tree learning
Stars: ✭ 39 (+11.43%)
Mutual labels:  federated-learning
golem
GOLEM is a numerical simulator for modelling coupled Thermo-Hydro-Mechanical processes in faulted geothermal reservoirs.
Stars: ✭ 20 (-42.86%)
Mutual labels:  golem
PFL-Non-IID
The origin of the Non-IID phenomenon is the personalization of users, who generate the Non-IID data. With Non-IID (Not Independent and Identically Distributed) issues existing in the federated learning setting, a myriad of approaches has been proposed to crack this hard nut. In contrast, the personalized federated learning may take the advantage…
Stars: ✭ 58 (+65.71%)
Mutual labels:  federated-learning
Awesome-Federated-Machine-Learning
Everything about federated learning, including research papers, books, codes, tutorials, videos and beyond
Stars: ✭ 190 (+442.86%)
Mutual labels:  federated-learning
baai-federated-learning-crane-baseline
电力人工智能数据竞赛——液压吊车目标检测赛道
Stars: ✭ 17 (-51.43%)
Mutual labels:  federated-learning
MOON
Model-Contrastive Federated Learning (CVPR 2021)
Stars: ✭ 93 (+165.71%)
Mutual labels:  federated-learning
Clay
Golem is creating a global market for computing power.
Stars: ✭ 2,963 (+8365.71%)
Mutual labels:  golem
Front-End
Federated Learning based Deep Learning. Docs: https://fets-ai.github.io/Front-End/
Stars: ✭ 35 (+0%)
Mutual labels:  federated-learning
Pysyft
A library for answering questions using data you cannot see
Stars: ✭ 7,811 (+22217.14%)
Mutual labels:  federated-learning
Federated-Learning-Mini-Framework
Federated Learning mini-framework with Keras
Stars: ✭ 38 (+8.57%)
Mutual labels:  federated-learning
FATE-Serving
A scalable, high-performance serving system for federated learning models
Stars: ✭ 107 (+205.71%)
Mutual labels:  federated-learning
Fate
An Industrial Grade Federated Learning Framework
Stars: ✭ 3,775 (+10685.71%)
Mutual labels:  federated-learning
FedLab-benchmarks
Standard federated learning implementations in FedLab and FL benchmarks.
Stars: ✭ 49 (+40%)
Mutual labels:  federated-learning
yapapi
Python high-level API for Golem.
Stars: ✭ 33 (-5.71%)
Mutual labels:  golem
KD3A
Here is the official implementation of the model KD3A in paper "KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation".
Stars: ✭ 63 (+80%)
Mutual labels:  federated-learning
Challenge
The repo for the FeTS Challenge
Stars: ✭ 21 (-40%)
Mutual labels:  federated-learning
FedDA
Source code for 'Dual Attention Based FL for Wireless Traffic Prediction'
Stars: ✭ 41 (+17.14%)
Mutual labels:  federated-learning
golem-electron
Graphical user interface for Golem Project
Stars: ✭ 99 (+182.86%)
Mutual labels:  golem
Awesome Mlops
A curated list of references for MLOps
Stars: ✭ 7,119 (+20240%)
Mutual labels:  federated-learning

DeML-Golem

Proof Of Concept of DEcentralised Machine Learning on top of the Golem (https://golem.network/) architecture

Links

Presentation Link

Presentation Download Link

Youtube Link (Part 1) [Explains the project] Youtube Link (Part 2) [The Demo]

The Idea

The basic idea of DeML (Decentralised ML) is to provide a framework for working with Machine Learning models across a network of computers with ease and low computing costs. DeML uses the concepts laid down by Federated Learning to combine the sub-step models it trains on different provider nodes into a full fleged model that can be compared to a model trained completely locally. FL models do definitely suffer from slightly sub-par accuracies, but do not require a single expensive machine to do their computation.

You can learn about FL here and here or enjoy a quirky comic here. FL is definitely more about privacy based systems, and we don't explore that here as much as the foundation it lays for distributed working.

Currently, for the hackathon, MNIST (The "Hello World" of the ML realm) has been used as the proof of concept to showcase this MVP, but as Golem reduces the restrictions Explained below as we move to the mainnet, you can expect to train increasingly difficult and useful models on Golem. [If you're looking at this repository beyond the submission date, chances are there's another branch on the repo with a more complicated model.]

Ideally, I wanted to build a twin-component product as a part of DeML, one where I could upload a model and the data file and recieve the inferences from the providers, and one where I can train, but decided to build only the latter due to time constraints and my inexperience with Golem. An orchestrator for producing results from a trained model should be much easier to build. (And useful for models like GANs, which use a lot of computing power.)

Motivation

The motivation for this project actually comes from a real life incident, where in order to train an ML model I was pursuing my research on, I accidently racked up an AWS bill large enough to buy groceries for a month. As a student, obviously this procedure isn't scalable, and I realised there needs to be a better way. Most of our personal devices are hardly ever enough to run long and complex models.

Free services like Google Colab or FloydHub are great, but with come with their own set of restrictions that make it extremely hard to run variations of models a researcher might need to.

Hence was born the idea of DeML, a way to train your models on nodes provided by the Golem Network by theoritically just writing two components - your dataloader (both locally and on provider) and your ML model!

Instructions to run locally

The current implementation is extremely easy to start with, and you can simply get started by editing out the model_base.py with your custom model and try out if you'd like a more complex model on the same dataset.

Here are the proper steps to get started -

Step 1

Install the required dependancies mentioned in the Pipfile. You need to have pipenv installed. You can do that by pip install pipenv. This tool allows you to create an environment and install the dependancies directly using pipenv install (after creating an env with pipenv --three)

Step 2

Once you have the environment set up, all you need to do is confirm you have a working tensorflow instance (some machines cannot compile TF) You can do something like this inside the env (use pipenv shell to open a shell in the environment)

>>> import tensorflow as tf
>>> hello = tf.constant("hello TensorFlow!")
>>> sess=tf.Session() 
>>> print sess.run(hello)

If this works, you're set to go!

Step 3

Run the orchestrator! Simply do python provider_orchestrator.py to start up your training!

Current Limitations

The current set of limitations mostly come from the design of Golem for now. The ones that impact us the most are -

  1. The executor time limit. Currently every executor instance gets a maximum of 30 minutes to run, which means training complex models and leaving them running for a few hours/days is not currently possible.

  2. Lack of access of internet connectivity on the nodes. This means that the data to be used for training has to be uploaded (or embedded in the docker image). Both of these have their own set of issues - uploading the dataset to your provider is next to impossible with inter-node communication speeds, and the ones sent along with the docker image (which has a 1Gib limit) might have a bug that force loads them onto the RAM, restricting their size to the ram size (see this chat thread for reference)

Most of these issues will be ironed out in the upcoming weeks, and hopefully that will allow for a more robust usage of such an application.

The idea is that instead of a high-compute server costing you a bazzilion dollars, you get to train your model on a network of smaller computers, with extremely negligible costs (which is free for now in the rinkby stage!), and still approach accuracies shown by a sequentially trained model

You can even do something suggested by a community member!

Innovations + Future Work

Cool things this Demo does:

1. Runs a completely customisable model in a distributed way. You can control the number of providers, the epochs on each provider and even how they preprocess the data!
2. Get logs inidicating the performance of each node in the intermediate steps.
3. Combine and get your prepared models in different stages!

What more I would like it seeing doing:

1. Build a UI. Something like slate, another submission in the hackathon, where I can simply provide my model definition and dataset, and never interact much with the code unless I want to.
2. Another module to run ML models after we've trained them to process live data - with a UI as well, where I just upload my .h5 files and specify dataset.
3. Run more complex models!! (Keep an eye out for another branch!)
4. Run ML models on GPU! The golem team is working on bringing in GPU support, and it will be amazing to see the performance bump that could possibly bring to ML training, especially for models like CNNs etc.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].