ADALabUCSD / cerebro-system

Licence: Apache-2.0 license

Data System for Optimized Deep Learning Model Selection

Programming Languages

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to cerebro-system

skrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.

Stars: ✭ 22 (+46.67%)

Mutual labels: model-selection, hyperparameter-tuning

mltb

Machine Learning Tool Box

Stars: ✭ 25 (+66.67%)

Mutual labels: hyperparameter-tuning

Hypernets

A General Automated Machine Learning framework to simplify the development of End-to-end AutoML toolkits in specific domains.

Stars: ✭ 221 (+1373.33%)

Mutual labels: hyperparameter-tuning

naturalselection

A general-purpose pythonic genetic algorithm.

Stars: ✭ 17 (+13.33%)

Mutual labels: hyperparameter-tuning

open-box

Generalized and Efficient Blackbox Optimization System.

Stars: ✭ 64 (+326.67%)

Mutual labels: hyperparameter-tuning

Machine-learning

This repository will contain all the stuffs required for beginners in ML and DL do follow and star this repo for regular updates

Stars: ✭ 27 (+80%)

Mutual labels: model-selection

Yellowbrick

Visual analysis and diagnostic tools to facilitate machine learning model selection.

Stars: ✭ 3,439 (+22826.67%)

Mutual labels: model-selection

pathpy

pathpy is an OpenSource python package for the modeling and analysis of pathways and temporal networks using higher-order and multi-order graphical models

Stars: ✭ 124 (+726.67%)

Mutual labels: model-selection

irace

Iterated Racing for Automatic Algorithm Configuration

Stars: ✭ 26 (+73.33%)

Mutual labels: hyperparameter-tuning

sklearndf

DataFrame support for scikit-learn.

Stars: ✭ 54 (+260%)

Mutual labels: model-selection

pyAudioProcessing

Audio feature extraction and classification

Stars: ✭ 165 (+1000%)

Mutual labels: hyperparameter-tuning

mlr3tuning

Hyperparameter optimization package of the mlr3 ecosystem

Stars: ✭ 44 (+193.33%)

Mutual labels: hyperparameter-tuning

differential-privacy-bayesian-optimization

This repo contains the underlying code for all the experiments from the paper: "Automatic Discovery of Privacy-Utility Pareto Fronts"

Stars: ✭ 22 (+46.67%)

Mutual labels: hyperparameter-tuning

scikit-hyperband

A scikit-learn compatible implementation of hyperband

Stars: ✭ 68 (+353.33%)

Mutual labels: hyperparameter-tuning

BAS

BAS R package https://merliseclyde.github.io/BAS/

Stars: ✭ 36 (+140%)

Mutual labels: model-selection

Tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Stars: ✭ 8,378 (+55753.33%)

Mutual labels: model-selection

map-floodwater-satellite-imagery

This repository focuses on training semantic segmentation models to predict the presence of floodwater for disaster prevention. Models were trained using SageMaker and Colab.

Stars: ✭ 21 (+40%)

Mutual labels: hyperparameter-tuning

sl3

💪 🤔 Modern Super Learning with Machine Learning Pipelines

Stars: ✭ 93 (+520%)

Mutual labels: model-selection

diviner

Diviner is a serverless machine learning and hyper parameter tuning platform

Stars: ✭ 19 (+26.67%)

Mutual labels: hyperparameter-tuning

maggy

Distribution transparent Machine Learning experiments on Apache Spark

Stars: ✭ 83 (+453.33%)

Mutual labels: hyperparameter-tuning

View All Similar Projects ➔

Cerebro

Cerebro is a data system for optimized deep learning model selection. It uses a novel parallel execution strategy called Model Hopper Parallelism (MOP) to execute end-to-end deep learning model selection workloads in a more resource-efficient manner. Detailed technical information about Cerebro can be found in our Technical Report.

Install

Prerequisites: You MUST be running on Python >= 3.6 with Tensorflow >= 2.3 and Apache Spark >= 2.4. You will need to install these separately, and you will also need to install pyspark with a matching version of your Spark. For most users, these (except for Spark, which you will need to follow their instructions) can be installed by

pip install tensorflow==2.3

and

pip install pyspark==<your spark version>

It's worth mentioning pyspark itself can be run in local/single-node mode without Spark installed. If you are just checking out/not using a cluster, then you can run

sudo apt-get update
sudo apt-get install -y openjdk-8-jdk
pip install pyspark==3.2.0

This alone should be sufficient for running the examples, but remember, to utilize a cluster with multiple machines, you will need Spark eventually.

Cerebro: The best way to install the Cerebro is via pip (may not contain the latest changes). WARNING: if you are using Spark/PySpark 3.x, then you must use the alternative method for installation

pip install -U cerebro-dl

Alternatively, you can git clone and run the provided Makefile script

git clone https://github.com/ADALabUCSD/cerebro-system.git && cd cerebro-system && make

Documentation

Detailed documentation about the system can be found here.

Acknowledgement

This project was/is supported in part by a Hellman Fellowship, the NIDDK of the NIH under award number R01DK114945, and an NSF CAREER Award.

We used the following projects when building Cerebro.

Horovod: Cerebro's Apache Spark implementation uses code from the Horovod's implementation for Apache Spark.
Petastorm: We use Petastorm to read Apache Parquet data from remote storage (e.g., HDFS)

Publications

If you use this software for research, plase cite the following papers:

@inproceedings{nakandala2019cerebro,
  title={Cerebro: Efficient and Reproducible Model Selection on Deep Learning Systems},
  author={Nakandala, Supun and Zhang, Yuhao and Kumar, Arun},
  booktitle={Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning},
  pages={1--4},
  year={2019}
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

ADALabUCSD / cerebro-system

Programming Languages

Labels

Projects that are alternatives of or similar to cerebro-system

Cerebro

Install

Documentation

Acknowledgement

Publications