All Projects → microsoft → Solution Accelerator Many Models

microsoft / Solution Accelerator Many Models

Licence: mit

Projects that are alternatives of or similar to Solution Accelerator Many Models

Av example
Examples on how to use the alpha vantage library
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Gspan
Python implementation of frequent subgraph mining algorithm gSpan. Directed graphs are supported.
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
100daysofcode With Python Course
Course materials and handouts for #100DaysOfCode in Python course
Stars: ✭ 1,391 (+1237.5%)
Mutual labels:  jupyter-notebook
K means clustering
This is the code for "K-Means Clustering - The Math of Intelligence (Week 3)" By SIraj Raval on Youtube
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
How to make an image classifier
This is the code for the "How to Make an Image Classifier" - Intro to Deep Learning #6 by Siraj Raval on Youtube
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Survivalstan
Library of Stan Models for Survival Analysis
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Dlnotebooks
INACTIVE - please go to https://gitlab.com/juliensimon/dlnotebooks
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Nlp essentials
Essential and Fundametal aspects of Natural Language Processing with hands-on examples and case-studies
Stars: ✭ 104 (+0%)
Mutual labels:  jupyter-notebook
Dndt
Deep Neural Decision Trees
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Pytorchnlpbook
Code and data accompanying Natural Language Processing with PyTorch published by O'Reilly Media https://nlproc.info
Stars: ✭ 1,390 (+1236.54%)
Mutual labels:  jupyter-notebook
Fraud Detection Using Deep Learning
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Matgenb
Jupyter notebooks demonstrating the utilization of open-source codes for the study of materials science.
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Fine grained classification
Fined grained classification On Car dataset
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Ai
机器学习、深度学习、自然语言处理、计算机视觉等AI领域相关技术的算法推导及应用
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Covid 19
A collection of work related to COVID-19
Stars: ✭ 1,394 (+1240.38%)
Mutual labels:  jupyter-notebook
Sophia
Neural networks from scratch
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Summerschool2016
Montréal Deep Learning Summer School 2016 material
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Face Id With Medical Masks
Face ID recognition with medical masks
Stars: ✭ 103 (-0.96%)
Mutual labels:  jupyter-notebook
Fma
FMA: A Dataset For Music Analysis
Stars: ✭ 1,391 (+1237.5%)
Mutual labels:  jupyter-notebook
100 Pandas Puzzles
100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)
Stars: ✭ 1,382 (+1228.85%)
Mutual labels:  jupyter-notebook

Many Models Solution Accelerator Banner

Many Models Solution Accelerator

Automl notebooks Custom script notebooks

In the real world, many problems can be too complex to be solved by a single machine learning model. Whether that be predicting sales for each individual store, building a predictive maintanence model for hundreds of oil wells, or tailoring an experience to individual users, building a model for each instance can lead to improved results on many machine learning problems.

This Pattern is very common across a wide variety of industries and applicable to many real world use cases. Below are some examples we have seen where this pattern is being used.

  • Energy and utility companies building predictive maintenance models for thousands of oil wells, hundreds of wind turbines or hundreds of smart meters

  • Retail organizations building workforce optimization models for thousands of stores, campaign promotion propensity models, Price optimization models for hundreds of thousands of products they sell

  • Restaurant chains building demand forecasting models across thousands of restaurants 

  • Banks and financial institutes building models for cash replenishment for ATM Machine and for several ATMs or building personalized models for individuals

  • Enterprises building revenue forecasting models at each division level

  • Document management companies building text analytics and legal document search models per each state

Azure Machine Learning (AML) makes it easy to train, operate, and manage hundreds or even thousands of models. This repo will walk you through the end to end process of creating a many models solution from training to scoring to monitoring.

Prerequisites

To use this solution accelerator, all you need is access to an Azure subscription and an Azure Machine Learning Workspace that you'll create below.

While it's not required, a basic understanding of Azure Machine Learning will be helpful for understanding the solution. The following resources can help introduce you to AML:

  1. Azure Machine Learning Overview
  2. Azure Machine Learning Tutorials
  3. Azure Machine Learning Sample Notebooks on Github

Getting started

1. Deploy Resources

Start by deploying the resources to Azure. The button below will deploy Azure Machine Learning and its related resources:

2. Configure Development Environment

Next you'll need to configure your development environment for Azure Machine Learning. We recommend using a Notebook VM as it's the fastest way to get up and running. Follow the steps in EnvironmentSetup.md to create a Notebook VM and clone the repo onto it.

3. Run Notebooks

Once your development environment is set up, run through the Jupyter Notebooks sequentially following the steps outlined. By the end, you'll know how to train, score, and make predictions using the many models pattern on Azure Machine Learning.

There are two ways to train many models:

  1. Using a custom training script
  2. Using Automated ML

However, the steps needed to set the workspace up and prepare the datasets are the same no matter which option you choose.

Sequence of Notebooks

Contents

In this repo, you'll train and score a forecasting model for each orange juice brand and for each store at a (simulated) grocery chain. By the end, you'll have forecasted sales by using up to 11,973 models to predict sales for the next few weeks.

The data used in this sample is simulated based on the Dominick's Orange Juice Dataset, sales data from a Chicago area grocery store.

The functionality is broken into the notebooks folders designed to be run sequentially.

Before training the models

Notebook Description
00_Setup_AML_Workspace.ipynb Creates and configures the AML Workspace, including deploying a compute cluster for training.
01_Data_Preparation.ipynb Prepares the datasets that will be used during training and forecasting.

Using a custom training script to train the models:

The following notebooks are located under the Custom_Script/ folder.

Notebook Description
02_CustomScript_Training_Pipeline.ipynb Creates a pipeline to train a model for each store and orange juice brand in the dataset using a custom script.
03_CustomScript_Forecasting_Pipeline.ipynb Creates a pipeline to forecast future orange juice sales using the models trained in the previous step.

Using Automated ML to train the models:

The following notebooks are located under the Automated_ML/ folder.

Notebook Description
02_AutoML_Training_Pipeline.ipynb Creates a pipeline to train a model for each store and orange juice brand in the dataset using Automated ML.
03_AutoML_Forecasting_Pipeline.ipynb Creates a pipeline to forecast future orange juice sales using the models trained in the previous step.

How-to-videos

Watch these how-to-videos for a step by step walk-through of the many model solution accelerator to learn how to setup your models using both the custom training script and Automated ML.

Custom Script

Watch the video

Automated ML

Watch the video

Key concepts

ParallelRunStep

ParallelRunStep enables the parallel training of models and is commonly used for batch inferencing. This document walks through some of the key concepts around ParallelRunStep.

Pipelines

Pipelines allow you to create workflows in your machine learning projects. These workflows have a number of benefits including speed, simplicity, repeatability, and modularity.

Automated Machine Learning

Automated Machine Learning also referred to as automated ML or AutoML, is the process of automating the time consuming, iterative tasks of machine learning model development. It allows data scientists, analysts, and developers to build ML models with high scale, efficiency, and productivity all while sustaining model quality.

Other Concepts

In additional to ParallelRunStep, Pipelines and Automated Machine Learning, you'll also be working with the following concepts including workspace, datasets, compute targets, python script steps, and Azure Open Datasets.

Contributing

This project welcomes contributions and suggestions. To learn more visit the contributing section.

Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].