All Projects → microsoft → Mlhyperparametertuning

microsoft / Mlhyperparametertuning

Licence: mit
Example of using HyperDrive to tune a regular ML learner.

Projects that are alternatives of or similar to Mlhyperparametertuning

Deep Learning For Natural Language Processing
Source Code for 'Deep Learning for Natural Language Processing' by Palash Goyal, Sumit Pandey and Karan Jain
Stars: ✭ 47 (-2.08%)
Mutual labels:  jupyter-notebook
E Bliss Rapgen
e-bliss projesi rap sarki sozleri ureteci kaynak kodu
Stars: ✭ 48 (+0%)
Mutual labels:  jupyter-notebook
Thegrevisualizer
Visualize synonyms and common confusing words in an interactive network
Stars: ✭ 48 (+0%)
Mutual labels:  jupyter-notebook
Info8002 Large Scale Data Systems
Lectures for INFO8002 - Large-scale Data Systems, ULiège
Stars: ✭ 47 (-2.08%)
Mutual labels:  jupyter-notebook
Stock Data
Analyze stock data by python science tools and machine learning.
Stars: ✭ 47 (-2.08%)
Mutual labels:  jupyter-notebook
Geo Pifu
This repository is the official implementation of Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view Human Reconstruction.
Stars: ✭ 48 (+0%)
Mutual labels:  jupyter-notebook
Meetupmaterials
Julia中文社区活动的各种材料 Meetup Materials
Stars: ✭ 47 (-2.08%)
Mutual labels:  jupyter-notebook
Rsd Engineeringcourse
Materials for Turing's Research Software Engineering course
Stars: ✭ 48 (+0%)
Mutual labels:  jupyter-notebook
Effective Pandas
Source code for my collection of articles on using pandas.
Stars: ✭ 1,042 (+2070.83%)
Mutual labels:  jupyter-notebook
Rnn Notebooks
RNN(SimpleRNN, LSTM, GRU) Tensorflow2.0 & Keras Notebooks (Workshop materials)
Stars: ✭ 48 (+0%)
Mutual labels:  jupyter-notebook
Sdc Vehicle Lane Detection
I am using an ensemble of classic computer vision and modern deep learning techniques, to detect the lane lines and the vehicles on a highway. This project was part of the Udacity SDC Nanodegree.
Stars: ✭ 47 (-2.08%)
Mutual labels:  jupyter-notebook
Udacity Sdc behavior Cloning
Behavior Cloning Project for Udacity SDC Nanodegree
Stars: ✭ 47 (-2.08%)
Mutual labels:  jupyter-notebook
Wavelet networks
Code repository of the paper "Wavelet Networks: Scale Equivariant Learning From Raw Waveforms" https://arxiv.org/abs/2006.05259
Stars: ✭ 48 (+0%)
Mutual labels:  jupyter-notebook
Algorithmmap
建立你的算法地图:如何高效学习算法;算法工程师:从小白到专家
Stars: ✭ 47 (-2.08%)
Mutual labels:  jupyter-notebook
Av Ltfs Data Science Finhack Ml Hackathon
L&T Financial Services & Analytics Vidhya presents ‘DataScience FinHack’ organised by Analytics Vidhya
Stars: ✭ 48 (+0%)
Mutual labels:  jupyter-notebook
Building Machine Learning Systems With Python Third Edition
Code repository for Building Machine Learning Systems with Python Third Edition, by Packt
Stars: ✭ 45 (-6.25%)
Mutual labels:  jupyter-notebook
Flows ood
Stars: ✭ 48 (+0%)
Mutual labels:  jupyter-notebook
Serverless For Data Scientists
Code and notebooks for a talk given at PyBay, 2018-08-19
Stars: ✭ 48 (+0%)
Mutual labels:  jupyter-notebook
Web Synth
A web-based sound synthesis, music production, and audio experimentation platform
Stars: ✭ 47 (-2.08%)
Mutual labels:  jupyter-notebook
Video Tutorial Cvpr2020
A Comprehensive Tutorial on Video Modeling
Stars: ✭ 48 (+0%)
Mutual labels:  jupyter-notebook

Author: Mario Bourgoin

Training of Python scikit-learn models on Azure

Overview

This scenario shows how to tune a Frequently Asked Questions (FAQ) matching model that can be deployed as a web service to provide predictions for user questions. For this scenario, "Input Data" in the architecture diagram refers to text strings containing the user questions to match with a list of FAQs. The scenario is designed for the Scikit-Learn machine learning library for Python but can be generalized to any scenario that uses Python models to make real-time predictions.

Design

alt text The scenario uses a subset of Stack Overflow question data which includes original questions tagged as JavaScript, their duplicate questions, and their answers. It tunes a Scikit-Learn pipeline to predict the match probability of a duplicate question with each of the original questions. The application flow for this architecture is as follows:

  1. Create an Azure ML Service workspace.
  2. Create an Azure ML Compute cluster.
  3. Upload training, tuning, and testing data to Azure Storage.
  4. Configure a HyperDrive random hyperparameter search.
  5. Submit the search.
  6. Monitor until complete.
  7. Retrieve the best set of hyperparameters.
  8. Register the best model.

Prerequisites

  1. Linux (Ubuntu).
  2. Anaconda Python installed.
  3. Azure account.

The tutorial was developed on an Azure Ubuntu DSVM, which addresses the first two prerequisites. You can allocate such a VM on Azure Portal by creating a "Data Science Virtual Machine for Linux (Ubuntu)" resource.

Setup

To set up your environment to run these notebooks, please follow these steps. They setup the notebooks to use Azure seamlessly.

  1. Create a Linux Ubuntu VM.
  2. Log in to your VM. We recommend that you use a graphical client such as X2Go to access your VM. The remaining steps are to be done on the VM.
  3. Open a terminal emulator.
  4. Clone, fork, or download the zip file for this repository:
    git clone https://github.com/Microsoft/MLHyperparameterTuning.git
    
  5. Enter the local repository:
    cd MLHyperparameterTuning
    
  6. Create the Python MLHyperparameterTuning virtual environment using the environment.yml:
    conda env create -f environment.yml
    
  7. Activate the virtual environment:
    source activate MLHyperparameterTuning
    
    The remaining steps should be done in this virtual environment.
  8. Login to Azure:
    az login
    
    You can verify that you are logged in to your subscription by executing the command:
    az account show -o table
    
  9. If you have more than one Azure subscription, select it:
    az account set --subscription <Your Azure Subscription>
    
  10. Start the Jupyter notebook server:
    jupyter notebook
    

Steps

After following the setup instructions above, run the Jupyter notebooks in order starting with 00_Data_Prep_Notebook.ipynb.

Cleaning up

The last Jupyter notebook describes how to delete the Azure resources created for running the tutorial. Consult the conda documentation for information on how to remove the conda environment created during the setup. And if you created a VM, you may also delete it.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].