All Projects → microsoft → az-ml-batch-score

microsoft / az-ml-batch-score

Licence: MIT License
Deploying a Batch Scoring Pipeline for Python Models

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
powershell
5483 projects

Projects that are alternatives of or similar to az-ml-batch-score

azureml-cheatsheets
Azure Machine Learning Cheat Sheets
Stars: ✭ 23 (+35.29%)
Mutual labels:  azure-machine-learning, azureml
Machinelearningnotebooks
Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
Stars: ✭ 2,790 (+16311.76%)
Mutual labels:  azure-machine-learning, azureml
MemStream
MemStream: Memory-Based Streaming Anomaly Detection
Stars: ✭ 58 (+241.18%)
Mutual labels:  anomaly-detection
awesome-time-series
Resources for working with time series and sequence data
Stars: ✭ 178 (+947.06%)
Mutual labels:  anomaly-detection
future.callr
🚀 R package future.callr: A Future API for Parallel Processing using 'callr'
Stars: ✭ 52 (+205.88%)
Mutual labels:  parallel-processing
Meta-GDN AnomalyDetection
Implementation of TheWebConf 2021 -- Few-shot Network Anomaly Detection via Cross-network Meta-learning
Stars: ✭ 22 (+29.41%)
Mutual labels:  anomaly-detection
MLOps VideoAnomalyDetection
Operationalize a video anomaly detection model with Azure ML
Stars: ✭ 102 (+500%)
Mutual labels:  azureml
FARED for Anomaly Detection
Official source code of "Fast Adaptive RNN Encoder-Decoder for Anomaly Detection in SMD Assembly Machine"
Stars: ✭ 14 (-17.65%)
Mutual labels:  anomaly-detection
ManTraNet-pytorch
Implementation of the famous Image Manipulation\Forgery Detector "ManTraNet" in Pytorch
Stars: ✭ 47 (+176.47%)
Mutual labels:  anomaly-detection
detection-rules
Threat Detection & Anomaly Detection rules for popular open-source components
Stars: ✭ 34 (+100%)
Mutual labels:  anomaly-detection
infantry
Run MapReduce in user's browser.
Stars: ✭ 14 (-17.65%)
Mutual labels:  parallel-processing
singular-spectrum-transformation
fast implementation of singular spectrum transformation (change point detection algorithm)
Stars: ✭ 41 (+141.18%)
Mutual labels:  anomaly-detection
Anomaly Detection
anomaly detection with anomalize and Google Trends data
Stars: ✭ 38 (+123.53%)
Mutual labels:  anomaly-detection
az-deep-realtime-score
AKS Deployment Tutorial
Stars: ✭ 33 (+94.12%)
Mutual labels:  azureml
aml-deploy
GitHub Action that allows you to deploy machine learning models in Azure Machine Learning.
Stars: ✭ 37 (+117.65%)
Mutual labels:  azure-machine-learning
benfordslaw
benfordslaw is about the frequency distribution of leading digits.
Stars: ✭ 29 (+70.59%)
Mutual labels:  anomaly-detection
Mean-Shifted-Anomaly-Detection
Mean-Shifted Contrastive Loss for Anomaly Detection
Stars: ✭ 61 (+258.82%)
Mutual labels:  anomaly-detection
snap
Snap Programming Language
Stars: ✭ 20 (+17.65%)
Mutual labels:  parallel-processing
A-Detector
⭐ An anomaly-based intrusion detection system.
Stars: ✭ 69 (+305.88%)
Mutual labels:  anomaly-detection
MStream
Anomaly Detection on Time-Evolving Streams in Real-time. Detecting intrusions (DoS and DDoS attacks), frauds, fake rating anomalies.
Stars: ✭ 68 (+300%)
Mutual labels:  anomaly-detection

Author: Said Bleik

Deploying a Batch Scoring Pipeline for Python Models

Overview

Scoring Anomaly Detection Models at Scale using Azure Machine Learning

In this repository you will find a set of scripts and commands that help you build a scalable solution for scoring many models in parallel using Azure Machine Learning (AML).

The solution can be used as a template and can generalize to different problems. The problem addressed here is monitoring the operation of a large number of devices in an IoT setting, where each device sends sensor readings continuously. We assume there are pre-trained anomaly detection models - one for each sensor of a device. The models are used to predict whether a series of measurements, that are aggregated over a predefined time interval, correspond to an anomaly or not.

To get started, read through the Design section, then go through the following sections to create the Python environment, Azure resources, and the scoring pipeline:

  • Design
  • Prerequisites
  • Create Environment
  • Steps
    • Create Azure Resources
    • Create and Schedule the Scoring Pipeline
    • Validate Deployments and Pipeline Execution
  • Cleanup

Design

System Architecture

This solution consists of several Azure cloud services that allow upscaling and downscaling resources according to need. The services and their role in this solution are described below.

Blob Storage

Blob containers are used to store the pre-trained models, the data, and the output predictions. The models that we upload to blob storage in the 01_create_resources.ipynb notebook are One-class SVM models that are trained on data that represents values of different sensors of different devices. We assume that the data values are aggregated over a fixed interval of time. In real-world scenarios, this could be a stream of sensor readings that need to be filtered and aggregated before being used in training or real-time scoring. For simplicity, we use the same data file when executing scoring jobs.

Azure Machine Learning

Azure Machine Learning (AML) is a cloud service that allows training, scoring, managing, and deploying machine learning models at scale in the cloud. It can be used to execute training, scoring, or other demanding jobs on remote compute targets, such as a cluster of virtual machines, that can scale according to need. In this solution guide, we use AML to run scoring jobs for many sensors in parallel. We do that by creating an AML pipeline with parallel steps, where each step executes a scoring Python script for each sensor. AML manages queueing and executing the steps on a scalable compute target.

In addition, we create a scheduling process using AML to run the pipeline continuously on a specified time interval.

For more information on these services, check the documentation links provided in the Links section.

Prerequisites

All scripts and commands were tested on an Ubuntu 16.04 LTS system.

Create Environment

Once all prerequisites are installed,

  1. Clone or download this repsitory:

    git clone https://github.com/Microsoft/AMLBatchScoringPipeline.git
    
  2. Create and select conda environment from yml file:

    conda env create -f environment.yml
    conda activate amlmm    
    
  3. Login to Azure and select subscription

    az login --use-device-code
    az account set -s "<subscription name or ID>"
    
  4. Start Jupyter in the same environment:

    jupyter notebook
    
  5. Open Jupyter Notebook in your browser and make sure your environemnt's kernel is selected:

    Kernel > Change Kernel > Python [conda env:amlmm]
    

Start creating the required resources in the next section.

Steps

1. Create Azure Resources

The 01_create_resources.ipynb notebook contains all Azure CLI commands needed to create resources in your Azure subscription, as well as configurations of the AML pipeline and the compute target.

Navigate to the cloned/downloaded directory in Jupyter Notebook: AMLBatchScoringPipeline/01_create_resources.ipynb, and start executing the cells to create the needed Azure resources.

3. Create and Schedule the Scoring Pipeline

The 02_create_pipeline.ipynb notebook contains Python code that creates the AML scoring pipeline and schedules it to run on a predefined interval.

2. Validate Deployments and Jobs Execution

After all resources are created, you can check your resource group in the portal and validate that all components have been deployed successfully.

Under Storage Account > Blobs, you should see the predictions' CSV files in the preds container, after the pipeline runs successfully.

Cleaning up

If you wish to delete all created resources, run the following CLI command to delete the resource group and all underlying resources.

az group delete --name <resource_group_name>

Links

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Related projects

Microsoft AI Github Find other Best Practice projects, and Azure AI Designed patterns in our central repository.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].