All Projects → Annielytix → Advanced-Databricks-for-ML-Build-2019

Annielytix / Advanced-Databricks-for-ML-Build-2019

Licence: other
Using Azure Databricks (Spark) for ML, this is the //build 2019 repository with homework examples, code and notebooks

Programming Languages

Jupyter Notebook
11667 projects
scala
5932 projects

Projects that are alternatives of or similar to Advanced-Databricks-for-ML-Build-2019

Api Management Developer Portal
Azure API Management developer portal.
Stars: ✭ 229 (+1661.54%)
Mutual labels:  microsoft
privacysec
I don't have anything to hide, but I don't have anything to show you either.
Stars: ✭ 110 (+746.15%)
Mutual labels:  microsoft
lava
Microsoft Azure Exploitation Framework
Stars: ✭ 46 (+253.85%)
Mutual labels:  microsoft
Checkedc
Checked C is an extension to C that lets programmers write C code that is guaranteed by the compiler to be type-safe. The goal is to let people easily make their existing C code type-safe and eliminate entire classes of errors. Checked C does not address use-after-free errors. This repo has a wiki for Checked C, sample code, the specification, a…
Stars: ✭ 2,692 (+20607.69%)
Mutual labels:  microsoft
Vscode
Visual Studio Code
Stars: ✭ 125,417 (+964646.15%)
Mutual labels:  microsoft
Xbox-GDK-Samples
Game development samples published by the Xbox Advanced Technology Group using the Microsoft GDK.
Stars: ✭ 128 (+884.62%)
Mutual labels:  microsoft
Awesome Dotnet Core
🐝 A collection of awesome .NET core libraries, tools, frameworks and software
Stars: ✭ 15,483 (+119000%)
Mutual labels:  microsoft
System-Center-Operations-Manager-API
Microsoft System Center Operations Manager (SCOM) Web API
Stars: ✭ 40 (+207.69%)
Mutual labels:  microsoft
FritzBoxTelefon-dingsbums
Das Fritz!Box Telefon-dingsbums ist ein Outlook-Addin, welches ein direktes Wählen der Kontakte aus Outlook ermöglicht. Zusätzlich bietet es nützliche Funktionen, wie einen Anrufmonitor oder eine Rückwärtssuche.
Stars: ✭ 16 (+23.08%)
Mutual labels:  microsoft
PowerEvents
PowerEvents is a PowerShell module that assists in the registration of WMI permanent event subscriptions.
Stars: ✭ 60 (+361.54%)
Mutual labels:  microsoft
Msgraph Sdk Powershell
Powershell SDK for Microsoft Graph
Stars: ✭ 239 (+1738.46%)
Mutual labels:  microsoft
Studentsatbuild
Find all of the resources you might need to try out code presented in the Student Zone at Build 2020 for yourself!
Stars: ✭ 251 (+1830.77%)
Mutual labels:  microsoft
DacFx
SQL Server database schema validation, deployment, and upgrade runtime. Enables declarative database development and database portability across SQL Server versions and environments.
Stars: ✭ 152 (+1069.23%)
Mutual labels:  microsoft
Azure Event Hubs
☁️ Cloud-scale telemetry ingestion from any stream of data with Azure Event Hubs
Stars: ✭ 233 (+1692.31%)
Mutual labels:  microsoft
windows-container
Docker files for various Windows Container build
Stars: ✭ 30 (+130.77%)
Mutual labels:  microsoft
Azure Powershell
Microsoft Azure PowerShell
Stars: ✭ 2,873 (+22000%)
Mutual labels:  microsoft
blackbricks
Black for Databricks notebooks
Stars: ✭ 40 (+207.69%)
Mutual labels:  databricks-notebooks
azure-cli-dev-tools
Developer utilities for Azure CLI command module and extension developers.
Stars: ✭ 62 (+376.92%)
Mutual labels:  microsoft
tod0
A Terminal Client for Microsoft To-Do
Stars: ✭ 93 (+615.38%)
Mutual labels:  microsoft
AlwaysEncryptedSample
Sample ASP.NET MVC Application for demonstrating Microsoft SQL Server Always Encrypted Functionality
Stars: ✭ 14 (+7.69%)
Mutual labels:  microsoft

build2019-Advanced Azure Databricks for ML

Using Azure Databricks (Spark) for ML, this is the repository prsented at //build 2019 with additional homework examples, code and notebooks

Welcome

Welcome to //build 2019 Advanced Databricks Challenge. We will focus on hands-on activities that develop proficiency in advanced Databricks concepts such as data exploration using Spark, building Supervised & Unsupervised Learning Models, Evaluating Models and using advanced libraries like MMLSpark. These challenges assume an introductory to intermediate knowledge of Azure Databricks, and if this is not the case, please spend time working through the Introduction to Databricks challenges first.

Goals

Most challenges observed by customers in these realms are in stitching multiple services together. As such, where possible, we have tried to place key concepts in the context of a broader example.

At the end of this workshop, you should be able to:

  • Understand how to use Azure Databricks to build ML models including:

    • Supervised Learning (classification)
    • Unsupervised Learning (clustering / recommendation )
  • How to evaluate those models using Azure Databricks

  • Understanding Libraries: Introduction to MMLSpark and when to use it

-Introduction to Deep Learning

Background Knowledge

This workshop is meant for a Data Scientist on Azure who actively scripts using a common data science language like Python. Since this is only a short workshop, there are certain things you will need to read or setup after you arrive.

Firstly, you should have some previous exposure to Python. We will be using it for everything we are building in the workshop, so you should be familiar with how to use it to create ML models. Additionally, this is not a class where we teach you about how to choose the correct algorithm for the business scenario. We assume you have some familiarity with these concepts ahead of time.

Secondly, you should have some experience with Azure Databricks and the core concepts including workspaces, libraries et al. If not, please check out the Intro to Azure Databricks workshop first.

Thirdly, you should have experience with the portal and be able to create resources (and spend money) on Azure. We will not be providing Azure passes for this workshop.

For fun, I have included a EU soccer example (.DBC) as well as a Retail Fashion example and by popular demand, a Pandas UDF Benchmark notebook to help you get started with your User Defined Functions with Pandas. Please let me know if you have any questions.

Challenges

[Business Case I - Azure Databricks

  1. Start by following the steps in the [README] to provision your Azure environment and fork both the [labs] below and the notebooks used in the challenges.
  2. Challenge 0 - Administration. ****Please note: you do not need to run through Admin if you are an attendee of //build(see note below for when to use this Databricks Archive).
  3. Challenge 1 - Exploring Data with Spark.
  4. Challenge 2 - Building Supervised Learning Models.
  5. Challenge 3 - Evaluating Supervised Learning Models.
  6. Challenge 4 - Recommenders and Clustering.
  7. Challenge 5 - Using the MMLSpark Library

Note: The Challenge 0 - Administration archive is to help facilitate this workshop in your offices after the fact.**

Discussion Forum

  • SWAG given for most active participants
  • Q&A and Feedback
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].