AlexIoannides / ml-workflow-automation

Licence: other

Python Machine Learning (ML) project that demonstrates the archetypal ML workflow within a Jupyter notebook, with automated model deployment as a RESTful service on Kubernetes.

Programming Languages

Jupyter Notebook

11667 projects

Projects that are alternatives of or similar to ml-workflow-automation

Machine Learning Projects

This repository consists of all my Machine Learning Projects.

Stars: ✭ 135 (+206.82%)

Mutual labels: numpy, sklearn, pandas, classification

Data Analysis

主要是爬虫与数据分析项目总结，外加建模与机器学习，模型的评估。

Stars: ✭ 142 (+222.73%)

Mutual labels: numpy, sklearn, pandas, kaggle

Data-Analyst-Nanodegree

Kai Sheng Teh - Udacity Data Analyst Nanodegree

Stars: ✭ 42 (-4.55%)

Mutual labels: numpy, sklearn, pandas

support-tickets-classification

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en

Stars: ✭ 142 (+222.73%)

Mutual labels: numpy, pandas, classification

Orange3

🍊 📊 💡 Orange: Interactive data analysis

Stars: ✭ 3,152 (+7063.64%)

Mutual labels: numpy, pandas, classification

Dimensionality-reduction-and-classification-on-Hyperspectral-Images-Using-Python

In this repository, You can find the files which implement dimensionality reduction on the hyperspectral image(Indian Pines) with classification.

Stars: ✭ 63 (+43.18%)

Mutual labels: numpy, pandas, classification

sklearn-predict

机器学习数据，预测趋势并画图

Stars: ✭ 16 (-63.64%)

Mutual labels: numpy, sklearn, pandas

Machinelearningcourse

A collection of notebooks of my Machine Learning class written in python 3

Stars: ✭ 35 (-20.45%)

Mutual labels: numpy, pandas, kaggle

Data-Scientist-In-Python

This repository contains notes and projects of Data scientist track from dataquest course work.

Stars: ✭ 23 (-47.73%)

Mutual labels: numpy, pandas, kaggle

Ml Cheatsheet

A constantly updated python machine learning cheatsheet

Stars: ✭ 136 (+209.09%)

Mutual labels: numpy, sklearn, pandas

Machine Learning

从零基础开始机器学习之旅

Stars: ✭ 209 (+375%)

Mutual labels: sklearn, pandas, kaggle

Tensorflow Ml Nlp

텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)

Stars: ✭ 176 (+300%)

Mutual labels: numpy, sklearn, pandas

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+50009.09%)

Mutual labels: numpy, pandas, kaggle

Lambda Packs

Precompiled packages for AWS Lambda

Stars: ✭ 997 (+2165.91%)

Mutual labels: numpy, sklearn, pandas

Machine Learning With Python

Practice and tutorial-style notebooks covering wide variety of machine learning techniques

Stars: ✭ 2,197 (+4893.18%)

Mutual labels: numpy, pandas, classification

Data Science Notebook

📖 每一个伟大的思想和行动都有一个微不足道的开始

Stars: ✭ 196 (+345.45%)

Mutual labels: numpy, sklearn, pandas

fer

Facial Expression Recognition

Stars: ✭ 32 (-27.27%)

Mutual labels: pandas, kaggle

skutil

NOTE: skutil is now deprecated. See its sister project: https://github.com/tgsmith61591/skoot. Original description: A set of scikit-learn and h2o extension classes (as well as caret classes for python). See more here: https://tgsmith61591.github.io/skutil

Stars: ✭ 29 (-34.09%)

Mutual labels: sklearn, pandas

Udacity-Data-Analyst-Nanodegree

Repository for the projects needed to complete the Data Analyst Nanodegree.

Stars: ✭ 31 (-29.55%)

Mutual labels: numpy, pandas

Data-Wrangling-with-Python

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

Stars: ✭ 90 (+104.55%)

Mutual labels: numpy, pandas

View All Similar Projects ➔

Automating the Archetypal Machine Learning Workflow and Model Deployment

This repository contains a Python-based Machine Learning (ML) project, whose primary aim is to demonstrate the archetypal ML workflow within a Jupyter notebook, together with some proof-of-concept ideas on automating key steps, using the Titanic binary classification dataset hosted on Kaggle. The ML workflow includes: data exploration and visualisation, feature engineering, model training and selection. The notebook - titanic-ml.ipynb - also yields a persisted prediction pipeline (pickled to the models directory), that is used downstream in the model deployment process. Note, that we have already downloaded the data from Kaggle, in CSV format, to the data directory of this project's root directory.

The secondary aim of this project, is to demonstrate how the deployment of the model generated as a 'build artefact' of the modelling notebook, can be automatically deployed as a managed RESTful prediction service on Kubernetes, without having to write any custom code. The full details are contained in the deploy/deploy-model.ipynb notebook, where we lean very heavily on the approaches discussed here.

Managing Project Dependencies using Pipenv

We use pipenv for managing project dependencies and Python environments (i.e. virtual environments). All of the direct packages dependencies required to run the code (e.g. NumPy for arrays/tensors and Pandas for DataFrames), as well as all the packages used during development (e.g. flake8 for code linting and IPython for interactive console sessions), are described in the Pipfile. Their precise downstream dependencies are described in Pipfile.lock.

Installing Pipenv

To get started with Pipenv, first of all download it - assuming that there is a global version of Python available on your system and on the PATH, then this can be achieved by running the following command,

pip3 install pipenv

Pipenv is also available to install from many non-Python package managers. For example, on OS X it can be installed using the Homebrew package manager, with the following terminal command,

brew install pipenv

For more information, including advanced configuration options, see the official pipenv documentation.

Installing this Projects' Dependencies

Make sure that you're in the project's root directory (the same one in which the Pipfile resides), and then run,

pipenv install --dev

This will install all of the direct project dependencies as well as the development dependencies (the latter a consequence of the --dev flag).

Running Python, IPython and JupyterLab from the Project's Virtual Environment

In order to continue development in a Python environment that precisely mimics the one the project was initially developed with, use Pipenv from the command line as follows,

pipenv run python3

The python3 command could just as well be ipython3 or the JupterLab, for example,

pipenv run jupyter lab

This will fire-up a JupyterLab where the default Python 3 kernel includes all of the direct and development project dependencies. This is how we advise that the notebooks within this project are used.

Automatic Loading of Environment Variables

Pipenv will automatically pick-up and load any environment variables declared in the .env file, located in the package's root directory. For example, adding,

SPARK_HOME=applications/spark-2.3.1/bin

Will enable access to this variable within any Python program, via a call to os.environ['SPARK_HOME']. Note, that if any security credentials are placed here, then this file must be removed from source control - i.e. add .env to the .gitignore file to prevent potential security risks.

Pipenv Shells

Prepending pipenv to every command you want to run within the context of your Pipenv-managed virtual environment, can get very tedious. This can be avoided by entering into a Pipenv-managed shell,

pipenv shell

which is equivalent to 'activating' the virtual environment. Any command will now be executed within the virtual environment. Use exit to leave the shell session.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

AlexIoannides / ml-workflow-automation

Programming Languages

Labels

Projects that are alternatives of or similar to ml-workflow-automation

Automating the Archetypal Machine Learning Workflow and Model Deployment

Managing Project Dependencies using Pipenv

Installing Pipenv

Installing this Projects' Dependencies

Running Python, IPython and JupyterLab from the Project's Virtual Environment

Automatic Loading of Environment Variables

Pipenv Shells