Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → crawles → Automl_service

crawles / Automl_service

Licence: mit

Deploy AutoML as a service using Flask

Labels

jupyter-notebook

Projects that are alternatives of or similar to Automl service

Allensdk

code for reading and processing Allen Institute for Brain Science data

Stars: ✭ 200 (-0.99%)

Mutual labels: jupyter-notebook

Urban Informatics And Visualization

Urban Informatics and Visualization (UC Berkeley CP255)

Stars: ✭ 200 (-0.99%)

Mutual labels: jupyter-notebook

Zaoqi Python

公众号：早起Python

Stars: ✭ 202 (+0%)

Mutual labels: jupyter-notebook

Pyqstrat

A fast, extensible, transparent python library for backtesting quantitative strategies.

Stars: ✭ 200 (-0.99%)

Mutual labels: jupyter-notebook

Pycon2017 Optimizing Pandas

Materials for PyCon 2017 presentation on optimizing Pandas code

Stars: ✭ 201 (-0.5%)

Mutual labels: jupyter-notebook

Echomods

Open source ultrasound processing modules and building blocks

Stars: ✭ 200 (-0.99%)

Mutual labels: jupyter-notebook

Sdc Lane And Vehicle Detection Tracking

OpenCV in Python for lane line and vehicle detection/tracking in autonomous cars

Stars: ✭ 200 (-0.99%)

Mutual labels: jupyter-notebook

Geostatspy

GeostatsPy Python package for spatial data analytics and geostatistics. Mostly a reimplementation of GSLIB, Geostatistical Library (Deutsch and Journel, 1992) in Python. Geostatistics in a Python package. I hope this resources is helpful, Prof. Michael Pyrcz

Stars: ✭ 200 (-0.99%)

Mutual labels: jupyter-notebook

Lsstc Dsfp Sessions

Lecture slides, Jupyter notebooks, and other material from the LSSTC Data Science Fellowship Program

Stars: ✭ 201 (-0.5%)

Mutual labels: jupyter-notebook

Data Science Online

Stars: ✭ 202 (+0%)

Mutual labels: jupyter-notebook

Keras deep clustering

How to do Unsupervised Clustering with Keras

Stars: ✭ 202 (+0%)

Mutual labels: jupyter-notebook

Trump Lies

Tutorial: Web scraping in Python with Beautiful Soup

Stars: ✭ 201 (-0.5%)

Mutual labels: jupyter-notebook

Fastpages

An easy to use blogging platform, with enhanced support for Jupyter Notebooks.

Stars: ✭ 2,888 (+1329.7%)

Mutual labels: jupyter-notebook

Rosetta

Tools, wrappers, etc... for data science with a concentration on text processing

Stars: ✭ 200 (-0.99%)

Mutual labels: jupyter-notebook

Csa Inpainting

Coherent Semantic Attention for image inpainting(ICCV 2019)

Stars: ✭ 202 (+0%)

Mutual labels: jupyter-notebook

Spark Practice

Apache Spark (PySpark) Practice on Real Data

Stars: ✭ 200 (-0.99%)

Mutual labels: jupyter-notebook

Cs231n

my assignment solutions for CS231n Convolutional Neural Networks for Visual Recognition

Stars: ✭ 201 (-0.5%)

Mutual labels: jupyter-notebook

Release

Deep Reinforcement Learning for de-novo Drug Design

Stars: ✭ 201 (-0.5%)

Mutual labels: jupyter-notebook

Face toolbox keras

A collection of deep learning frameworks ported to Keras for face analysis.

Stars: ✭ 202 (+0%)

Mutual labels: jupyter-notebook

Joyful Pandas

pandas中文教程

Stars: ✭ 2,788 (+1280.2%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

AutoML Service

Deploy automated machine learning (AutoML) as a service using Flask, for both pipeline training and pipeline serving.

The framework implements a fully automated time series classification pipeline, automating both feature engineering and model selection and optimization using Python libraries, TPOT and tsfresh.

Check out the blog post for more info.

Resources:

TPOT– Automated feature preprocessing and model optimization tool
tsfresh– Automated time series feature engineering and selection
Flask– A web development microframework for Python

Architecture

The application exposes both model training and model predictions with a RESTful API. For model training, input data and labels are sent via POST request, a pipeline is trained, and model predictions are accessible via a prediction route.

Pipelines are stored to a unique key, and thus, live predictions can be made on the same data using different feature construction and modeling pipelines.

An automated pipeline for time-series classification.

The model training logic is exposed as a REST endpoint. Raw, labeled training data is uploaded via a POST request and an optimal model is developed.

Raw training data is uploaded via a POST request and a model prediction is returned.

Using the app

View the Jupyter Notebook for an example.

Deploying

# deploy locally
python automl_service.py

# deploy on cloud foundry
cf push

Usage

Train a pipeline:

train_url = 'http://0.0.0.0:8080/train_pipeline'
train_files = {'raw_data': open('data/data_train.json', 'rb'),
               'labels'  : open('data/label_train.json', 'rb'),
               'params'  : open('parameters/train_parameters_model2.yml', 'rb')}

# post request to train pipeline
r_train = requests.post(train_url, files=train_files)
result_df = json.loads(r_train.json())

returns:

{'featureEngParams': {'default_fc_parameters': "['median', 'minimum', 'standard_deviation', 
                                                 'sum_values', 'variance', 'maximum', 
                                                 'length', 'mean']",
                      'impute_function': 'impute',
                      ...},
 'mean_cv_accuracy': 0.865,
 'mean_cv_roc_auc': 0.932,
 'modelId': 1,
 'modelType': "Pipeline(steps=[('stackingestimator', StackingEstimator(estimator=LinearSVC(...))),
                               ('logisticregression', LogisticRegressionClassifier(solver='liblinear',...))])"
 'trainShape': [1647, 8],
 'trainTime': 1.953}

Serve pipeline predictions:

serve_url = 'http://0.0.0.0:8080/serve_prediction'
test_files = {'raw_data': open('data/data_test.json', 'rb'),
              'params' : open('parameters/test_parameters_model2.yml', 'rb')}

# post request to serve predictions from trained pipeline
r_test  = requests.post(serve_url, files=test_files)
result = pd.read_json(r_test.json()).set_index('id')

example_id	prediction
1	0.853
2	0.991
3	0.060
4	0.995
5	0.003
...	...

View all trained models:

r = requests.get('http://0.0.0.0:8080/models')
pipelines = json.loads(r.json())

{'1':
    {'mean_cv_accuracy': 0.873,
     'modelType': "RandomForestClassifier(...),
     ...},
 '2':
    {'mean_cv_accuracy': 0.895,
     'modelType': "GradientBoostingClassifier(...),
     ...},
 '3':
    {'mean_cv_accuracy': 0.859,
     'modelType': "LogisticRegressionClassifier(...),
     ...},
...}

Running the tests

Supply a user argument for the host.

# use local app
py.test --host http://0.0.0.0:8080

# use cloud-deployed app
py.test --host http://ROUTE-HERE

Scaling the architecture

For production, I would suggest splitting training and serving into seperate applications, and incorporating a fascade API. Also it would be best to use a shared cache such as Redis or Pivotal Cloud Cache to allow other applications and multiple instances of the pipeline to access the trained model. Here is a potential architecture.

A scalable model training and model serving architecture.

Author

Chris Rawles

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 202

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (3) 🔗