All Projects → iamDecode → sklearn-pmml-model

iamDecode / sklearn-pmml-model

Licence: BSD-2-Clause license
A library to parse and convert PMML models into Scikit-learn estimators.

Programming Languages

cython
566 projects
python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to sklearn-pmml-model

Hyperparameter hunter
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (+812.68%)
Mutual labels:  scikit-learn, sklearn
Sklearn Porter
Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
Stars: ✭ 1,014 (+1328.17%)
Mutual labels:  scikit-learn, sklearn
Machinelearningstocks
Using python and scikit-learn to make stock predictions
Stars: ✭ 897 (+1163.38%)
Mutual labels:  scikit-learn, sklearn
Sklearn Evaluation
Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.
Stars: ✭ 294 (+314.08%)
Mutual labels:  scikit-learn, sklearn
Python Flask Sklearn Docker Template
A simple example of python api for real time machine learning, using scikit-learn, Flask and Docker
Stars: ✭ 117 (+64.79%)
Mutual labels:  scikit-learn, sklearn
Profanity Check
A fast, robust Python library to check for offensive language in strings.
Stars: ✭ 354 (+398.59%)
Mutual labels:  scikit-learn, sklearn
Traingenerator
🧙 A web app to generate template code for machine learning
Stars: ✭ 948 (+1235.21%)
Mutual labels:  scikit-learn, sklearn
ai-deployment
关注AI模型上线、模型部署
Stars: ✭ 149 (+109.86%)
Mutual labels:  scikit-learn, pmml
Facial Expression Recognition Svm
Training SVM classifier to recognize people expressions (emotions) on Fer2013 dataset
Stars: ✭ 110 (+54.93%)
Mutual labels:  scikit-learn, sklearn
Ml code
A repository for recording the machine learning code
Stars: ✭ 75 (+5.63%)
Mutual labels:  scikit-learn, sklearn
python3-docker-devenv
Docker Start Guide with Python Development Environment
Stars: ✭ 13 (-81.69%)
Mutual labels:  scikit-learn, sklearn
Igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
Stars: ✭ 2,956 (+4063.38%)
Mutual labels:  scikit-learn, sklearn
skippa
SciKIt-learn Pipeline in PAndas
Stars: ✭ 33 (-53.52%)
Mutual labels:  scikit-learn, sklearn
Hungabunga
HungaBunga: Brute-Force all sklearn models with all parameters using .fit .predict!
Stars: ✭ 614 (+764.79%)
Mutual labels:  scikit-learn, sklearn
Kaio-machine-learning-human-face-detection
Machine Learning project a case study focused on the interaction with digital characters, using a character called "Kaio", which, based on the automatic detection of facial expressions and classification of emotions, interacts with humans by classifying emotions and imitating expressions
Stars: ✭ 18 (-74.65%)
Mutual labels:  scikit-learn, sklearn
Ailearning
AiLearning: 机器学习 - MachineLearning - ML、深度学习 - DeepLearning - DL、自然语言处理 NLP
Stars: ✭ 32,316 (+45415.49%)
Mutual labels:  scikit-learn, sklearn
sklearn-audio-classification
An in-depth analysis of audio classification on the RAVDESS dataset. Feature engineering, hyperparameter optimization, model evaluation, and cross-validation with a variety of ML techniques and MLP
Stars: ✭ 31 (-56.34%)
Mutual labels:  scikit-learn, sklearn
scikit-learn
به فارسی، برای مشارکت scikit-learn
Stars: ✭ 19 (-73.24%)
Mutual labels:  scikit-learn, sklearn
Mlatimperial2017
Materials for the course of machine learning at Imperial College organized by Yandex SDA
Stars: ✭ 71 (+0%)
Mutual labels:  scikit-learn, sklearn
Qlik Py Tools
Data Science algorithms for Qlik implemented as a Python Server Side Extension (SSE).
Stars: ✭ 135 (+90.14%)
Mutual labels:  scikit-learn, sklearn

sklearn-pmml-model

PyPI version codecov Language grade: Python CircleCI ReadTheDocs

A library to effortlessly import models trained on different platforms and with programming languages into scikit-learn in Python. First export your model to PMML (widely supported). Next, load the exported PMML file with this library, and use the class as any other scikit-learn estimator.

Installation

The easiest way is to use pip:

$ pip install sklearn-pmml-model

Status

The library currently supports the following models:

Model Classification Regression Categorical features
Decision Trees 1
Random Forests 1
Gradient Boosting 1
Linear Regression 3
Ridge 2 3
Lasso 2 3
ElasticNet 2 3
Gaussian Naive Bayes 3
Support Vector Machines 3
Nearest Neighbors
Neural Networks

1 Categorical feature support using slightly modified internals, based on scikit-learn#12866.

2 These models differ only in training characteristics, the resulting model is of the same form. Classification is supported using PMMLLogisticRegression for regression models and PMMLRidgeClassifier for general regression models.

3 By one-hot encoding categorical features automatically.

Example

A minimal working example (using this PMML file) is shown below:

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.ensemble import PMMLForestClassifier

# Prepare data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.33, random_state=123)

clf = PMMLForestClassifier(pmml="models/randomForest.pmml")
clf.predict(Xte)
clf.score(Xte, yte)

More examples can be found in the subsequent packages: tree, ensemble, linear_model, naive_bayes, svm, neighbors and neural_network.

Benchmark

Depending on the data set and model, sklearn-pmml-model is between 5 and a 1000 times faster than competing libraries, by leveraging the optimization and industry-tested robustness of sklearn. Source code for this benchmark can be found in the corresponding jupyter notebook.

Running times (load + predict, in seconds)

Linear model Naive Bayes Decision tree Random Forest Gradient boosting
Wine PyPMML 0.773291 0.77384 0.777425 0.895204 0.902355
sklearn-pmml-model 0.005813 0.006357 0.002693 0.108882 0.121823
Breast cancer PyPMML 3.849855 3.878448 3.83623 4.16358 4.13766
sklearn-pmml-model 0.015723 0.011278 0.002807 0.146234 0.044016

Improvement

Linear model Naive Bayes Decision tree Random Forest Gradient boosting
Wine Improvement 133× 122× 289×
Breast cancer Improvement 245× 344× 1,367× 28× 94×

Development

Prerequisites

Tests can be run using Py.test. Grab a local copy of the source:

$ git clone http://github.com/iamDecode/sklearn-pmml-model
$ cd sklearn-pmml-model

create a virtual environment and activating it:

$ python3 -m venv venv
$ source venv/bin/activate

and install the dependencies:

$ pip install -r requirements.txt

The final step is to build the Cython extensions:

$ python setup.py build_ext --inplace

Testing

You can execute tests with py.test by running:

$ python setup.py pytest

Contributing

Feel free to make a contribution. Please read CONTRIBUTING.md for more details.

License

This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].