Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → nok → Sklearn Porter

nok / Sklearn Porter

Licence: mit

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.

Programming Languages

python

139335 projects - #7 most used programming language

Labels

machine-learning data-science scikit-learn sklearn

Projects that are alternatives of or similar to Sklearn Porter

Qlik Py Tools

Data Science algorithms for Qlik implemented as a Python Server Side Extension (SSE).

Stars: ✭ 135 (-86.69%)

Mutual labels: data-science, scikit-learn, sklearn

Hyperparameter hunter

Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries

Stars: ✭ 648 (-36.09%)

Mutual labels: data-science, scikit-learn, sklearn

Igel

a delightful machine learning tool that allows you to train, test, and use models without writing code

Stars: ✭ 2,956 (+191.52%)

Mutual labels: data-science, scikit-learn, sklearn

Machinelearningstocks

Using python and scikit-learn to make stock predictions

Stars: ✭ 897 (-11.54%)

Mutual labels: data-science, scikit-learn, sklearn

Sklearn Evaluation

Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.

Stars: ✭ 294 (-71.01%)

Mutual labels: data-science, scikit-learn, sklearn

Python Machine Learning Book 2nd Edition

The "Python Machine Learning (2nd edition)" book code repository and info resource

Stars: ✭ 6,422 (+533.33%)

Mutual labels: data-science, scikit-learn

Datastream.io

An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana

Stars: ✭ 814 (-19.72%)

Mutual labels: data-science, sklearn

Machine Learning With Python

Small scale machine learning projects to understand the core concepts . Give a Star 🌟If it helps you. BONUS: Interview Bank coming up..!

Stars: ✭ 821 (-19.03%)

Mutual labels: data-science, scikit-learn

Model Describer

model-describer : Making machine learning interpretable to humans

Stars: ✭ 22 (-97.83%)

Mutual labels: data-science, scikit-learn

Baikal

A graph-based functional API for building complex scikit-learn pipelines.

Stars: ✭ 573 (-43.49%)

Mutual labels: data-science, scikit-learn

Foxcross

AsyncIO serving for data science models

Stars: ✭ 18 (-98.22%)

Mutual labels: data-science, scikit-learn

Crime Analysis

Association Rule Mining from Spatial Data for Crime Analysis

Stars: ✭ 20 (-98.03%)

Mutual labels: data-science, scikit-learn

Featuretools

An open source python library for automated feature engineering

Stars: ✭ 5,891 (+480.97%)

Mutual labels: data-science, scikit-learn

Awesome Python Data Science

Probably the best curated list of data science software in Python.

Stars: ✭ 812 (-19.92%)

Mutual labels: data-science, scikit-learn

Hungabunga

HungaBunga: Brute-Force all sklearn models with all parameters using .fit .predict!

Stars: ✭ 614 (-39.45%)

Mutual labels: scikit-learn, sklearn

Python for ml

brief introduction to Python for machine learning

Stars: ✭ 29 (-97.14%)

Mutual labels: data-science, scikit-learn

Traingenerator

🧙 A web app to generate template code for machine learning

Stars: ✭ 948 (-6.51%)

Mutual labels: scikit-learn, sklearn

Mljar Supervised

Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀

Stars: ✭ 961 (-5.23%)

Mutual labels: data-science, scikit-learn

Mlcourse.ai

Open Machine Learning Course

Stars: ✭ 7,963 (+685.31%)

Mutual labels: data-science, scikit-learn

Data Science Portfolio

Portfolio of data science projects completed by me for academic, self learning, and hobby purposes.

Stars: ✭ 559 (-44.87%)

Mutual labels: data-science, scikit-learn

View All Similar Projects ➔

sklearn-porter

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
It's recommended for limited embedded systems and critical applications where performance matters most.

Important

We're hard working on the first major release of sklearn-porter.
Until that we will just release bugfixes to the stable version.

Estimators

Estimator	Programming language
Classifier	Java *	JS	C	Go	PHP	Ruby
svm.SVC	✓, ✓ ᴵ	✓	✓		✓	✓
svm.NuSVC	✓, ✓ ᴵ	✓	✓		✓	✓
svm.LinearSVC	✓, ✓ ᴵ	✓	✓	✓	✓	✓
tree.DecisionTreeClassifier	✓, ✓ ᴱ, ✓ ᴵ	✓, ✓ ᴱ	✓, ✓ ᴱ	✓, ✓ ᴱ	✓, ✓ ᴱ	✓, ✓ ᴱ
ensemble.RandomForestClassifier	✓ ᴱ, ✓ ᴵ	✓ ᴱ	✓ ᴱ	✓ ᴱ	✓ ᴱ	✓ ᴱ
ensemble.ExtraTreesClassifier	✓ ᴱ, ✓ ᴵ	✓ ᴱ	✓ ᴱ		✓ ᴱ	✓ ᴱ
ensemble.AdaBoostClassifier	✓ ᴱ, ✓ ᴵ	✓ ᴱ, ✓ ᴵ	✓ ᴱ
neighbors.KNeighborsClassifier	✓, ✓ ᴵ	✓, ✓ ᴵ
naive_bayes.GaussianNB	✓, ✓ ᴵ	✓
naive_bayes.BernoulliNB	✓, ✓ ᴵ	✓
neural_network.MLPClassifier	✓, ✓ ᴵ	✓, ✓ ᴵ
Regressor	Java *	JS	C	Go	PHP	Ruby
neural_network.MLPRegressor		✓

✓ = is full-featured,　ᴱ = with embedded model data,　ᴵ = with imported model data,　* = default language

Installation

Stable

$ pip install sklearn-porter

Development

If you want the latest changes, you can install this package from the master branch:

$ pip uninstall -y sklearn-porter
$ pip install --no-cache-dir https://github.com/nok/sklearn-porter/zipball/master

Usage

Export

The following example demonstrates how you can transpile a decision tree estimator to Java:

from sklearn.datasets import load_iris
from sklearn.tree import tree
from sklearn_porter import Porter

# Load data and train the classifier:
samples = load_iris()
X, y = samples.data, samples.target
clf = tree.DecisionTreeClassifier()
clf.fit(X, y)

# Export:
porter = Porter(clf, language='java')
output = porter.export(embed_data=True)
print(output)

The exported result matches the official human-readable version of the decision tree.

Integrity

You should always check and compute the integrity between the original and the transpiled estimator:

# ...
porter = Porter(clf, language='java')

# Compute integrity score:
integrity = porter.integrity_score(X)
print(integrity)  # 1.0

Prediction

You can compute the prediction(s) in the target programming language:

# ...
porter = Porter(clf, language='java')

# Prediction(s):
Y_java = porter.predict(X)
y_java = porter.predict(X[0])
y_java = porter.predict([1., 2., 3., 4.])

Notebooks

You can run and test all notebooks by starting a Jupyter notebook server locally:

$ make open.examples
$ make stop.examples

CLI

In general you can use the porter on the command line:

$ porter <pickle_file> [--to <directory>]
         [--class_name <class_name>] [--method_name <method_name>]
         [--export] [--checksum] [--data] [--pipe]
         [--c] [--java] [--js] [--go] [--php] [--ruby]
         [--version] [--help]

The following example shows how you can save a trained estimator to the pickle format:

# ...

# Extract estimator:
joblib.dump(clf, 'estimator.pkl', compress=0)

After that the estimator can be transpiled to JavaScript by using the following command:

$ porter estimator.pkl --js

The target programming language is changeable on the fly:

$ porter estimator.pkl --c
$ porter estimator.pkl --java
$ porter estimator.pkl --php
$ porter estimator.pkl --java
$ porter estimator.pkl --ruby

For further processing the argument --pipe can be used to pass the result:

$ porter estimator.pkl --js --pipe > estimator.js

For instance the result can be minified by using UglifyJS:

$ porter estimator.pkl --js --pipe | uglifyjs --compress -o estimator.min.js

Development

Environment

You have to install required modules for broader development:

$ make install.environment  # conda environment (optional)
$ make install.requirements.development  # pip requirements

Independently, the following compilers and intepreters are required to cover all tests:

Name	Version	Command
GCC	`>=4.2`	`gcc --version`
Java	`>=1.6`	`java -version`
PHP	`>=5.6`	`php --version`
Ruby	`>=2.4.1`	`ruby --version`
Go	`>=1.7.4`	`go version`
Node.js	`>=6`	`node --version`

Testing

The tests cover module functions as well as matching predictions of transpiled estimators. Start all tests with:

$ make test

The test files have a specific pattern: '[Algorithm][Language]Test.py':

$ pytest tests -v -o python_files='RandomForest*Test.py'
$ pytest tests -v -o python_files='*JavaTest.py'

While you are developing new features or fixes, you can reduce the test duration by changing the number of tests:

$ N_RANDOM_FEATURE_SETS=5 N_EXISTING_FEATURE_SETS=10 \
  pytest tests -v -o python_files='*JavaTest.py'

Quality

It's highly recommended to ensure the code quality. For that Pylint is used. Start the linter with:

$ make lint

Citation

If you use this implementation in you work, please add a reference/citation to the paper. You can use the following BibTeX entry:

@unpublished{skpodamo,
  author = {Darius Morawiec},
  title = {sklearn-porter},
  note = {Transpile trained scikit-learn estimators to C, Java, JavaScript and others},
  url = {https://github.com/nok/sklearn-porter}
}

License

The module is Open Source Software released under the MIT license.

Questions?

Don't be shy and feel free to contact me on Twitter or Gitter.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 1,014

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (42) 🔗