Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → IntelPython → Daal4py

IntelPython / Daal4py

Licence: apache-2.0

sources for daal4py - a convenient Python API to oneDAL

Programming Languages

python

139335 projects - #7 most used programming language

Labels

hacktoberfest data-analysis scikit-learn machine-learning-algorithms

Projects that are alternatives of or similar to Daal4py

Onedal

oneAPI Data Analytics Library (oneDAL)

Stars: ✭ 382 (+238.05%)

Mutual labels: hacktoberfest, data-analysis, machine-learning-algorithms

Model Describer

model-describer : Making machine learning interpretable to humans

Stars: ✭ 22 (-80.53%)

Mutual labels: data-analysis, scikit-learn, machine-learning-algorithms

Igel

a delightful machine learning tool that allows you to train, test, and use models without writing code

Stars: ✭ 2,956 (+2515.93%)

Mutual labels: data-analysis, scikit-learn, machine-learning-algorithms

Skll

SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.

Stars: ✭ 523 (+362.83%)

Mutual labels: hacktoberfest, scikit-learn

Articles

A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci

Stars: ✭ 350 (+209.73%)

Mutual labels: data-analysis, machine-learning-algorithms

Algorithmsanddatastructure

Algorithms And DataStructure Implemented In Python & CPP, Give a Star 🌟If it helps you

Stars: ✭ 400 (+253.98%)

Mutual labels: hacktoberfest, machine-learning-algorithms

Data-Analysis

Different types of data analytics projects : EDA, PDA, DDA, TSA and much more.....

Stars: ✭ 22 (-80.53%)

Mutual labels: machine-learning-algorithms, data-analysis

Spring2017 proffosterprovost

Introduction to Data Science

Stars: ✭ 18 (-84.07%)

Mutual labels: data-analysis, machine-learning-algorithms

Awesome Python Data Science

Probably the best curated list of data science software in Python.

Stars: ✭ 812 (+618.58%)

Mutual labels: data-analysis, scikit-learn

100 Days Of Ml Code

100 Days of ML Coding

Stars: ✭ 33,641 (+29670.8%)

Mutual labels: scikit-learn, machine-learning-algorithms

Ds and ml projects

Data Science & Machine Learning projects and tutorials in python from beginner to advanced level.

Stars: ✭ 56 (-50.44%)

Mutual labels: scikit-learn, machine-learning-algorithms

Zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

Stars: ✭ 303 (+168.14%)

Mutual labels: data-analysis, scikit-learn

Mlcourse.ai

Open Machine Learning Course

Stars: ✭ 7,963 (+6946.9%)

Mutual labels: data-analysis, scikit-learn

Modal

A modular active learning framework for Python

Stars: ✭ 1,148 (+915.93%)

Mutual labels: scikit-learn, machine-learning-algorithms

Machinejs

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml

Stars: ✭ 412 (+264.6%)

Mutual labels: scikit-learn, machine-learning-algorithms

Netket

Machine learning algorithms for many-body quantum systems

Stars: ✭ 256 (+126.55%)

Mutual labels: hacktoberfest, machine-learning-algorithms

Collection of various algorithms in mathematics, machine learning, computer science, physics, etc implemented in C for educational purposes.

Stars: ✭ 11,897 (+10428.32%)

Mutual labels: hacktoberfest, machine-learning-algorithms

genie

Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)

Stars: ✭ 21 (-81.42%)

Mutual labels: machine-learning-algorithms, data-analysis

genieclust

Genie++ Fast and Robust Hierarchical Clustering with Noise Point Detection - for Python and R

Stars: ✭ 34 (-69.91%)

Mutual labels: machine-learning-algorithms, data-analysis

Innovative Hacktober

Make a pull request. Let's hack the ocktober in an innovative way.

Stars: ✭ 34 (-69.91%)

Mutual labels: hacktoberfest, machine-learning-algorithms

View All Similar Projects ➔

daal4py - A Convenient Python API to the Intel(R) oneAPI Data Analytics Library

A simplified API to Intel(R) oneAPI Data Analytics Library that allows for fast usage of the framework suited for Data Scientists or Machine Learning users. Built to help provide an abstraction to Intel(R) oneAPI Data Analytics Library for either direct usage or integration into one's own framework and extending this beyond by providing drop-in paching for scikit-learn.

Running full scikit-learn test suite with daal4py optimization patches:

when applied to scikit-learn from PyPi
when applied to build from master branch

👀 Follow us on Medium

We publish blogs on Medium, so follow us to learn tips and tricks for more efficient data analysis the help of daal4py. Here are our latest blogs:

🔗 Important links

💬 Support

Report issues, ask questions, and provide suggestions using:

You may reach out to project maintainers privately at [email protected]

🛠 Installation

daal4py is available at the Python Package Index, on Anaconda Cloud in Conda-Forge channel and in Intel channel.

# PyPi
pip install daal4py

# Anaconda Cloud from Conda-Forge channel (recommended for conda users by default)
conda install daal4py -c conda-forge

# Anaconda Cloud from Intel channel (recommended for Intel® Distribution for Python)
conda install daal4py -c intel

[Click to expand] ℹ️ Supported configurations

📦 PyPi channel

OS / Python version	Python 3.6	Python 3.7	Python 3.8	Python 3.9
Linux	[CPU, GPU]	[CPU, GPU]	[CPU, GPU]	[CPU, GPU]
Windows	[CPU, GPU]	[CPU, GPU]	[CPU, GPU]	[CPU, GPU]
OsX	[CPU]	[CPU]	[CPU]	[CPU]

📦 Anaconda Cloud: Conda-Forge channel

OS / Python version	Python 3.6	Python 3.7	Python 3.8	Python 3.9
Linux	[CPU]	[CPU]	[CPU]	[CPU]
Windows	[CPU]	[CPU]	[CPU]	[CPU]
OsX	❌	❌	❌	❌

📦 Anaconda Cloud: Intel channel

OS / Python version	Python 3.6	Python 3.7	Python 3.8	Python 3.9
Linux	❌	[CPU, GPU]	❌	❌
Windows	❌	[CPU, GPU]	❌	❌
OsX	❌	[CPU]	❌	❌

You can build daal4py from sources as well.

⚡️ Get Started

Accelerate scikit-learn with the core functionality of daal4py without changing the code.

Intel CPU optimizations patching

import numpy as np
from daal4py.sklearn import patch_sklearn
patch_sklearn()

from sklearn.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.],
              [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
clustering = DBSCAN(eps=3, min_samples=2).fit(X)

Intel CPU/GPU optimizations patching

import numpy as np
from daal4py.sklearn import patch_sklearn
from daal4py.oneapi import sycl_context
patch_sklearn()

from sklearn.cluster import DBSCAN

X = np.array([[1., 2.], [2., 2.], [2., 3.],
              [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
with sycl_context("gpu"):
    clustering = DBSCAN(eps=3, min_samples=2).fit(X)

🚀 Scikit-learn patching

Speedups of daal4py-powered Scikit-learn over the original Scikit-learn

Technical details: float type: float64; HW: Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz, 2 sockets, 28 cores per socket; SW: scikit-learn 0.23.1, Intel® oneDAl (2021.1 Beta 10)

daal4py patching affects performance of specific Scikit-learn functionality listed below. In cases when unsupported parameters are used, daal4py fallbacks into stock Scikit-learn. These limitations described below. If the patching does not cover your scenarios, submit an issue on GitHub.

⚠️ We support optimizations for the last four versions of scikit-learn. The latest release of daal4py-2021.1 supports scikit-learn 0.21.X, 0.22.X, 0.23.X and 0.24.X.

[Click to expand] 🔥 Applying the daal4py patch will impact the following existing scikit-learn algorithms:

Task	Functionality	Parameters support	Data support
Classification	SVC	All parameters except `kernel` = 'poly' and 'sigmoid'.	No limitations.
	RandomForestClassifier	All parameters except `warmstart` = True and `cpp_alpha` != 0, `criterion` != 'gini'.	Multi-output and sparse data is not supported.
	KNeighborsClassifier	All parameters except `metric` != 'euclidean' or `minkowski` with `p` = 2.	Multi-output and sparse data is not supported.
	LogisticRegression / LogisticRegressionCV	All parameters except `solver` != 'lbfgs' or 'newton-cg', `class_weight` != None, `sample_weight` != None.	Only dense data is supported.
Regression	RandomForestRegressor	All parameters except `warmstart` = True and `cpp_alpha` != 0, `criterion` != 'mse'.	Multi-output and sparse data is not supported.
	KNeighborsRegressor	All parameters except `metric` != 'euclidean' or `minkowski` with `p` = 2.	Sparse data is not supported.
	LinearRegression	All parameters except `normalize` != False and `sample_weight` != None.	Only dense data is supported, `#observations` should be >= `#features`.
	Ridge	All parameters except `normalize` != False, `solver` != 'auto' and `sample_weight` != None.	Only dense data is supported, `#observations` should be >= `#features`.
	ElasticNet	All parameters except `sample_weight` != None.	Multi-output and sparse data is not supported, `#observations` should be >= `#features`.
	Lasso	All parameters except `sample_weight` != None.	Multi-output and sparse data is not supported, `#observations` should be >= `#features`.
Clustering	KMeans	All parameters except `precompute_distances` and `sample_weight` != None.	No limitations.
	DBSCAN	All parameters except `metric` != 'euclidean' or `minkowski` with `p` = 2.	Only dense data is supported.
Dimensionality reduction	PCA	All parameters except `svd_solver` != 'full'.	No limitations.
	TSNE	All parameters except `metric` != 'euclidean' or `minkowski` with `p` = 2.	Sparse data is not supported.
Unsupervised	NearestNeighbors	All parameters except `metric` != 'euclidean' or `minkowski` with `p` = 2.	Sparse data is not supported.
Other	train_test_split	All parameters are supported.	Only dense data is supported.
	assert_all_finite	All parameters are supported.	Only dense data is supported.
	pairwise_distance	With `metric`='cosine' and 'correlation'.	Only dense data is supported.

Scenarios that are only available in the master branch (not released yet):

Task	Functionality	Parameters support	Data support
Other	roc_auc_score	Parameters `average`, `sample_weight`, `max_fpr` and `multi_class` are not supported.	No limitations.

📜 scikit-learn verbose

To find out which implementation of the algorithm is currently used (daal4py or stock Scikit-learn), set the environment variable:

On Linux and Mac OS: export IDP_SKLEARN_VERBOSE=INFO
On Windows: set IDP_SKLEARN_VERBOSE=INFO

For example, for DBSCAN you get one of these print statements depending on which implementation is used:

INFO: sklearn.cluster.DBSCAN.fit: uses Intel(R) oneAPI Data Analytics Library solver
INFO: sklearn.cluster.DBSCAN.fit: uses original Scikit-learn solver