All Projects → alteryx → Featuretools

alteryx / Featuretools

Licence: bsd-3-clause
An open source python library for automated feature engineering

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to Featuretools

Auto ml
[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (-73.54%)
Mutual labels:  data-science, scikit-learn, automl, feature-engineering, automated-machine-learning
Tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Stars: ✭ 8,378 (+42.22%)
Mutual labels:  data-science, scikit-learn, automl, feature-engineering, automated-machine-learning
Mljar Supervised
Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀
Stars: ✭ 961 (-83.69%)
Mutual labels:  data-science, scikit-learn, automl, feature-engineering, automated-machine-learning
Machinejs
[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml
Stars: ✭ 412 (-93.01%)
Mutual labels:  data-science, scikit-learn, automl, automated-machine-learning
Hyperactive
A hyperparameter optimization and data collection toolbox for convenient and fast prototyping of machine-learning models.
Stars: ✭ 182 (-96.91%)
Mutual labels:  data-science, scikit-learn, feature-engineering, automated-machine-learning
Autodl
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (-85.5%)
Mutual labels:  data-science, automl, feature-engineering, automated-machine-learning
Autogluon
AutoGluon: AutoML for Text, Image, and Tabular Data
Stars: ✭ 3,920 (-33.46%)
Mutual labels:  data-science, scikit-learn, automl, automated-machine-learning
Lightautoml
LAMA - automatic model creation framework
Stars: ✭ 196 (-96.67%)
Mutual labels:  data-science, automl, feature-engineering, automated-machine-learning
Lale
Library for Semi-Automated Data Science
Stars: ✭ 198 (-96.64%)
Mutual labels:  data-science, scikit-learn, automl, automated-machine-learning
featuretoolsOnSpark
A simplified version of featuretools for Spark
Stars: ✭ 24 (-99.59%)
Mutual labels:  feature-engineering, automl, automated-machine-learning, automated-feature-engineering
Nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Stars: ✭ 10,698 (+81.6%)
Mutual labels:  data-science, automl, feature-engineering, automated-machine-learning
EvolutionaryForest
An open source python library for automated feature engineering based on Genetic Programming
Stars: ✭ 56 (-99.05%)
Mutual labels:  feature-engineering, automl, automated-machine-learning, automated-feature-engineering
Auptimizer
An automatic ML model optimization tool.
Stars: ✭ 166 (-97.18%)
Mutual labels:  data-science, automl, automated-machine-learning
Evalml
EvalML is an AutoML library written in python.
Stars: ✭ 145 (-97.54%)
Mutual labels:  data-science, automl, feature-engineering
Auto Sklearn
Automated Machine Learning with scikit-learn
Stars: ✭ 5,916 (+0.42%)
Mutual labels:  scikit-learn, automl, automated-machine-learning
Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (-96.3%)
Mutual labels:  data-science, scikit-learn, feature-engineering
Pba
Efficient Learning of Augmentation Policy Schedules
Stars: ✭ 461 (-92.17%)
Mutual labels:  data-science, automl, automated-machine-learning
Flaml
A fast and lightweight AutoML library.
Stars: ✭ 205 (-96.52%)
Mutual labels:  data-science, automl, automated-machine-learning
Igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
Stars: ✭ 2,956 (-49.82%)
Mutual labels:  data-science, scikit-learn, automl
Xcessiv
A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.
Stars: ✭ 1,255 (-78.7%)
Mutual labels:  data-science, scikit-learn, automated-machine-learning

Featuretools

"One of the holy grails of machine learning is to automate more and more of the feature engineering process." ― Pedro Domingos, A Few Useful Things to Know about Machine Learning

Tests Coverage Status PyPI version Anaconda-Server Badge StackOverflow Downloads

Featuretools is a python library for automated feature engineering. See the documentation for more information.

Installation

Install with pip

python -m pip install featuretools

or from the Conda-forge channel on conda:

conda install -c conda-forge featuretools

Add-ons

You can install add-ons individually or all at once by running

python -m pip install "featuretools[complete]"

Update checker - Receive automatic notifications of new Featuretools releases

python -m pip install "featuretools[update_checker]"

NLP Primitives - Use Natural Language Processing Primitives:

python -m pip install "featuretools[nlp_primitives]"

TSFresh Primitives - Use 60+ primitives from tsfresh within Featuretools

python -m pip install "featuretools[tsfresh]"

Example

Below is an example of using Deep Feature Synthesis (DFS) to perform automated feature engineering. In this example, we apply DFS to a multi-table dataset consisting of timestamped customer transactions.

>> import featuretools as ft
>> es = ft.demo.load_mock_customer(return_entityset=True)
>> es.plot()

Featuretools can automatically create a single table of features for any "target dataframe"

>> feature_matrix, features_defs = ft.dfs(entityset=es, target_dataframe_name="customers")
>> feature_matrix.head(5)
            zip_code  COUNT(transactions)  COUNT(sessions)  SUM(transactions.amount) MODE(sessions.device)  MIN(transactions.amount)  MAX(transactions.amount)  YEAR(join_date)  SKEW(transactions.amount)  DAY(join_date)                   ...                     SUM(sessions.MIN(transactions.amount))  MAX(sessions.SKEW(transactions.amount))  MAX(sessions.MIN(transactions.amount))  SUM(sessions.MEAN(transactions.amount))  STD(sessions.SUM(transactions.amount))  STD(sessions.MEAN(transactions.amount))  SKEW(sessions.MEAN(transactions.amount))  STD(sessions.MAX(transactions.amount))  NUM_UNIQUE(sessions.DAY(session_start))  MIN(sessions.SKEW(transactions.amount))
customer_id                                                                                                                                                                                                                                  ...
1              60091                  131               10                  10236.77               desktop                      5.60                    149.95             2008                   0.070041               1                   ...                                                     169.77                                 0.610052                                   41.95                               791.976505                              175.939423                                 9.299023                                 -0.377150                                5.857976                                        1                                -0.395358
2              02139                  122                8                   9118.81                mobile                      5.81                    149.15             2008                   0.028647              20                   ...                                                     114.85                                 0.492531                                   42.96                               596.243506                              230.333502                                10.925037                                  0.962350                                7.420480                                        1                                -0.470007
3              02139                   78                5                   5758.24               desktop                      6.78                    147.73             2008                   0.070814              10                   ...                                                      64.98                                 0.645728                                   21.77                               369.770121                              471.048551                                 9.819148                                 -0.244976                               12.537259                                        1                                -0.630425
4              60091                  111                8                   8205.28               desktop                      5.73                    149.56             2008                   0.087986              30                   ...                                                      83.53                                 0.516262                                   17.27                               584.673126                              322.883448                                13.065436                                 -0.548969                               12.738488                                        1                                -0.497169
5              02139                   58                4                   4571.37                tablet                      5.91                    148.17             2008                   0.085883              19                   ...                                                      73.09                                 0.830112                                   27.46                               313.448942                              198.522508                                 8.950528                                  0.098885                                5.599228                                        1                                -0.396571

[5 rows x 69 columns]

We now have a feature vector for each customer that can be used for machine learning. See the documentation on Deep Feature Synthesis for more examples.

Featuretools contains many different types of built-in primitives for creating features. If the primitive you need is not included, Featuretools also allows you to define your own custom primitives.

Demos

Predict Next Purchase

Repository | Notebook

In this demonstration, we use a multi-table dataset of 3 million online grocery orders from Instacart to predict what a customer will buy next. We show how to generate features with automated feature engineering and build an accurate machine learning pipeline using Featuretools, which can be reused for multiple prediction problems. For more advanced users, we show how to scale that pipeline to a large dataset using Dask.

For more examples of how to use Featuretools, check out our demos page.

Testing & Development

The Featuretools community welcomes pull requests. Instructions for testing and development are available here.

Support

The Featuretools community is happy to provide support to users of Featuretools. Project support can be found in four places depending on the type of question:

  1. For usage questions, use Stack Overflow with the featuretools tag.
  2. For bugs, issues, or feature requests start a Github issue.
  3. For discussion regarding development on the core library, use Slack.
  4. For everything else, the core developers can be reached by email at [email protected].

Citing Featuretools

If you use Featuretools, please consider citing the following paper:

James Max Kanter, Kalyan Veeramachaneni. Deep feature synthesis: Towards automating data science endeavors. IEEE DSAA 2015.

BibTeX entry:

@inproceedings{kanter2015deep,
  author    = {James Max Kanter and Kalyan Veeramachaneni},
  title     = {Deep feature synthesis: Towards automating data science endeavors},
  booktitle = {2015 {IEEE} International Conference on Data Science and Advanced Analytics, DSAA 2015, Paris, France, October 19-21, 2015},
  pages     = {1--10},
  year      = {2015},
  organization={IEEE}
}

Built at Alteryx Innovation Labs

Alteryx Innovation Labs
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].