All Projects → alteryx → Compose

alteryx / Compose

Licence: bsd-3-clause
A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels for supervised learning.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Compose

Autodl
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (+320.69%)
Mutual labels:  ai, data-science, automl
Snorkel
A system for quickly generating training data with weak supervision
Stars: ✭ 4,953 (+2339.9%)
Mutual labels:  ai, data-science, labeling
Primehub
A toil-free multi-tenancy machine learning platform in your Kubernetes cluster
Stars: ✭ 160 (-21.18%)
Mutual labels:  ai, data-science
Fixy
Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çözebilen, eşsiz yaklaşımlar öne süren ve literatürdeki çalışmaların eksiklerini gideren open source bir yazım destekleyicisi/denetleyicisi oluşturmak. Kullanıcıların yazdıkları metinlerdeki yazım yanlışlarını derin öğrenme yaklaşımıyla çözüp aynı zamanda metinlerde anlamsal analizi de gerçekleştirerek bu bağlamda ortaya çıkan yanlışları da fark edip düzeltebilmek.
Stars: ✭ 165 (-18.72%)
Mutual labels:  ai, data-science
Auptimizer
An automatic ML model optimization tool.
Stars: ✭ 166 (-18.23%)
Mutual labels:  data-science, automl
Nlpaug
Data augmentation for NLP
Stars: ✭ 2,761 (+1260.1%)
Mutual labels:  ai, data-science
Evalml
EvalML is an AutoML library written in python.
Stars: ✭ 145 (-28.57%)
Mutual labels:  data-science, automl
Lale
Library for Semi-Automated Data Science
Stars: ✭ 198 (-2.46%)
Mutual labels:  data-science, automl
Auto ml
[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (+667.98%)
Mutual labels:  data-science, automl
Lightautoml
LAMA - automatic model creation framework
Stars: ✭ 196 (-3.45%)
Mutual labels:  data-science, automl
Pytorch Lightning
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
Stars: ✭ 16,641 (+8097.54%)
Mutual labels:  ai, data-science
Delbot
It understands your voice commands, searches news and knowledge sources, and summarizes and reads out content to you.
Stars: ✭ 191 (-5.91%)
Mutual labels:  ai, data-science
Automl alex
State-of-the art Automated Machine Learning python library for Tabular Data
Stars: ✭ 132 (-34.98%)
Mutual labels:  data-science, automl
Ds Ai Tech Notes
📖 [译] 数据科学和人工智能技术笔记
Stars: ✭ 131 (-35.47%)
Mutual labels:  ai, data-science
Datasciencevm
Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
Stars: ✭ 153 (-24.63%)
Mutual labels:  ai, data-science
Modelchimp
Experiment tracking for machine and deep learning projects
Stars: ✭ 121 (-40.39%)
Mutual labels:  ai, data-science
Aulas
Aulas da Escola de Inteligência Artificial de São Paulo
Stars: ✭ 166 (-18.23%)
Mutual labels:  ai, data-science
Imodels
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
Stars: ✭ 194 (-4.43%)
Mutual labels:  ai, data-science
Blurr
Data transformations for the ML era
Stars: ✭ 96 (-52.71%)
Mutual labels:  ai, data-science
Nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Stars: ✭ 10,698 (+5169.95%)
Mutual labels:  data-science, automl

Compose

"Build better training examples in a fraction of the time."

Tests ReadTheDocs PyPI Version StackOverflow PyPI Downloads


Compose is a machine learning tool for automated prediction engineering. It allows you to structure prediction problems and generate labels for supervised learning. An end user defines an outcome of interest by writing a labeling function, then runs a search to automatically extract training examples from historical data. Its result is then provided to Featuretools for automated feature engineering and subsequently to EvalML for automated machine learning. The workflow of an applied machine learning engineer then becomes:


Compose


By automating the early stage of the machine learning pipeline, our end user can easily define a task and solve it. See the documentation for more information.

Install

Compose is available on PyPI and Conda-forge for Python 3.6 or later.

pip

To install from PyPI, run the command:

pip install composeml

conda

To install from Conda-forge, run the command:

conda install -c conda-forge composeml

Example

Will a customer spend more than 300 in the next hour of transactions?

In this example, we automatically generate new training examples from a historical dataset of transactions.

import composeml as cp
df = cp.demos.load_transactions()
df = df[df.columns[:7]]
df.head()
transaction_id session_id transaction_time product_id amount customer_id device
298 1 2014-01-01 00:00:00 5 127.64 2 desktop
10 1 2014-01-01 00:09:45 5 57.39 2 desktop
495 1 2014-01-01 00:14:05 5 69.45 2 desktop
460 10 2014-01-01 02:33:50 5 123.19 2 tablet
302 10 2014-01-01 02:37:05 5 64.47 2 tablet

First, we represent the prediction problem with a labeling function and a label maker.

def total_spent(ds):
    return ds['amount'].sum()

label_maker = cp.LabelMaker(
    target_entity="customer_id",
    time_index="transaction_time",
    labeling_function=total_spent,
    window_size="1h",
)

Then, we run a search to automatically generate the training examples.

label_times = label_maker.search(
    df.sort_values('transaction_time'),
    num_examples_per_instance=2,
    minimum_data='2014-01-01',
    drop_empty=False,
    verbose=False,
)

label_times = label_times.threshold(300)
label_times.head()
customer_id time total_spent
1 2014-01-01 00:00:00 True
1 2014-01-01 01:00:00 True
2 2014-01-01 00:00:00 False
2 2014-01-01 01:00:00 False
3 2014-01-01 00:00:00 False

We now have labels that are ready to use in Featuretools to generate features.

Support

The Innovation Labs open source community is happy to provide support to users of Compose. Project support can be found in three places depending on the type of question:

  1. For usage questions, use Stack Overflow with the composeml tag.
  2. For bugs, issues, or feature requests start a Github issue.
  3. For discussion regarding development on the core library, use Slack.

Citing Compose

Compose is built upon a newly defined part of the machine learning process — prediction engineering. If you use Compose, please consider citing this paper: James Max Kanter, Gillespie, Owen, Kalyan Veeramachaneni. Label, Segment,Featurize: a cross domain framework for prediction engineering. IEEE DSAA 2016.

BibTeX entry:

@inproceedings{kanter2016label,
  title={Label, segment, featurize: a cross domain framework for prediction engineering},
  author={Kanter, James Max and Gillespie, Owen and Veeramachaneni, Kalyan},
  booktitle={2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)},
  pages={430--439},
  year={2016},
  organization={IEEE}
}

Acknowledgements

The open source development has been supported in part by DARPA's Data driven discovery of models program (D3M).

Innovation Labs

Innovation Labs

Compose has been developed and open sourced by Innovation Labs. We developed Compose to enable flexible definition of the machine learning task. To see the other open source projects we're working on visit Innovation Labs.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].