All Projects → predict-idlab → tsflex

predict-idlab / tsflex

Licence: MIT license
Flexible time series feature extraction & processing

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to tsflex

Deltapy
DeltaPy - Tabular Data Augmentation (by @firmai)
Stars: ✭ 344 (+36.51%)
Mutual labels:  feature-extraction, feature-engineering
Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (-13.49%)
Mutual labels:  feature-extraction, feature-engineering
Awesome Feature Engineering
A curated list of resources dedicated to Feature Engineering Techniques for Machine Learning
Stars: ✭ 433 (+71.83%)
Mutual labels:  feature-extraction, feature-engineering
mistql
A miniature lisp-like language for querying JSON-like structures. Tuned for clientside ML feature extraction.
Stars: ✭ 260 (+3.17%)
Mutual labels:  feature-extraction, feature-engineering
Nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Stars: ✭ 10,698 (+4145.24%)
Mutual labels:  feature-extraction, feature-engineering
feature engine
Feature engineering package with sklearn like functionality
Stars: ✭ 758 (+200.79%)
Mutual labels:  feature-extraction, feature-engineering
Feature Selection
Features selector based on the self selected-algorithm, loss function and validation method
Stars: ✭ 534 (+111.9%)
Mutual labels:  feature-extraction, feature-engineering
fastknn
Fast k-Nearest Neighbors Classifier for Large Datasets
Stars: ✭ 64 (-74.6%)
Mutual labels:  feature-extraction, feature-engineering
Blurr
Data transformations for the ML era
Stars: ✭ 96 (-61.9%)
Mutual labels:  feature-extraction, feature-engineering
Kaggle Competitions
There are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (-65.87%)
Mutual labels:  feature-extraction, feature-engineering
Machine Learning Workflow With Python
This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation
Stars: ✭ 157 (-37.7%)
Mutual labels:  feature-extraction, feature-engineering
Deep Learning Machine Learning Stock
Stock for Deep Learning and Machine Learning
Stars: ✭ 240 (-4.76%)
Mutual labels:  feature-extraction, feature-engineering
gan tensorflow
Automatic feature engineering using Generative Adversarial Networks using TensorFlow.
Stars: ✭ 48 (-80.95%)
Mutual labels:  feature-extraction, feature-engineering
Nlpython
This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
Stars: ✭ 265 (+5.16%)
Mutual labels:  feature-extraction, feature-engineering
featurewiz
Use advanced feature engineering strategies and select best features from your data set with a single line of code.
Stars: ✭ 229 (-9.13%)
Mutual labels:  feature-extraction, feature-engineering
Feature Engineering And Feature Selection
A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.
Stars: ✭ 526 (+108.73%)
Mutual labels:  feature-extraction, feature-engineering
Bike-Sharing-Demand-Kaggle
Top 5th percentile solution to the Kaggle knowledge problem - Bike Sharing Demand
Stars: ✭ 33 (-86.9%)
Mutual labels:  feature-extraction, feature-engineering
autoencoders tensorflow
Automatic feature engineering using deep learning and Bayesian inference using TensorFlow.
Stars: ✭ 66 (-73.81%)
Mutual labels:  feature-extraction, feature-engineering
Protr
Comprehensive toolkit for generating various numerical features of protein sequences
Stars: ✭ 30 (-88.1%)
Mutual labels:  feature-extraction, feature-engineering
The Building Data Genome Project
A collection of non-residential buildings for performance analysis and algorithm benchmarking
Stars: ✭ 117 (-53.57%)
Mutual labels:  feature-extraction, feature-engineering

tsflex

PyPI Latest Release Conda Latest Release support-version codecov Downloads PRs Welcome Documentation Testing

tsflex is a toolkit for flexible time series processing & feature extraction, that is efficient and makes few assumptions about sequence data.

Useful links

Installation

command
pip pip install tsflex
conda conda install -c conda-forge tsflex

Usage

tsflex is built to be intuitive, so we encourage you to copy-paste this code and toy with some parameters!

Feature extraction

import pandas as pd; import numpy as np; import scipy.stats as ss
from tsflex.features import MultipleFeatureDescriptors, FeatureCollection
from tsflex.utils.data import load_empatica_data

# 1. Load sequence-indexed data (in this case a time-index)
df_tmp, df_acc, df_ibi = load_empatica_data(['tmp', 'acc', 'ibi'])

# 2. Construct your feature extraction configuration
fc = FeatureCollection(
    MultipleFeatureDescriptors(
          functions=[np.min, np.mean, np.std, ss.skew, ss.kurtosis],
          series_names=["TMP", "ACC_x", "ACC_y", "IBI"],
          windows=["15min", "30min"],
          strides="15min",
    )
)

# 3. Extract features
fc.calculate(data=[df_tmp, df_acc, df_ibi], approve_sparsity=True)

Note that the feature extraction is performed on multivariate data with varying sample rates.

signal columns sample rate
df_tmp ["TMP"] 4Hz
df_acc ["ACC_x", "ACC_y", "ACC_z" ] 32Hz
df_ibi ["IBI"] irregularly sampled

Processing

Working example in our docs

Why tsflex?

  • Flexible:
  • Efficient:
  • Intuitive:
    • maintains the sequence-index of the data
    • feature extraction constructs interpretable output column names
    • intuitive API
  • Few assumptions about the sequence data:
    • no assumptions about sampling rate
    • able to deal with multivariate asynchronous data
      i.e. data with small time-offsets between the modalities
  • Advanced functionalities:

¹ These integrations are shown in integration-example notebooks.

Future work 🔨

  • scikit-learn integration for both processing and feature extraction
    note: is actively developed upon sklearn integration branch.
  • Support time series segmentation (exposing under the hood strided-rolling functionality) - see this issue
  • Support for multi-indexed dataframes

=> Also see the enhancement issues

Contributing 👪

We are thrilled to see your contributions to further enhance tsflex.
See this guide for more instructions on how to contribute.

Referencing our package

If you use tsflex in a scientific publication, we would highly appreciate citing us as:

@article{vanderdonckt2021tsflex,
    author = {Van Der Donckt, Jonas and Van Der Donckt, Jeroen and Deprost, Emiel and Van Hoecke, Sofie},
    title = {tsflex: flexible time series processing \& feature extraction},
    journal = {SoftwareX},
    year = {2021},
    url = {https://github.com/predict-idlab/tsflex},
    publisher={Elsevier}
}

Link to the paper: https://www.sciencedirect.com/science/article/pii/S2352711021001904


👤 Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].