All Categories → Machine Learning → feature-engineering

Top 93 feature-engineering open source projects

Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Nyaggle
Code for Kaggle and Offline Competitions
Tsfel
An intuitive library to extract features from time series
Geomancer
Automated feature engineering for geospatial data
Fe4ml Zh
📖 [译] 面向机器学习的特征工程
Hanzi char featurizer
汉字字符特征提取器 (featurizer),提取汉字的特征(发音特征、字形特征)用做深度学习的特征 | A Chinese character feature extractor, which extracts the features of Chinese characters (pronunciation features, glyph features) as features for deep learning
Autofeat
Linear Prediction Model with Automated Feature Engineering and Selection Capabilities
Transmogrifai
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Remixautoml
R package for automation of machine learning, forecasting, feature engineering, model evaluation, model interpretation, data generation, and recommenders.
Machine Learning Workflow With Python
This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation
Albedo
A recommender system for discovering GitHub repos, built with Apache Spark
Ppdai risk evaluation
“魔镜杯”风控算法大赛 拍拍贷风控模型,接近冠军分数
Datasist
A Python library for easy data analysis, visualization, exploration and modeling
The Building Data Genome Project
A collection of non-residential buildings for performance analysis and algorithm benchmarking
Kaggle Competitions
There are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Home Credit Default Risk
Default risk prediction for Home Credit competition - Fast, scalable and maintainable SQL-based feature engineering pipeline
Awesome Feature Engineering
A curated list of feature engineering techniques for image and text machine learning
Drugs Recommendation Using Reviews
Analyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.
Feagen
(deprecated) A fast and memory-efficient Python data engineering framework for machine learning.
Protr
Comprehensive toolkit for generating various numerical features of protein sequences
Sgx Full Orderbook Tick Data Trading Strategy
Providing the solutions for high-frequency trading (HFT) strategies using data science approaches (Machine Learning) on Full Orderbook Tick Data.
Kaggle Quora Question Pairs
Kaggle:Quora Question Pairs, 4th/3396 (https://www.kaggle.com/c/quora-question-pairs)
Kaggler
Code for Kaggle Data Science Competitions
Feature Selection
Features selector based on the self selected-algorithm, loss function and validation method
Feature Engineering And Feature Selection
A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.
Awesome Feature Engineering
A curated list of resources dedicated to Feature Engineering Techniques for Machine Learning
Open source demos
A collection of demos showcasing automated feature engineering and machine learning in diverse use cases
Nlpython
This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
cortana-intelligence-customer360
This repository contains instructions and code to deploy a customer 360 profile solution on Azure stack using the Cortana Intelligence Suite.
mistql
A miniature lisp-like language for querying JSON-like structures. Tuned for clientside ML feature extraction.
featurewiz
Use advanced feature engineering strategies and select best features from your data set with a single line of code.
EngineX
Engine X - 实时AI智能决策引擎、规则引擎、风控引擎、数据流引擎。 通过可视化界面进行规则配置,无需繁琐开发,节约人力,提升效率,实时监控,减少错误率,随时调整; 支持规则集、评分卡、决策树,名单库管理、机器学习模型、三方数据接入、定制化开发等;
1-60 of 93 feature-engineering projects