50-days-of-Statistics-for-Data-ScienceThis repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.
Stars: ✭ 19 (-91.7%)
exemplary-ml-pipelineExemplary, annotated machine learning pipeline for any tabular data problem.
Stars: ✭ 23 (-89.96%)
feature engineFeature engineering package with sklearn like functionality
Stars: ✭ 758 (+231%)
tsflexFlexible time series feature extraction & processing
Stars: ✭ 252 (+10.04%)
Feature SelectionFeatures selector based on the self selected-algorithm, loss function and validation method
Stars: ✭ 534 (+133.19%)
FIFA-2019-AnalysisThis is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations
Stars: ✭ 28 (-87.77%)
RemixautomlR package for automation of machine learning, forecasting, feature engineering, model evaluation, model interpretation, data generation, and recommenders.
Stars: ✭ 159 (-30.57%)
Auto ml[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (+580.79%)
NlpythonThis repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
Stars: ✭ 265 (+15.72%)
DeltapyDeltaPy - Tabular Data Augmentation (by @firmai)
Stars: ✭ 344 (+50.22%)
NniAn open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Stars: ✭ 10,698 (+4571.62%)
fastknnFast k-Nearest Neighbors Classifier for Large Datasets
Stars: ✭ 64 (-72.05%)
BlurrData transformations for the ML era
Stars: ✭ 96 (-58.08%)
kaggle-berlinMaterial of the Kaggle Berlin meetup group!
Stars: ✭ 36 (-84.28%)
Machine Learning Workflow With PythonThis is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation
Stars: ✭ 157 (-31.44%)
Mljar SupervisedAutomated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀
Stars: ✭ 961 (+319.65%)
TpotA Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Stars: ✭ 8,378 (+3558.52%)
mistqlA miniature lisp-like language for querying JSON-like structures. Tuned for clientside ML feature extraction.
Stars: ✭ 260 (+13.54%)
HyperactiveA hyperparameter optimization and data collection toolbox for convenient and fast prototyping of machine-learning models.
Stars: ✭ 182 (-20.52%)
pyHSICLassoVersatile Nonlinear Feature Selection Algorithm for High-dimensional Data
Stars: ✭ 125 (-45.41%)
Kaggle CompetitionsThere are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (-62.45%)
TsfelAn intuitive library to extract features from time series
Stars: ✭ 202 (-11.79%)
Amazing Feature EngineeringFeature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (-4.8%)
Market-Mix-ModelingMarket Mix Modelling for an eCommerce firm to estimate the impact of various marketing levers on sales
Stars: ✭ 31 (-86.46%)
skrobotskrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.
Stars: ✭ 22 (-90.39%)
ProtrComprehensive toolkit for generating various numerical features of protein sequences
Stars: ✭ 30 (-86.9%)
autoencoders tensorflowAutomatic feature engineering using deep learning and Bayesian inference using TensorFlow.
Stars: ✭ 66 (-71.18%)
AlinkAlink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
Stars: ✭ 2,936 (+1182.1%)
msdaLibrary for multi-dimensional, multi-sensor, uni/multivariate time series data analysis, unsupervised feature selection, unsupervised deep anomaly detection, and prototype of explainable AI for anomaly detector
Stars: ✭ 80 (-65.07%)
Hyperparameter hunterEasy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (+182.97%)
Home Credit Default RiskDefault risk prediction for Home Credit competition - Fast, scalable and maintainable SQL-based feature engineering pipeline
Stars: ✭ 68 (-70.31%)
gan tensorflowAutomatic feature engineering using Generative Adversarial Networks using TensorFlow.
Stars: ✭ 48 (-79.04%)
Awesome Feature EngineeringA curated list of resources dedicated to Feature Engineering Techniques for Machine Learning
Stars: ✭ 433 (+89.08%)
NVTabularNVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
Stars: ✭ 797 (+248.03%)
AutoTabularAutomatic machine learning for tabular data. ⚡🔥⚡
Stars: ✭ 51 (-77.73%)
dominance-analysisThis package can be used for dominance analysis or Shapley Value Regression for finding relative importance of predictors on given dataset. This library can be used for key driver analysis or marginal resource allocation models.
Stars: ✭ 111 (-51.53%)
FEASTA FEAture Selection Toolbox for C/C+, Java, and Matlab/Octave.
Stars: ✭ 67 (-70.74%)
kserveServerless Inferencing on Kubernetes
Stars: ✭ 1,621 (+607.86%)
Predicting-Transportation-Modes-of-GPS-TrajectoriesUnderstanding transportation mode from GPS (Global Positioning System) traces is an essential topic in the data mobility domain. In this paper, a framework is proposed to predict transportation modes. This framework follows a sequence of five steps: (i) data preparation, where GPS points are grouped in trajectory samples; (ii) point features gen…
Stars: ✭ 37 (-83.84%)
Kaggle-Competition-SberbankTop 1% rankings (22/3270) code sharing for Kaggle competition Sberbank Russian Housing Market: https://www.kaggle.com/c/sberbank-russian-housing-market
Stars: ✭ 31 (-86.46%)
Deep-LearningThis repo provides projects on deep-learning mainly using Tensorflow 2.0
Stars: ✭ 22 (-90.39%)
towheeTowhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Stars: ✭ 821 (+258.52%)
imsearchFramework to build your own reverse image search engine
Stars: ✭ 64 (-72.05%)
lung-image-analysisA basic framework for pulmonary nodule detection and characterization in CT
Stars: ✭ 26 (-88.65%)
laravel-rolloutA package to integrate rollout into your Laravel project.
Stars: ✭ 23 (-89.96%)
EngineXEngine X - 实时AI智能决策引擎、规则引擎、风控引擎、数据流引擎。 通过可视化界面进行规则配置,无需繁琐开发,节约人力,提升效率,实时监控,减少错误率,随时调整; 支持规则集、评分卡、决策树,名单库管理、机器学习模型、三方数据接入、定制化开发等;
Stars: ✭ 369 (+61.14%)
sklearn-audio-classificationAn in-depth analysis of audio classification on the RAVDESS dataset. Feature engineering, hyperparameter optimization, model evaluation, and cross-validation with a variety of ML techniques and MLP
Stars: ✭ 31 (-86.46%)
recsys2019The complete code and notebooks used for the ACM Recommender Systems Challenge 2019
Stars: ✭ 26 (-88.65%)
mloperatorMachine Learning Operator & Controller for Kubernetes
Stars: ✭ 85 (-62.88%)
woodworkWoodwork is a Python library that provides robust methods for managing and communicating data typing information.
Stars: ✭ 97 (-57.64%)
gallia-coreA schema-aware Scala library for data transformation
Stars: ✭ 44 (-80.79%)