Market-Mix-ModelingMarket Mix Modelling for an eCommerce firm to estimate the impact of various marketing levers on sales
sklearn-audio-classificationAn in-depth analysis of audio classification on the RAVDESS dataset. Feature engineering, hyperparameter optimization, model evaluation, and cross-validation with a variety of ML techniques and MLP
fastknnFast k-Nearest Neighbors Classifier for Large Datasets
gallia-coreA schema-aware Scala library for data transformation
DataCon🏆DataCon大数据安全分析大赛,2019年方向二(恶意代码检测)冠军源码、2020年方向五(恶意代码分析)季军源码
hrv-analysisPackage for Heart Rate Variability analysis in Python
Predicting-Transportation-Modes-of-GPS-TrajectoriesUnderstanding transportation mode from GPS (Global Positioning System) traces is an essential topic in the data mobility domain. In this paper, a framework is proposed to predict transportation modes. This framework follows a sequence of five steps: (i) data preparation, where GPS points are grouped in trajectory samples; (ii) point features gen…
hamiltonA scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
50-days-of-Statistics-for-Data-ScienceThis repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.
dominance-analysisThis package can be used for dominance analysis or Shapley Value Regression for finding relative importance of predictors on given dataset. This library can be used for key driver analysis or marginal resource allocation models.
NVTabularNVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
AutoTabularAutomatic machine learning for tabular data. ⚡🔥⚡
zcaZCA whitening in python
anovosAnovos - An Open Source Library for Scalable feature engineering Using Apache-Spark
PubMed-Best-MatchMachine-learning based pipeline relying on LambdaMART currently used in PubMed for relevance (Best Match) searches
msdaLibrary for multi-dimensional, multi-sensor, uni/multivariate time series data analysis, unsupervised feature selection, unsupervised deep anomaly detection, and prototype of explainable AI for anomaly detector
EvolutionaryForestAn open source python library for automated feature engineering based on Genetic Programming
skrobotskrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.
AutoTSAutomated Time Series Forecasting
clinkClink is a library that provides APIs and infrastructure to facilitate the development of parallelizable feature engineering operators that can be used in both C++ and Java runtime.
fengfeng - feature engineering for machine-learning champions
FIFA-2019-AnalysisThis is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations
Data-ScienceUsing Kaggle Data and Real World Data for Data Science and prediction in Python, R, Excel, Power BI, and Tableau.
tsflexFlexible time series feature extraction & processing