komal11lamba / 50-days-of-Statistics-for-Data-Science

Licence: other

This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.

Programming Languages

Jupyter Notebook

11667 projects

Projects that are alternatives of or similar to 50-days-of-Statistics-for-Data-Science

featurewiz

Use advanced feature engineering strategies and select best features from your data set with a single line of code.

Stars: ✭ 229 (+1105.26%)

Mutual labels: feature-selection, feature-extraction, feature-engineering

Market-Mix-Modeling

Market Mix Modelling for an eCommerce firm to estimate the impact of various marketing levers on sales

Stars: ✭ 31 (+63.16%)

Mutual labels: eda, feature-selection, feature-engineering

feature engine

Feature engineering package with sklearn like functionality

Stars: ✭ 758 (+3889.47%)

Mutual labels: feature-selection, feature-extraction, feature-engineering

FIFA-2019-Analysis

This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations

Stars: ✭ 28 (+47.37%)

Mutual labels: eda, feature-selection, feature-engineering

exemplary-ml-pipeline

Exemplary, annotated machine learning pipeline for any tabular data problem.

Stars: ✭ 23 (+21.05%)

Mutual labels: feature-selection, feature-engineering, feature-scaling

dominance-analysis

This package can be used for dominance analysis or Shapley Value Regression for finding relative importance of predictors on given dataset. This library can be used for key driver analysis or marginal resource allocation models.

Stars: ✭ 111 (+484.21%)

Mutual labels: feature-selection, feature-engineering

Deep Learning Machine Learning Stock

Stock for Deep Learning and Machine Learning

Stars: ✭ 240 (+1163.16%)

Mutual labels: feature-extraction, feature-engineering

NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

Stars: ✭ 797 (+4094.74%)

Mutual labels: feature-selection, feature-engineering

The Building Data Genome Project

A collection of non-residential buildings for performance analysis and algorithm benchmarking

Stars: ✭ 117 (+515.79%)

Mutual labels: feature-extraction, feature-engineering

pyHSICLasso

Versatile Nonlinear Feature Selection Algorithm for High-dimensional Data

Stars: ✭ 125 (+557.89%)

Mutual labels: feature-selection, feature-extraction

My Journey In The Data Science World

📢 Ready to learn or review your knowledge!

Stars: ✭ 1,175 (+6084.21%)

Mutual labels: eda, feature-extraction

Amazing Feature Engineering

Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.

Stars: ✭ 218 (+1047.37%)

Mutual labels: feature-extraction, feature-engineering

Tsfel

An intuitive library to extract features from time series

Stars: ✭ 202 (+963.16%)

Mutual labels: feature-extraction, feature-engineering

Machine Learning Workflow With Python

This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation

Stars: ✭ 157 (+726.32%)

Mutual labels: feature-extraction, feature-engineering

tsflex

Flexible time series feature extraction & processing

Stars: ✭ 252 (+1226.32%)

Mutual labels: feature-extraction, feature-engineering

Data-Science

Using Kaggle Data and Real World Data for Data Science and prediction in Python, R, Excel, Power BI, and Tableau.

Stars: ✭ 15 (-21.05%)

Mutual labels: dimensionality-reduction, feature-engineering

Blurr

Data transformations for the ML era

Stars: ✭ 96 (+405.26%)

Mutual labels: feature-extraction, feature-engineering

Nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Stars: ✭ 10,698 (+56205.26%)

Mutual labels: feature-extraction, feature-engineering

Complete Life Cycle Of A Data Science Project

Complete-Life-Cycle-of-a-Data-Science-Project

Stars: ✭ 140 (+636.84%)

Mutual labels: eda, feature-engineering

skrobot

skrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.

Stars: ✭ 22 (+15.79%)

Mutual labels: feature-selection, feature-engineering

View All Similar Projects ➔

50-days-of-Statistics-for-Data-Science

This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.

Sr No	Notebook Topic	Colab
1	Elements of Structured Data
2	Rectangular Data
3	Estimates of Location
4	Estimates of Variability
5	Exploring the Data Distribution
6	Exploring Binary and Categorical Data
7	Correlation
8	Exploring Two or More Variables
9	Random Sampling and Sample Bias
10	Selection Bias
11	Sampling Distribution of a Statistic
12	The Bootstrap
13	Confidence Intervals
14	Normal Distribution
15	Long-Tailed Distributions
16	Student’s t-Distribution
17	Binomial Distribution
18	Chi-Square Distribution
19	F-Distribution
20	Poisson and Related Distributions
21	A/B Testing
22	Hypothesis Tests
23	Resampling
24	Statistical Significance and p-Values
25	t-Tests
26	Multiple Testing
27	Degrees of Freedom
28	ANOVA
29	Chi-Square Test
30	Multi-Arm Bandit Algorithm
31	Power and Sample Size
32	Simple Linear Regression
33	Multiple Linear Regression
34	Prediction Using Regression
35	Factor Variables in Regression
36	Interpreting the Regression Equation
37	Regression Diagnostics
38	Polynomial and Spline Regression
39	Naïve Bayes
40	Discriminant Analysis
41	Logistic Regression
42	Evaluating Classification Models
43	Strategies for Imbalanced Data
44	K-Nearest Neighbors
45	Tree Models
46	Bagging and the Random Forest
47	Boosting
48	Principal Components Analysis
49	K-Means Clustering
50	Hierarchical Clustering
51	Model-Based Clustering
52	Scaling and Categorical Variables

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

komal11lamba / 50-days-of-Statistics-for-Data-Science

Programming Languages

Labels

Projects that are alternatives of or similar to 50-days-of-Statistics-for-Data-Science

50-days-of-Statistics-for-Data-Science