All Projects → empathy87 → The Elements Of Statistical Learning Python Notebooks

empathy87 / The Elements Of Statistical Learning Python Notebooks

A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to The Elements Of Statistical Learning Python Notebooks

My Journey In The Data Science World
📢 Ready to learn or review your knowledge!
Stars: ✭ 1,175 (+190.12%)
Mutual labels:  jupyter-notebook, data-science, data-analysis, sklearn
Data Science Resources
👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (-57.78%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Articles
A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci
Stars: ✭ 350 (-13.58%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Dtale
Visualizer for pandas data structures
Stars: ✭ 2,864 (+607.16%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Datasciencevm
Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
Stars: ✭ 153 (-62.22%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Covid19 Severity Prediction
Extensive and accessible COVID-19 data + forecasting for counties and hospitals. 📈
Stars: ✭ 170 (-58.02%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Data Science
Collection of useful data science topics along with code and articles
Stars: ✭ 315 (-22.22%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Data Analysis
主要是爬虫与数据分析项目总结,外加建模与机器学习,模型的评估。
Stars: ✭ 142 (-64.94%)
Mutual labels:  jupyter-notebook, data-analysis, sklearn
Deep Learning Machine Learning Stock
Stock for Deep Learning and Machine Learning
Stars: ✭ 240 (-40.74%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Data Science Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (-32.59%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Cryptocurrency Analysis Python
Open-Source Tutorial For Analyzing and Visualizing Cryptocurrency Data
Stars: ✭ 278 (-31.36%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Ml Workspace
🛠 All-in-one web-based IDE specialized for machine learning and data science.
Stars: ✭ 2,337 (+477.04%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Data Science Portfolio
A Portfolio of my Data Science Projects
Stars: ✭ 149 (-63.21%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Quantitative Notebooks
Educational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy
Stars: ✭ 356 (-12.1%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Machine learning for good
Machine learning fundamentals lesson in interactive notebooks
Stars: ✭ 142 (-64.94%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
100 Days Of Ml Code
A day to day plan for this challenge. Covers both theoritical and practical aspects
Stars: ✭ 172 (-57.53%)
Mutual labels:  jupyter-notebook, data-science, tutorials
Sklearn Evaluation
Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.
Stars: ✭ 294 (-27.41%)
Mutual labels:  jupyter-notebook, data-science, sklearn
Datasist
A Python library for easy data analysis, visualization, exploration and modeling
Stars: ✭ 123 (-69.63%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Youtube Like Predictor
YouTube Like Count Predictions using Machine Learning
Stars: ✭ 137 (-66.17%)
Mutual labels:  jupyter-notebook, data-science, data-analysis
Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (-46.17%)
Mutual labels:  jupyter-notebook, data-science, data-analysis

"The Elements of Statistical Learning" Notebooks

Reproducing examples from the "The Elements of Statistical Learning" by Trevor Hastie, Robert Tibshirani and Jerome Friedman with Python and its popular libraries: numpy, math, scipy, sklearn, pandas, tensorflow, statsmodels, sympy, catboost, pyearth, mlxtend, cvxpy. Almost all plotting is done using matplotlib, sometimes using seaborn.

Examples

The documented Jupyter Notebooks are in the examples folder:

examples/Mixture.ipynb

Classifying the points from a mixture of "gaussians" using linear regression, nearest-neighbor, logistic regression with natural cubic splines basis expansion, neural networks, support vector machines, flexible discriminant analysis over MARS regression, mixture discriminant analysis, k-Means clustering, Gaussian mixture model and random forests.

alt

examples/Prostate Cancer.ipynb

Predicting prostate specific antigen using ordinary least squares, ridge/lasso regularized linear regression, principal components regression, partial least squares and best subset regression. Model parameters are selected by K-folds cross-validation.

alt

examples/South African Heart Disease.ipynb

Understanding the risk factors using logistic regression, L1 regularized logistic regression, natural cubic splines basis expansion for nonlinearities, thin-plate spline for mutual dependency, local logistic regression, kernel density estimation and gaussian mixture models.

alt

examples/Vowel.ipynb

Vowel speech recognition using regression of an indicator matrix, linear/quadratic/regularized/reduced-rank discriminant analysis and logistic regression.

alt

examples/Bone Mineral Density.ipynb

Comparing patterns of bone mineral density relative change for men and women using smoothing splines.

alt

examples/Air Pollution Data.ipynb

Analysing Los Angeles pollution data using smoothing splines.

alt

examples/Phoneme Recognition.ipynb

Phonemes speech recognition using reduced flexibility logistic regression.

alt

examples/Galaxy.ipynb

Analysing radial velocity of galaxy NGC7531 using local regression in multidimentional space.

alt

examples/Ozone.ipynb

Analysing the factors influencing ozone concentration using local regression and trellis plot.

alt

examples/Spam.ipynb

Detecting email spam using logistic regression, generalized additive logistic model, decision tree, multivariate adaptive regression splines, boosting and random forest.

alt

examples/California Housing.ipynb

Analysing the factors influencing California houses prices using boosting over decision trees and partial dependance plots.

alt

examples/Demographics.ipynb

Predicting shopping mall customers occupation, and hence identifying demographic variables that discriminate between different occupational categories using boosting and market basket analysis.

alt

examples/ZIP Code.ipynb

Recognizing small hand-drawn digits using LeCun's Net-1 - Net-5 neural networks.

alt

Analysing of the number three variation in ZIP codes using principal component and archetypal analysis.

alt

examples/Human Tumor Microarray Data.ipynb

Analysing microarray data using K-means clustring and hierarchical clustering.

alt

examples/Country Dissimilarities.ipynb

Analysing country dissimilarities using K-medoids clustering and multidimensional scaling.

alt

examples/Signature.ipynb

Analysing signature shapes using Procrustes transformation.

alt

examples/Waveform.ipynb

Recognizing wave classes using linear, quadratic, flexible (over MARS regression), mixture discriminant analysis and decision trees.

alt

examples/Protein Flow-Cytometry.ipynb

Analysing protein flow-cytometry data using graphical-lasso undirected graphical model for continuous variables.

alt

examples/SRBCT Microarray.ipynb

Analysing microarray data of 2308 genes and selecting the most significant genes for cancer classification using nearest shrunken centroids.

alt

examples/14 Cancer Microarray.ipynb

Analysing microarray data of 16,063 genes gathered by Ramaswamy et al. (2001) and selecting the most significant genes for cancer classification using nearest shrunken centroids, L2-penalized discriminant analysis, support vector classifier, k-nearest neighbors, L2-penalized multinominal, L1-penalized multinominal and elastic-net penalized multinominal. It is a difficult classification problem with p>>N (only 144 training observations).

examples/Skin of the Orange.ipynb

Solving a synthetic classification problem using Support Vector Machines and multivariate adaptive regression splines to show the influence of additional noise features.

examples/Radiation Sensitivity.ipynb

Assessing the significance of 12,625 genes from microarray study of radiation sensitivity using Benjamini-Hochberg method and the significane analysis of microarrays (SAM) approach.

alt

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].