msalibian / Stat406
Labels
Projects that are alternatives of or similar to Stat406
STAT406 - "Elements of Statistical Learning"
Public repository for STAT406 @ UBC - "Elements of Statistical Learning".
LICENSE
The notes in this repository are released under the "Creative Commons Attribution-ShareAlike 4.0 International" license. See the human-readable version here and the real thing here.
Course outline
The course syllabus is here.
Tentative weekly schedule
The tentative week-by-week schedule is here.
Weekly reading
This is a list of strongly recommended pre-class reading. [JWHT13] and [HTF09] indicate two of the reference books listed below. The list will be updated / edited as the Term progresses. Make sure you double check the recommended pre-class reading approximately one week in advance, as it may have changed.
- Week 1 (L1): Review of Linear Regression
- Sections 2.1, 2.1.1, 2.1.2, 2.1.3, 2.2, 2.2.1 from [JWHT13]
- Sections 2.4 and 2.6 from [HTF09].
- Week 2 (L2/3): Goodness of Fit vs Prediction error, Cross Validation
- Sections 5.1, 5.1.1, 5.1.2, 5.1.3 from [JWHT13]
- Sections 7.1, 7.2, 7.3, 7.10 from [HTF09].
- Week 3 (L4/5): Correlated predictors, Feature selection, AIC
- Sections 6.1, 6.1.1, 6.1.2, 6.1.3, 6.2 and 6.2.1 from [JWHT13]
- Sections 7.4, 7.5 from [HTF09].
- Week 4 (L6/MT1): Ridge regression, LASSO, Elastic Net
- Sections 6.2 (complete) from [JWHT13]
- Sections 3.4, 3.8, 3.8.1, 3.8.2 from [HTF09]
- Week 5 (L7/8): Elastic Net, Smoothers (Local regression, Splines)
- Sections 7.1, 7.3, 7.4, 7.5, 7.6 from [JWHT13]
- Week 6 (L9/10): Curse of dimensionality, Regression Trees
- Sections 8.1, 8.1.1, 8.1.3, 8.1.4 from [JWHT13]
- Week 7 (L11/MT2): Bagging
- Sections 8.2, 8.2.1 from [JWHT13]
- Week 8 (L12/13): Classification, LDA, LQA, Logistic Regression
- Section 4.1, 4.2, 4.3, 4.4, 2.2.3 from [JWHT13]
- Week 9 (L14/15): Trees, Ensembles, Bagging
- Sections 8.1.2, 8.2.1 and 8.2.2 from [JWHT13]
- Week 10 (L16/MT3): Random Forests
- Sections 8.2.1 and 8.2.2 from [JWHT13]
- Week 11 (L17/18): Boosting, Neural Networks?
- Sections 8.2.3 from [JWHT13]
- Sections 10.1 - 10.10 (except 10.7), 11.3 - 11.5, 11.7 from [HTF09]
- Week 12 (L19/20): Unsupervised learning, K-means, model-based clustering
- Sections 10.3 from [JWHT13]
- Sections 13.2, 14.3 from [HTF09]
- Week 13 (L21/L22
MT4): Hierarchical clustering, Principal Components, Multidimensional Scaling- Sections 10.2, 10.3 from [JWHT13]
- Sections 8.5, 14.3, 14.5.1, 14.8, 14.9 from [HTF09]
Reference books
-
[JWHT13]: James, G., Witten, D., Hastie, T. and Tibshirani, R. An Introduction to Statistical Learning. 2013. Springer-Verlag New York
-
[HTF09]: Hastie, T., Tibshirani, R. and Friedman, J. The Elements of Statistical Learning. 2009. Second Edition. Springer-Verlag New York
-
[MASS]: Venables, W.N. and Ripley, B.D. Modern Applied Statistics with S. 2002. Fourth edition, Springer, New York.
PIAZZA and WebWork
- You can register in the course's PIAZZA page via Canvas.
- In order to use WebWork to practice with the quizzes you need to ... (more to come later).
Useful tools
- R: This is the software we will use in the course. I will assume that you are familiar with it (in particular, that you know how to write your own functions and loops). If needed, there are plenty of resources on line to learn R.
- RStudio: The IDE (integrated development environment) of choice for R. Not necessary, but helpful.
-
Jupyter Notebooks. "The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text."
You can use these to interactively run and play with the lecture notes and the code to reproduce all the examples I use in class. This is not necessary, but may be helpful. There are two options to run notebooks: (1) locally on your own computer; or (2) on a remote server:
- Follow the instructions
here to install Jupyter on your laptop. You will also need to follow these instructions to install the
R kernel
for Jupyter. - Alternatively, you can run the notebooks on the syzygy server. There are Julia, Python 2, Python 3, and R kernels available (although we will only use the R one). Sign in with your UBC CWL. Once you are logged in, use this link to clone this repository (STAT406) (including all notebooks) directly onto your syzygy home directory. You will need to do this regularly throughout the Term, as the notebooks may (will?) change during the Term.
- Follow the instructions
here to install Jupyter on your laptop. You will also need to follow these instructions to install the