All Projects โ†’ DaWe1992 โ†’ Applied_ML_Fundamentals

DaWe1992 / Applied_ML_Fundamentals

Licence: other
๐Ÿ“” DHBW Lecture Notes "Applied ML Fundamentals" ๐Ÿค–

Programming Languages

TeX
3793 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Applied ML Fundamentals

Awesome Math
A curated list of awesome mathematics resources
Stars: โœญ 5,452 (+16937.5%)
Mutual labels:  lecture-notes
reinforcement learning course materials
Lecture notes, tutorial tasks including solutions as well as online videos for the reinforcement learning course hosted by Paderborn University
Stars: โœญ 765 (+2290.63%)
Mutual labels:  lecture-notes
computational-intelligence
Spare material for Computational Intelligence 01URROV @ Politecnico di Torino
Stars: โœญ 20 (-37.5%)
Mutual labels:  lecture-notes
ComputationalMathematics
Lecture slides and homework assignments for MA5233 Computational Mathematics at the National University of Singapore.
Stars: โœญ 35 (+9.38%)
Mutual labels:  lecture-notes
2018-datascience-lectures
Lecture content for Intro to Data Science 2018
Stars: โœญ 32 (+0%)
Mutual labels:  lecture-notes
pandoc-lecture
This project defines a skeleton repo for creating lecture slides and handouts including lecture notes out of Pandoc Markdown (http://pandoc.org/MANUAL.html) using a single source approach.
Stars: โœญ 72 (+125%)
Mutual labels:  lecture-notes
Awesome-Math-Learning
๐Ÿ“œ Collection of the most awesome Math learning resources in the form of notes, videos and cheatsheets.
Stars: โœญ 73 (+128.13%)
Mutual labels:  lecture-notes
LaTeX-lecture-notes-class
A LaTeX document class for lecture notes (for a seminar, for an entire course with several lectures, or for brief talks) that looks great and works even with basic pdflatex.
Stars: โœญ 58 (+81.25%)
Mutual labels:  lecture-notes
5
Lectures and computer labs storage for IW5 course at FIT VUT.
Stars: โœญ 32 (+0%)
Mutual labels:  lecture-notes
fm-notes
Unassorted scribbles on formal methods, type theory, category theory, and so on, and so on
Stars: โœญ 19 (-40.62%)
Mutual labels:  lecture-notes
fastai-num-linalg-v2-zh
๐Ÿ“– [่ฏ‘] fast.ai ๆ•ฐๅ€ผ็บฟๆ€งไปฃๆ•ฐ่ฎฒไน‰ v2
Stars: โœญ 72 (+125%)
Mutual labels:  lecture-notes
android-lecture
android lecture notes
Stars: โœญ 25 (-21.87%)
Mutual labels:  lecture-notes
nus-notes-cheatsheets
Notes and cheatsheets from NUS modules taken as part of the Computer Science curriculum.
Stars: โœญ 97 (+203.13%)
Mutual labels:  lecture-notes
Mit Deep Learning Book Pdf
MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
Stars: โœญ 9,859 (+30709.38%)
Mutual labels:  lecture-notes
compsci
Lecture notes, projects, and more resources on the courses I attended for my Bachelor's and Master's degrees in Computer Science
Stars: โœญ 15 (-53.12%)
Mutual labels:  lecture-notes
CPP
Lecture notes, projects and other materials for Course 'CS205 C/C++ Program Design' at Southern University of Science and Technology.
Stars: โœญ 373 (+1065.63%)
Mutual labels:  lecture-notes
learningspoons
nlp lecture-notes and source code
Stars: โœญ 29 (-9.37%)
Mutual labels:  lecture-notes
nyu-mlif-notes
๐Ÿ“– NYU ้‡‘่žๆœบๅ™จๅญฆไน  ไธญๆ–‡็ฌ”่ฎฐ
Stars: โœญ 75 (+134.38%)
Mutual labels:  lecture-notes
Econ5121A
Econ5121A@CUHK. This is an open-source writing project.
Stars: โœญ 56 (+75%)
Mutual labels:  lecture-notes
theolog-ss2017
Notizen zur TheoLog-Vorlesung mit Begriffen aus Formale Systeme. Hinweis: die Unterlagen sind fรผr die VL in 2017 und kรถnnen Fehler enthalten
Stars: โœญ 18 (-43.75%)
Mutual labels:  lecture-notes

๐Ÿ“” Applied Machine Learning Fundamentals (Lecture) ๐Ÿค–

'We are drowning in information and starving for knowledge.' โ€“ John Naisbitt

Machine learning / data science is a subfield of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs which can access data and use it to learn for themselves. A machine learning algorithm learns by building a mathematical / statistical model from the data. This model can then be used for inference and decision making. Machine learning has become an integral part of many modern applications. It is a cross-topic discipline which comprises computer science, math / statistics as well as domain and business knowledge:

The lecture 'Applied Machine Learning Fundamentals' is supposed to give a general introduction into state-of-the-art machine learning algorithms and their applications. This Readme file provides you with all necessary information. It is structured as follows:

  1. ๐Ÿ“œ Lecture contents
  2. โœ’๏ธ Assignments
  3. ๐Ÿ“ Exam
  4. ๐Ÿ Python code
  5. ๐Ÿ“š Literature and recommended reading
  6. ๐Ÿ“ Data exploration project
  7. ๐ŸŽ Additional material
  8. โš ๏ธ Frequently Asked Questions (FAQ)
  9. ๐Ÿž Bugs and errors

Lecture Contents ๐Ÿ“œ

The following topics / algorithms will be covered by the lecture:

  1. Introduction to machine learning (click here)
    • Motivation and applications
    • Terminology
    • Key challenges in ML: Generalization, feature engineering, model selection, ...
  2. Mathematical foundations (click here)
    • Linear algebra
    • Statistics
    • Optimization
  3. Bayesian decision theory (click here)
    • Bayes optimal classifier
    • Naive Bayes
    • Risk minimization
  4. Probability density estimation (click here)
    • Parametric models
    • Non-parametric models
    • Gaussian mixture models and expectation maximization
  5. Supervised learning
    • Regression (click here)
      • Linear regression
      • Probabilistic regression
      • Basis functions: Radial basis functions, polynomial basis functions
    • Classification I
    • Classification II: Deep learning (click here)
      • Perceptrons
      • Multi-layer-perceptrons and back-propagation
      • Deep learning application: NLP (word embeddings, text classification, sentiment analysis)
  6. Evaluation of ML models (click here)
    • Out-of-sample testing and cross validation
    • Confusion matrices
    • Evaluation metrics: Precision, recall, F1 score, ROC, accuracy
    • Cost-sensitive evaluation
    • Model selection: Grid search, random search
  7. Unsupervised learning
    • Clustering (click here)
      • k-Means
      • Hierarchical clustering (divisive and agglomerative)
    • Principal component analysis (click here)
  8. Lecture summary

A list of abbreviations, symbols and mathematical notation used in the context of the slides can be found here. Please find additional material below.

Assignments โœ’๏ธ (Not applicable this semester!)

The assignments are voluntary. All students who choose to participate have to form groups comprising three to four students (not more and not less). The groups do not have to be static, you may form new groups for each assignment. The task descriptions, starter code and data sets for the assignments can be found in the folder 02_exercises. You have two weeks to answer the questions and to submit your work. The solutions are going to be presented and discussed after the submission deadline. Sample solutions will not be uploaded. However, you are free to share correct solutions with your colleagues after they have been graded.

Formal requirements for submissions:

  • Please submit your solutions via Moodle (as a .zip file) as well as in printed form. The .zip file must contain one .pdf file for the pen-and-paper tasks as well as one .py file per programming task. Only pen-and-paper tasks have to be printed, you do not have to print the source code.
  • Only one member of the group has to submit the solutions. Please make sure to specify the matriculation numbers (not the names!) of all group members so that all participants receive the points they deserve!
  • Please refrain from submitting hand-written solutions or images of solutions (.png / .jpg files). Rather use proper type-setting software like LaTeX or other comparable programs. If you choose to use LaTeX, you may want to use the template files located here for your answers.
  • Code assignments have to be done in Python. Please submit .py files (no jupyter notebooks).
  • The following packages are allowed for code submissions: numpy, pandas and scipy. Please ask beforehand, if you want to use a specific package not mentioned here.
  • Do not use already implemented models (e.g. from scikit-learn).

Please make sure to fulfill the above mentioned formal requirements. Otherwise, you may risk to lose points. Submissions which severely violate the specifications might not get any points at all!

Grading details for assignments

Your homework is going to be corrected and given back to you. Correct solutions are rewarded with a bonus for the exam which amounts to at most ten percent of the exam, if all solutions submitted by you are correct (this corresponds to at most six points in the exam). It is still possible to achieve full points in the exam, even if you choose not to participate in the assignments (it is additional). Below you find the function which is used to compute the bonus as well as a legend which explains what the components mean. Please note that this is not a linear function.

Parameter Explanation Value
score achieved in the assignments up to you
maximum attainable points in the assignments 40
bonus points attained for the exam up to you
maximum attainable bonus points for the exam 6

Please note: The bonus points will be taken into account in case you have to repeat the exam (i.e. they do not expire if you fail the first attempt). ๐Ÿšฉ Very important: ๐Ÿšฉ Unsurprisingly, the solutions have to be your own work. If you plagiarize in the assignments, you will lose all bonus points!

Exam ๐Ÿ“

The exam is going to take 60 minutes. The maximum attainable score will be 60 points, so you have one minute per point. Important: Keep your answers short and simple in order not to lose too much valuable time. The exam questions will be given in English, but you may answer them in either English or German (you are also allowed to mix the languages). Please do not translate domain specific technical terms in order to avoid confusion. Please answer all questions on the task sheets (you may also write on the back-sides).

Exam preparation:

  • You will not be asked for any derivations, rather I want to test whether you understand the general concepts.
  • The exam will contain a mix of multiple choice questions, short answer questions and calculations.
  • Make sure you can answer the self-test questions provided for each topic. There won't be sample solutions for those questions! (This would undermine the sense of self-test questions).
  • Some of the slides give you important hints (upper left corner):
    • A slide marked with symbol (1) provides in-depth information which you do not have to know by heart (think of it as additional material for the sake of completeness).
    • Symbol (2) indicates very important content. Make sure you understand it!
  • Make sure you understand the homework assignments.
  • Work through the list of 150+ terms which you should be able to explain.
  • Solve the mock exam which is officially provided. The solutions will be discussed in the last session of the lecture.

Symbol (1):

Symbol (2):

Auxiliary material for the exam:

  • Non-programmable pocket calculator
  • Two-sided hand-written cheat sheet (you may note whatever you want)

Exam grading

Since the lecture Applied Machine Learning Fundamentals is part of a bigger module (Machine Learning Fundamentals, W3WI_DS304), it is not graded individually. Instead, the score you achieved in the exam (at most 60 points) will be added to the points you receive in the second element of the module, the Data Exploration Project in the 4th semester (cf. below) which is also worth 60 points at maximum. Your performance in both elements combined will determine the eventual grade. Please note: Even with bonus points included, it is not possible to get more than 60 points for the exam.

Please refer to the official DHBW data science module catalogue for further details.

Python Code ๐Ÿ

Machine learning algorithms (probably all algorithms) are easier to understand, if you see them implemented. Please find Python implementations for some of the algorithms (which are not part of the assignments) in the folder 06_python. Play around with the hyper-parameters of the algorithms and try different data sets in order to get a better feeling for how the algorithms work. Also, debug through the code line by line and check what each line does.

Literature and recommended Reading ๐Ÿ“š

You do not need to buy any books for the lecture, most resources are available online. Please find a curated list below:

Title Author(s) Publisher View online
Deep Learning Goodfellow et al. (2016) MIT Press click here
Elements of statistical Learning Hastie et al. (2008) Springer click here
Machine Learning Mitchell (1997) McGraw-Hill click here
Machine Learning - A probabilistic perspective Murphy (2012) MIT Press click here
Mathematics for Machine Learning Deisenroth et al. (2019) Cambridge Univ. Press click here
Pattern Recognition and Machine Learning Bishop (2006) Springer click here
Probabilistic Graphical Models Koller et al. (2009) MIT Press click here
Reinforcement Learning - An introduction Sutton et al. (2014) MIT Press click here

๐Ÿ”— YouTube resources:

๐Ÿ”— Interesting papers:

In general, the Coursera machine learning course by Andrew Ng is highly recommended. You can get a nice certificate if you want (around $60), but you can also participate in the course for free (without getting the certificate):



In the exam you will not be asked for content which was not discussed in the lecture. Regard the literature as additional resources in case you want to dig deeper into specific topics. Please give me a hint, if you feel that some important resources are missing. I am happy to add them here.

Data Exploration Project (4th Semester) ๐Ÿ“

Please find the material for the project in the folder 05_project.

The material contains:

  • Organization and goals of the project
  • List of topic suggestions
  • Submission details
  • Grading details

Additional Material ๐ŸŽ

Have a look at the following slides (unless stated otherwise, these slides are not relevant for the exam):

  1. Support vector machines (click here, click here)
    • Linear SVMs
    • Non-linear SVMs and the kernel-trick
    • Soft-margin SVMs
  2. Reinforcement learning (click here)
    • Markov decision processes
    • Algorithms:
      • Policy iteration and value iteration
      • Q-learning and Q-networks
      • Policy gradient methods
  3. Probabilistic graphical models (click here)
    • Bayesian networks (representation and inference)
    • Hidden Markov models and viterbi
  4. Apriori / association rules (click here)
  5. Data preprocessing (click here)
    • Data mining processes (KDD, CRISP-DM)
    • Data cleaning
    • Data transformation (e. g. normalization, discretization)
    • Data reduction and feature subset selection
    • Data integration
  6. Advanced regression (click here)
    • Bayesian regression
    • Kernel regression
    • Gaussian process regression
    • Support vector regression
  7. Advanced deep learning (not yet available)

Frequently Asked Questions (FAQ) โš ๏ธ

Q: Can we get sample solutions for the self-test questions?
A: No. The goal of those questions is that you deepen your knowledge about the contents. If answers were provided you would probably not answer the questions on your own.

Bugs and Errors ๐Ÿž

Help me improve the lecture. Please feel free to file an issue in case you spot any errors in the slides, exercises or code. Thank you very much in advance! Please do not open issues for questions concerning the content! Either use the Moodle forum or send me an e-mail for that ([email protected]).

ยฉ 2022 Daniel Wehner

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].