All Projects → fastai → Course Nlp

fastai / Course Nlp

A Code-First Introduction to NLP course

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to Course Nlp

Sc17
SuperComputing 2017 Deep Learning Tutorial
Stars: ✭ 211 (-93.03%)
Mutual labels:  jupyter-notebook, data-science
Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (-92.8%)
Mutual labels:  jupyter-notebook, data-science
Tutorials
AI-related tutorials. Access any of them for free → https://towardsai.net/editorial
Stars: ✭ 204 (-93.27%)
Mutual labels:  jupyter-notebook, data-science
Flaml
A fast and lightweight AutoML library.
Stars: ✭ 205 (-93.23%)
Mutual labels:  jupyter-notebook, data-science
Functional intro to python
[tutorial]A functional, Data Science focused introduction to Python
Stars: ✭ 228 (-92.47%)
Mutual labels:  jupyter-notebook, data-science
Covid19za
Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Stars: ✭ 208 (-93.13%)
Mutual labels:  jupyter-notebook, data-science
Gwu data mining
Materials for GWU DNSC 6279 and DNSC 6290.
Stars: ✭ 217 (-92.84%)
Mutual labels:  jupyter-notebook, data-science
Fastpages
An easy to use blogging platform, with enhanced support for Jupyter Notebooks.
Stars: ✭ 2,888 (-4.66%)
Mutual labels:  jupyter-notebook, data-science
Alphatools
Quantitative finance research tools in Python
Stars: ✭ 226 (-92.54%)
Mutual labels:  jupyter-notebook, data-science
Full Stack Data Science
Full Stack Data Science in Python
Stars: ✭ 227 (-92.51%)
Mutual labels:  jupyter-notebook, data-science
Eli5
A library for debugging/inspecting machine learning classifiers and explaining their predictions
Stars: ✭ 2,477 (-18.22%)
Mutual labels:  jupyter-notebook, data-science
Deep Learning Machine Learning Stock
Stock for Deep Learning and Machine Learning
Stars: ✭ 240 (-92.08%)
Mutual labels:  jupyter-notebook, data-science
Scihub
Source code and data analyses for the Sci-Hub Coverage Study
Stars: ✭ 205 (-93.23%)
Mutual labels:  jupyter-notebook, data-science
Cartoframes
CARTO Python package for data scientists
Stars: ✭ 208 (-93.13%)
Mutual labels:  jupyter-notebook, data-science
Python For Data Science
A collection of Jupyter Notebooks for learning Python for Data Science.
Stars: ✭ 205 (-93.23%)
Mutual labels:  jupyter-notebook, data-science
Cardio
CardIO is a library for data science research of heart signals
Stars: ✭ 218 (-92.8%)
Mutual labels:  jupyter-notebook, data-science
Radio
RadIO is a library for data science research of computed tomography imaging
Stars: ✭ 198 (-93.46%)
Mutual labels:  jupyter-notebook, data-science
Trump Lies
Tutorial: Web scraping in Python with Beautiful Soup
Stars: ✭ 201 (-93.36%)
Mutual labels:  jupyter-notebook, data-science
Datascienceprojects
The code repository for projects and tutorials in R and Python that covers a variety of topics in data visualization, statistics sports analytics and general application of probability theory.
Stars: ✭ 223 (-92.64%)
Mutual labels:  jupyter-notebook, data-science
Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (-92.51%)
Mutual labels:  jupyter-notebook, data-science

A Code-First Intro to Natural Language Processing

You can find out about the course in this blog post and all lecture videos are available here.

This course was originally taught in the University of San Francisco's Masters of Science in Data Science program, summer 2019. The course is taught in Python with Jupyter Notebooks, using libraries such as sklearn, nltk, pytorch, and fastai.

Table of Contents

The following topics will be covered:

1. What is NLP?

  • A changing field
  • Resources
  • Tools
  • Python libraries
  • Example applications
  • Ethics issues

2. Topic Modeling with NMF and SVD

  • Stop words, stemming, & lemmatization
  • Term-document matrix
  • Topic Frequency-Inverse Document Frequency (TF-IDF)
  • Singular Value Decomposition (SVD)
  • Non-negative Matrix Factorization (NMF)
  • Truncated SVD, Randomized SVD

3. Sentiment classification with Naive Bayes, Logistic regression, and ngrams

  • Sparse matrix storage
  • Counters
  • the fastai library
  • Naive Bayes
  • Logistic regression
  • Ngrams
  • Logistic regression with Naive Bayes features, with trigrams

4. Regex (and re-visiting tokenization)

5. Language modeling & sentiment classification with deep learning

  • Language model
  • Transfer learning
  • Sentiment classification

6. Translation with RNNs

  • Review Embeddings
  • Bleu metric
  • Teacher Forcing
  • Bidirectional
  • Attention

7. Translation with the Transformer architecture

  • Transformer Model
  • Multi-head attention
  • Masking
  • Label smoothing

8. Bias & ethics in NLP

  • bias in word embeddings
  • types of bias
  • attention economy
  • drowning in fraudulent/fake info

Why is this course taught in a weird order?

This course is structured with a top-down teaching method, which is different from how most math courses operate. Typically, in a bottom-up approach, you first learn all the separate components you will be using, and then you gradually build them up into more complex structures. The problems with this are that students often lose motivation, don't have a sense of the "big picture", and don't know what they'll need.

Harvard Professor David Perkins has a book, Making Learning Whole in which he uses baseball as an analogy. We don't require kids to memorize all the rules of baseball and understand all the technical details before we let them play the game. Rather, they start playing with a just general sense of it, and then gradually learn more rules/details as time goes on.

If you took the fast.ai deep learning course, that is what we used. You can hear more about my teaching philosophy in this blog post or this talk I gave at the San Francisco Machine Learning meetup.

All that to say, don't worry if you don't understand everything at first! You're not supposed to. We will start using some "black boxes" and then we'll dig into the lower level details later.

To start, focus on what things DO, not what they ARE.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].