All Projects → will-thompson-k → Deeplearning Nlp Models

will-thompson-k / Deeplearning Nlp Models

Licence: mit
A small, interpretable codebase containing the re-implementation of a few "deep" NLP models in PyTorch. Colab notebooks to run with GPUs. Models: word2vec, CNNs, transformer, gpt.

Projects that are alternatives of or similar to Deeplearning Nlp Models

Nlp Tutorial
Natural Language Processing Tutorial for Deep Learning Researchers
Stars: ✭ 9,895 (+15360.94%)
Mutual labels:  jupyter-notebook, attention, transformer
Neural Networks
All about Neural Networks!
Stars: ✭ 34 (-46.87%)
Mutual labels:  jupyter-notebook, cnn, word2vec
Pytorch Pos Tagging
A tutorial on how to implement models for part-of-speech tagging using PyTorch and TorchText.
Stars: ✭ 96 (+50%)
Mutual labels:  jupyter-notebook, tutorials, cnn
Embedding As Service
One-Stop Solution to encode sentence to fixed length vectors from various embedding techniques
Stars: ✭ 151 (+135.94%)
Mutual labels:  word2vec, embeddings, transformer
Deep learning nlp
Keras, PyTorch, and NumPy Implementations of Deep Learning Architectures for NLP
Stars: ✭ 407 (+535.94%)
Mutual labels:  jupyter-notebook, word2vec, attention
Awesome Embedding Models
A curated list of awesome embedding models tutorials, projects and communities.
Stars: ✭ 1,486 (+2221.88%)
Mutual labels:  jupyter-notebook, word2vec, embeddings
Bertqa Attention On Steroids
BertQA - Attention on Steroids
Stars: ✭ 112 (+75%)
Mutual labels:  jupyter-notebook, attention, transformer
Jddc solution 4th
2018-JDDC大赛第4名的解决方案
Stars: ✭ 235 (+267.19%)
Mutual labels:  jupyter-notebook, attention, transformer
learningspoons
nlp lecture-notes and source code
Stars: ✭ 29 (-54.69%)
Mutual labels:  word2vec, transformer, attention
Pytorch Seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Stars: ✭ 3,418 (+5240.63%)
Mutual labels:  jupyter-notebook, attention, transformer
Pytorch Original Transformer
My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
Stars: ✭ 411 (+542.19%)
Mutual labels:  jupyter-notebook, attention, transformer
Tsai
Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai
Stars: ✭ 407 (+535.94%)
Mutual labels:  jupyter-notebook, cnn, transformer
Servenet
Service Classification based on Service Description
Stars: ✭ 21 (-67.19%)
Mutual labels:  jupyter-notebook, cnn, word2vec
Keras basic
keras를 이용한 딥러닝 기초 학습
Stars: ✭ 39 (-39.06%)
Mutual labels:  jupyter-notebook, cnn
Opencv Tutorials
Tutorials for learning OpenCV in Python from Scratch
Stars: ✭ 36 (-43.75%)
Mutual labels:  jupyter-notebook, tutorials
Word2vec Russian Novels
Inspired by word2vec-pride-vis the replacement of words of Russian most valuable novels text with closest word2vec model words. By Boris Orekhov
Stars: ✭ 39 (-39.06%)
Mutual labels:  jupyter-notebook, word2vec
Svhn Cnn
Google Street View House Number(SVHN) Dataset, and classifying them through CNN
Stars: ✭ 44 (-31.25%)
Mutual labels:  jupyter-notebook, cnn
Finalfusion Rust
finalfusion embeddings in Rust
Stars: ✭ 35 (-45.31%)
Mutual labels:  word2vec, embeddings
Yann
This toolbox is support material for the book on CNN (http://www.convolution.network).
Stars: ✭ 41 (-35.94%)
Mutual labels:  jupyter-notebook, cnn
Face Identification With Cnn Triplet Loss
Face identification with cnn+triplet-loss written by Keras.
Stars: ✭ 45 (-29.69%)
Mutual labels:  jupyter-notebook, cnn

deeplearning-nlp-models

Coveralls github Travis (.com) CodeFactor Grade GitHub

A small, interpretable codebase containing the re-implementation of a few "deep" NLP models in PyTorch.

This is presented as an (incomplete) starting point for those interested in getting into the weeds of DL architectures in NLP. Annotated models are presented along with some notes.

There are links to run these models on colab with GPUs 🌩 via notebooks.

Current models: word2vec, CNNs, transformer, gpt. (Work in progress)

Meta

BERT: Reading. Comprehending.

Note: These are toy versions of each model.

Contents

Models

These NLP models are presented chronologically and, as you might expect, build off each other.

Model Class Model Year
Embeddings
1. Word2Vec Embeddings (Self-Supervised Learning) 2013
CNNs
2. CNN-based Text Classification (Binary Classification) 2014
Transformers
3. The O.G. Transformer (Machine Translation) 2017
4. OpenAI's GPT Model (Language Model) 2018, 2019, 2020

Features

This repository has the following features:

  • [ ] model overviews: A brief overview of each model's motivation and design are provided in separate README.md files.
  • [ ] Jupyter notebooks (easy to run on colab w/ GPUs): Jupyter notebooks showing how to run the models and some simple analyses of the model results.
  • [ ] self-contained: Tokenizers, dataset loaders, dictionaries, and all the custom utilities required for each problem.

Endgame

After reviewing these models, the world's your oyster in terms of other models to explore:

Char-RNN, BERT, ELMO, XLNET, all the other BERTs, BART, Performer, T5, etc....

Roadmap

Future models to implement:

  • [ ] Char-RNN (Kaparthy)
  • [ ] BERT

Future repo features:

  • [ ] Tensorboard plots
  • [ ] Val set demonstrations
  • [ ] Saving checkpoints/ loading models
  • [ ] BPE (from either openai/gpt-2 or facebook's fairseq library)

Setup

You can install the repo using pip:

pip install git+https://github.com/will-thompson-k/deeplearning-nlp-models 

Structure

Here is a breakdown of the repository:

  • [ ] nlpmodels/models: The model code for each paper.
  • [ ] nlpmodels/utils: Contains all the auxiliary classes related to building a model, including datasets, vocabulary, tokenizers, samplers and trainer classes. (Note: Most of the non-model files are thrown into utils. I would advise against that in a larger repo.)
  • [ ] tests: Light (and by no means comprehensive) coverage.
  • [ ] notebooks: Contains the notebooks and write-ups for each model implementation.

A few useful commands:

  • [ ] make test: Run the full suite of tests (you can also use setup.py test and run_tests.sh).
  • [ ] make test_light: Run all tests except the regression tests.
  • [ ] make lint: If you really like linting code (also can run run_pylint.sh).

Requirements

Python 3.6+

Here are the package requirements (found in requirements.txt)

  • [ ] numpy==1.19.1
  • [ ] tqdm==4.50.2
  • [ ] torch==1.7.0
  • [ ] datasets==1.0.2
  • [ ] torchtext==0.8.0

Citation

@misc{deeplearning-nlp-models,
  author = {Thompson, Will},
  url = {https://github.com/will-thompson-k/deeplearning-nlp-models},
  year = {2020}
}

License

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].