All Projects → ap229997 → LanguageModel-using-Attention

ap229997 / LanguageModel-using-Attention

Licence: other
Pytorch implementation of a basic language model using Attention in LSTM network

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to LanguageModel-using-Attention

Awesome Speech Recognition Speech Synthesis Papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Stars: ✭ 2,085 (+7622.22%)
Mutual labels:  language-model, attention-mechanism
Awesome Bert Nlp
A curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.
Stars: ✭ 567 (+2000%)
Mutual labels:  language-model, attention-mechanism
Neural sp
End-to-end ASR/LM implementation with PyTorch
Stars: ✭ 408 (+1411.11%)
Mutual labels:  language-model, attention-mechanism
Attention Mechanisms
Implementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras.
Stars: ✭ 203 (+651.85%)
Mutual labels:  language-model, attention-mechanism
hexia
Mid-level PyTorch Based Framework for Visual Question Answering.
Stars: ✭ 24 (-11.11%)
Mutual labels:  attention-mechanism
CharLM
Character-aware Neural Language Model implemented by PyTorch
Stars: ✭ 32 (+18.52%)
Mutual labels:  language-model
Neural-Chatbot
A Neural Network based Chatbot
Stars: ✭ 68 (+151.85%)
Mutual labels:  attention-mechanism
amta-net
Asymmetric Multi-Task Attention Network for Prostate Bed Segmentation in CT Images
Stars: ✭ 26 (-3.7%)
Mutual labels:  attention-mechanism
CIAN
Implementation of the Character-level Intra Attention Network (CIAN) for Natural Language Inference (NLI) upon SNLI and MultiNLI corpus
Stars: ✭ 17 (-37.04%)
Mutual labels:  attention-mechanism
swig-srilm
SWIG Wrapper for the SRILM toolkit
Stars: ✭ 33 (+22.22%)
Mutual labels:  language-model
LSTM-Attention
A Comparison of LSTMs and Attention Mechanisms for Forecasting Financial Time Series
Stars: ✭ 53 (+96.3%)
Mutual labels:  attention-mechanism
memory-compressed-attention
Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"
Stars: ✭ 47 (+74.07%)
Mutual labels:  attention-mechanism
ChangeFormer
Official PyTorch implementation of our IGARSS'22 paper: A Transformer-Based Siamese Network for Change Detection
Stars: ✭ 220 (+714.81%)
Mutual labels:  attention-mechanism
calm
Context Aware Language Models
Stars: ✭ 29 (+7.41%)
Mutual labels:  language-model
uniformer-pytorch
Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks, debuted in ICLR 2022
Stars: ✭ 90 (+233.33%)
Mutual labels:  attention-mechanism
STAM-pytorch
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification
Stars: ✭ 109 (+303.7%)
Mutual labels:  attention-mechanism
Optic-Disc-Unet
Attention Unet model with post process for retina optic disc segmention
Stars: ✭ 77 (+185.19%)
Mutual labels:  attention-mechanism
lm-scorer
📃Language Model based sentences scoring library
Stars: ✭ 264 (+877.78%)
Mutual labels:  language-model
NARRE
This is our implementation of NARRE:Neural Attentional Regression with Review-level Explanations
Stars: ✭ 100 (+270.37%)
Mutual labels:  attention-mechanism
personality-prediction
Experiments for automated personality detection using Language Models and psycholinguistic features on various famous personality datasets including the Essays dataset (Big-Five)
Stars: ✭ 109 (+303.7%)
Mutual labels:  language-model

LanguageModel-using-Attention

Pytorch implementation of a basic language model using Attention in LSTM network

Introduction

This repository contains code for a basic language model to predict the next word given the context. The network architecture used is LSTM network with Attention. The sentence length can be variable and this is taken care by padding the additional steps in the sequence. The model is trained using text from the book The Mercer Boys at Woodcrest by Capwell Wyckoff available at http://www.gutenberg.org. Any other ebook or txt from other sources can also be used for training the network.

Setup

This repository is compatible with python 2.

  • Follow instructions outlined on PyTorch Homepage for installing PyTorch (Python2).
  • The python packages required are nltk which can be installed using pip.

Data

Download any ebook available at http://www.gutenberg.org in .txt format. Create a new directory data and store the txt file in it. Any other text source can also be used.

Process Data

The txt file is first preprocessed to remove some unwanted tokens, filter rarely used words and converted into dictionary format. In addition the glove embeddings are also to be loaded.

Create dictionary

To create the dictionary, use the script preprocess_data/create_dictionary.py

python create_dictionary.py --data_path path_to_txt_file --dict_file dict_file_name.json --min_occ minimum_occurance_required

Create GLOVE dictionary

To create the GLOVE dictionary, download the original glove file and run the script preprocess_data/create_gloves.py

wget http://nlp.stanford.edu/data/glove.42B.300d.zip -P data/
unzip data/glove.42B.300d.zip -d data/
python preprocess_data/create_gloves.py --data_path path_to_txt_file --glove_in data/glove.42B.300d.txt --glove_out data/glove_dict.pkl

If there is an issue in downloading using the script, then the glove file can be downloaded from here.

Train the model

To train the model, run the following script

python main.py --gpu gpu_id_to_use --use_cuda True --data_path path_to_txt_file --glove_path data/glove_dict.pkl --dict_path path_to_dict_file

The other parameters to be used are specified in main.py. Refer to it for better understanding.
The saved models are available here.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].