All Projects → DeepsMoseli → Bidirectiona-LSTM-for-text-summarization-

DeepsMoseli / Bidirectiona-LSTM-for-text-summarization-

Licence: MIT license
A bidirectional encoder-decoder LSTM neural network is trained for text summarization on the cnn/dailymail dataset. (MIT808 project)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Bidirectiona-LSTM-for-text-summarization-

Ntm One Shot Tf
One Shot Learning using Memory-Augmented Neural Networks (MANN) based on Neural Turing Machine architecture in Tensorflow
Stars: ✭ 238 (+226.03%)
Mutual labels:  lstm
Har Stacked Residual Bidir Lstms
Using deep stacked residual bidirectional LSTM cells (RNN) with TensorFlow, we do Human Activity Recognition (HAR). Classifying the type of movement amongst 6 categories or 18 categories on 2 different datasets.
Stars: ✭ 250 (+242.47%)
Mutual labels:  lstm
NLP-Extractive-NEWS-summarization-using-MMR
A simple python implementation of the Maximal Marginal Relevance (MMR) baseline system for text summarization.
Stars: ✭ 59 (-19.18%)
Mutual labels:  text-summarization
Trafficflowprediction
Traffic Flow Prediction with Neural Networks(SAEs、LSTM、GRU).
Stars: ✭ 242 (+231.51%)
Mutual labels:  lstm
Nlstm
Nested LSTM Cell
Stars: ✭ 246 (+236.99%)
Mutual labels:  lstm
Automatic speech recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Stars: ✭ 2,751 (+3668.49%)
Mutual labels:  lstm
Crnn Audio Classification
UrbanSound classification using Convolutional Recurrent Networks in PyTorch
Stars: ✭ 235 (+221.92%)
Mutual labels:  lstm
lstm har
LSTM based human activity recognition using smart phone sensor dataset
Stars: ✭ 20 (-72.6%)
Mutual labels:  lstm
Pytorch Seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Stars: ✭ 3,418 (+4582.19%)
Mutual labels:  lstm
Scripts-for-extractive-summarization
Scripts for an upcoming blog "Extractive vs. Abstractive Summarization" for RaRe Technologies.
Stars: ✭ 12 (-83.56%)
Mutual labels:  text-summarization
Caption generator
A modular library built on top of Keras and TensorFlow to generate a caption in natural language for any input image.
Stars: ✭ 243 (+232.88%)
Mutual labels:  lstm
Tensorflow novelist
模仿莎士比亚创作戏剧!屌炸天的是还能创作金庸武侠小说!快star,保持更新!!
Stars: ✭ 244 (+234.25%)
Mutual labels:  lstm
TextSumma
reimplementing Neural Summarization by Extracting Sentences and Words
Stars: ✭ 16 (-78.08%)
Mutual labels:  text-summarization
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+4295.89%)
Mutual labels:  lstm
Sumrized
Automatic Text Summarization (English/Arabic).
Stars: ✭ 37 (-49.32%)
Mutual labels:  text-summarization
Qa match
A simple effective ToolKit for short text matching
Stars: ✭ 235 (+221.92%)
Mutual labels:  lstm
Deepjazz
Deep learning driven jazz generation using Keras & Theano!
Stars: ✭ 2,766 (+3689.04%)
Mutual labels:  lstm
dltf
Hands-on in-person workshop for Deep Learning with TensorFlow
Stars: ✭ 14 (-80.82%)
Mutual labels:  lstm
Persian-Summarization
Statistical and Semantical Text Summarizer in Persian Language
Stars: ✭ 38 (-47.95%)
Mutual labels:  text-summarization
email-summarization
A module for E-mail Summarization which uses clustering of skip-thought sentence embeddings.
Stars: ✭ 81 (+10.96%)
Mutual labels:  text-summarization

Bidirectiona-LSTM-for-text-summarization-

A bidirectional encoder-decoder LSTM neural network is trained for text summarization on the cnn/dailymail dataset.

-The unprocessed dataset can be downloaded here

-The version (only cnn articles and summaries) used in this project can be found here

1) Word embeddings

Word2vec algorithm skipgram is used for the encoder input sequence. This is achieved by training a shallow neural network to ro predict context words given a current word. after training the hidden layer is used as the embedding layer. embedding size was kept at 128. skipgram was pre-trained on both the articles and golden summary words. Skip-gram model

For the decoder input and output, one hot encoding of the summary words was used. vocabulary size was initialy 50k but reduced to 30k due to memory constraints. one hot encoding was also to allow addition of attention layer later.

2) Encoder - decoder LSTM

We use a bidirectional encoder lstm with state size = 128, dropout=0.2 and a tanh activation. The Decoder is a unidirectional lstm with size = 128, droput = 0.2 and a softmax ativation. BiEnDeLSTM Network

3) Attention Layer

An attention layer between the encoder and decoder over the source sequence's hidden states. as the skipgram embedding and the one-hot vector sizes arent the same, pca over embedding to allow multiplication with one-hot vectors to get attention weights and vectors. final prediction of output word in decoder sequence is done by the attention layer. It helps allow the decoder individual encoder state information. (forgive the figure below if unclear or messy) BiEnDeLSTM + Attention mechanism

4) Training

Download the data and run the following scripts in this order:

  1. python cnn_daily_load.py
  2. python word2vec.py
  3. python lstm_Attention.py

dependencies

  • tensorflow, keras,sklearn
  • numpy, pandas, pyrouge, matplotlib
  • regex(re), NLTK, gensim

LSTM encoder decoder architectural and trainig parameters:

  • batch_size = 50
  • epochs = 20
  • hidden_units = 128
  • learning_rate = 0.005
  • clip_norm = 2.0
  • test_size = 0.2
  • optimizer = RMsprop
  • dropout = 0.2 (both encoder and decoder during training)

5) Results

The generated summaries are readable and xmake sense, however they contain repetitions and sometimes skip over important facts or get the plot wrong altogether.

References

  1. Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate
  2. Zixiang Ding, Rui Xia, Jianfei Yu, Xiang Li and Jian Yang. Densely Connected Bidirectional LSTM with Applications to Sentence Classification
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].