All Projects → codedecde → Recognizing-Textual-Entailment

codedecde / Recognizing-Textual-Entailment

Licence: MIT license
A pyTorch implementation of models used for Recognizing Textual Entailment using the SNLI corpus

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Recognizing-Textual-Entailment

Bamnet
Code & data accompanying the NAACL 2019 paper "Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases"
Stars: ✭ 140 (+351.61%)
Mutual labels:  attention-model
Generative Inpainting Pytorch
A PyTorch reimplementation for paper Generative Image Inpainting with Contextual Attention (https://arxiv.org/abs/1801.07892)
Stars: ✭ 242 (+680.65%)
Mutual labels:  attention-model
SANET
"Arbitrary Style Transfer with Style-Attentional Networks" (CVPR 2019)
Stars: ✭ 21 (-32.26%)
Mutual labels:  attention-model
Pytorch Acnn Model
code of Relation Classification via Multi-Level Attention CNNs
Stars: ✭ 170 (+448.39%)
Mutual labels:  attention-model
Generative inpainting
DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral
Stars: ✭ 2,659 (+8477.42%)
Mutual labels:  attention-model
Sinet
Camouflaged Object Detection, CVPR 2020 (Oral & Reported by the New Scientist Magazine)
Stars: ✭ 246 (+693.55%)
Mutual labels:  attention-model
Linear Attention Recurrent Neural Network
A recurrent attention module consisting of an LSTM cell which can query its own past cell states by the means of windowed multi-head attention. The formulas are derived from the BN-LSTM and the Transformer Network. The LARNN cell with attention can be easily used inside a loop on the cell state, just like any other RNN. (LARNN)
Stars: ✭ 119 (+283.87%)
Mutual labels:  attention-model
Compact-Global-Descriptor
Pytorch implementation of "Compact Global Descriptor for Neural Networks" (CGD).
Stars: ✭ 22 (-29.03%)
Mutual labels:  attention-model
Pytorch Batch Attention Seq2seq
PyTorch implementation of batched bi-RNN encoder and attention-decoder.
Stars: ✭ 245 (+690.32%)
Mutual labels:  attention-model
reasoning attention
Unofficial implementation algorithms of attention models on SNLI dataset
Stars: ✭ 34 (+9.68%)
Mutual labels:  attention-model
Snli Entailment
attention model for entailment on SNLI corpus implemented in Tensorflow and Keras
Stars: ✭ 181 (+483.87%)
Mutual labels:  attention-model
Keras Attention Mechanism
Attention mechanism Implementation for Keras.
Stars: ✭ 2,504 (+7977.42%)
Mutual labels:  attention-model
swin-transformer-pytorch
Implementation of the Swin Transformer in PyTorch.
Stars: ✭ 610 (+1867.74%)
Mutual labels:  attention-model
Sa Tensorflow
Soft attention mechanism for video caption generation
Stars: ✭ 154 (+396.77%)
Mutual labels:  attention-model
GATE
The implementation of "Gated Attentive-Autoencoder for Content-Aware Recommendation"
Stars: ✭ 65 (+109.68%)
Mutual labels:  attention-model
Image Caption Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Stars: ✭ 126 (+306.45%)
Mutual labels:  attention-model
Attentionalpoolingaction
Code/Model release for NIPS 2017 paper "Attentional Pooling for Action Recognition"
Stars: ✭ 248 (+700%)
Mutual labels:  attention-model
HHH-An-Online-Question-Answering-System-for-Medical-Questions
HBAM: Hierarchical Bi-directional Word Attention Model
Stars: ✭ 44 (+41.94%)
Mutual labels:  attention-model
learningspoons
nlp lecture-notes and source code
Stars: ✭ 29 (-6.45%)
Mutual labels:  attention-model
G2P
Grapheme To Phoneme
Stars: ✭ 59 (+90.32%)
Mutual labels:  attention-model

Recognizing-Textual-Entailment


A pyTorch implementation of models used for Recognizing Textual Entailment using the SNLI corpus. The following models have been implemented (so far) :

The details and results specific to the different models are given below

Reasoning About Entailment with Neural Attention


Introduction


The paper presents an LSTM based model with attention for the task. The following are some key points:

  • Two LSTM's encode the premise and hypothesis.
  • The hidden state of the LSTM encoding the hypothesis is initialised using the hidden state of the LSTM encoding the premise
  • Two different attention mechanisms are explored:
    • Using just the last output of the hypothesis LSTM to attend over the outputs of the premise LSTM
    • Attending over the premise LSTM outputs at every step of processing the hypothesis (using a simple RNN).

Running the code


To start training the model, call

python run_rte.py 

The following command line arguments are available: General arguments (used by other models as well)

-n_embed     (Embedding Layer Dimensions, default 300)
-n_dim       (Hidden Layer Dimensions, default 300)
-batch       (batch size, default 256)
-dropout     (p value for dropout layer, default 0.1)
-l2          (L2 regularisation value, default 0.0003)
-lr          (Learning rate, default 0.001 )
-train_flag  (Training or evaluation mode, default True)

Model specific arguments

-last_nonlinear	(Projection to softmax layer is non-linear or not, default False)
-wbw_attn    (Use word by word attention, default False)
-h_maxlen    (Maximum Length of hypothesis(used by the recurrent batchnorm layer), default 30)

Implementation Caveats


The word by word attention model is basically a simple RNN, used to attend over the premise at every step. Consequently, it faces the exploding gradient problem. In order to prevent that from happening, the following measures have been taken:

  • Setting the initial weights of the RNN to be orthogonal
  • Using Batch Normalisation in Recurrent Networks, as done in Recurrent Batch Normalisation Cooijmans et. al. '17 [see recurrent_BatchNorm.py for implementation.]

Results


Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].