All Projects → arunarn2 → HierarchicalAttentionNetworks

arunarn2 / HierarchicalAttentionNetworks

Licence: other
Hierarchical Attention Networks for Document Classification in Keras

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to HierarchicalAttentionNetworks

kdsb17
Gaussian Mixture Convolutional AutoEncoder applied to CT lung scans from the Kaggle Data Science Bowl 2017
Stars: ✭ 18 (-74.29%)
Mutual labels:  keras-tensorflow
Nearest-Celebrity-Face
Tensorflow Implementation of FaceNet: A Unified Embedding for Face Recognition and Clustering to find the celebrity whose face matches the closest to yours.
Stars: ✭ 30 (-57.14%)
Mutual labels:  keras-tensorflow
StockerBot
Twitter Bot to follow financial trends in publicly traded companies
Stars: ✭ 77 (+10%)
Mutual labels:  sentiment-classification
One-Shot-Learning-with-Siamese-Networks
Implementation of One Shot Learning using Convolutional Siamese Networks on Omniglot Dataset
Stars: ✭ 129 (+84.29%)
Mutual labels:  keras-tensorflow
GLOM-TensorFlow
An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data
Stars: ✭ 32 (-54.29%)
Mutual labels:  keras-tensorflow
Brainy
Brainy is a virtual MRI analyzer. Just upload the MRI scan file and get 3 different classes of tumors detected and segmented. In Beta.
Stars: ✭ 29 (-58.57%)
Mutual labels:  keras-tensorflow
german-sentiment-lib
An easy to use python package for deep learning-based german sentiment classification.
Stars: ✭ 33 (-52.86%)
Mutual labels:  sentiment-classification
Hierarchical-Word-Sense-Disambiguation-using-WordNet-Senses
Word Sense Disambiguation using Word Specific models, All word models and Hierarchical models in Tensorflow
Stars: ✭ 33 (-52.86%)
Mutual labels:  hierarchical-models
ForEx
Using ML to create a ForEx trader to invest my personal finances to get rid of student debt
Stars: ✭ 17 (-75.71%)
Mutual labels:  keras-tensorflow
Deep-learning-model-deploy-with-django
Serving a keras model (neural networks) in a website with the python Django-REST framework.
Stars: ✭ 76 (+8.57%)
Mutual labels:  keras-tensorflow
Word-Level-Eng-Mar-NMT
Translating English sentences to Marathi using Neural Machine Translation
Stars: ✭ 37 (-47.14%)
Mutual labels:  keras-tensorflow
banglabert
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chap…
Stars: ✭ 186 (+165.71%)
Mutual labels:  sentiment-classification
DCASE2020 task1
Code for DCASE 2020 task 1a and task 1b.
Stars: ✭ 72 (+2.86%)
Mutual labels:  keras-tensorflow
Explainable-Automated-Medical-Coding
Implementation and demo of explainable coding of clinical notes with Hierarchical Label-wise Attention Networks (HLAN)
Stars: ✭ 35 (-50%)
Mutual labels:  hierarchical-attention-networks
arabic-sentiment-analysis
Sentiment Analysis in Arabic tweets
Stars: ✭ 64 (-8.57%)
Mutual labels:  sentiment-classification
SignatureVerification
A system to recognize whether signatures are forged or real.
Stars: ✭ 17 (-75.71%)
Mutual labels:  keras-tensorflow
labml
🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱
Stars: ✭ 1,213 (+1632.86%)
Mutual labels:  keras-tensorflow
ACAN
Code for NAACL 2019 paper: Adversarial Category Alignment Network for Cross-domain Sentiment Classification
Stars: ✭ 23 (-67.14%)
Mutual labels:  sentiment-classification
cnn-text-classification
Text classification with Convolution Neural Networks on Yelp, IMDB & sentence polarity dataset v1.0
Stars: ✭ 108 (+54.29%)
Mutual labels:  sentiment-classification
stock-volatility-google-trends
Deep Learning Stock Volatility with Google Domestic Trends: https://arxiv.org/pdf/1512.04916.pdf
Stars: ✭ 74 (+5.71%)
Mutual labels:  keras-tensorflow

Hierarchical Attention Networks

This repository contains an implementation of Hierarchical Attention Networks for Document Classification in keras and another implementation of the same network in tensorflow.

Hierarchical Attention Networks consists of the following parts:

  1. Embedding layer
  2. Word Encoder: word level bi-directional GRU to get rich representation of words
  3. Word Attention:word level attention to get important information in a sentence
  4. Sentence Encoder: sentence level bi-directional GRU to get rich representation of sentences
  5. Sentence Attention: sentence level attention to get important sentence among sentences
  6. Fully Connected layer + Softmax

These models have 2 levels of attention: one at the word level and one at the sentence level thereby allowing the model to pay less or more attention to individual words and sentences accordingly when constructing the represenation of a document.

Hierarchical Attn Network

DataSet:

I have used the IMDB Movies dataset from Kaggle, labeledTrainData.tsv which contains 25000 reviews with labels

Preprocessing on the Data:

I have done minimal preprocessing on the input reviews in the dataset following these basic steps:

  1. Remove html tags

  2. Replace non-ascii characters with a single space

  3. Split each review into sentences

Then I create the character set with a max sentence length of 512 chars and set an upper bound of 15 for the max number of sentences per review. The input X is indexed as (document, sentence, char) and the target y has the corresponding sentiments.

Attention Layer Implementation

Attention mechanism layer which reduces Bi-RNN outputs with Attention vector (adapted from the paper)
Args:
    inputs: The Attention inputs.             
            In case of Bidirectional RNN, this must be a tuple (outputs_fw, outputs_bw) containing 
            the forward and the backward RNN outputs `Tensor`.
                If time_major == False (default),
                    outputs_fw is a `Tensor` shaped:
                    `[batch_size, max_time, cell_fw.output_size]`
                    and outputs_bw is a `Tensor` shaped:
                    `[batch_size, max_time, cell_bw.output_size]`.
                If time_major == True,
                    outputs_fw is a `Tensor` shaped:
                    `[max_time, batch_size, cell_fw.output_size]`
                    and outputs_bw is a `Tensor` shaped:
                    `[max_time, batch_size, cell_bw.output_size]`.
    attention_size: Linear size of the Attention weights.
    time_major: The shape format of the `inputs` Tensors.
        If true, these `Tensors` must be shaped `[max_time, batch_size, depth]`.
        If false, these `Tensors` must be shaped `[batch_size, max_time, depth]`.
        Using `time_major = True` is a bit more efficient because it avoids
        transposes at the beginning and end of the RNN calculation.  However,
        most TensorFlow data is batch-major, so by default this function
        accepts input and emits output in batch-major form.
    return_alphas: Whether to return attention coeef variable along with layer's output.
        Used for visualization purpose.
Returns:
    The Attention output `Tensor`.

Requirements:

  1. pandas 0.20.3
  2. tensorflow 1.4.0
  3. keras 2.0.8
  4. numpy 1.14.0

Implementation in Keras

Execution:

python HierarchicalAttn.py

Results & Accuracy:

Accuracy

Implementation in Tensorflow

Execution:

python HierarchicalAttn_tf.py

Results & Accuracy:

Accuracy

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].