All Projects → EdGENetworks → Attention Networks For Classification

EdGENetworks / Attention Networks For Classification

Hierarchical Attention Networks for Document Classification in PyTorch

Projects that are alternatives of or similar to Attention Networks For Classification

Deeplearning.ai Assignments
Stars: ✭ 268 (-50.37%)
Mutual labels:  jupyter-notebook, lstm
Stock Prediction Models
Gathers machine learning and deep learning models for Stock forecasting including trading bots and simulations
Stars: ✭ 4,660 (+762.96%)
Mutual labels:  jupyter-notebook, lstm
Cryptocurrency Price Prediction
Cryptocurrency Price Prediction Using LSTM neural network
Stars: ✭ 271 (-49.81%)
Mutual labels:  jupyter-notebook, lstm
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+494.26%)
Mutual labels:  jupyter-notebook, lstm
Tensorflow Lstm Regression
Sequence prediction using recurrent neural networks(LSTM) with TensorFlow
Stars: ✭ 433 (-19.81%)
Mutual labels:  jupyter-notebook, lstm
Pytorch Seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Stars: ✭ 3,418 (+532.96%)
Mutual labels:  jupyter-notebook, lstm
Thesemicolon
This repository contains Ipython notebooks and datasets for the data analytics youtube tutorials on The Semicolon.
Stars: ✭ 345 (-36.11%)
Mutual labels:  jupyter-notebook, lstm
Up Down Captioner
Automatic image captioning model based on Caffe, using features from bottom-up attention.
Stars: ✭ 195 (-63.89%)
Mutual labels:  jupyter-notebook, lstm
Sentimentanalysis
文本情感分析
Stars: ✭ 421 (-22.04%)
Mutual labels:  jupyter-notebook, lstm
Zhihu Text Classification
[2017知乎看山杯 多标签 文本分类] ye组(第六名) 解题方案
Stars: ✭ 392 (-27.41%)
Mutual labels:  jupyter-notebook, lstm
Natural Language Processing With Tensorflow
Natural Language Processing with TensorFlow, published by Packt
Stars: ✭ 222 (-58.89%)
Mutual labels:  jupyter-notebook, lstm
Cryptocurrencyprediction
Predict Cryptocurrency Price with Deep Learning
Stars: ✭ 453 (-16.11%)
Mutual labels:  jupyter-notebook, lstm
Graph convolutional lstm
Traffic Graph Convolutional Recurrent Neural Network
Stars: ✭ 210 (-61.11%)
Mutual labels:  jupyter-notebook, lstm
Lstm Human Activity Recognition
Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier
Stars: ✭ 2,943 (+445%)
Mutual labels:  jupyter-notebook, lstm
Screenshot To Code
A neural network that transforms a design mock-up into a static website.
Stars: ✭ 13,561 (+2411.3%)
Mutual labels:  jupyter-notebook, lstm
Image Captioning
Image Captioning using InceptionV3 and beam search
Stars: ✭ 290 (-46.3%)
Mutual labels:  jupyter-notebook, lstm
Stylenet
A cute multi-layer LSTM that can perform like a human 🎶
Stars: ✭ 187 (-65.37%)
Mutual labels:  jupyter-notebook, lstm
Deep Learning Random Explore
Stars: ✭ 192 (-64.44%)
Mutual labels:  jupyter-notebook, lstm
Easy Deep Learning With Keras
Keras tutorial for beginners (using TF backend)
Stars: ✭ 367 (-32.04%)
Mutual labels:  jupyter-notebook, lstm
Pytorch Ntm
Neural Turing Machines (NTM) - PyTorch Implementation
Stars: ✭ 453 (-16.11%)
Mutual labels:  jupyter-notebook, lstm

Hierarchical Attention Networks for Document Classification

We know that documents have a hierarchical structure, words combine to form sentences and sentences combine to form documents. We can try to learn that structure or we can input this hierarchical structure into the model and see if it improves the performance of existing models. This paper exploits that structure to build a classification model.

This is an (close) implementation of the model in PyTorch.

Note:

  1. I jointly optimize both the word and sentence attention models with the same optimizer.
  2. The minibatches are padded with zeros. This can be improved, one can sort senteces with similar length together, and minimize the paddings.
  3. Pytorch does not yet support gradient masking, so padded zeros will have gradients flowing through them during backpropagation. One can create a mask, but since I am interested in using Bidirectional GRU, it is not possible to use a mask. I've seen that variable length RNN supoort is coming soon to Pytorch as well. Update: Pytorch does supoort masked RNN now with pack_padded_sequence method.

This picture from Explosion blog explains the structure perfectly.

alt text

Notebook

The notebook contains an example of trained model on IMDB movie review dataset. I could not get the original IMDB dataset that the paper referred to, so I have used this data

The preprocessed data is available here

The best accuracy that I got was around ~ 0.35. This dataset has only 84919 samples and 10 classes. Here is the training loss for the dataset.

alt text

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].