alokwhitewolf / Visual-Attention-Model

Licence: MIT license

Chainer implementation of Deepmind's Visual Attention Model paper

Programming Languages

python

139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Visual-Attention-Model

Textclassifier

Text classifier for Hierarchical Attention Networks for Document Classification

Stars: ✭ 985 (+3548.15%)

Mutual labels: recurrent-neural-networks, attention-mechanism

Document Classifier Lstm

A bidirectional LSTM with attention for multiclass/multilabel text classification.

Stars: ✭ 136 (+403.7%)

Mutual labels: recurrent-neural-networks, attention-mechanism

Simplednn

SimpleDNN is a machine learning lightweight open-source library written in Kotlin designed to support relevant neural network architectures in natural language processing tasks

Stars: ✭ 81 (+200%)

Mutual labels: recurrent-neural-networks, attention-mechanism

automatic-personality-prediction

[AAAI 2020] Modeling Personality with Attentive Networks and Contextual Embeddings

Stars: ✭ 43 (+59.26%)

Mutual labels: recurrent-neural-networks, attention-mechanism

Machine Learning Curriculum

💻 Make machines learn so that you don't have to struggle to program them; The ultimate list

Stars: ✭ 761 (+2718.52%)

Mutual labels: chainer, recurrent-neural-networks

Da Rnn

📃 **Unofficial** PyTorch Implementation of DA-RNN (arXiv:1704.02971)

Stars: ✭ 256 (+848.15%)

Mutual labels: recurrent-neural-networks, attention-mechanism

Image Caption Generator

A neural network to generate captions for an image using CNN and RNN with BEAM Search.

Stars: ✭ 126 (+366.67%)

Mutual labels: recurrent-neural-networks, attention-mechanism

Linear Attention Recurrent Neural Network

A recurrent attention module consisting of an LSTM cell which can query its own past cell states by the means of windowed multi-head attention. The formulas are derived from the BN-LSTM and the Transformer Network. The LARNN cell with attention can be easily used inside a loop on the cell state, just like any other RNN. (LARNN)

Stars: ✭ 119 (+340.74%)

Mutual labels: recurrent-neural-networks, attention-mechanism

Attention is all you need

Transformer of "Attention Is All You Need" (Vaswani et al. 2017) by Chainer.

Stars: ✭ 303 (+1022.22%)

Mutual labels: chainer, attention-mechanism

Multi-task-Conditional-Attention-Networks

A prototype version of our submitted paper: Conversion Prediction Using Multi-task Conditional Attention Networks to Support the Creation of Effective Ad Creatives.

Stars: ✭ 21 (-22.22%)

Mutual labels: chainer, attention-mechanism

datastories-semeval2017-task6

Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".

Stars: ✭ 20 (-25.93%)

Mutual labels: recurrent-neural-networks, attention-mechanism

DARNN

A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction

Stars: ✭ 90 (+233.33%)

Mutual labels: recurrent-neural-networks, attention-mechanism

stanford-cs231n-assignments-2020

This repository contains my solutions to the assignments for Stanford's CS231n "Convolutional Neural Networks for Visual Recognition" (Spring 2020).

Stars: ✭ 84 (+211.11%)

Mutual labels: recurrent-neural-networks, attention-mechanism

Keras Attention

Visualizing RNNs using the attention mechanism

Stars: ✭ 697 (+2481.48%)

Mutual labels: recurrent-neural-networks, attention-mechanism

Attention Mechanisms

Implementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras.

Stars: ✭ 203 (+651.85%)

Mutual labels: recurrent-neural-networks, attention-mechanism

Chainer Rnn Ner

Named Entity Recognition with RNN, implemented by Chainer

Stars: ✭ 19 (-29.63%)

Mutual labels: chainer, recurrent-neural-networks

Neural-Chatbot

A Neural Network based Chatbot

Stars: ✭ 68 (+151.85%)

Mutual labels: recurrent-neural-networks, attention-mechanism

GuneyOzsanOutThereMusicVideo

Procedurally generated, real-time, demoscene style, open source music video made with Unity 3D for Out There by Guney Ozsan.

Stars: ✭ 26 (-3.7%)

Mutual labels: visual

3dgan-chainer

📦 A Chainer implementation of 3D Generative Adversarial Network.

Stars: ✭ 25 (-7.41%)

Mutual labels: chainer

Probabilistic-RNN-DA-Classifier

Probabilistic Dialogue Act Classification for the Switchboard Corpus using an LSTM model

Stars: ✭ 22 (-18.52%)

Mutual labels: recurrent-neural-networks

View All Similar Projects ➔

Visual Attention Model

Chainer implementation of Deepmind's Recurrent Models of Visual Attention. Humans do not tend to process a whole scene in its entirety at once. Instead we focus attention selectively on parts of the visual space to acquire information when and where it is needed, and combine information from different fixations over time to build up an internal representation of the scene.Focusing the computational resources on parts of a scene saves “bandwidth” as fewer “pixels” need to be processed.

The model is a recurrent neural network (RNN) which processes inputs sequentially, attending to different locations within the images (or video frames) one at a time, and incrementally combines information from these fixations to build up a dynamic internal representation of the scene or envi- ronment. Instead of processing an entire image or even bounding box at once, at each step, the model selects the next location to attend to based on past information and the demands of the task. Both the number of parameters in the model and the amount of computation it performs can be controlled independently of the size of the input image, which is in contrast to convolutional networks whose computational demands scale linearly with the number of image pixels.

The Network Architecture

Network Architecture image from Sunner Li's Blogpost
Glimpse Sensor

Glimpse Sensor is the implementation of RetinaThe idea is to allow our network to “take a glance” at the image around a given location, called a glimpse, then extract and resize this glimpse into various scales of image crops, but each scale is using the same resolution. For example, the glimpse in the above example contains 3 different scales, each scale has the same resolution (a.k.a. sensor bandwidth), e.g. 12x12. Therefore, the smallest scale of crop in the centre is most detailed, whereas the largest crop in the outer ring is most blurred. In summary, Glimpse Sensor takes a full-sized image and a location, outputs the “Retina-like” representation of the image around the given location.

Glimpse Network

Once we have defined glimpse sensor, Glimpse Network is simply a wrapped around Glimpse Sensor, to take a full-sized image and a location, extract a retina representation of the image via Glimpse Sensor, flatten, then combine the extracted retina representation with the glimpse location using hidden layers and ReLU, emitting a single vector g. This vector contains the information of both “what” (our retina representation) and “where” (the focused location within the image).

Recurrent Network
Recurrent Network takes feature vector input from Glimpse Network, remembers the useful information via it’s hidden states (and memory cell).

Location Network
Location Network takes hidden states from Recurrent Network as input, and tries to predict the next location to look at. This location prediction will become input to the Glimpse Network in the next time step in the unrolled recurrent network. The Location Network is the key component in this whole idea since it directly determines where to pay attention to in the next time step. In order to maximize the performance of this Location Network, the paper introduce a stochastic process (i.e. gaussian distribution) to generate next location, and use reinforcement learning techniques to learn. It is also known as “hard” attention, since this stochastic process is non-differentiable (compared to “soft” attention). The intuition behind stochasticity is to balance between exploitation (to predict future using the history) and exploration (to try unprecedented stuff). Note that, this stochasticity makes the component non-differentiable, which will incur problem during back-propagation. And REINFORCE gradient policy algorithm is used to solve this problem.

Activation Network
Activation Network takes hidden states from Recurrent Network as input, and tries to predict the digit. In addition, the prediction result is used to generate the reward point, which is used to train the Location Network (since the stochasticity makes it non-differentiable).

Architecture Combined
Combining all the element illustrated above, we have our network architecture below.

Experiments

MNIST
Translated MNIST
Cluttered MNIST
SVHN

Credits

Some of the texts and images have been medium posts by Tristan and Sunner Li

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

alokwhitewolf / Visual-Attention-Model

Programming Languages

Labels

Projects that are alternatives of or similar to Visual-Attention-Model

Visual Attention Model

The Network Architecture

Experiments

Credits