Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → philipperemy → Keras Attention Mechanism

philipperemy / Keras Attention Mechanism

Licence: apache-2.0

Attention mechanism Implementation for Keras.

Programming Languages

139335 projects - #7 most used programming language

Labels

deep-learning keras attention-mechanism attention-model keras-neural-networks

Projects that are alternatives of or similar to Keras Attention Mechanism

Structured Self Attention

A Structured Self-attentive Sentence Embedding

Stars: ✭ 459 (-81.67%)

Mutual labels: attention-mechanism, attention-model

attention-mechanism-keras

attention mechanism in keras, like Dense and RNN...

Stars: ✭ 19 (-99.24%)

Mutual labels: attention-mechanism, attention-model

Attentionalpoolingaction

Code/Model release for NIPS 2017 paper "Attentional Pooling for Action Recognition"

Stars: ✭ 248 (-90.1%)

Mutual labels: attention-mechanism, attention-model

Linear Attention Recurrent Neural Network

A recurrent attention module consisting of an LSTM cell which can query its own past cell states by the means of windowed multi-head attention. The formulas are derived from the BN-LSTM and the Transformer Network. The LARNN cell with attention can be easily used inside a loop on the cell state, just like any other RNN. (LARNN)

Stars: ✭ 119 (-95.25%)

Mutual labels: attention-mechanism, attention-model

Deep Visual Attention Prediction (TIP18)

Stars: ✭ 65 (-97.4%)

Mutual labels: attention-mechanism, attention-model

Neural Machine Translation with Keras

Stars: ✭ 501 (-79.99%)

Mutual labels: attention-mechanism, attention-model

Compact-Global-Descriptor

Pytorch implementation of "Compact Global Descriptor for Neural Networks" (CGD).

Stars: ✭ 22 (-99.12%)

Mutual labels: attention-mechanism, attention-model

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Stars: ✭ 990 (-60.46%)

Mutual labels: attention-mechanism, attention-model

Pytorch Attention Guided Cyclegan

Pytorch implementation of Unsupervised Attention-guided Image-to-Image Translation.

Stars: ✭ 67 (-97.32%)

Mutual labels: attention-mechanism, attention-model

Image Caption Generator

A neural network to generate captions for an image using CNN and RNN with BEAM Search.

Stars: ✭ 126 (-94.97%)

Mutual labels: attention-mechanism, attention-model

Pytorch Acnn Model

code of Relation Classification via Multi-Level Attention CNNs

Stars: ✭ 170 (-93.21%)

Mutual labels: attention-model

Multimodal Sentiment Analysis

Attention-based multimodal fusion for sentiment analysis

Stars: ✭ 172 (-93.13%)

Mutual labels: attention-mechanism

Sparse Structured Attention

Sparse and structured neural attention mechanisms

Stars: ✭ 198 (-92.09%)

Mutual labels: attention-mechanism

Speech emotion recognition blstm

Bidirectional LSTM network for speech emotion recognition.

Stars: ✭ 203 (-91.89%)

Mutual labels: attention-model

attention-based LSTM/Dense implemented by Keras

Stars: ✭ 168 (-93.29%)

Mutual labels: attention-mechanism

Train and visualize Hierarchical Attention Networks

Stars: ✭ 192 (-92.33%)

Mutual labels: attention-mechanism

A Deep Learning library for EEG Tasks (Signals) Classification, based on TensorFlow.

Stars: ✭ 165 (-93.41%)

Mutual labels: attention-mechanism

Implementation of Slot Attention from GoogleAI

Stars: ✭ 168 (-93.29%)

Mutual labels: attention-mechanism

Graph Attention Networks (https://arxiv.org/abs/1710.10903)

Stars: ✭ 2,229 (-10.98%)

Mutual labels: attention-mechanism

Guided Attention Inference Network

Contains implementation of Guided Attention Inference Network (GAIN) presented in Tell Me Where to Look(CVPR 2018). This repository aims to apply GAIN on fcn8 architecture used for segmentation.

Stars: ✭ 204 (-91.85%)

Mutual labels: attention-mechanism

View All Similar Projects ➔

Keras Attention Mechanism

Many-to-one attention mechanism for Keras.

Installation

pip install attention

Example

import numpy as np
from tensorflow.keras import Input
from tensorflow.keras.layers import Dense, LSTM
from tensorflow.keras.models import load_model, Model

from attention import Attention


def main():
    # Dummy data. There is nothing to learn in this example.
    num_samples, time_steps, input_dim, output_dim = 100, 10, 1, 1
    data_x = np.random.uniform(size=(num_samples, time_steps, input_dim))
    data_y = np.random.uniform(size=(num_samples, output_dim))

    # Define/compile the model.
    model_input = Input(shape=(time_steps, input_dim))
    x = LSTM(64, return_sequences=True)(model_input)
    x = Attention(32)(x)
    x = Dense(1)(x)
    model = Model(model_input, x)
    model.compile(loss='mae', optimizer='adam')
    print(model.summary())

    # train.
    model.fit(data_x, data_y, epochs=10)

    # test save/reload model.
    pred1 = model.predict(data_x)
    model.save('test_model.h5')
    model_h5 = load_model('test_model.h5')
    pred2 = model_h5.predict(data_x)
    np.testing.assert_almost_equal(pred1, pred2)
    print('Success.')


if __name__ == '__main__':
    main()

Other Examples

Browse examples.

Install the requirements before running the examples: pip install -r examples/examples-requirements.txt.

IMDB Dataset

In this experiment, we demonstrate that using attention yields a higher accuracy on the IMDB dataset. We consider two LSTM networks: one with this attention layer and the other one with a fully connected layer. Both have the same number of parameters for a fair comparison (250K).

Here are the results on 10 runs. For every run, we record the max accuracy on the test set for 10 epochs.

Measure	No Attention (250K params)	Attention (250K params)
MAX Accuracy	88.22	88.76
AVG Accuracy	87.02	87.62
STDDEV Accuracy	0.18	0.14

As expected, there is a boost in accuracy for the model with attention. It also reduces the variability between the runs, which is something nice to have.

Adding two numbers

Let's consider the task of adding two numbers that come right after some delimiters (0 in this case):

x = [1, 2, 3, 0, 4, 5, 6, 0, 7, 8]. Result is y = 4 + 7 = 11.

The attention is expected to be the highest after the delimiters. An overview of the training is shown below, where the top represents the attention map and the bottom the ground truth. As the training progresses, the model learns the task and the attention map converges to the ground truth.

Finding max of a sequence

We consider many 1D sequences of the same length. The task is to find the maximum of each sequence.

We give the full sequence processed by the RNN layer to the attention layer. We expect the attention layer to focus on the maximum of each sequence.

After a few epochs, the attention layer converges perfectly to what we expected.

References

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 2,504

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (9) 🔗