Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → ethancaballero → Improved Dynamic Memory Networks Dmn Plus

ethancaballero / Improved Dynamic Memory Networks Dmn Plus

Theano Implementation of DMN+ (Improved Dynamic Memory Networks) from the paper by Xiong, Merity, & Socher at MetaMind, http://arxiv.org/abs/1603.01417 (Dynamic Memory Networks for Visual and Textual Question Answering)

Programming Languages

python

139335 projects - #7 most used programming language

Labels

deep-learning nlp neural-network deep-neural-networks question-answering

Projects that are alternatives of or similar to Improved Dynamic Memory Networks Dmn Plus

Deeppavlov

An open source library for deep learning end-to-end dialog systems and chatbots.

Stars: ✭ 5,525 (+3248.48%)

Mutual labels: question-answering, deep-neural-networks

Deep Embedded Memory Networks

https://arxiv.org/abs/1707.00836

Stars: ✭ 19 (-88.48%)

Mutual labels: question-answering, deep-neural-networks

Bidaf Keras

Bidirectional Attention Flow for Machine Comprehension implemented in Keras 2

Stars: ✭ 60 (-63.64%)

Mutual labels: question-answering, deep-neural-networks

Zi2zi

Learning Chinese Character style with conditional GAN

Stars: ✭ 1,988 (+1104.85%)

Mutual labels: deep-neural-networks

Nspm

🤖 Neural SPARQL Machines for Knowledge Graph Question Answering.

Stars: ✭ 156 (-5.45%)

Mutual labels: question-answering

Denspi

Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index (DenSPI)

Stars: ✭ 162 (-1.82%)

Mutual labels: question-answering

Onnx Caffe2

Caffe2 implementation of Open Neural Network Exchange (ONNX)

Stars: ✭ 164 (-0.61%)

Mutual labels: deep-neural-networks

Aw nas

aw_nas: A Modularized and Extensible NAS Framework

Stars: ✭ 152 (-7.88%)

Mutual labels: deep-neural-networks

Awesomemrc

This repo is our research summary and playground for MRC. More features are coming.

Stars: ✭ 162 (-1.82%)

Mutual labels: question-answering

Guesslang

Detect the programming language of a source code

Stars: ✭ 159 (-3.64%)

Mutual labels: deep-neural-networks

Applied Deep Learning With Tensorflow

Learn applied deep learning from zero to deployment using TensorFlow 1.8+

Stars: ✭ 160 (-3.03%)

Mutual labels: deep-neural-networks

Best ai paper 2020

A curated list of the latest breakthroughs in AI by release date with a clear video explanation, link to a more in-depth article, and code

Stars: ✭ 2,140 (+1196.97%)

Mutual labels: deep-neural-networks

Tf Adnet Tracking

Deep Object Tracking Implementation in Tensorflow for 'Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning(CVPR 2017)'

Stars: ✭ 162 (-1.82%)

Mutual labels: deep-neural-networks

Self Driving Car

Udacity Self-Driving Car Engineer Nanodegree projects.

Stars: ✭ 2,103 (+1174.55%)

Mutual labels: deep-neural-networks

Iresnet

Improved Residual Networks (https://arxiv.org/pdf/2004.04989.pdf)

Stars: ✭ 163 (-1.21%)

Mutual labels: deep-neural-networks

Squeezesegv2

Implementation of SqueezeSegV2, Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud

Stars: ✭ 154 (-6.67%)

Mutual labels: deep-neural-networks

Deep Learning With Tensorflow

Deep Learning with TensorFlow by Packt

Stars: ✭ 163 (-1.21%)

Mutual labels: deep-neural-networks

Dynamics

A Compositional Object-Based Approach to Learning Physical Dynamics

Stars: ✭ 159 (-3.64%)

Mutual labels: deep-neural-networks

Chinese Rc Datasets

Collections of Chinese reading comprehension datasets

Stars: ✭ 159 (-3.64%)

Mutual labels: question-answering

Hey Jetson

Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.

Stars: ✭ 161 (-2.42%)

Mutual labels: deep-neural-networks

View All Similar Projects ➔

Implementation of DMN+ (Improved Dynamic Memory Networks) in Theano

This is an implementation of DMN+ (Improved Dynamic Memory Networks) from the paper by Xiong et al. at MetaMind, Dynamic Memory Networks for Visual and Textual Question Answering, arXiv:1603.01417.

It's a fork of YerevaNN's implementation of the initial version of Dynamic Memory Networks.

Question answering webapp using implementation is currently running at ethancaballero.pythonanywhere.com; type the number 2, 3, 6, or 17 into Task Type box and then click 'Load Task Type' button to start webapp.

Repository contents

file	description
`webapp.py`	run webapp demo of DMN+ (adapted from MemN2N webapp)
`dmn_tied.py`	weights of answer module are tied; trains faster
`dmn_untied.py`	weights of answer module are untied; slightly better performance for most taks, but slower training
`utils.py`	tools for working with bAbI tasks
`nn_utils.py`	helper functions on top of Theano and Lasagne
`fetch_babi_data.sh`	shell script to fetch bAbI tasks (adapted from MemN2N)

Usage

This implementation is based on Theano, Lasagne, and Keras. One way to install them is:

pip install -r https://raw.githubusercontent.com/Lasagne/Lasagne/master/requirements.txt
pip install https://github.com/Lasagne/Lasagne/archive/master.zip
pip install keras

numpy, scipy, & flask are installed with pip install *

Also, NLTK is used for sentence splitting. to install:

pip install nltk

then Run the Python interpreter and type these commands to download punkt dataset:

import nltk
nltk.download('punkt')

The following bash scripts will download bAbI tasks.

./fetch_babi_data.sh

Use main.py to train a network:

python main.py --network dmn_tied --mode train --babi_id 1

The states of the network will be saved in states/ folder. There are pretrained states for bAbI tasks 2, 3, 6, & 17 in the states folder.

.theanorc file contains theano configuration that was used; dmn_tied might yield NaN if included .theanorc configuration is not used. definitely make sure floatX is set equal to float32 in .theanorc file

Webapp Usage

The webapp uses this implementation to answer questions about stories from bAbI qa task 2, 3, 6, or 17; these tasks were chosen because they are the tasks that receive the largest performance increase from DMN+ (and conversely are the tasks that initial DMN struggled with the most). The webapp does not offer server side training because I couldn't find a free hosting service that has sufficient compute to train in a reasonable amount of time.

The question answering webapp using this implementation is currently running at ethancaballero.pythonanywhere.com. Upon arriving at the webapp page, type the number 2, 3, 6, or 17 into Task Type box and then click 'Load Task Type' button to load bAbI task of that type (loading the task usually takes about one minute). When loading finishes, the Story and Question text boxes will respectively be filled with a story and a question. Next, click 'Predict Answer' to view the network's answer_prediction, attentions, and confidence given the current story_question pair or click 'Load New Story' to get new story_question pair. To load bAbI tasks of a different type, type the number 2, 3, 6, or 17 into Task Type box and then click 'Load Task Type' button to load bAbI task of that type.

To run your own webapp locally:

python webapp.py

then go to http://0.0.0.0:5000/ in browser

Additions implemented in this DMN+ repo

positional sentence encoder produced by: fi = sum(lj * wij) in M through j=1 where lj is a column vector with structure ljd = (1 - j/M) - (d/D)(1 - 2j/M)
Bidirectional GRU for input fusion layer of input module
gt from Episodic Memory Module is a function similar to softmax
Attention based GRU in which update gate u of GRU is replaced with gt to yield this hidden layer: hi=git h?i +(1 git) hi 1
untied answer module weights (this addition was not implemented before training time so it is not used by the webapp)
most of the additions are in lines ~250-440 of dmn_*.py

Questions

Unsure how/where input dropout is supposed to be implemented
How is the ReLU memory update layer supposed to work?? The paper seems to use concatenated floats to allocate the subtensors of Wt, but how can float(s) allocate subtensor(s) (wouldn't integers need to be used?).

Benchmarks

DMN+ implementation accuracy on test data is higher than that of 2015 DMN, but lower than accuracies reported in DMN+ paper. I'm pretty sure this insufficient generalization is due to an error in my implementation of dropout on the initial sentence encodings.
This benchmark tests the model when it has been trained in a weakly supervised setting (relevant supporting facts are not marked during training)

English bAbI task (10k)	Test error rate (%)
2: 2 supporting facts	9.5
3: 3 supporting facts	28.9
6: yes/no questions	0.3
17: positional reasoning	20.2

training settings were: adam with lr=.0002 and beta=.5; dropout p=.1; l2 reg = .0005; ~100 epochs

TODO

Mini-batch training
Implement Visual Portion
figure out update ReLU and input dropout

Acknowledgements

This is a fork of YerevaNN's implementation of the initial version of Dynamic Memory Networks.
Webapp design is based off Vinh Khuc's MemN2N Webapp implementation

License?

JSON

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 165

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗