Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → 3springs → Attentive Neural Processes

3springs / Attentive Neural Processes

Licence: apache-2.0

implementing "recurrent attentive neural processes" to forecast power usage (w. LSTM baseline, MCDropout)

Labels

jupyter-notebook pytorch rnn attention prediction

Projects that are alternatives of or similar to Attentive Neural Processes

Machine Learning

My Attempt(s) In The World Of ML/DL....

Stars: ✭ 78 (+136.36%)

Mutual labels: jupyter-notebook, rnn, attention

Doc Han Att

Hierarchical Attention Networks for Chinese Sentiment Classification

Stars: ✭ 206 (+524.24%)

Mutual labels: jupyter-notebook, rnn, attention

Chinese Chatbot

中文聊天机器人，基于10万组对白训练而成，采用注意力机制，对一般问题都会生成一个有意义的答复。已上传模型，可直接运行，跑不起来直播吃键盘。

Stars: ✭ 124 (+275.76%)

Mutual labels: jupyter-notebook, rnn, attention

Pytorch Seq2seq

Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.

Stars: ✭ 3,418 (+10257.58%)

Mutual labels: jupyter-notebook, rnn, attention

Rnn For Joint Nlu

Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling" (https://arxiv.org/abs/1609.01454)

Stars: ✭ 176 (+433.33%)

Mutual labels: jupyter-notebook, rnn, attention

Base-On-Relation-Method-Extract-News-DA-RNN-Model-For-Stock-Prediction--Pytorch

基於關聯式新聞提取方法之雙階段注意力機制模型用於股票預測

Stars: ✭ 33 (+0%)

Mutual labels: prediction, rnn, attention

Easy Deep Learning With Keras

Keras tutorial for beginners (using TF backend)

Stars: ✭ 367 (+1012.12%)

Mutual labels: jupyter-notebook, rnn

Deep learning nlp

Keras, PyTorch, and NumPy Implementations of Deep Learning Architectures for NLP

Stars: ✭ 407 (+1133.33%)

Mutual labels: jupyter-notebook, attention

Pytorch Original Transformer

My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.

Stars: ✭ 411 (+1145.45%)

Mutual labels: jupyter-notebook, attention

Video Classification

Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101

Stars: ✭ 543 (+1545.45%)

Mutual labels: jupyter-notebook, rnn

Ner Bert

BERT-NER (nert-bert) with google bert https://github.com/google-research.

Stars: ✭ 339 (+927.27%)

Mutual labels: jupyter-notebook, attention

Headlines

Automatically generate headlines to short articles

Stars: ✭ 516 (+1463.64%)

Mutual labels: jupyter-notebook, rnn

Telemanom

A framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.

Stars: ✭ 589 (+1684.85%)

Mutual labels: jupyter-notebook, rnn

Text summurization abstractive methods

Multiple implementations for abstractive text summurization , using google colab

Stars: ✭ 359 (+987.88%)

Mutual labels: jupyter-notebook, rnn

Fast Pytorch

Pytorch Tutorial, Pytorch with Google Colab, Pytorch Implementations: CNN, RNN, DCGAN, Transfer Learning, Chatbot, Pytorch Sample Codes

Stars: ✭ 346 (+948.48%)

Mutual labels: jupyter-notebook, rnn

Tsai

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

Stars: ✭ 407 (+1133.33%)

Mutual labels: jupyter-notebook, rnn

Thesemicolon

This repository contains Ipython notebooks and datasets for the data analytics youtube tutorials on The Semicolon.

Stars: ✭ 345 (+945.45%)

Mutual labels: jupyter-notebook, rnn

Deeplearning

深度学习入门教程, 优秀文章, Deep Learning Tutorial

Stars: ✭ 6,783 (+20454.55%)

Mutual labels: jupyter-notebook, rnn

Rnn Time Series Anomaly Detection

RNN based Time-series Anomaly detector model implemented in Pytorch.

Stars: ✭ 718 (+2075.76%)

Mutual labels: rnn, prediction

Tensorflow cookbook

Code for Tensorflow Machine Learning Cookbook

Stars: ✭ 5,984 (+18033.33%)

Mutual labels: jupyter-notebook, rnn

View All Similar Projects ➔

Neural Processes for sequential data

This repo implements "Recurrent Attentive Neural Process for Sequential Data" (ANP-RNN) on a toy regression problem. And also tests it on real smart meter data.

This repository has lots of options so you can run it as a ANP-RNN, or ANP or NP.

Models
- ANP-RNN "Recurrent Attentive Neural Process for Sequential Data"
- ANP: Attentive Neural Processes
- NP: Neural Processes
- LSTM
Experiments:
- Monte Carlo Dropout
Data:
- Toy 1d regression
- Power prediction problem

I've also made lots of tweaks for flexibility and stability and replicated the DeepMind ANP results in pytorch. The replication qualitatively seems like a better match than the other pytorch versions of ANP (as of 2019-11-01). You can see other code repositories in the see also section.

It's not heavily documented, because most of my code never gets read or used. If you are using it, and it's confusing, make a github issue are we will add comments or docs together.

git clone
git-lfs pull

Neural Processes for sequential data
Citing

Experiment: Comparing models on real world data

Here I compare the models on smartmeter power demand data.

The black dots are input data, the dotted line is the true data. The blue line is the prediction, and the blue shadow is the uncertainty to one standard deviation.

I chose a difficult example below, it's a window in the test set that deviates from the previous pattern. Given 3 days inputs, it must predict the next day, and the next day has higher power usage than previously. The trained model manages to predict it based on the inputs.

Results

Results on Smartmeter prediction (lower is better)

Model	val_np_loss	val_mse_loss
ANP-RNN(impr)(MCDropout)	-1.48
ANP-RNN_imp	-1.38	.00423
ANP-RNN	-1.27	0.0047
ANP	-1.3	0.0072
NP	-1.3	0.0040
LSTM	-0.78	0.0074

Example LSTM baseline

Here is an LSTM with a similar setup: it has access to the y value in the context (first half). It's output is inferier and it's uncertainty estimation if poor. It starts of high since it hasn't had much data yet, but it should increase, or at least stay high in the second half as it moves away from it's data.

Example NP

Here we see underfitting, since the curve doesn't match the data

Example ANP outputs (sequential)

Here we see overfitting, but the uncertainty seems to small, and the fit could be improved

Example ANP-RNN outputs

This has a better calibrated uncertainty and a better fit

Example of ANP-RNN with MCDropout

Experiment: Comparing models on toy 1d regression

I put some work into replicating the behaviour shown in the original deepmind tensorflow notebook. At the same time I compared multiple models.

Results

Results on toy 1d regression (lower is better)

model	val_loss
ANP-RNN(impr)	-1.3217
ANP-RNN	-0.62
ANP	-0.4228
ANP(impr)	-0.3182
NP	-1.2687

Example outputs

Compare deepmind:

And this repo with an ANP (anp_1d_regression.ipynb)

And a ANP-RNN

It's just a qualitative comparison but we see the same kind of overfitting with uncertainty being tight where lots of data points exist, and wide where they do not. However this repo seems to miss points occasionally.

Experiment: Using ANP-RNN + Monte Carlo Dropout

One more experiment is included:

The model tries to estimate the how unsure it is, but what about when it is out of sample? What about what it doesn't know that it doesn't know?

Name	val_loss (n=100) [lower is better]
MCDropout	-1.31
Normal	-1.04

We can estimate additional uncertainty by using Monte Carlo Dropout to see how uncertain the model acts in the presence of dropout. This doesn't capture all uncertainty, but I found that is does improve (decrease) the validation loss. The loss is calculated by the negative overlap of the output distribution and the target value so this improvement in the loss shows that MCDropout improved the estimation of the uncertainty.

Why didn't the model just learn to be more uncertain? Well I choose a challenging train, val/test split where the val data was in the future and showed quite differen't behaviour. That means that the validation data had behaviour the model has never seen before.

With MCDropout:

Without

For more details see the notebook ./smartmeters-ANP-RNN-mcdropout.ipynb

Usage

clone this repository
see requirements.txt for requirements and version
Start and run the notebook smartmeters.ipynb
To see a toy 1d regression problem, look at anp-rnn_1d_regression.ipynb

Smartmeter Data

Some data is included, you can get more from https://www.kaggle.com/jeanmidev/smart-meters-in-london/version/11
Inputs are:
- Weather
- Time features: time of day, day of week, month of year, etc
- Bank holidays
- Position in sequence: days since start of window
Target is: mean power usage on block 0

Code

This is based on the code listed in the next section, with some changes. The most notable ones add stability, others are to make sure it can handle predicting into the future:

Changes for a sequential/predictive use case:

target points are always in the future, context is in the past
context and targets are still sampled randomly during training

Changes for stability:

in eval mode, take mean of latent space, and mean of output isntead of sampling
use log_variance where possible (there is a flag to try without this, and it seems to help)
- and add a minimum bound to std (in log domain) to avoid mode collapse (one path using log_var one not)
use log_prob loss (not mseloss or BCELoss)
use pytorch attention (which has dropout and is faster) instead of custom attention
use_deterministic option, although it seems to do better with this off
use batchnorm and dropout on channel dimensions
check and skip nonfinite values because for extreme inputs we can still get nan's. Also gradient clipping
use pytorch lightning for early stopping, hyperparam opt, and reduce learning rate on plateau

ANP-RNN diagram

Tips

Make you normalise all data, ideally the output two, this seems to be very important
Batchnorm, lvar, dropout: these seem ok but it's unclear to me how to make these help reliably. Attention dropout or lstm dropout can be especially unreliable.
sometimes you need quite a large hidden space to model a process. Making the network deep seems to stop it learning effectivly. It would be helpfull to try differen't activations, initializations and make sure the gradient flows effectivly to deeper networks,
The deterministic path had unclear value, I found it best to leave it out
The absolute size and comparitive size of the context and target is important for performance.
- If the context is too long and complex the model cannot summarize it
- If the target is too long and complex the model cannot fit it well
- If the context is in the target, the model may collapse to just fitting this. To fix
  - make it small
  - or make the loss on this part downweighted, this seems like the best approach since x_context->y_context may still be a usefull secondary task
  - or do not include context in target
- however including the target in the context may sometimes be helpfull
This repo compares models, but the biggest difference in this situation would be from additional data sources, but that is outside the scope of these experiments

Citing

If you like our work and end up using this code for your reseach give us a shout-out by citing or acknowledging

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 33

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

3springs / Attentive Neural Processes

Labels

Projects that are alternatives of or similar to Attentive Neural Processes

Neural Processes for sequential data

Experiment: Comparing models on real world data

Results

Example LSTM baseline

Example NP

Example ANP outputs (sequential)

Example ANP-RNN outputs

Example of ANP-RNN with MCDropout

Experiment: Comparing models on toy 1d regression

Results

Example outputs

Experiment: Using ANP-RNN + Monte Carlo Dropout

Usage

Smartmeter Data

Code

ANP-RNN diagram

Tips

See also:

Citing