All Projects → IvanBongiorni → GAN-RNN_Timeseries-imputation

IvanBongiorni / GAN-RNN_Timeseries-imputation

Licence: MIT License
Recurrent GAN for imputation of time series data. Implemented in TensorFlow 2 on Wikipedia Web Traffic Forecast dataset from Kaggle.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to GAN-RNN Timeseries-imputation

Tensorflow Tutorials
텐서플로우를 기초부터 응용까지 단계별로 연습할 수 있는 소스 코드를 제공합니다
Stars: ✭ 2,096 (+1858.88%)
Mutual labels:  gan, rnn, seq2seq
Rgan
Recurrent (conditional) generative adversarial networks for generating real-valued time series data.
Stars: ✭ 480 (+348.6%)
Mutual labels:  gan, rnn
Tensorflow Tutorial
Tensorflow tutorial from basic to hard, 莫烦Python 中文AI教学
Stars: ✭ 4,122 (+3752.34%)
Mutual labels:  gan, rnn
Mlds2018spring
Machine Learning and having it Deep and Structured (MLDS) in 2018 spring
Stars: ✭ 124 (+15.89%)
Mutual labels:  gan, seq2seq
Base-On-Relation-Method-Extract-News-DA-RNN-Model-For-Stock-Prediction--Pytorch
基於關聯式新聞提取方法之雙階段注意力機制模型用於股票預測
Stars: ✭ 33 (-69.16%)
Mutual labels:  rnn, seq2seq
Basicocr
BasicOCR是一个致力于解决自然场景文字识别算法研究的项目。该项目由长城数字大数据应用技术研究院佟派AI团队发起和维护。
Stars: ✭ 336 (+214.02%)
Mutual labels:  gan, rnn
Ad examples
A collection of anomaly detection methods (iid/point-based, graph and time series) including active learning for anomaly detection/discovery, bayesian rule-mining, description for diversity/explanation/interpretability. Analysis of incorporating label feedback with ensemble and tree-based detectors. Includes adversarial attacks with Graph Convolutional Network.
Stars: ✭ 641 (+499.07%)
Mutual labels:  gan, rnn
Course-Project---Speech-Driven-Facial-Animation
ECE 535 - Course Project, Deep Learning Framework
Stars: ✭ 63 (-41.12%)
Mutual labels:  gan, rnn
Keraspp
코딩셰프의 3분 딥러닝, 케라스맛
Stars: ✭ 178 (+66.36%)
Mutual labels:  gan, rnn
Iseebetter
iSeeBetter: Spatio-Temporal Video Super Resolution using Recurrent-Generative Back-Projection Networks | Python3 | PyTorch | GANs | CNNs | ResNets | RNNs | Published in Springer Journal of Computational Visual Media, September 2020, Tsinghua University Press
Stars: ✭ 202 (+88.79%)
Mutual labels:  gan, rnn
tensorflow-ml-nlp-tf2
텐서플로2와 머신러닝으로 시작하는 자연어처리 (로지스틱회귀부터 BERT와 GPT3까지) 실습자료
Stars: ✭ 245 (+128.97%)
Mutual labels:  rnn, seq2seq
Seq2seq Chatbot For Keras
This repository contains a new generative model of chatbot based on seq2seq modeling.
Stars: ✭ 322 (+200.93%)
Mutual labels:  gan, seq2seq
Mydeeplearning
A deep learning library to provide algs in pure Numpy or Tensorflow.
Stars: ✭ 281 (+162.62%)
Mutual labels:  gan, rnn
Time Series Prediction
A collection of time series prediction methods: rnn, seq2seq, cnn, wavenet, transformer, unet, n-beats, gan, kalman-filter
Stars: ✭ 351 (+228.04%)
Mutual labels:  gan, seq2seq
SLE-GAN
Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis
Stars: ✭ 53 (-50.47%)
Mutual labels:  gan, tensorflow2
Deeplearning
深度学习入门教程, 优秀文章, Deep Learning Tutorial
Stars: ✭ 6,783 (+6239.25%)
Mutual labels:  gan, rnn
Stockpriceprediction
Stock Price Prediction using Machine Learning Techniques
Stars: ✭ 700 (+554.21%)
Mutual labels:  forecasting, rnn
CS231n
My solutions for Assignments of CS231n: Convolutional Neural Networks for Visual Recognition
Stars: ✭ 30 (-71.96%)
Mutual labels:  gan, rnn
DeepLearning-Lab
Code lab for deep learning. Including rnn,seq2seq,word2vec,cross entropy,bidirectional rnn,convolution operation,pooling operation,InceptionV3,transfer learning.
Stars: ✭ 83 (-22.43%)
Mutual labels:  rnn, seq2seq
Solar-Rad-Forecasting
In these notebooks the entire research and implementation process carried out for the construction of various machine learning models based on neural networks that are capable of predicting levels of solar radiation is captured given a set of historical data taken by meteorological stations.
Stars: ✭ 24 (-77.57%)
Mutual labels:  forecasting, rnn

Author: Ivan Bongiorni, Data Scientist. LinkedIn.

Convolutional Recurrent Seq2seq GAN for the Imputation of Missing Values in Time Series Data

Description

The goal of this project is the implementation of multiple configurations of a Recurrent Convolutional Seq2seq neural network for the imputation of time series data. Three implementations are provided:

  1. A Recurrent Convolutional seq2seq model.
  2. A GAN (Generative Adversarial Network) based on the same architecture above, where an Imputer is trained to fool an adversarial Network that tries to distinguish real and fake (imputed) time series.
  3. A partially adversarial model, in which both Loss structures of previous models are combined in one: an Imputer model must reduce true error Loss, while trying to fool a Discriminator at the same time.

Models are Implemented in TensorFlow 2 and trained on the Wikipedia Web Traffic Time Series Forecasting dataset.


Files

  • config.yaml: configuration parameters for data preprocessing, training and testing.

Pipelines:

  • main_processing.py: starts data preprocessing pipeline. Its outcomes are ready-to-train datasets saved in .npy (numpy) format in /data_processed/ folder.
  • main_train.py: starts training pipeline. Trained model is saved in /saved_models/ folder, with the 'model_name' provided in config.yaml.

Scripts:

  • tools.py: contains more technical functions that are iterated during preprocessing pipeline.
  • model.py: implementation of models' architectures.
  • train.py: contains functions for all training configurations.
  • deterioration.py: the script contains the function that calls an artificial deterioration of training data, in order to check imputation performance.
  • impute.py: final script, to be called in order to produce imputed data (for raw time series that contain NaN's) and export them for future projects.

Notebooks and explanations:

  • how_it_works.md: contains explanation of Deep Learning models in greater detail.
  • nan_exploration.ipynb: contains a study of the distribution of NaN's in the raw dataset, that lead to the development of the deterioration function.
  • data_scaling_exploration.ipynb: contains visualizations of the scaling function I employed in data preprocessing phase.
  • imputation_visual_check.ipynb: visualization of a models performance. The notebook loads the trained model specified in params['model_name'] and check its performance on Validation and Test data.
  • performance_comparison.ipynb: shows the performances of three trained models on Test data, compared.

Folders:

  • data_raw/: it is supposed to contain the raw Wikipedia Web Traffic Time Series Forecasting dataset, as it is downloaded (and unzipped) from Kaggle.
  • data_processed/: it contains the outcome of preprocesing pipeline, launched from main_processing.py. Observations will be stored in three sub-directories for Training/, Validation/ and Test/.
  • saved_models/: where models are saved at the end of training pipepine. Model names can be changed in config.yaml. In case a GAN is trained and config parameter save_discriminator is set to True, the Discriminator model will be saved as [model_name]_discriminator.h5.

Modules required

langdetect==1.0.8
numpy==1.18.3
pandas==1.0.3
scikit-learn==0.22.2.post1
scipy==1.4.1
tensorflow==2.1.0

Bibliography

  • Luo, Y., Cai, X., Zhang, Y., & Xu, J. (2018). Multivariate time series imputation with generative adversarial networks. In Advances in Neural Information Processing Systems (pp. 1596-1607).
  • Yoon, J., Jordon, J., & Van Der Schaar, M. (2018). Gain: Missing data imputation using generative adversarial nets. arXiv preprint arXiv:1806.02920.
  • Guo, Z., Wan, Y., & Ye, H. (2019). A data imputation method for multivariate time series based on generative adversarial network. Neurocomputing, 360, 185-197.
  • Liu, Y., Yu, R., Zheng, S., Zhan, E., & Yue, Y. (2019). NAOMI: Non-autoregressive multiresolution sequence imputation. In Advances in Neural Information Processing Systems (pp. 11238-11248).
  • Luo, Y., Zhang, Y., Cai, X., & Yuan, X. (2019, August). E2GAN: End-to-End Generative Adversarial Network for Multivariate Time Series Imputation. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (pp. 3094-3100). AAAI Press.
  • Suo, Q., Yao, L., Xun, G., Sun, J., & Zhang, A. (2019, June). Recurrent Imputation for Multivariate Time Series with Missing Values. In 2019 IEEE International Conference on Healthcare Informatics (ICHI) (pp. 1-3). IEEE.
  • Tang, X., Yao, H., Sun, Y., Aggarwal, C. C., Mitra, P., & Wang, S. (2020). Joint Modeling of Local and Global Temporal Dynamics for Multivariate Time Series Forecasting with Missing Values. In AAAI (pp. 5956-5963).
  • Zhang, J., Mu, X., Fang, J., & Yang, Y. (2019). Time Series Imputation via Integration of Revealed Information Based on the Residual Shortcut Connection. IEEE Access, 7, 102397-102405.
  • Fortuin, V., Baranchuk, D., Rätsch, G., & Mandt, S. (2020, June). GP-VAE: Deep Probabilistic Time Series Imputation. In International Conference on Artificial Intelligence and Statistics (pp. 1651-1661).
  • Huang, T., Chakraborty, P., & Sharma, A. (2020). Deep convolutional generative adversarial networks for traffic data imputation encoding time series as images. arXiv preprint arXiv:2005.04188.
  • Huang, Y., Tang, Y., VanZwieten, J., & Liu, J. (2020). Reliable machine prognostic health management in the presence of missing data. Concurrency and Computation: Practice and Experience, e5762.
  • Jun, E., Mulyadi, A. W., Choi, J., & Suk, H. I. (2020). Uncertainty-Gated Stochastic Sequential Model for EHR Mortality Prediction. arXiv preprint arXiv:2003.00655.
  • Qi, M., Qin, J., Wu, Y., & Yang, Y. (2020). Imitative Non-Autoregressive Modeling for Trajectory Forecasting and Imputation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 12736-12745).
  • Wang, Y., Menkovski, V., Wang, H., Du, X., & Pechenizkiy, M. (2020). Causal Discovery from Incomplete Data: A Deep Learning Approach. arXiv preprint arXiv:2001.05343.
  • Yi, J., Lee, J., Kim, K. J., Hwang, S. J., & Yang, E. (2019). Why Not to Use Zero Imputation? Correcting Sparsity Bias in Training Neural Networks. arXiv preprint arXiv:1906.00150.
  • Yoon, S., & Sull, S. (2020). GAMIN: Generative Adversarial Multiple Imputation Network for Highly Missing Data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8456-8464).

Hardware

I trained this model on a fairly powerful machine: a System76 Adder WS laptop with 64 GB of RAM and NVidia RTX 2070 GPU.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].