All Projects → Arturus → Kaggle Web Traffic

Arturus / Kaggle Web Traffic

Licence: mit
1st place solution

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Kaggle Web Traffic

Kaggle Web Traffic Time Series Forecasting
Solution to Kaggle - Web Traffic Time Series Forecasting
Stars: ✭ 29 (-98.23%)
Mutual labels:  kaggle, jupyter-notebook, time-series, timeseries
Tsai
Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai
Stars: ✭ 407 (-75.2%)
Mutual labels:  jupyter-notebook, time-series, timeseries, rnn
Timesynth
A Multipurpose Library for Synthetic Time Series Generation in Python
Stars: ✭ 170 (-89.64%)
Mutual labels:  jupyter-notebook, time-series, timeseries
Natural Language Processing With Tensorflow
Natural Language Processing with TensorFlow, published by Packt
Stars: ✭ 222 (-86.47%)
Mutual labels:  jupyter-notebook, rnn, seq2seq
Tsmoothie
A python library for time-series smoothing and outlier detection in a vectorized way.
Stars: ✭ 109 (-93.36%)
Mutual labels:  jupyter-notebook, time-series, timeseries
Simplestockanalysispython
Stock Analysis Tutorial in Python
Stars: ✭ 126 (-92.32%)
Mutual labels:  jupyter-notebook, time-series, timeseries
Poetry Seq2seq
Chinese Poetry Generation
Stars: ✭ 159 (-90.31%)
Mutual labels:  jupyter-notebook, rnn, seq2seq
Text summurization abstractive methods
Multiple implementations for abstractive text summurization , using google colab
Stars: ✭ 359 (-78.12%)
Mutual labels:  jupyter-notebook, rnn, seq2seq
Chinese Chatbot
中文聊天机器人,基于10万组对白训练而成,采用注意力机制,对一般问题都会生成一个有意义的答复。已上传模型,可直接运行,跑不起来直播吃键盘。
Stars: ✭ 124 (-92.44%)
Mutual labels:  jupyter-notebook, rnn, seq2seq
Telemanom
A framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.
Stars: ✭ 589 (-64.11%)
Mutual labels:  jupyter-notebook, time-series, rnn
Deeplearning
深度学习入门教程, 优秀文章, Deep Learning Tutorial
Stars: ✭ 6,783 (+313.35%)
Mutual labels:  kaggle, jupyter-notebook, rnn
Ad examples
A collection of anomaly detection methods (iid/point-based, graph and time series) including active learning for anomaly detection/discovery, bayesian rule-mining, description for diversity/explanation/interpretability. Analysis of incorporating label feedback with ensemble and tree-based detectors. Includes adversarial attacks with Graph Convolutional Network.
Stars: ✭ 641 (-60.94%)
Mutual labels:  time-series, timeseries, rnn
Pytorch Seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Stars: ✭ 3,418 (+108.29%)
Mutual labels:  jupyter-notebook, rnn, seq2seq
Allstate capstone
Allstate Kaggle Competition ML Capstone Project
Stars: ✭ 72 (-95.61%)
Mutual labels:  kaggle, jupyter-notebook, cudnn
Stingray
Anything can happen in the next half hour (including spectral timing made easy)!
Stars: ✭ 94 (-94.27%)
Mutual labels:  jupyter-notebook, time-series, timeseries
Dmm
Deep Markov Models
Stars: ✭ 103 (-93.72%)
Mutual labels:  jupyter-notebook, time-series
Dataminingnotesandpractice
记录我学习数据挖掘过程的笔记和见到的奇技,持续更新~
Stars: ✭ 103 (-93.72%)
Mutual labels:  kaggle, jupyter-notebook
Time Series Forecasting With Python
A use-case focused tutorial for time series forecasting with python
Stars: ✭ 105 (-93.6%)
Mutual labels:  jupyter-notebook, time-series
Dog Breeds Classification
Set of scripts and data for reproducing dog breed classification model training, analysis, and inference.
Stars: ✭ 105 (-93.6%)
Mutual labels:  kaggle, jupyter-notebook
Codesearchnet
Datasets, tools, and benchmarks for representation learning of code.
Stars: ✭ 1,378 (-16.03%)
Mutual labels:  jupyter-notebook, rnn

Kaggle Web Traffic Time Series Forecasting

1st place solution

predictions

Main files:

  • make_features.py - builds features from source data
  • input_pipe.py - TF data preprocessing pipeline (assembles features into training/evaluation tensors, performs some sampling and normalisation)
  • model.py - the model
  • trainer.py - trains the model(s)
  • hparams.py - hyperpatameter sets.
  • submission-final.ipynb - generates predictions for submission

How to reproduce competition results:

  1. Download input files from https://www.kaggle.com/c/web-traffic-time-series-forecasting/data : key_2.csv.zip, train_2.csv.zip, put them into data directory.
  2. Run python make_features.py data/vars --add_days=63. It will extract data and features from the input files and put them into data/vars as Tensorflow checkpoint.
  3. Run trainer: python trainer.py --name s32 --hparam_set=s32 --n_models=3 --name s32 --no_eval --no_forward_split --asgd_decay=0.99 --max_steps=11500 --save_from_step=10500. This command will simultaneously train 3 models on different seeds (on a single TF graph) and save 10 checkpoints from step 10500 to step 11500 to data/cpt. Note: training requires GPU, because of cuDNN usage. CPU training will not work. If you have 3 or more GPUs, add --multi_gpu flag to speed up the training. One can also try different hyperparameter sets (described in hparams.py): --hparam_set=definc, --hparam_set=inst81, etc. Don't be afraid of displayed NaN losses during training. This is normal, because we do the training in a blind mode, without any evaluation of model performance.
  4. Run submission-final.ipynb in a standard jupyter notebook environment, execute all cells. Prediction will take some time, because it have to load and evaluate 30 different model weights. At the end, you'll get submission.csv.gz file in data directory.

See also detailed model description

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].