All Projects → LenzDu → Kaggle Competition Favorita

LenzDu / Kaggle Competition Favorita

Licence: mit
5th place solution for Kaggle competition Favorita Grocery Sales Forecasting

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Kaggle Competition Favorita

Kaggle Web Traffic Time Series Forecasting
Solution to Kaggle - Web Traffic Time Series Forecasting
Stars: ✭ 29 (-82.84%)
Mutual labels:  kaggle, time-series, cnn
Multi Class Text Classification Cnn Rnn
Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.
Stars: ✭ 570 (+237.28%)
Mutual labels:  kaggle, cnn, lstm
Sequitur
Library of autoencoders for sequential data
Stars: ✭ 162 (-4.14%)
Mutual labels:  time-series, lstm
Robust Lane Detection
Stars: ✭ 110 (-34.91%)
Mutual labels:  cnn, lstm
Eeg Dl
A Deep Learning library for EEG Tasks (Signals) Classification, based on TensorFlow.
Stars: ✭ 165 (-2.37%)
Mutual labels:  cnn, lstm
Awesome Deep Learning Resources
Rough list of my favorite deep learning resources, useful for revisiting topics or for reference. I have got through all of the content listed there, carefully. - Guillaume Chevalier
Stars: ✭ 1,469 (+769.23%)
Mutual labels:  cnn, lstm
Deep Learning Based Ecg Annotator
Annotation of ECG signals using deep learning, tensorflow’ Keras
Stars: ✭ 110 (-34.91%)
Mutual labels:  time-series, lstm
Kaggle Web Traffic
1st place solution
Stars: ✭ 1,641 (+871.01%)
Mutual labels:  kaggle, time-series
Cnn lstm for text classify
CNN, LSTM, NBOW, fasttext 中文文本分类
Stars: ✭ 90 (-46.75%)
Mutual labels:  cnn, lstm
Ncrfpp
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Stars: ✭ 1,767 (+945.56%)
Mutual labels:  cnn, lstm
Easyocr
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Stars: ✭ 13,379 (+7816.57%)
Mutual labels:  cnn, lstm
Vpilot
Scripts and tools to easily communicate with DeepGTAV. In the future a self-driving agent will be implemented.
Stars: ✭ 136 (-19.53%)
Mutual labels:  cnn, lstm
Sarcasmdetection
Sarcasm detection on tweets using neural network
Stars: ✭ 99 (-41.42%)
Mutual labels:  cnn, lstm
Pytorch Learners Tutorial
PyTorch tutorial for learners
Stars: ✭ 97 (-42.6%)
Mutual labels:  cnn, lstm
Deeplearning tutorials
The deeplearning algorithms implemented by tensorflow
Stars: ✭ 1,580 (+834.91%)
Mutual labels:  cnn, lstm
Pytorch Pos Tagging
A tutorial on how to implement models for part-of-speech tagging using PyTorch and TorchText.
Stars: ✭ 96 (-43.2%)
Mutual labels:  cnn, lstm
Pytorch Gan Timeseries
GANs for time series generation in pytorch
Stars: ✭ 109 (-35.5%)
Mutual labels:  time-series, lstm
Forecasting
Time Series Forecasting Best Practices & Examples
Stars: ✭ 2,123 (+1156.21%)
Mutual labels:  time-series, lightgbm
Cnn For Stock Market Prediction Pytorch
CNN for stock market prediction using raw data & candlestick graph.
Stars: ✭ 86 (-49.11%)
Mutual labels:  time-series, cnn
End To End Sequence Labeling Via Bi Directional Lstm Cnns Crf Tutorial
Tutorial for End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
Stars: ✭ 87 (-48.52%)
Mutual labels:  cnn, lstm

Kaggle-Competition-Favorita

This is the 5th place solution for Kaggle competition Favorita Grocery Sales Forecasting.

The Problem

This competition is a time series problem where we are required to predict the sales of different items in different stores for 16 days in the future, given the sales history and promotion info of these items. Additional information about the items and the stores are also provided. Dataset and detailed description can be found on the competition page: https://www.kaggle.com/c/favorita-grocery-sales-forecasting

Model Overview

I build 3 models: a Gradient Boosting, a CNN+DNN and a seq2seq RNN model. Final model was a weighted average of these models (where each model is stabilized by training multiple times with different random seeds then take the average). Each model separately can stay in top 1% in the final ranking.

LGBM: It is an upgraded model from the public kernels. More features, data and periods were fed to the model.

CNN+DNN: This is a traditional NN model, where the CNN part is a dilated causal convolution inspired by WaveNet, and the DNN part is 2 FC layers connected to raw sales sequences. Then the inputs are concatenated together with categorical embeddings and future promotions, and directly output to 16 future days of predictions.

RNN: This is a seq2seq model with a similar architecture of @Arthur Suilin's solution for the web traffic prediction. Encoder and decoder are both GRUs. The hidden states of the encoder are passed to the decoder through an FC layer connector. This is useful to improve the accuracy significantly.

How to Run the Model

Three models are in separate .py files as their filename tell.

Before running the models, download the data from the competition website, and add records of 0 with any existing store-item combo on every Dec 25th in the training data. Then use the function load_data() in Utils.py to load and transform the raw data files, and use save_unstack() to save them to feather files. In the model codes, change the input of load_unstack() to the filename you saved. Then the models can be runned. Please read the codes of these functions for more details.

Note: if you are not using a GPU, change CudnnGRU to GRU in seq2seq.py

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].