Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → ceshine → Favorita_sales_forecasting

ceshine / Favorita_sales_forecasting

Solution to Corporación Favorita Grocery Sales Forecasting Competition

Programming Languages

python

139335 projects - #7 most used programming language

Labels

pytorch time-series

Projects that are alternatives of or similar to Favorita sales forecasting

Fecon235

Notebooks for financial economics. Keywords: Jupyter notebook pandas Federal Reserve FRED Ferbus GDP CPI PCE inflation unemployment wage income debt Case-Shiller housing asset portfolio equities SPX bonds TIPS rates currency FX euro EUR USD JPY yen XAU gold Brent WTI oil Holt-Winters time-series forecasting statistics econometrics

Stars: ✭ 708 (+3271.43%)

Mutual labels: time-series

Falcon Plus

An open-source and enterprise-level monitoring system.

Stars: ✭ 6,770 (+32138.1%)

Mutual labels: time-series

Agots

Anomaly Generator on Time Series

Stars: ✭ 24 (+14.29%)

Mutual labels: time-series

Uplot

📈 A small, fast chart for time series, lines, areas, ohlc & bars

Stars: ✭ 6,808 (+32319.05%)

Mutual labels: time-series

Btgym

Scalable, event-driven, deep-learning-friendly backtesting library

Stars: ✭ 765 (+3542.86%)

Mutual labels: time-series

Void

terminal-based personal organizer

Stars: ✭ 831 (+3857.14%)

Mutual labels: time-series

H1st

The AI Application Platform We All Need. Human AND Machine Intelligence. Based on experience building AI solutions at Panasonic: robotics predictive maintenance, cold-chain energy optimization, Gigafactory battery mfg, avionics, automotive cybersecurity, and more.

Stars: ✭ 697 (+3219.05%)

Mutual labels: time-series

Mycodo

An environmental monitoring and regulation system

Stars: ✭ 936 (+4357.14%)

Mutual labels: time-series

Informer2020

The GitHub repository for the paper "Informer" accepted by AAAI 2021.

Stars: ✭ 771 (+3571.43%)

Mutual labels: time-series

Gesturerecognition

Gesture Recognition using TensorFlow

Stars: ✭ 19 (-9.52%)

Mutual labels: time-series

Getting Things Done With Pytorch

Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch. Topics: Face detection with Detectron 2, Time Series anomaly detection with LSTM Autoencoders, Object Detection with YOLO v5, Build your first Neural Network, Time Series forecasting for Coronavirus daily cases, Sentiment Analysis with BERT.

Stars: ✭ 738 (+3414.29%)

Mutual labels: time-series

Darts

A python library for easy manipulation and forecasting of time series.

Stars: ✭ 760 (+3519.05%)

Mutual labels: time-series

Awesome Ai Ml Dl

Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.

Stars: ✭ 831 (+3857.14%)

Mutual labels: time-series

Rnn Time Series Anomaly Detection

RNN based Time-series Anomaly detector model implemented in Pytorch.

Stars: ✭ 718 (+3319.05%)

Mutual labels: time-series

Phildb

Timeseries database

Stars: ✭ 25 (+19.05%)

Mutual labels: time-series

Rrdtool 1.x

RRDtool 1.x - Round Robin Database

Stars: ✭ 702 (+3242.86%)

Mutual labels: time-series

Deep Learning Time Series

List of papers, code and experiments using deep learning for time series forecasting

Stars: ✭ 796 (+3690.48%)

Mutual labels: time-series

Pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

Stars: ✭ 838 (+3890.48%)

Mutual labels: time-series

Tempdisagg

Methods for Temporal Disaggregation and Interpolation of Time Series

Stars: ✭ 25 (+19.05%)

Mutual labels: time-series

Heroic

The Heroic Time Series Database

Stars: ✭ 836 (+3880.95%)

Mutual labels: time-series

View All Similar Projects ➔

(Simplified) Solution to Favorita Competition

Sorry, no CPU-only mode. You have to use an nvidia card to train models.

Test environment:

GTX 1070
16 GB RAM + 8 GB Swap
At least 30 GB free disk space

(it can be less if you turn off some of the joblib disk caching)

Docker 17.12.0-ce
Nvidia-docker 2.0

Acknowledgement

Transformer model comes from Yu-Hsiang Huang's implementation. His repo is included in "attention-is-all-you-need-pytorch" folder via git subtree.
LSTNet model is largely inspired from GUOKUN LAI's implementation.
The model structure is inspired by the work of Sean Vasquez and Arthur Suilin.

Docker Usage

First build the image. Example command: docker build -t favorita .

Then spin up a docker container:

docker run --runtime=nvidia --rm -ti \
    -v /mnt/Data/favorita_cache:/home/docker/labs/cache \
    -v /mnt/Data/favorita_data:/home/docker/labs/data \
    -p 6006:6006 favorita bash

It is recommended to manually mount the data and cache folder
port 6006 is for running tensorboard inside the container

Where to put the data

Download and extract the data files from Kaggle into data folder.

We're going to assume you're using the BASH prompt inside the container in the rest of this README.

Model Training

Preprocessing

python prepare_seq_data.py

Train Model

For now there are two types of model ready to be trained:

Transformer (fit_transformer.py)
LSTNet (fit_lstnet.py)

The training scripts use Sacred to manage experiments. It is recommended to set a seed explicitly via CLI:

python fit_transformer.py with seed=93102

You can also use Mongo to save experiment results and hyper-parameters for each run. Please refer to the Sacred documentation for more details.

Prediction for Validation and Testing Dataset

The CSV output will be saved in cache/preds/val/ and cache/preds/test/ respectively.

Tensorboard

Training and validation loss curves, and some of the embeddings are logged in tensorboard format. Launch tensorboad via:

tensorboard --logdir runs

Then visit http://localhost:6006 for the web interface.

TODO (For now you need to figure them out yourself)

Ensembling script: I made some changes to the outputs of model training scripts so they are more readable. But that means ensembling script needs to be updated as well. (For those who want to try: the ground truth for validation set is stored in cache/yval_seq.npy.)
Encoder/Decoder and Encoder/MLP models with LSTM, GRU, QRNN, SRU units: I tried a lot of different stuffs for this competition. But I feel the code could use some refactoring, so they are removed for now.
Tabular data preparation and models: My GBM models is mediocre at best, so not really worth sharing here. But as I mentioned in the blog post. For those store/item combination that were removed by the 56-day nonzero filter, using a GBM model to predict values for them will give you a better score than predicting zeros.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 21

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗