All Projects → dimitreOliveira → StoreItemDemand

dimitreOliveira / StoreItemDemand

Licence: MIT license
(117th place - Top 26%) Deep learning using Keras and Spark for the "Store Item Demand Forecasting" Kaggle competition.

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to StoreItemDemand

Kaggle Competitions
There are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (+258.33%)
Mutual labels:  kaggle, kaggle-competition
Open Solution Toxic Comments
Open solution to the Toxic Comment Classification Challenge
Stars: ✭ 154 (+541.67%)
Mutual labels:  kaggle, kaggle-competition
Deep Learning Boot Camp
A community run, 5-day PyTorch Deep Learning Bootcamp
Stars: ✭ 1,270 (+5191.67%)
Mutual labels:  kaggle, kaggle-competition
Ml competition platform
Kaggle-like machine learning competition platform
Stars: ✭ 42 (+75%)
Mutual labels:  kaggle, kaggle-competition
Data-Science-Hackathon-And-Competition
Grandmaster in MachineHack (3rd Rank Best) | Top 70 in AnalyticsVidya & Zindi | Expert at Kaggle | Hack AI
Stars: ✭ 165 (+587.5%)
Mutual labels:  kaggle, kaggle-competition
My Journey In The Data Science World
📢 Ready to learn or review your knowledge!
Stars: ✭ 1,175 (+4795.83%)
Mutual labels:  kaggle, kaggle-competition
Kaggle Airbnb Recruiting New User Bookings
2nd Place Solution in Kaggle Airbnb New User Bookings competition
Stars: ✭ 118 (+391.67%)
Mutual labels:  kaggle, kaggle-competition
Amazon Forest Computer Vision
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks
Stars: ✭ 346 (+1341.67%)
Mutual labels:  kaggle, kaggle-competition
Benchmarks
Comparison tools
Stars: ✭ 139 (+479.17%)
Mutual labels:  regression, kaggle
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+4895.83%)
Mutual labels:  regression, kaggle
Kaggle Web Traffic Time Series Forecasting
Solution to Kaggle - Web Traffic Time Series Forecasting
Stars: ✭ 29 (+20.83%)
Mutual labels:  kaggle, kaggle-competition
Lightautoml
LAMA - automatic model creation framework
Stars: ✭ 196 (+716.67%)
Mutual labels:  regression, kaggle
Data Science Competitions
Goal of this repo is to provide the solutions of all Data Science Competitions(Kaggle, Data Hack, Machine Hack, Driven Data etc...).
Stars: ✭ 572 (+2283.33%)
Mutual labels:  kaggle, kaggle-competition
Kaggle Notebooks
Sample notebooks for Kaggle competitions
Stars: ✭ 77 (+220.83%)
Mutual labels:  kaggle, kaggle-competition
Kaggle Homedepot
3rd Place Solution for HomeDepot Product Search Results Relevance Competition on Kaggle.
Stars: ✭ 452 (+1783.33%)
Mutual labels:  kaggle, kaggle-competition
Segmentation
Tensorflow implementation : U-net and FCN with global convolution
Stars: ✭ 101 (+320.83%)
Mutual labels:  kaggle, kaggle-competition
kaggler
🏁 API client for Kaggle
Stars: ✭ 50 (+108.33%)
Mutual labels:  kaggle, kaggle-competition
Pytorch Kaggle Starter
Pytorch starter kit for Kaggle competitions
Stars: ✭ 268 (+1016.67%)
Mutual labels:  kaggle, kaggle-competition
Machine Learning Workflow With Python
This is a comprehensive ML techniques with python: Define the Problem- Specify Inputs & Outputs- Data Collection- Exploratory data analysis -Data Preprocessing- Model Design- Training- Evaluation
Stars: ✭ 157 (+554.17%)
Mutual labels:  kaggle, kaggle-competition
Deep Time Series Prediction
Seq2Seq, Bert, Transformer, WaveNet for time series prediction.
Stars: ✭ 183 (+662.5%)
Mutual labels:  regression, kaggle

Deep Learning regression with Keras and Spark

About the repository

The Spark folder of this repository was written using Databricks if you want to replicate or continue the work you can checkout the free version Databrick community.

The main goal of the repository is to use the Spark structure from Databricks clusters, load and process data from the Kaggle competition and train deep learning models distributed.

What you will find

  • Brief EDA of the data set. [link]
  • Creation and usage of custom spark pipelines. [link]
  • Data preparation. [link]
  • Model training. [link]
  • Model prediction (test set). [link]
  • Model evaluation (evaluation of many different models. [link]

Store Item Demand Forecasting Challenge

link for the Kaggle competition: https://www.kaggle.com/c/demand-forecasting-kernels-only

datasets: https://www.kaggle.com/c/demand-forecasting-kernels-only/data

Overview

This competition is provided as a way to explore different time series techniques on a relatively simple and clean dataset.

You are given 5 years of store-item sales data, and asked to predict 3 months of sales for 50 different items at 10 different stores.

What's the best way to deal with seasonality? Should stores be modeled separately, or can you pool them together? Does deep learning work better than ARIMA? Can either beat xgboost?

This is a great competition to explore different models and improve your skills in forecasting.

PySpark Dependencies:

Python Dependencies:

To-Do:

  • Persistence of the pipeline classes needs to be fixed.
  • Pipeline classes needs revised.
  • The data probably needs more feature extraction.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].