Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → minerva-ml → open-solution-cdiscount-starter

minerva-ml / open-solution-cdiscount-starter

Licence: MIT license

Open solution to the Cdiscount’s Image Classification Challenge

Programming Languages

139335 projects - #7 most used programming language

77523 projects

Labels

competition data-science machine-learning deep-learning neptune keras kaggle keras-models neptune-framework kaggle-challenge keras-implementations

Projects that are alternatives of or similar to open-solution-cdiscount-starter

open-solution-ship-detection

Open solution to the Airbus Ship Detection Challenge

Stars: ✭ 54 (+170%)

Mutual labels: neptune, kaggle, neptune-framework

Open Solution Toxic Comments

Open solution to the Toxic Comment Classification Challenge

Stars: ✭ 154 (+670%)

Mutual labels: competition, kaggle

🎨 Winning solution for the Painter by Numbers competition on Kaggle.

Stars: ✭ 257 (+1185%)

Mutual labels: competition, kaggle

kaggle-quora-question-pairs

My solution to Kaggle Quora Question Pairs competition (Top 2%, Private LB log loss 0.13497).

Stars: ✭ 104 (+420%)

Mutual labels: competition, kaggle

Open Solution Mapping Challenge

Open solution to the Mapping Challenge 🌎

Stars: ✭ 291 (+1355%)

Mutual labels: competition, kaggle

Open Solution Home Credit

Open solution to the Home Credit Default Risk challenge 🏡

Stars: ✭ 397 (+1885%)

Mutual labels: competition, kaggle

Data Science Bowl 2018

End-to-end one-class instance segmentation based on U-Net architecture for Data Science Bowl 2018 in Kaggle

Stars: ✭ 56 (+180%)

Mutual labels: competition, kaggle

Data-Science-Hackathon-And-Competition

Grandmaster in MachineHack (3rd Rank Best) | Top 70 in AnalyticsVidya & Zindi | Expert at Kaggle | Hack AI

Stars: ✭ 165 (+725%)

Mutual labels: competition, kaggle

Recruit-Restaurant-Visitor-Forecasting

6th place solution for Recruit-Restaurant-Visitor-Forecasting

Stars: ✭ 16 (-20%)

Mutual labels: kaggle

Generating video descriptions using deep learning in Keras

Stars: ✭ 22 (+10%)

Mutual labels: keras-models

a small collection of models implemented in keras, including matrix factorization(recommendation system), topic modeling, text classification, etc. Runs on tensorflow.

Stars: ✭ 14 (-30%)

Mutual labels: keras-models

awesome-kaggle-kernels

Compilation of good Kaggle Kernels.

Stars: ✭ 51 (+155%)

Mutual labels: kaggle

🌍 Where will a new guest book their first travel experience?

Stars: ✭ 53 (+165%)

Mutual labels: kaggle

conv3d-video-action-recognition

My experimentation around action recognition in videos. Contains Keras implementation for C3D network based on original paper "Learning Spatiotemporal Features with 3D Convolutional Networks", Tran et al. and it includes video processing pipelines coded using mPyPl package. Model is being benchmarked on popular UCF101 dataset and achieves result…

Stars: ✭ 50 (+150%)

Mutual labels: keras-implementations

Gaussian Mixture Convolutional AutoEncoder applied to CT lung scans from the Kaggle Data Science Bowl 2017

Stars: ✭ 18 (-10%)

Mutual labels: kaggle

FAUST Gameserver for attack-defense CTFs

Stars: ✭ 38 (+90%)

Mutual labels: competition

GTAV-Self-driving-car

Self driving car in GTAV with Deep Learning

Stars: ✭ 15 (-25%)

Mutual labels: keras-models

kaggle-malware-classification

Kaggle "Microsoft Malware Classification Challenge". 6th place solution

Stars: ✭ 29 (+45%)

Mutual labels: kaggle

Data-Science-Articles

A collection of my blogs on Data Science and Machine learning.

Stars: ✭ 66 (+230%)

Mutual labels: kaggle

Kaggle | 14th place solution for TGS Salt Identification Challenge

Stars: ✭ 73 (+265%)

Mutual labels: kaggle

View All Similar Projects ➔

What is Cdiscount starter?

This is ready to use, end-to-end sample solution for the currently running Kaggle Cdiscount challenge.

It involves data loading and augmentation, model training (many different architectures), ensembling and submit generator.

More competitions 🎇

Check collection of public projects 🎁, where you can find multiple Kaggle competitions with code, experiments and outputs.

Disclaimer

In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 😉.

How to run Cdiscount starter?

Installation

Install the requirements
```
pip install -r requirements.txt
```
Install neptune by simply
```
pip install neptune-cli
```
Finish neptune installation by running
```
neptune login
```
Finally, open neptune and create project cdiscount. Check the project key because you will use it later (most likely it is: CDIS).

Now, you are ready to run the code and train some models...

Run code

remark about the competition data: We have uploaded the data to the neptune platform. It is available in the /public/cdiscount directory. Moreover, we created the meta_data file for large .bson files in the /public/Cdiscount/meta directory. It makes the process way faster.

You can run this end-to-end solution in two ways:

If you wish to work on your own machine you can run
```
neptune run run_manager.py -- run_pipeline
```

Deploying on cloud via neptune is super easy

just go
```
source run_neptune_command.sh
```

more advanced option is to run

neptune send run_manager.py \
--config experiment_config.yaml \
--pip-requirements-file requirements.txt \
--project-key CDIS \
--environment keras-2.0-gpu-py3 \
--worker gcp-gpu-medium \
-- run_pipeline

Collect results and upload to Kaggle

Navigate to /output/project_data/submissions, get your submission file, upload it to Kaggle and check your rank in the competition!

Advanced options

custom data directories

If you do not wish to use default data directories, you can specify custom paths in the data_config.yaml

raw_data_dir: /public/Cdiscount
meta_data_dir: /public/Cdiscount/meta
meta_data_processed_dir: /output/project_data/meta_processed
models_dir: /output/project_data/models
predictions_dir: /output/project_data/predictions
submissions_dir: /output/project_data/submissions

meta data creation

If you want to create meta data locally you should run

python run_manager create_metadata

and your metadata will be stored in the meta_data_dir

data sampling

Since the dataset is very large we suggest that you sample training dataset to a manageable size. Something like 1000 most common categories and 1000 images per category seems reasonable to start with. Nevertheless, You can tweak it however you want in the experiment_config.yaml file

properties:
  - key: top_categories
    value: 100
  - key: images_per_category
    value: 100
  - key: epochs
    value: 10
  - key: pipeline_name
    value: InceptionPipeline

hyperparameter space search

If you like to search the hyperparameter space, neptune can do this for you. Check out hyperparameter optimization.

training without neptune

We give you an option to run this code without neptune. The transition is seamless, just follow these steps:

Download the competition data to some folder your_raw_data_dir
specify data directories in the data_config.yaml
run python code
```
  python run_manager.py run_pipeline
```

Final remarks

Please feel free to modify this code in order to improve your score. Add new models, pre- and post-processing routines or ensembling methods.

Have fun competing on this Kaggle challenge!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 20

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗