Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → SeuTao → Tgs Salt Identification Challenge 2018 _4th_place_solution

SeuTao / Tgs Salt Identification Challenge 2018 _4th_place_solution

Kaggle TGS Salt Identification Challenge 2018 4th place code

Programming Languages

python

139335 projects - #7 most used programming language

Labels

pytorch kaggle

Projects that are alternatives of or similar to Tgs Salt Identification Challenge 2018 4th place solution

Kaggle The Hunt for Prohibited Content

4th Place Solution for The Hunt for Prohibited Content Competition on Kaggle (http://www.kaggle.com/c/avito-prohibited-content)

Stars: ✭ 29 (-91.32%)

Mutual labels: kaggle

kernel-run

Run any Jupyter notebook instantly using Kaggle kernels

Stars: ✭ 59 (-82.34%)

Mutual labels: kaggle

Painters

🎨 Winning solution for the Painter by Numbers competition on Kaggle.

Stars: ✭ 257 (-23.05%)

Mutual labels: kaggle

Kaggle Competition

Summary of the Kaggle Stock Prediction Competition & my Trial

Stars: ✭ 78 (-76.65%)

Mutual labels: kaggle

sklearn-feature-engineering

使用sklearn做特征工程

Stars: ✭ 114 (-65.87%)

Mutual labels: kaggle

COVID-19-CaseStudy-and-Predictions

This repository is a case study, analysis and visualization of COVID-19 Pandemic spread along with prediction models.

Stars: ✭ 90 (-73.05%)

Mutual labels: kaggle

kaggle-statoil-iceberg-classifier-challenge

16th place solution of the Kaggle Statoil/C-CORE Iceberg Classifier Challenge

Stars: ✭ 15 (-95.51%)

Mutual labels: kaggle

Awesome Noah

AI圈Noah plan-AI数据竞赛Top可复现解决方案(Awesome Top Solution List of Excellent AI Competitions)

Stars: ✭ 289 (-13.47%)

Mutual labels: kaggle

convolutedPredictions Cdiscount

2nd place solution to Kaggle's Cdiscount image classification challange.

Stars: ✭ 17 (-94.91%)

Mutual labels: kaggle

Kaggle Carvana Image Masking Challenge

Stars: ✭ 256 (-23.35%)

Mutual labels: kaggle

ashrae-great-energy-predictor-3-solution-analysis

Analysis of top give winning solutions of the ASHRAE Great Energy Predictor III competition

Stars: ✭ 44 (-86.83%)

Mutual labels: kaggle

PadhAI-A-First-Course-on-Deep-Learning-Detailed-Notes

Disclaimer: This is the detailed notes for PadhAI onefourthlabs course “A First Course on Deep Learning".

Stars: ✭ 19 (-94.31%)

Mutual labels: kaggle

Credit-Card-Fraud

No description or website provided.

Stars: ✭ 17 (-94.91%)

Mutual labels: kaggle

kaggle-code

A repository for some of the code I used in kaggle data science & machine learning tasks.

Stars: ✭ 100 (-70.06%)

Mutual labels: kaggle

Argus Freesound

Kaggle | 1st place solution for Freesound Audio Tagging 2019

Stars: ✭ 265 (-20.66%)

Mutual labels: kaggle

kaggle-quora-question-pairs

My solution to Kaggle Quora Question Pairs competition (Top 2%, Private LB log loss 0.13497).

Stars: ✭ 104 (-68.86%)

Mutual labels: kaggle

game2vec

TensorFlow implementation of word2vec applied on https://www.kaggle.com/tamber/steam-video-games dataset, using both CBOW and Skip-gram.

Stars: ✭ 62 (-81.44%)

Mutual labels: kaggle

Open Solution Mapping Challenge

Open solution to the Mapping Challenge 🌎

Stars: ✭ 291 (-12.87%)

Mutual labels: kaggle

Pytorch Kaggle Starter

Pytorch starter kit for Kaggle competitions

Stars: ✭ 268 (-19.76%)

Mutual labels: kaggle

kaggler

🏁 API client for Kaggle

Stars: ✭ 50 (-85.03%)

Mutual labels: kaggle

View All Similar Projects ➔

Kaggle TGS Salt Identification Challenge 2018 4th place code

This is the source code for my part of the 4th place solution to the TGS Salt Identification Challenge hosted by Kaggle.com.

Recent Update

2018.11.06: jigsaw python code，dirty code of handcraft rules and pseudo label training code updated.

2018.10.22: single model training code updated.

2018.10.20: We achieved the 4th place on Kaggle TGS Salt Identification Challenge.

Dependencies

opencv-python==3.4.2
scikit-image==0.14.0
scikit-learn==0.19.1
scipy==1.1.0
torch==0.3.1
torchvision==0.2.1

Solution Development

Single model design

input: 101 random pad to 128*128, random LRflip;
encoder: resnet34, se-resnext50, resnext101_ibna, se-resnet101, se-resnet152, se resnet154;
decoder: scse, hypercolumn (not used in network with resnext101_ibna, se_resnext101 backbone), ibn block, dropout;
Deep supervision structure with Lovasz softmax; We designed 6 single models for the final submission;

Single model performace

single model（10fold 7cycle）	valid LB	public LB	privare LB
model_50	0.873	0.873	0.891
model_50_slim	0.871	0.872	0.891
model_101A	0.868	0.870	0.889
model_101B	0.870	0.871	0.891
model_152	0.868	0.869	0.888
model_154	0.869	0.871	0.890

Model ensemble performace

ensemble model（cycle voting）	public LB	privare LB
50+50_slim	0.873	0.891
50+50_slim+101B	0.873	0.892
50+50_slim+101A	0.873	0.892
50+50_slim+101A+101B	0.874	0.892
50+50_slim+101A+101B+154	0.874	0.892
50+50_slim+101A+101B+152+154	0.874	0.892

Post processing

According to the 2D and 3D jigsaw results, we applied around 10 handcraft rules that gave a 0.010~0.011 public LB boost and 0.001 private LB boost.

model	public LB	privare LB
50+50_slim+101A+101B with post processing	0.884	0.893

Data distill (Pseudo Labeling)

We started to do this part since the middle of the competetion. Pseudo labeling is pretty tricky and has the risk of overfitting. I am not sure whether it would boost the private LB untill the result is published. I just post our results here, the implementation details will be updated. Steps (as the following flow chart shows):

Grabing the pseudo labels provided by previous predict (with post processing).
Randomly split the test set into two parts, one for training and the other for predicting.
To prevent overfitting to pseudo labels, we randomly select images from training set or test set (one part) with same probability in each mini batch.
Training the new dataset in three different networks with same steps as mentioned previously.
Predicting the test set (the other part) by all three trained models and voting the result.
Repeat step 3 to 5 except that in this time we change two test parts.

model with datadistill	public LB	privare LB	placement
model_34	0.877	0.8931	8
model_50	0.880	0.8939	8
model_101	0.880	0.8946	7
model 34+50+101	0.879	0.8947	6
model_34 with post processing	0.885	0.8939	8
model_50 with post processing	0.886	0.8948	5
model_101 with post processing	0.886	0.8950	5
model 34+50+101 with post processing (final sub)	0.887	0.8953	4

Data Setup

save the train mask images to disk

python prepare_data.py

Single Model Training

train model_34 fold 0：

CUDA_VISIBLE_DEVICES=0 python train.py --mode=train --model=model_34 --model_name=model_34 --train_fold_index=0

predict model_34 all fold：

CUDA_VISIBLE_DEVICES=0 python predict.py --mode=InferModel10Fold --model=model_34 --model_name=model_34

Ensemble and Jigsaw Post-processing

After you predict all 6 single models 10 fold test csv，use this two command to perform majority voting and post-processing.

a) solve Jigsaw map (only need to run for one time)

python predict.py --mode=SolveJigsawPuzzles

b) ensemble 6 model all cycles and post-processing, 'model_name_list' is the list of signle model names you train with the command above

python predict.py --mode=EnsembleModels --model_name_list=model_50A,model_50A_slim,model_101A,model_101B,model_152,model_154 ----save_sub_name=6_model_ensemble.csv

You'll get ensemble sub '6_model_ensemble.csv' and ensembel+jigsaw sub '6_model_ensemble-vertical-empty-smooth.csv'

Pseudo label training

After you get ensemble+jigsaw results, use command below to train with pseudo label. We randomly split the test set into two parts. For each model, we train twice with 50% pseudo labels each.

train model_34 with 6model output pseudo label:

a) part0 fold 0

python train.py --mode=train --model=model_34 --model_name=model_34_pseudo_part0 --pseudo_csv=6_model_ensemble-vertical-empty-smooth.csv --pseudo_split=0 --train_fold_index=0

b) part1 fold 0

python train.py --mode=train --model=model_34 --model_name=model_34_pseudo_part1 --pseudo_csv=6_model_ensemble-vertical-empty-smooth.csv --pseudo_split=1 --train_fold_index=1

Final Ensemble

python predict.py --mode=EnsembleModels --model_name_list=model_34_pseudo_part0,model_34_pseudo_part1,model_50A_slim_pseudo_part0,model_50A_slim_pseudo_part1,model_101A_pseudo_part0,model_101A_pseudo_part1 ----save_sub_name=final_sub.csv

The "final_sub-vertical-empty-smooth.csv" is all you need.

Reference

https://arxiv.org/abs/1608.03983 LR schedule
https://arxiv.org/abs/1803.02579 Squeeze and excitation
https://arxiv.org/abs/1411.5752 Hypercolumns
https://arxiv.org/abs/1705.08790 Lovasz
https://arxiv.org/abs/1712.04440 Data distillation

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 334

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗