Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → neptune-ai → Open Solution Mapping Challenge

neptune-ai / Open Solution Mapping Challenge

Licence: mit

Open solution to the Mapping Challenge 🌎

Programming Languages

python

139335 projects - #7 most used programming language

Labels

deep-learning machine-learning data-science pipeline kaggle unet satellite-imagery lightgbm competition

Projects that are alternatives of or similar to Open Solution Mapping Challenge

Open Solution Toxic Comments

Open solution to the Toxic Comment Classification Challenge

Stars: ✭ 154 (-47.08%)

Mutual labels: competition, kaggle, data-science, pipeline

Open Solution Home Credit

Open solution to the Home Credit Default Risk challenge 🏡

Stars: ✭ 397 (+36.43%)

Mutual labels: competition, kaggle, pipeline, lightgbm

Mlbox

MLBox is a powerful Automated Machine Learning python library.

Stars: ✭ 1,199 (+312.03%)

Mutual labels: kaggle, data-science, pipeline, lightgbm

Open Solution Data Science Bowl 2018

Open solution to the Data Science Bowl 2018

Stars: ✭ 159 (-45.36%)

Mutual labels: kaggle, data-science, unet

Open Solution Salt Identification

Open solution to the TGS Salt Identification Challenge

Stars: ✭ 124 (-57.39%)

Mutual labels: data-science, pipeline, unet

Lightautoml

LAMA - automatic model creation framework

Stars: ✭ 196 (-32.65%)

Mutual labels: kaggle, data-science, pipeline

Open Solution Value Prediction

Open solution to the Santander Value Prediction Challenge 🐠

Stars: ✭ 34 (-88.32%)

Mutual labels: competition, data-science, lightgbm

Data Science Bowl 2018

End-to-end one-class instance segmentation based on U-Net architecture for Data Science Bowl 2018 in Kaggle

Stars: ✭ 56 (-80.76%)

Mutual labels: competition, kaggle, unet

open-solution-cdiscount-starter

Open solution to the Cdiscount’s Image Classification Challenge

Stars: ✭ 20 (-93.13%)

Mutual labels: competition, kaggle

MSDS696-Masters-Final-Project

Earthquake Prediction Challenge with LightGBM and XGBoost

Stars: ✭ 58 (-80.07%)

Mutual labels: kaggle, lightgbm

pixel-decoder

A tool for running deep learning algorithms for semantic segmentation with satellite imagery

Stars: ✭ 68 (-76.63%)

Mutual labels: satellite-imagery, unet

kaggle-recruit-restaurant

🏆 Kaggle 8th place solution

Stars: ✭ 102 (-64.95%)

Mutual labels: kaggle, lightgbm

Data-Science-Hackathon-And-Competition

Grandmaster in MachineHack (3rd Rank Best) | Top 70 in AnalyticsVidya & Zindi | Expert at Kaggle | Hack AI

Stars: ✭ 165 (-43.3%)

Mutual labels: competition, kaggle

open-solution-ship-detection

Open solution to the Airbus Ship Detection Challenge

Stars: ✭ 54 (-81.44%)

Mutual labels: kaggle, unet

fast retraining

Show how to perform fast retraining with LightGBM in different business cases

Stars: ✭ 56 (-80.76%)

Mutual labels: kaggle, lightgbm

fabric

Urban change model designed to identify changes across 2 timestamps

Stars: ✭ 53 (-81.79%)

Mutual labels: satellite-imagery, unet

autogbt-alt

An experimental Python package that reimplements AutoGBT using LightGBM and Optuna.

Stars: ✭ 76 (-73.88%)

Mutual labels: kaggle, lightgbm

docker-kaggle-ko

머신러닝/딥러닝(PyTorch, TensorFlow) 전용 도커입니다. 한글 폰트, 한글 자연어처리 패키지(konlpy), 형태소 분석기, Timezone 등의 설정 등을 추가 하였습니다.

Stars: ✭ 46 (-84.19%)

Mutual labels: kaggle, lightgbm

Apartment-Interest-Prediction

Predict people interest in renting specific NYC apartments. The challenge combines structured data, geolocalization, time data, free text and images.

Stars: ✭ 17 (-94.16%)

Mutual labels: kaggle, lightgbm

kaggle-satellite-imagery-feature-detection

Satellite Imagery Feature Detection (68 out of 419)

Stars: ✭ 29 (-90.03%)

Mutual labels: kaggle, satellite-imagery

View All Similar Projects ➔

Open Solution to the Mapping Challenge Competition

More competitions 🎇

Check collection of public projects 🎁, where you can find multiple Kaggle competitions with code, experiments and outputs.

Poster 🌍

Poster that summarizes our project is available here.

Intro

Open solution to the CrowdAI Mapping Challenge competition.

Check live preview of our work on public projects page: Mapping Challenge 📈.
Source code and issues are publicly available.

Results

0.943 Average Precision 🚀

0.954 Average Recall 🚀

No cherry-picking here, I promise 😉. The results exceded our expectations. The output from the network is so good that not a lot of morphological shenanigans is needed. Happy days:)

Average Precision and Average Recall were calculated on stage 1 data using pycocotools. Check this blog post for average precision explanation.

Disclaimer

In this open source solution you will find references to the neptune.ai. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ai is not necessary to proceed with this solution. You may run it as plain Python script 😉.

Reproduce it!

Check REPRODUCE_RESULTS

Solution write-up

Pipeline diagram

Preprocessing

✔️ What Worked

Overlay binary masks for each image is produced (code 💻).
Distances to the two closest objects are calculated creating the distance map that is used for weighing (code 💻).
Size masks for each image is produced (code 💻).
Dropped small masks on the edges (code 💻).
We load training and validation data in batches: using torch.utils.data.Dataset and torch.utils.data.DataLoader makes it easy and clean (code 💻).
Only some basic augmentations (due to speed constraints) from the imgaug package are applied to images (code 💻).
Image is resized before feeding it to the network. Surprisingly this worked better than cropping (code 💻 and config 📑).

✖️ What didn't Work

Ground truth masks are prepared by first eroding them per mask creating non overlapping masks and only after that the distances are calculated (code 💻).
Dilated small objects to increase the signal (code 💻).
Network is fed with random crops (code 💻 and config 📑).

🤔 What could have worked but we haven't tried it

Ground truth masks for overlapping contours (DSB-2018 winners approach).

Network

✔️ What Worked

Unet with Resnet34, Resnet101 and Resnet152 as an encoder where Resnet101 gave us the best results. This approach is explained in the TernausNetV2 paper (our code 💻 and config 📑). Also take a look at our parametrizable implementation of the U-Net.

✖️ What didn't Work

Network architecture based on dilated convolutions described in this paper.

🤔 What could have worked but we haven't tried it

Unet with contextual blocks explained in this paper.

Loss function

✔️ What Worked

Distance weighted cross entropy explained in the famous U-Net paper (our code 💻 and config 📑).
Using linear combination of soft dice and distance weighted cross entropy (code 💻 and config 📑).
Adding component weighted by building size (smaller buildings has greater weight) to the weighted cross entropy that penalizes misclassification on pixels belonging to the small objects (code 💻).

Weights visualization

For both weights: the darker the color the higher value.

distance weights: high values corresponds to pixels between buildings.
size weights: high values denotes small buildings (the smaller the building the darker the color). Note that no-building is fixed to black.

Training

✔️ What Worked

Use pretrained models!
Our multistage training procedure:
1. train on a 50000 examples subset of the dataset with lr=0.0001 and dice_weight=0.5
2. train on a full dataset with lr=0.0001 and dice_weight=0.5
3. train with smaller lr=0.00001 and dice_weight=0.5
4. increase dice weight to dice_weight=5.0 to make results smoother
Multi-GPU training
Use very simple augmentations

The entire configuration can be tweaked from the config file 📑.

🤔 What could have worked but we haven't tried it

Set different learning rates to different layers.
Use cyclic optimizers.
Use warm start optimizers.

Postprocessing

✔️ What Worked

Test time augmentation (tta). Make predictions on image rotations (90-180-270 degrees) and flips (up-down, left-right) and take geometric mean on the predictions (code 💻 and config 📑).
Simple morphological operations. At the beginning we used erosion followed by labeling and per label dilation with structure elements chosed by cross-validation. As the models got better, erosion was removed and very small dilation was the only one showing improvements (code 💻).
Scoring objects. In the beginning we simply used score 1.0 for every object which was a huge mistake. Changing that to average probability over the object region improved results. What improved scores even more was weighing those probabilities with the object size (code 💻).
Second level model. We tried Light-GBM and Random Forest trained on U-Net outputs and features calculated during postprocessing.

✖️ What didn't Work

Test time augmentations by using colors (config 📑).
Inference on reflection-padded images was not a way to go. What worked better (but not for the very best models) was replication padding where border pixel value was replicated for all the padded regions (code 💻).
Conditional Random Fields. It was so slow that we didn't check it for the best models (code 💻).

🤔 What could have worked but we haven't tried it

Ensembling
Recurrent neural networks for postprocessing (instead of our current approach)

Model Weights

Model weights for the winning solution are available here

You can use those weights and run the pipeline as explained in REPRODUCE_RESULTS.

User support

There are several ways to seek help:

crowdai discussion.
You can submit an issue directly in this repo.
Join us on Gitter.

Contributing

Check CONTRIBUTING for more information.
Check issues to check if there is something you would like to contribute to.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 291

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (52) 🔗