All Projects → ternaus → Kaggle_dstl_submission

ternaus / Kaggle_dstl_submission

Licence: mit
Code for a winning model (3 out of 419) in a Dstl Satellite Imagery Feature Detection challenge

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Kaggle dstl submission

Keras Unet
Helper package with multiple U-Net implementations in Keras as well as useful utility tools helpful when working with image semantic segmentation tasks. This library and underlying tools come from multiple projects I performed working on semantic segmentation tasks
Stars: ✭ 196 (+23.27%)
Mutual labels:  segmentation, image-segmentation, satellite-imagery
Segmentation
Catalyst.Segmentation
Stars: ✭ 27 (-83.02%)
Mutual labels:  segmentation, image-segmentation
Efficient Segmentation Networks
Lightweight models for real-time semantic segmentationon PyTorch (include SQNet, LinkNet, SegNet, UNet, ENet, ERFNet, EDANet, ESPNet, ESPNetv2, LEDNet, ESNet, FSSNet, CGNet, DABNet, Fast-SCNN, ContextNet, FPENet, etc.)
Stars: ✭ 579 (+264.15%)
Mutual labels:  segmentation, image-segmentation
Data Science Bowl 2018
End-to-end one-class instance segmentation based on U-Net architecture for Data Science Bowl 2018 in Kaggle
Stars: ✭ 56 (-64.78%)
Mutual labels:  competition, segmentation
Segmentation models.pytorch
Segmentation models with pretrained backbones. PyTorch.
Stars: ✭ 4,584 (+2783.02%)
Mutual labels:  segmentation, image-segmentation
Caer
High-performance Vision library in Python. Scale your research, not boilerplate.
Stars: ✭ 452 (+184.28%)
Mutual labels:  segmentation, image-segmentation
Albumentations
Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
Stars: ✭ 9,353 (+5782.39%)
Mutual labels:  segmentation, image-segmentation
Pytorch Saltnet
Kaggle | 9th place single model solution for TGS Salt Identification Challenge
Stars: ✭ 270 (+69.81%)
Mutual labels:  kaggle-competition, segmentation
Segmentation
Tensorflow implementation : U-net and FCN with global convolution
Stars: ✭ 101 (-36.48%)
Mutual labels:  kaggle-competition, segmentation
Open Solution Toxic Comments
Open solution to the Toxic Comment Classification Challenge
Stars: ✭ 154 (-3.14%)
Mutual labels:  kaggle-competition, competition
Fusionseg
Video Object Segmentation
Stars: ✭ 116 (-27.04%)
Mutual labels:  segmentation, image-segmentation
Segmentation models
Segmentation models with pretrained backbones. Keras and TensorFlow Keras.
Stars: ✭ 3,575 (+2148.43%)
Mutual labels:  segmentation, image-segmentation
Open Solution Mapping Challenge
Open solution to the Mapping Challenge 🌎
Stars: ✭ 291 (+83.02%)
Mutual labels:  competition, satellite-imagery
Ternausnetv2
TernausNetV2: Fully Convolutional Network for Instance Segmentation
Stars: ✭ 521 (+227.67%)
Mutual labels:  image-segmentation, satellite-imagery
Geospatial Machine Learning
A curated list of resources focused on Machine Learning in Geospatial Data Science.
Stars: ✭ 289 (+81.76%)
Mutual labels:  image-segmentation, satellite-imagery
Pytorch Toolbelt
PyTorch extensions for fast R&D prototyping and Kaggle farming
Stars: ✭ 942 (+492.45%)
Mutual labels:  segmentation, image-segmentation
Robosat
Semantic segmentation on aerial and satellite imagery. Extracts features such as: buildings, parking lots, roads, water, clouds
Stars: ✭ 1,789 (+1025.16%)
Mutual labels:  segmentation, satellite-imagery
TensorFlow-Advanced-Segmentation-Models
A Python Library for High-Level Semantic Segmentation Models based on TensorFlow and Keras with pretrained backbones.
Stars: ✭ 64 (-59.75%)
Mutual labels:  segmentation, image-segmentation
HyperDenseNet pytorch
Pytorch version of the HyperDenseNet deep neural network for multi-modal image segmentation
Stars: ✭ 58 (-63.52%)
Mutual labels:  segmentation, image-segmentation
Multiclass Semantic Segmentation Camvid
Tensorflow 2 implementation of complete pipeline for multiclass image semantic segmentation using UNet, SegNet and FCN32 architectures on Cambridge-driving Labeled Video Database (CamVid) dataset.
Stars: ✭ 67 (-57.86%)
Mutual labels:  segmentation, image-segmentation

THIS REPO IS UNMAINTEINED AND MOSTLY REMAINS AS HISTORIC ARTIFACT.

Please, take a look at versions of software we used to generate our result -- it is ancient now. Do not expect this code to run on Keras above version 1.2.2 and so on. It will not run out of the box, we wouldn't answer any issues about it, sorry. If you manage to run this code on newer versions -- please, feel free to open pull request, we will merge it for the public good.

Take a look at https://github.com/ternaus/TernausNet if you need more up-to-date segmentation solution.

Winning Model Documentation

Name: Vladimir Iglovikov

LinkedIn: https://www.linkedin.com/in/iglovikov/

Location: San-Francisco, United States

Name: Sergey Mushinskiy

LinkedIn: https://www.linkedin.com/in/sergeymushinskiy/

Location: Angarsk, Russia

Competition: Dstl Satellite Imagery Feature Detection

Blog post: http://blog.kaggle.com/2017/05/09/dstl-satellite-imagery-competition-3rd-place-winners-interview-vladimir-sergey/

If you find this code useful for your publications, please consider citing

@article{DBLP:journals/corr/IglovikovMO17,
  author    = {Vladimir Iglovikov and
               Sergey Mushinskiy and
               Vladimir Osin},
  title     = {Satellite Imagery Feature Detection using Deep Convolutional Neural
               Network: A Kaggle Competition},  
  volume    = {abs/1706.06169},
  year      = {2017},  
  archivePrefix = {arXiv},
  eprint    = {1706.06169},     
}

Prerequisites

To train final models you will need the following:

  • OS: Ubuntu 16.04 (although code was successfully ran on Windows 10 too)
  • Required hardware:
    • Any decent modern computer with x86-64 CPU,
    • Fair amount of RAM (we had about 32Gb and 128Gb in our boxes, however, not all memory was used)
    • Powerful GPU: we used Nvidia Titan X (Pascal) with 12Gb of RAM and Nvidia GeForce GTX 1080 with 8Gb of RAM.

Main software for training neural networks:

  • Python 2.7 (preferable and fully tested) or Python 3.5
  • Keras 1.2.2
  • Theano 0.9.0rc1

Utility packages for geometry and image manipulation and other helper functions:

  • h5py
  • matplotlib
  • numba
  • numpy
  • pandas
  • rasterio
  • Shapely
  • scikit_image
  • tifffile
  • OpenCV
  • tqdm
  1. Install required OS and Python
  2. Install packages with pip install -r requirements.txt
  3. Create following directory structure:
  • Data structure:
data / theree_band / *
     / sixteen_band / *
    grid_sizes.csv
    train_wkt_v4.csv
  • Source code
src / *.py

Prepare data for training:

  1. Run python get_3_band_shapes.py
  2. Run cache_train.py

Train models

Each class in our solution has separate neural network, so it requires running of several distinct models one by one (or in parallel if there are enough computing resources)

  1. Run python unet_buidings.py
  2. Run python unet_structures.py
  3. Run python unet_road.py
  4. Run python unet_track.py
  5. Run python unet_trees.py
  6. Run python unet_crops.py

For water predictions we used different method and it can be created by running:

  1. Run python fast_water.py
  2. Run python slow_water.py

After training finishes (it may require quite a long time depending on hardware used, in our case it was about 7 hours for each stage (50 epochs)) trained weights and model architectures are saved in cache directory and can be used by prediction scripts (see the next section).

Create predictions

To create predictions run every make_prediction_cropped_*.py file in src dir. It could take considerable amount of time to generate all predictions as there are a lot of data in test and we use separate models for each class and use test time augmentation and cropping for the best model performance. On Titan X GPU each class took about 5 hours to get predictions.

  1. Run python make_prediction_cropped_buildings.py
  2. Run python make_prediction_cropped_structures.py
  3. Run python make_prediction_cropped_track.py
  4. Run python make_prediction_cropped_road.py
  5. Run python make_prediction_cropped_trees.py
  6. Run python make_prediction_cropped_crops.py

When all predictions are done they should be merged in a single file for submit:

  • Run python merge_predictions.py
  1. Run python merge_predictions.py The previous step will create file joined.csv that just merges predictions per class into the unified format.
  • Last step in the pipeline is to
  1. Run python post_processing.py joined.csv

that will perform some cleaning of the overlapping classes (remove predictions of the slow water from fast water, all other predictions from buildings, etc)

  • Done!

Remarks

Please, keep in mind that this isn't a production ready code but a very specific solution for the particular competition created in short time frame and with a lot of other constrains (limited training data, scarce computing resources and a small number of attents to check for improvements).

So, there are a lot of hardcoded magic numbers and strings and there may be some inconsistensies and differences between different models. Sometimes, it was indentended to get more accurate predictions and there wasn't enough resources to check if changes improve score for other classes after they were introduced for some of them. Sometimes, it just slipped from our attention.

Also, inherent stochasticity of neural networks training on many different levels (random initialization of weights, random cropping of patches into minibatch and so on) makes it impossible to reproduce exact submission from scratch. We went extra mile and reimplemented solution and training procedure from scratch as much as possible in the last two weeks after competition final. We've got up to 20% extra performance for some classes with abundant training data like buildings, tracks and so on. However, some classes proven more difficult to reliably reproduce because of lack of training data and small amount of time. Such classes show high variance of results between epochs. For competition we used our best performing combinations of epoch/model for those classes, which may not be exactly the same as trained for fixed number of epochs (as in this particular code). However, we believe that our model is equally capable to segment any classes, given enough data and/or clear definitions what exactly consists of each class (it wasn't clear how segmentation was performed in the first place for some classes, like road/tracks).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].