All Projects → emredog → FCNN-example

emredog / FCNN-example

Licence: other
This is a fully convolutional neural net exercise to detect houses from aerial images.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to FCNN-example

Cascaded Fcn
Source code for the MICCAI 2016 Paper "Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional NeuralNetworks and 3D Conditional Random Fields"
Stars: ✭ 296 (+957.14%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
3dunet abdomen cascade
Stars: ✭ 91 (+225%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
Pytorch Unet
Simple PyTorch implementations of U-Net/FullyConvNet (FCN) for image segmentation
Stars: ✭ 470 (+1578.57%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
Vnet.pytorch
A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation
Stars: ✭ 506 (+1707.14%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
Pytorch Semantic Segmentation
PyTorch for Semantic Segmentation
Stars: ✭ 1,580 (+5542.86%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
Pytorch Semseg
Semantic Segmentation Architectures Implemented in PyTorch
Stars: ✭ 3,180 (+11257.14%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
Keras Icnet
Keras implementation of Real-Time Semantic Segmentation on High-Resolution Images
Stars: ✭ 85 (+203.57%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
Fcn Pytorch
🚘 Easiest Fully Convolutional Networks
Stars: ✭ 278 (+892.86%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
semantic segmentation
Semantically segment the road in the given image.
Stars: ✭ 91 (+225%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
Semanticsegpapercollection
Stars: ✭ 102 (+264.29%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
Segmentation
Tensorflow implementation : U-net and FCN with global convolution
Stars: ✭ 101 (+260.71%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
Fashion-Clothing-Parsing
FCN, U-Net models implementation in TensorFlow for fashion clothing parsing
Stars: ✭ 29 (+3.57%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
Odm
A command line toolkit to generate maps, point clouds, 3D models and DEMs from drone, balloon or kite images. 📷
Stars: ✭ 3,340 (+11828.57%)
Mutual labels:  drone, aerial-imagery
atomai
Deep and Machine Learning for Microscopy
Stars: ✭ 77 (+175%)
Mutual labels:  semantic-segmentation, fully-convolutional-networks
label-studio-frontend
Data labeling react app that is backend agnostic and can be embedded into your applications — distributed as an NPM package
Stars: ✭ 230 (+721.43%)
Mutual labels:  semantic-segmentation
InstantDL
InstantDL: An easy and convenient deep learning pipeline for image segmentation and classification
Stars: ✭ 33 (+17.86%)
Mutual labels:  semantic-segmentation
Semantic-Mono-Depth
Geometry meets semantics for semi-supervised monocular depth estimation - ACCV 2018
Stars: ✭ 98 (+250%)
Mutual labels:  semantic-segmentation
Context-Aware-Consistency
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)
Stars: ✭ 121 (+332.14%)
Mutual labels:  semantic-segmentation
zubax gnss
Zubax GNSS module
Stars: ✭ 45 (+60.71%)
Mutual labels:  drone
night image semantic segmentation
[ICIP 2019] : This is the official github repository for the paper "What's There in The Dark" accepted in IEEE International Conference in Image Processing 2019 (ICIP19) , Taipei, Taiwan.
Stars: ✭ 25 (-10.71%)
Mutual labels:  semantic-segmentation

FCNN Example

This was an exercise for a job application (of which I will not disclose the company name).

The goal was overfit to the given single image, a large aerial image along with ground truth, so that we can detect houses.

The network architecture was given, so no flexibility there.

This repo contains PyTorch code and other material to decribe my approach to the given problem.

Note that custom cross entropy function is in fact unnecessary, and will be removed in the next version.

Repo Structure

  • images folder contains sample image and its ground truth
  • plots folder contains loss and score plots of a trained model
  • predicted_images folder contains images marked with the resulting prediction of a trained model
  • weights folder contains model weights, where a model may usually have multiple checkpoints
  • FCNN.py is the fully convolutional neural net model as defined in the exercise. Upscaling is constant and nearest neighbor interpolation is used by default.
  • FCNN2.py is almost the same model as defined FCNN.py, but with a learnable upscaling layer, i.e. Transposed Convolution, following [1].
  • train.py is the script to train the network with given parameters for given epochs. It also:
    • Saves a plot containing training loss per iterations
    • Saves a plot containing model scores (see below for details) per epochs
    • Saves sets of model weights with best scores, as well as the last set of weights.
  • predict.py is the simplest script of all, simply predicting the data provided by given dataloader using given trained model
  • util.py is a collection of helper functions and some constants to make life easier. Basically, it has functions for
    • Loading the image (and dividing it into patches),
    • Analyzing the patches (elininating blank patches, or patches with low information) as well as other image operations (data augmentation, preprocessing etc.),
    • Calculating scores for given predictions,
    • Saving the segmentation result upon the image to have a nice image at the end.
  • main.py is the script that is used in the experiments, to try out different hyperparameters in a loop. It also demonstrates the intended usage of the train.py and predict.py.
  • CrossEntropyLoss2d.py is a custom loss function obtained from here. Although the original Cross Entropy Loss of PyTorch supports tensors of any size, my experiments showed that the custom loss allows better performance No, they both yield the exact same result.

Performance Metrics

Initially, only "classification accuracy" (or "accuracy" for short) was used, since the problem was implemented a binary classification task. Shortly after, I've realized that the data is highly unbalanced (~91% background vs. ~9% houses), e.g. a result with no detection at all would result in 91% accuracy! (But I have kept this metric nonetheless, to compare with earlier attempts)

Therefore, if we frame the problem as a "house detection" problem, we can use metrics like Precision, Recall and F1 Score. These would enable us to properly evaluate the model performance, and to compare hyperparameters.

I have also considered IoU metric, but decided that would be counterintuitive for semantic segmentation and too much of a hassle for a 1 week exercise.

Unbalanced data --> Class Weights in Loss Function

As mentioned earlier, sample data is highly unbalanced, where 91% of the pixels are background whereas only 9% are houses. This becomes a huge burden during training, because loss from the houses becomes too insignificant compared to loss from background pixels, which causes the optimizer to be contented in situations such as "very few detections".

To address this issue, providing class weights into the Loss function is a good option under this circumstances (where we can't get more data). This is in fact forces the optimizer to find the weights that yields low loss from background pixels and low (despite amplified!) loss from house pixels.

I thought that weights 1 vs. 10 would work best theoretically, but 1 vs 6 and 1 vs 8 turned out to be better empirically.

Training & Test

This section provides a brief description about the training and test procedures.

Image patches

The requirements of the exercise clearly states that the images patches feed into the network must not be larger than 256x256 pixels. Moreover, I wanted to experiment with other image sizes as well.

To this end, data loader functions in util.py take W as argument to calculate how to divide the large sample image into WxW patches.

Assumption For convenience, image patches are always square, hence WxW and not WxH. Extending existing code to handle rectangular inputs is trivial, but I believe that is very unusual in the literature.

Image patches are created differently for Training and Test stages:

  • Training: Sample image is divided into WxW patches. Zero paddings around the sample images was used to avoid remainders at the boundaries.

    1. Patches with more than 1/2 blank are discarded if they do not contain houses.
    2. For augmented case, different strides are used to crop the images, to provide overlapping patches. In my experiments, using W/2 strides provided 8% improvement in F1 Score. For this exercise, I decided not to apply further data augmentation (random crops, random flips/rotations, color jittering etc.), since generalization was not a concern.
  • Test: At test time, since we shouldn't know where the houses are, the sample is image is simply zero padded and divided into WxW patches.

Please note that, both training and test patches were normalized with respect to a mean image computed from training set. (see images/mean.npy)

Hyperparameters and other choices

During searching for hyperparameters, the random seed was fixed. After long training hours, I found that following hyperparameters works the best for the given.

  • Image size (W): 128
  • Batch size (N): 1
  • Upscale method (u): Learnable (FCNN2) with transposed convolutions
  • Learning rate (lr): 1e-4
    • with decay: 0.5 at every 40 step
  • Optimizer: Following [1], SGD optimizer was used. I found that momentum:0.9 works best.
  • Weight initialization: He et al.[2] initialization with normal distribution for all learnable weights in the network.
  • Regularization: L2 regularization with strength 5e-3.
  • Class weights for loss: 1.0 for background, 6.0 for house

More experiments (although not presented nicely) can be found here.

Results

Results for the model #236, which is trained 200 epochs with the indicated hyperparams above.

Precision Recall F1 Score Accuracy
76.80 93.56 84.35 97.55

Training Loss over iterations:

Scores over epochs:

Qualitative result: Predictions

References

  1. Long, J., Shelhamer, E., & Darrell, T. Fully convolutional networks for semantic segmentation. CVPR, 2015.
  2. He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification.", ICCV. 2015.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].