All Projects → fmahoudeau → FCN-Segmentation-TensorFlow

fmahoudeau / FCN-Segmentation-TensorFlow

Licence: MIT license
FCN for Semantic Image Segmentation achieving 68.5 mIoU on PASCAL VOC

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to FCN-Segmentation-TensorFlow

TensorFlow-Advanced-Segmentation-Models
A Python Library for High-Level Semantic Segmentation Models based on TensorFlow and Keras with pretrained backbones.
Stars: ✭ 64 (+88.24%)
Mutual labels:  segmentation, fcn, image-segmentation, semantic-segmentation
Multiclass Semantic Segmentation Camvid
Tensorflow 2 implementation of complete pipeline for multiclass image semantic segmentation using UNet, SegNet and FCN32 architectures on Cambridge-driving Labeled Video Database (CamVid) dataset.
Stars: ✭ 67 (+97.06%)
Mutual labels:  segmentation, fcn, image-segmentation, semantic-segmentation
Brain-Tumor-Segmentation-using-Topological-Loss
A Tensorflow Implementation of Brain Tumor Segmentation using Topological Loss
Stars: ✭ 28 (-17.65%)
Mutual labels:  segmentation, fcn, image-segmentation
Fcn Googlenet
GoogLeNet implementation of Fully Convolutional Networks for Semantic Segmentation in TensorFlow
Stars: ✭ 45 (+32.35%)
Mutual labels:  fcn, semantic-segmentation, pascal-voc
Entity
EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation
Stars: ✭ 313 (+820.59%)
Mutual labels:  segmentation, image-segmentation, semantic-segmentation
Semseg
常用的语义分割架构结构综述以及代码复现
Stars: ✭ 624 (+1735.29%)
Mutual labels:  fcn, image-segmentation, semantic-segmentation
Segmentation models.pytorch
Segmentation models with pretrained backbones. PyTorch.
Stars: ✭ 4,584 (+13382.35%)
Mutual labels:  segmentation, image-segmentation, semantic-segmentation
Pytorch Auto Drive
Segmentation models (ERFNet, ENet, DeepLab, FCN...) and Lane detection models (SCNN, SAD, PRNet, RESA, LSTR...) based on PyTorch 1.6 with mixed precision training
Stars: ✭ 32 (-5.88%)
Mutual labels:  fcn, semantic-segmentation, pascal-voc
Segmentation
Tensorflow implementation : U-net and FCN with global convolution
Stars: ✭ 101 (+197.06%)
Mutual labels:  segmentation, fcn, semantic-segmentation
Seg Mentor
TFslim based semantic segmentation models, modular&extensible boutique design
Stars: ✭ 43 (+26.47%)
Mutual labels:  segmentation, fcn, semantic-segmentation
Efficient Segmentation Networks
Lightweight models for real-time semantic segmentationon PyTorch (include SQNet, LinkNet, SegNet, UNet, ENet, ERFNet, EDANet, ESPNet, ESPNetv2, LEDNet, ESNet, FSSNet, CGNet, DABNet, Fast-SCNN, ContextNet, FPENet, etc.)
Stars: ✭ 579 (+1602.94%)
Mutual labels:  segmentation, image-segmentation, semantic-segmentation
Keras Unet
Helper package with multiple U-Net implementations in Keras as well as useful utility tools helpful when working with image semantic segmentation tasks. This library and underlying tools come from multiple projects I performed working on semantic segmentation tasks
Stars: ✭ 196 (+476.47%)
Mutual labels:  segmentation, image-segmentation, semantic-segmentation
Hyperdensenet
This repository contains the code of HyperDenseNet, a hyper-densely connected CNN to segment medical images in multi-modal image scenarios.
Stars: ✭ 124 (+264.71%)
Mutual labels:  segmentation, fcn, image-segmentation
Fcn
Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)
Stars: ✭ 211 (+520.59%)
Mutual labels:  segmentation, fcn, semantic-segmentation
CAP augmentation
Cut and paste augmentation for object detection and instance segmentation
Stars: ✭ 93 (+173.53%)
Mutual labels:  segmentation, semantic-segmentation
semantic-segmentation
SOTA Semantic Segmentation Models in PyTorch
Stars: ✭ 464 (+1264.71%)
Mutual labels:  semantic-segmentation, camvid
LightNet
LightNet: Light-weight Networks for Semantic Image Segmentation (Cityscapes and Mapillary Vistas Dataset)
Stars: ✭ 710 (+1988.24%)
Mutual labels:  segmentation, semantic-segmentation
DocuNet
Code and dataset for the IJCAI 2021 paper "Document-level Relation Extraction as Semantic Segmentation".
Stars: ✭ 84 (+147.06%)
Mutual labels:  segmentation, semantic-segmentation
hyperseg
HyperSeg - Official PyTorch Implementation
Stars: ✭ 174 (+411.76%)
Mutual labels:  segmentation, semantic-segmentation
Semantic-Segmentation-BiSeNet
Keras BiseNet architecture implementation
Stars: ✭ 55 (+61.76%)
Mutual labels:  image-segmentation, semantic-segmentation

FCN for Semantic Image Segmentation on TensorFlow

This is an implementation of Fully Convolutional Networks (FCN) achieving 68.5 mIoU on PASCAL VOC2012 validation set. The model generates semantic masks for each object class in the image using a VGG16 backbone. It is based on the work by E. Shelhamer, J. Long and T. Darrell described in the PAMI FCN and CVPR FCN papers (achieving 67.2 mIoU).

Semantic Segmentation Sample

The repository includes:

  • Source code of FCN built on VGG16

  • Training code for PASCAL VOC

  • Pre-trained weights for PASCAL VOC

  • Code to download and prepare the PASCAL VOC 2012 dataset and extra data from Hariharan et al.

  • Data augmentation code based on OpenCV

  • Jupyter notebook to visualize the data augmentation pipeline with PASCAL VOC 2012

  • Other examples of training with the Kitty Road and CamVid datasets

  • Evaluation of trained models for several datasets

The code is documented and designed to be easy to extend for your own dataset. If you use it in your projects, please consider citing this repository (bibtex below).

Getting started

  • demo.ipynb: This notebook is the recommended way to get started. It provides examples of using a FCN model pre-trained on PASCAL VOC to segment object classes in your own images. It includes code to run object class segmentation on arbitrary images.

  • data_augmentation.ipynb: This notebook visualizes the data augmentation process using PASCAL VOC 2012 as example. Image transformations are built on OpenCV.

  • (fcn_run_loop.py, fcn_model.py): These files contain the main VGG16 FCN implementation details.

  • fcn_training.ipynb: This notebook reports training results for several datasets and can be used to reproduce them on your own.

Validation Results

This section reports validation results for several datasets on the following experiments:

  • One-off end to end training of the FCN-32s model starting from the pre-trained weights of VGG16.
  • One-off end to end training of FCN-16s starting from the pre-trained weights of VGG16.
  • One-off end to end training of FCN-8s starting from the pre-trained weights of VGG16.
  • Staged training of FCN-16s using the pre-trained weights of FCN-32s.
  • Staged training of FCN-8s using the pre-trained weights of FCN-16s-staged.

The models are evaluated against standard metrics, including pixel accuracy (PixAcc), mean class accuracy (MeanAcc), and mean intersection over union (MeanIoU). All training experiments were done with the Adam optimizer. Learning rate and weight decay parameters were selected using grid search.

Kitty Road

Kitty Road is a road and lane prediction task consisting of 289 training and 290 test images. It belongs to the KITTI Vision Benchmark Suite. As test images are not labelled, 20% of the images in the training set have been isolated to evaluate the model. The best result of 96.2 mIoU was obtained with one-off training of FCN-8s.

PixAcc MeanAcc MeanIoU
FCN-32s 98.1 97.3 93.8
FCN-16s-oneoff 98.6 97.9 95.6
FCN-8s-oneoff 98.8 98.5 96.2
FCN-16s-staged 98.8 98.0 96.0
FCN-8s-staged 98.6 98.2 95.3

CamVid

The Cambridge-driving Labeled Video Database (CamVid) is the first collection of videos with object class semantic labels, complete with metadata. The database provides ground truth labels that associate each pixel with one of 32 semantic classes. I have used a modified version of CamVid with 11 semantic classes and all images reshaped to 480x360. The training set has 367 images, the validation set 101 images and is known as CamSeq01. The best result of 73.2 mIoU was also obtained with one-off training of FCN-8s.

PixAcc MeanAcc MeanIoU
FCN-32s 92.6 73.4 65.0
FCN-16s-oneoff 93.9 79.2 70.4
FCN-8s-oneoff 94.5 81.0 73.2
FCN-16s-staged 93.8 77.9 69.7
FCN-8s-staged 94.6 81.5 72.9

PASCAL VOC 2012

The PASCAL Visual Object Classes Challenge includes a segmentation challenge with the objective of generating pixel-wise segmentations giving the class of the object visible at each pixel, or "background" otherwise. There are 20 different object classes in the dataset. It is one of the most widely used datasets for research. Again, the best result of 62.5 mIoU was obtained with one-off training of FCN-8s.

PixAcc MeanAcc MeanIoU
FCN-32s 90.7 69.3 60.0
FCN-16s-oneoff 91.0 72.9 61.9
FCN-8s-oneoff 91.2 72.2 62.5
FCN-16s-staged 91.1 72.3 61.9
FCN-8s-staged 91.0 72.1 61.7

PASCAL Plus

PASCAL Plus refers to the PASCAL VOC 2012 dataset augmented with the annotations from Hariharan et al. Again, the best result of 68.5 mIoU was obtained with one-off training of FCN-8s.

PixAcc MeanAcc MeanIoU
FCN-32s 91.3 79.3 64.5
FCN-16s-oneoff 92.4 78.1 67.3
FCN-8s-oneoff 92.7 78.5 68.5
FCN-16s-staged 92.3 78.5 67.5
FCN-8s-staged 92.4 77.9 67.2

Differences from the Official Paper

This implementation follows the FCN paper for the most part, but there are a few differences. Please let me know if I missed anything important.

  • Optimizer: The paper uses SGD with momentum and weight decay. This implementation uses Adam with a batch size of 12 images, a learning rate of 1e-5 and weight decay of 1e-6 for all training experiments with PASCAL VOC data. I did not double the learning rate for biases in the final solution.

  • Data Augmentation: The authors chose not to augment the data after finding no noticeable improvement with horizontal flipping and jittering. I find that more complex transformations such as zoom, rotation and color saturation improve the learning while also reducing overfitting. However, for PASCAL VOC, I was never able to completly eliminate overfitting.

  • Extra Data: The train and test sets in the additional labels were merged to obtain a larger training set of 10582 images, compared to the 8498 used in the paper. The validation set has 1449 images. This larger number of training images is arguably the main reason for obtaining a better mIoU than the one reported in the second version of the paper (67.2).

  • Image Resizing: To support training multiple images per batch we resize all images to the same size. For example, 512x512px on PASCAL VOC. As the largest side of any PASCAL VOC image is 500px, all images are center padded with zeros. I find this approach more convinient than having to pad or crop features after each up-sampling layer to re-instate their initial shape before the skip connection.

Training on Your Own

I'm providing pre-trained weights for PASCAL Plus to make it easier to start. You can use those weights as a starting point to fine-tune the training on your own dataset. Training and evaluation code is in fcn_run_loop.py. You can import this module in Jupyter notebook (see the provided notebooks for examples). You can also perform training, evaluation and prediction directly from the command line as such:

# Training a new FCN8 model starting from pre-trained VGG16 weights
python fcn_run_loop.py train --fcn_version=FCN8 --dataset=pascal_plus --model_name=<your model's name> 
                             --save_dir=/path/to/your/saved/models/ --data_dir=path/to/pascal/plus/data 
                             --vgg16_weights_path=/path/to/vgg16/weights.npz --n_epochs=50

# Training a new FCN16 model starting from pre-trained FCN32 weights
python fcn_run_loop.py train --fcn_version=FCN16 --dataset=pascal_plus --model_name=<your model's name>
                             --saved_variables=<FCN32 pre-trained weights filename w/o file extension> 
                             --save_dir=/path/to/your/saved/models/ --data_dir=path/to/pascal/plus/data 
                             --vgg16_weights_path=/path/to/vgg16/weights.npz --n_epochs=50

You can also evaluate the model with:

# Evaluate FCN8 model on PASCAL Plus validation set
python fcn_run_loop.py evaluate --fcn_version=FCN8 --dataset=pascal_plus --model_name=<your model's name> 
                                --saved_variables=<FCN8 pre-trained weights filename w/o file extension>
                                --save_dir=/path/to/your/saved/models/ --data_dir=path/to/pascal/plus/data 
                                --vgg16_weights_path=/path/to/vgg16/weights.npz

You can also predict the images' pixel-level object classes. This command creates a sub-folder under your save_dir and saves all images of the validation set with their segmentation mask overlayed:

# Predict PASCAL Plus validation set using an FCN8 model
python fcn_run_loop.py predict --fcn_version=FCN8 --dataset=pascal_voc_2012 --model_name=<your model's name>
                               --saved_variables=<FCN8 pre-trained weights filename w/o file extension>
                               --save_dir=/path/to/your/saved/models/ --data_dir=path/to/pascal/plus/data 
                               --vgg16_weights_path=/path/to/vgg16/weights.npz

To find out about the other command line arguments type:

python fcn_run_loop.py --help

Requirements

Python 3.6, TensorFlow 1.12, OpenCV, and other common packages listed in environment.yml.

Datasets

Kitty Road

To train or test on the Kitty Road dataset go to Kitty Road and click to download the base kit. Provide an email address to receive your download link.

# Unzip and prepare TFRecordDatasets 
python kitty_road_dataset.py --data_dir=<path to data_road.zip>

Cam Vid

I'm providing a prepared version of CamVid with 11 object classes. You can also go to the Cambridge-driving Labeled Video Database to make your own.

# Unzip and prepare TFRecordDatasets 
python cam_vid_dataset.py --data_dir=<path to cam_vid_prepped.zip>

Pascal VOC

To train or test on PASCAL VOC 2012 and the augmented dataset, use the provided scripts:

# Create the destination folder
mkdir /path/to/pascal_voc_data

# Download the dataset
python pascal_voc_downloader.py --data_dir=</path/to/pascal_voc_data>

# Untar and prepare TFRecordDatasets 
python pascal_voc_dataset.py --data_dir=</path/to/pascal_voc_data/VOCdevkit/VOC2012>

# Repeat the same steps for the additional annotations
mkdir /path/to/pascal_plus_data

python pascal_plus_downloader.py --data_dir=</path/to/pascal_plus_data>

python pascal_plus_dataset.py --contours_dir=</path/to/pascal_plus_data/benchmark_RELEASE/dataset/>
                              --voc_dir=</path/to/pascal_voc_data/VOCdevkit/VOC2012/>
                              --vocplus_dir=</path/to/pascal_plus_data/prepared>

Citation

Use this bibtex to cite this repository:

@misc{fmahoudeau_fcn_2019,
  title={FCN methods for semantic image segmentation on TensorFlow},
  author={Florent Mahoudeau},
  year={2019},
  publisher={Github},
  journal={GitHub repository},
  howpublished={\url{https://github.com/fmahoudeau/fcn}},
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].