Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → SConsul → Global_convolutional_network

SConsul / Global_convolutional_network

Licence: mit

Pytorch implementation of GCN architecture for semantic segmentation

Programming Languages

139335 projects - #7 most used programming language

Labels

semantic-segmentation

Projects that are alternatives of or similar to Global convolutional network

Recurrent Scene Parsing With Perspective Understanding In The Loop

parsing scene images with understanding geometric perspective in the loop

Stars: ✭ 32 (-49.21%)

Mutual labels: semantic-segmentation

Deep Label Distribution Learning with Label Ambiguity

Stars: ✭ 49 (-22.22%)

Mutual labels: semantic-segmentation

Semanticsegmentation

A framework for training segmentation models in pytorch on labelme annotations with pretrained examples of skin, cat, and pizza topping segmentation

Stars: ✭ 52 (-17.46%)

Mutual labels: semantic-segmentation

TFslim based semantic segmentation models, modular&extensible boutique design

Stars: ✭ 43 (-31.75%)

Mutual labels: semantic-segmentation

Lung Segmentation

Segmentation of Lungs from Chest X-Rays using Fully Connected Networks

Stars: ✭ 47 (-25.4%)

Mutual labels: semantic-segmentation

An easy-to-use wrapper for work with dense per-pixel tasks in PyTorch (including multi-task learning)

Stars: ✭ 50 (-20.63%)

Mutual labels: semantic-segmentation

Tensorflow 2.3.0 implementation of DeepLabV3-Plus

Stars: ✭ 32 (-49.21%)

Mutual labels: semantic-segmentation

[CVPR 2020] MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning

Stars: ✭ 58 (-7.94%)

Mutual labels: semantic-segmentation

PyTorch implementation of PSPNet

Stars: ✭ 49 (-22.22%)

Mutual labels: semantic-segmentation

CCNet: Criss-Cross Attention for Semantic Segmentation (TPAMI 2020 & ICCV 2019).

Stars: ✭ 1,059 (+1580.95%)

Mutual labels: semantic-segmentation

Vocal Melody Extraction

Source code for "Vocal melody extraction with semantic segmentation and audio-symbolic domain transfer learning".

Stars: ✭ 44 (-30.16%)

Mutual labels: semantic-segmentation

GoogLeNet implementation of Fully Convolutional Networks for Semantic Segmentation in TensorFlow

Stars: ✭ 45 (-28.57%)

Mutual labels: semantic-segmentation

Awesome Semantic Segmentation

🤘 awesome-semantic-segmentation

Stars: ✭ 8,831 (+13917.46%)

Mutual labels: semantic-segmentation

SegNet implementation & experiments in Chainer

Stars: ✭ 42 (-33.33%)

Mutual labels: semantic-segmentation

The ROS Package of Mask R-CNN for Object Detection and Segmentation

Stars: ✭ 53 (-15.87%)

Mutual labels: semantic-segmentation

Pytorch Auto Drive

Segmentation models (ERFNet, ENet, DeepLab, FCN...) and Lane detection models (SCNN, SAD, PRNet, RESA, LSTR...) based on PyTorch 1.6 with mixed precision training

Stars: ✭ 32 (-49.21%)

Mutual labels: semantic-segmentation

Segmentationcpp

A c++ trainable semantic segmentation library based on libtorch (pytorch c++). Backbone: ResNet, ResNext. Architecture: FPN, U-Net, PAN, LinkNet, PSPNet, DeepLab-V3, DeepLab-V3+ by now.

Stars: ✭ 49 (-22.22%)

Mutual labels: semantic-segmentation

Minkowskiengine

Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors

Stars: ✭ 1,110 (+1661.9%)

Mutual labels: semantic-segmentation

Code for Universal Semi-Supervised Semantic Segmentation models paper accepted in ICCV 2019

Stars: ✭ 57 (-9.52%)

Mutual labels: semantic-segmentation

Jacinto Ai Devkit

Training & Quantization of embedded friendly Deep Learning / Machine Learning / Computer Vision models

Stars: ✭ 49 (-22.22%)

Mutual labels: semantic-segmentation

View All Similar Projects ➔

Global Convolutional Network

by Sarthak Consul

Refer to MeDAL-IITB's repo for more

GCN Architecture is proposed in the paper "Large Kernel Matters —— Improve Semantic Segmentation by Global Convolutional Network"[1]

A GCN based architecture, called ResNet-GCN, is used for the purposes of lung segmentation from chest x-rays.

Dataset

This architecture is proposed to segment out lungs from a chest radiograph (colloquially know as chest X-Ray, CXR). The dataset is known as the Montgomery County X-Ray Set, which contains 138 posterior-anterior x-rays. The motivation being that this information can be further used to detect chest abnormalities like shrunken lungs or other structural deformities. This is especially useful in detecting tuberculosis in patients.

Data Preprocessing

The x-rays are 4892x4020 pixels big. Due to GPU memory limitations, they are resized to 1024x1024.

The dataset is augmented by randomly rotating and flipping the images.

Architecture

Intuition

The Global Convolution Network or GCN is an architecture proposed for the task of segmenting images. An image segmenter has to perform 2 tasks: classification as well as localization. This has an inherent challenge as both tasks have inherent diametrically opposite demands. While a classifier has to be transformation and rotation invariant, a localizer has to sensitive to the same. The GCN architecture finds a balance of the two demands with the following properties:

To retain spatial information, no fully connected layers are used and a FCN framework is adopted
For better classification, a large kernel size is adopted to enable dense connections in feature maps

For segmentation to have semantic context, local context obtained from simple CNN architectures is not sufficient; a bigger view (i.e. global context) is critical. This architecture, coined ResNet-GCN, is basically a modified ResNet model with additional GCN blocks obtaining the required global view and the Boundary Refinement Blocks further improving the segmentation performance near object boundaries.

The entire pipeline of this architecture is visualized below:

GCN Block

The GCN Block is essentially a kx1 followed by 1xk convolution summed with a parallely computed 1xk followed by kx1 convolution. This results in a large kxk kernel with dense connections. NOTE: the blocks are acting on feature maps and so channel width is larger than 3

Boundary Refinement Block

The BR block improves the segmentation near the boundaries of objects, where segmentation is less like a pure classification problem. It's design is inspired by that of ResNets and is basically a parallel branch of Conv+ReLU, followed by another conv. layer added to the input.

Training

A pretrained ResNet-50 $^{[2]}$ is used and is later fine-tuned. The rationale being that while medical images are vastly different from natural images, the ResNet is a good feature extractor (eg. edges, blobs, etc.) It is further augmented by the fact that many components in a medical image have features that resemble that of natural images eg. nuclei looks similar to balls.

Refer to http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006 for the ResNet50 Architecture and https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py for the torchvision.model code

57 CXRs, with their corresponding masks, were used to train the model while 20 were used for validation purposes (hold-out cross validation). Another 61 images have been reserved as test set.

Loss Function

A linear combination of Soft Dice Loss, Soft Inverse Dice Loss, and Binary Cross-Entropy Loss (with logits) is used to train the model end-to-end. The best performance was obtained by weighing the three criteria at 0.25:0.5:0.25 (respectively).

Binary Cross-Entropy Loss (with logits)

This is calculated by passing the output of the network through a sigmoid activation before applying cross-entropy loss.

The sigmoid and cross entropy calculations are done in one class to exploit the log-sum-exp trick for greater numerical stability (as compared to sequentially applying sigmoid activation and then using vanilla BCE).

$$l_n = - w_n \left[ t_n \cdot \log \sigma(x_n) + (1 - t_n) \cdot \log (1 - \sigma(x_n)) \right],$$

$$ L(x,y) = \sum_{i=1}^{N}l_i$$

Soft Dice Loss

Dice Loss gives a measure of how accurate the overlap of the mask and ground truth is. The Sørensen–Dice coefficient is calculated as: $\frac{2. X\cap Y}{|X| + |Y|} = \frac{2. TP}{2. TP + FP + FN}$ and the Dice Loss is simply 1 - Dice coeff. For Soft version of the loss, the output of the network is passed through a sigmoid before standard dice loss is evaluated.

Soft Inverse Dice Loss

Inverse Dice loss checks for how accurately the background is masked. This penalizes the excess areas in the predicted mask. It is found by inverting the output before using the soft dice loss. This is added to account for true-negatives in our prediction.

Evaluation Metrics

Three metrics were used to evaluate the trained models;

Intersection over Union (IoU)
Dice Index
Inverse Dice Index

Intersection over Union (IoU)

IoU measures the accuracy of the predicted mask. It rewards better overlap of the prediction with the ground truth. $$\text{IoU} =\frac{P\cap GT}{P\cup GT} = \frac{TP}{TP+FP+FN} $$

P stands for Predicted Mask while GT is ground truth.

Dice Index

The Dice index (also known as Sørensen–Dice similarity coefficient) has been discussed earlier. Like IoU, Dice Index gives a measure of accuracy.

While for a single inference, both Dice and IoU are functionally equialent, over an average both have different inferences.

While the Dice score is a measure of the average performance of the model, the IoU score is harsher towards errors and is a measure of the worst performance of the model. $^{[\dagger]}$

Inverse Dice Index

As mentioned before, the Inverse dice index is obtained by inverting the masks and ground truth before calculating their dice score.

Due to the relatively smaller area of lung compared to the background, Inverse Dice score is large for every model.

Results

After 35 epochs of training, with learning rate = $10^{-3}$, scheduled to decrease by a factor of 5 after $15^{th}$ and $30^{th}$ epoch.

The model performed as follows:

Mean IoU: 0.8313548250035635
Mean Dice: 0.9072525421846304
Mean Inv. Dice: 0.9705243499345569

Examples of output

The red boundary denotes the ground truth while the blue shaded portion is the predicted mask

Observations

The GCN architecture is comparatively lightweight (in terms of GPU consumption)
The GCN architecture performs remarkably well for the task of lung segmentation even with very little training.
However, further work is required to achieve better performance. Type -1 error is particularly prevalent in the predictions

References

[1] Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, and Jian Sun. Large kernel matters - improve semantic segmentation by global convolutional network. CoRR, abs/1703.02719, 2017.

[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 63

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗