Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → cvjena → Alpha_pooling

cvjena / Alpha_pooling

Licence: bsd-2-clause

Code for our paper "Generalized Orderless Pooling Performs Implicit Salient Matching" published at ICCV 2017.

Labels

jupyter-notebook image-classification vgg16

Projects that are alternatives of or similar to Alpha pooling

Deep Learning With Python

Deep learning codes and projects using Python

Stars: ✭ 195 (+282.35%)

Mutual labels: jupyter-notebook, image-classification, vgg16

Tensorflow2.0 Examples

🙄 Difficult algorithm, Simple code.

Stars: ✭ 1,397 (+2639.22%)

Mutual labels: jupyter-notebook, image-classification, vgg16

Pytorch classification

利用pytorch实现图像分类的一个完整的代码，训练，预测，TTA，模型融合，模型部署，cnn提取特征，svm或者随机森林等进行分类，模型蒸馏，一个完整的代码

Stars: ✭ 395 (+674.51%)

Mutual labels: jupyter-notebook, image-classification

Computer Vision

Programming Assignments and Lectures for Stanford's CS 231: Convolutional Neural Networks for Visual Recognition

Stars: ✭ 408 (+700%)

Mutual labels: jupyter-notebook, image-classification

Dogs vs cats

猫狗大战

Stars: ✭ 570 (+1017.65%)

Mutual labels: jupyter-notebook, image-classification

VGG16 Keras TensorFlow

# This is a image classification by VGG16 pre-trained model.#

Stars: ✭ 40 (-21.57%)

Mutual labels: image-classification, vgg16

Grad Cam Tensorflow

tensorflow implementation of Grad-CAM (CNN visualization)

Stars: ✭ 261 (+411.76%)

Mutual labels: jupyter-notebook, vgg16

Pba

Efficient Learning of Augmentation Policy Schedules

Stars: ✭ 461 (+803.92%)

Mutual labels: jupyter-notebook, image-classification

Pixel level land classification

Tutorial demonstrating how to create a semantic segmentation (pixel-level classification) model to predict land cover from aerial imagery. This model can be used to identify newly developed or flooded land. Uses ground-truth labels and processed NAIP imagery provided by the Chesapeake Conservancy.

Stars: ✭ 217 (+325.49%)

Mutual labels: jupyter-notebook, image-classification

Concise Ipython Notebooks For Deep Learning

Ipython Notebooks for solving problems like classification, segmentation, generation using latest Deep learning algorithms on different publicly available text and image data-sets.

Stars: ✭ 23 (-54.9%)

Mutual labels: jupyter-notebook, image-classification

Grocery Product Classification

Implementation of the paper "A Hierarchical Grocery Store Image Dataset with Visual and Semantic Labels"

Stars: ✭ 23 (-54.9%)

Mutual labels: jupyter-notebook, image-classification

Neural Image Captioning

Implementation of Neural Image Captioning model using Keras with Theano backend

Stars: ✭ 12 (-76.47%)

Mutual labels: jupyter-notebook, vgg16

Deep-Learning

It contains the coursework and the practice I have done while learning Deep Learning.🚀 👨‍💻💥 🚩🌈

Stars: ✭ 21 (-58.82%)

Mutual labels: image-classification, vgg16

Skin Lesions Classification DCNNs

Transfer Learning with DCNNs (DenseNet, Inception V3, Inception-ResNet V2, VGG16) for skin lesions classification

Stars: ✭ 47 (-7.84%)

Mutual labels: image-classification, vgg16

Pytorch Image Classification

Tutorials on how to implement a few key architectures for image classification using PyTorch and TorchVision.

Stars: ✭ 272 (+433.33%)

Mutual labels: jupyter-notebook, image-classification

Image-Classification

Pre-trained VGG-Net Model for image classification using tensorflow

Stars: ✭ 29 (-43.14%)

Mutual labels: image-classification, vgg16

Food Recipe Cnn

food image to recipe with deep convolutional neural networks.

Stars: ✭ 448 (+778.43%)

Mutual labels: jupyter-notebook, vgg16

Cvpr18 Inaturalist Transfer

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018

Stars: ✭ 164 (+221.57%)

Mutual labels: jupyter-notebook, image-classification

Seismic Transfer Learning

Deep-learning seismic facies on state-of-the-art CNN architectures

Stars: ✭ 32 (-37.25%)

Mutual labels: jupyter-notebook, vgg16

Food 101 Keras

Food Classification with Deep Learning in Keras / Tensorflow

Stars: ✭ 646 (+1166.67%)

Mutual labels: jupyter-notebook, image-classification

View All Similar Projects ➔

Alpha pooling for fine-grained recognition

This repository contains code for our International Conference on Computer Vision publication ``Generalized Orderless Pooling Performs Implicit Salient Matching''. It contains scripts for fine-tuning a pre-trained VGG16 model with our presented alpha-pooling approach.

Abstract of the paper

Most recent CNN architectures use average pooling as a final feature encoding step. In the field of fine-grained recognition, however, recent global representations like bilinear pooling offer improved performance. In this paper, we generalize average and bilinear pooling to "alpha-pooling", allowing for learning the pooling strategy during training. In addition, we present a novel way to visualize decisions made by these approaches. We identify parts of training images having the highest influence on the prediction of a given test image. This allows for justifying decisions to users and also for analyzing the influence of semantic parts. For example, we can show that the higher capacity VGG16 model focuses much more on the bird's head than, e.g., the lower-capacity VGG-M model when recognizing fine-grained bird categories. Both contributions allow us to analyze the difference when moving between average and bilinear pooling. In addition, experiments show that our generalized approach can outperform both across a variety of standard datasets.

Getting started

You need our custom caffe located at https://github.com/cvjena/caffe_pp2, which has our own SignedPowerLayer with learnable power as well as a spatial transformer layer used for on-the-fly image resizing and a compact bilinear layer for computing the outer product in an efficient manner. Please clone and compile caffe_pp2 as well as its python interface. We use python 3 in all our experiments.

Preparation of the dataset

We use an ImageData layer in our experiments. This layer is required in order to use the scripts provided here. Hence you will need a list of train images and a list of test images. Each file should contain the path to the respective images relative to --image_root and the label as integer separated by comma. This means, the files should look like

/path/to/dataset/class1/image1.jpg 1
/path/to/dataset/class1/image2.jpg 1
/path/to/dataset/class2/image1.jpg 2
/path/to/dataset/class2/image2.jpg 2

The path to these files is used in the following scripts and are called train_imagelist and val_imagelist.

How to learn an alpha-pooling model

We provide a batch script and an Jupyter notebook to prepare the fine-tuning. The usage of the batch script is described in the --help message:

usage: prepare-finetuning-batchscript.py [-h] [--init_weights INIT_WEIGHTS]
                                        [--label LABEL] [--gpu_id GPU_ID]
                                        [--num_classes NUM_CLASSES]
                                        [--image_root IMAGE_ROOT]
                                        train_imagelist val_imagelist

Prepare fine-tuning of multiscale alpha pooling. The working directory should
contain train_val.prototxt of vgg16. The models will be created in the
subfolders.

positional arguments:
train_imagelist       Path to imagelist containing the training images. Each
                        line should contain the path to an image followed by a
                        space and the class ID.
val_imagelist         Path to imagelist containing the validation images.
                        Each line should contain the path to an image followed
                        by a space and the class ID.

optional arguments:
-h, --help            show this help message and exit
--init_weights INIT_WEIGHTS
                        Path to the pre-trained vgg16 model
--label LABEL         Label of the created output folder
--gpu_id GPU_ID       ID of the GPU to use
--num_classes NUM_CLASSES
                        Number of object categories
--image_root IMAGE_ROOT
                        Image root folder, used to set the root_folder
                        parameter of the ImageData layer of caffe.

The explanation for the usage of the notebook is described in the comments of it. Please note that gamma in the scripts refers to alpha in the paper due to last minute renaming of the approach before submission.

The script preprare the prototxt and solver for learning the model. In addition, they also learn the last classification layer already. After the preparation, you can fine-tune the network using the created ft.solver file in the finetuning subfolder. Please note that our implementation only supports GPU computation, as the SignedPowerLayer in caffe_pp2 has only a GPU implementation at the moment.

How to learn another architecture

The code shows the fine-tuning preparation for VGG16. If you want to learn another model, you will need a train_val.prototxt, which has two ImageData layers. It is probably the best to take your existing train_val.prototxt and replace your data layers with the ImageData layers of our VGG16 train_val.prototxt. Our script does not support LMDB or any other types of layers, but could be probably adapted for it. After these adjustments, you might also need to adjust the notebook or prepare-finetuning-batchscript.py, depending on what you are using.

Feel free to try any other model, for example our caffe implementation of ResNet50 from https://github.com/cvjena/cnn-models/tree/master/ResNet_preact/ResNet50_cvgj

Accuracy

With VGG16 and a resolution of 224 and 560 pixels on the smaller side of the image, you should achieve the 85.3% top-1 accuracy reported in the paper. Complete list of results:

Dataset	CUB200-2011	Aircraft	40 actions-
classes / images	200 / 12k	89 / 10k	40 / 9.5k
Previous	81.0% [24]	72.5% [6]	72.0% [36]
	82.0% [17]	78.0% [22]	80.9% [4]
	84.5% [34]	80.7% [13]	81.7% [22]
Special case: bilinear [19]	84.1%	84.1%	-
Learned strategy (Ours)	85.3%	85.5%	86.0%

Note: running the training longer the the predefined number of itertions leads to a higher accuracy and is necessary to reproduce the paper results.

Citation

Please cite the corresponding ICCV 2017 publication if our models helped your research:

@inproceedings{Simon17_GOP,
title = {Generalized orderless pooling performs implicit salient matching},
booktitle = {International Conference on Computer Vision (ICCV)},
author = {Marcel Simon and Yang Gao and Trevor Darrell and Joachim Denzler and Erik Rodner},
year = {2017},
}

License and support

The code is released under BSD 2-clause license allowing both academic and commercial use. I would appreciate if you give credit to this work by citing our paper in academic works and referencing to this Github repository in commercial works. If you need any support, please open an issue or contact Marcel Simon.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 51

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (1) 🔗