All Projects → gabrielwong159 → siamese

gabrielwong159 / siamese

Licence: other
One-shot learning for image classification using Siamese neural networks

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to siamese

Reproducibilty-Challenge-ECANET
Unofficial Implementation of ECANets (CVPR 2020) for the Reproducibility Challenge 2020.
Stars: ✭ 27 (+3.85%)
Mutual labels:  image-classification
Personalised-aesthetic-assessment-using-residual-adapters
Jupyter notebooks used as supporting material for an msc thesis about personalised aesthetic assessment using residual adapters.
Stars: ✭ 19 (-26.92%)
Mutual labels:  image-classification
sparsify
Easy-to-use UI for automatically sparsifying neural networks and creating sparsification recipes for better inference performance and a smaller footprint
Stars: ✭ 138 (+430.77%)
Mutual labels:  image-classification
SketchRecognition
Model and Android app for sketch recognition using Google's quickdraw dataset
Stars: ✭ 28 (+7.69%)
Mutual labels:  image-classification
OfflineSignatureVerification
Writer independent offline signature verification using convolutional siamese networks
Stars: ✭ 49 (+88.46%)
Mutual labels:  siamese-cnn
data-selfie-image-classification
No description or website provided.
Stars: ✭ 15 (-42.31%)
Mutual labels:  image-classification
al-fk-self-supervision
Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"
Stars: ✭ 28 (+7.69%)
Mutual labels:  image-classification
VGG16 Keras TensorFlow
# This is a image classification by VGG16 pre-trained model.#
Stars: ✭ 40 (+53.85%)
Mutual labels:  image-classification
TensorPy
Easy Image Classification with TensorFlow
Stars: ✭ 44 (+69.23%)
Mutual labels:  image-classification
ML2017FALL
Machine Learning (EE 5184) in NTU
Stars: ✭ 66 (+153.85%)
Mutual labels:  image-classification
Offline-Signature-Verification-using-Siamese-Network
Identifying forged signatures using convolutional siamese networks implemented in Keras
Stars: ✭ 31 (+19.23%)
Mutual labels:  one-shot-learning
image classifier
Image classifier in Elixir
Stars: ✭ 12 (-53.85%)
Mutual labels:  image-classification
Deep-Learning
It contains the coursework and the practice I have done while learning Deep Learning.🚀 👨‍💻💥 🚩🌈
Stars: ✭ 21 (-19.23%)
Mutual labels:  image-classification
PlayerDetection
Player detection and ball detection in football matches using image processing(opencv).
Stars: ✭ 50 (+92.31%)
Mutual labels:  image-classification
convolutedPredictions Cdiscount
2nd place solution to Kaggle's Cdiscount image classification challange.
Stars: ✭ 17 (-34.62%)
Mutual labels:  image-classification
kaggle brain-tumor-3D
Predict the status of a genetic biomarker important for brain cancer treatment
Stars: ✭ 20 (-23.08%)
Mutual labels:  image-classification
HugsVision
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
Stars: ✭ 154 (+492.31%)
Mutual labels:  image-classification
thermography
Automatic detection of defected solar panel modules
Stars: ✭ 59 (+126.92%)
Mutual labels:  image-classification
CrowdLayer
A neural network layer that enables training of deep neural networks directly from crowdsourced labels (e.g. from Amazon Mechanical Turk) or, more generally, labels from multiple annotators with different biases and levels of expertise.
Stars: ✭ 45 (+73.08%)
Mutual labels:  image-classification
car-crash-accident
Car Crash Accident Project
Stars: ✭ 14 (-46.15%)
Mutual labels:  image-classification

One-shot learning with Siamese networks

Typical CNN classification methods involve a final fully-connected layer with neurons corresponding to the number of classes. This is suboptimal in situations where the number of classes is large, or changing.

In Siamese CNNs, we extract features from an image and convert it into an n-dimensional vector. We compare this n-dimensional vector with that of another image, and the model is trained such that images of the same class will produce similar vectors.

By comparing an unknown image against samples of labelled images, we are able to determine the labelled image which is most similar to the unknown image, and obtain a classification result. This provides Siamese networks with the ability to learn classification tasks with low training samples, as well as generalize to any number of classes.

Illustration of a Siamese network

Architecture

Much like a typical CNN, a Siamese CNN will have several convolutional layers, followed by fully-connected layers. The convolutional layers help to extract features from an image, before conversion into vectors for comparison.

Training

When training a Siamese CNN, we input two images, and a binary label indicating if the two images are of the same class. The last layer of the CNN is a fully-connected layer, which produces an n-dimensional vector. Subsequently, the output layer and the output vector will be used interchangably, and both refer to this layer. Depending on the label, the model will then try to minimize or maximize the distance between the vectors produced by the two images.

Note that the network that both images pass through are the same. This means that the weights and biases in the network for both images are identical throughout the training process.

Loss

In this project we experiment with two different kinds of loss functions. The loss is calculated based on the L1- or L2-distance between the outputs of the CNN (fully-connected layers) from the two images.

Loss with spring

In Dimensionality Reduction by Learning an Invariant Mapping the loss function as shown below is described. The following GitHub project is used as reference for the implementation of the loss function.

Siamese loss function

Sigmoid loss

Sigmoid loss for image recognition in Omniglot dataset is used in the paper Siamese Neural Networks for One-shot Image Recognition. The model architecture used in the paper is also the basis for the CNN for the Omniglot task.

MNIST

We start with MNIST to test our implementation. The model was trained with learning_rate=1e-4 over 20,000 iterations. The training results for several architectures are summarized below:

commit_hash conv. kernel size accuracy description
983a8a8 3x3 0.9758 2 layer FC + 2-neuron out
df5d2b9 5x5 0.9844 2 layer conv + 2 layer FC + 2-neuron out
df5d2b9 3x3 0.9856 2 layer conv + 2 layer FC + 2-neuron out
3757780 3x3 0.9890 2 layer conv + 2 layer FC (out)

Transfer learning

We first train a CNN on an MNIST classification task, achieving 99.37% accuracy on the test set. We then transfer the weights from the convolutional layers to the Siamese CNN before training the Siamese model with learning_rate=1e-4 over 10,000 iterations. This achieved a test accuracy of 98.99%, higher than the current maximum attained without transfer learning.

Testing

MNIST images for evaluation

For each of the ground truth images above, we obtain its output vector via the model. Then, for each image that we are evaluating, we obtain its output vector as well, then find the closest ground truth vector to it via L1- or L2-dist.

Omniglot

The Omniglot dataset is typically used for one-shot learning, as it contains a large number of classes, with few training samples per class.

While the training and testing classes were the same in MNIST, the Omniglot dataset allows us to test the model on completely different classes from the ones used in training.

A random seed of 0 was set for both the Python inbuilt random library, as well as Tensorflow.

Data

Training

Images in the images_background folder were used for training. For each class (e.g. Alphabet_of_the_Magi/character01), all possible combinations of pairs were appended to a list. For example, a class with 20 images yielded 20 choose 2 == 190 pairs.

n_samples number of pairs were then chosen at random from the possible pairs to form the training data for similar images. Subsequently, for each similar pair, we add a dissimilar pair by choosing two different classes at random, and choosing one image each from both classes. This ensures that the number of similar and dissimilar pairs are the same.

Testing

Images in the images_evaluation folder were used for testing. We use 20 classes (Angelic/character{01-20}) for testing, and determine accuracy by the number of correct predictions.

Results

Loss with spring

model_name n_samples n_iterations learning_rate dist accuracy
fc1 20 000 50 000 1e-5 L1 0.4025
fc1 20 000 50 000 1e-5 L2 0.4150
fc1 40 000 50 000 1e-5 L1 0.4000
fc1 40 000 50 000 1e-5 L2 0.4000
fc1_reg1 20 000 50 000 1e-5 L1 0.2700
fc1_reg1 20 000 50 000 1e-5 L2 0.2725
fc2 20 000 50 000 1e-5 L1 0.2875
fc2 20 000 50 000 1e-5 L2 0.2800
fc1

Single fully-connected layer with 4096 neurons.

fc1_reg1

Regularization with 2e-4 for convolutional layers.

fc2

Two fully-connected layer with 2048 neurons each, dropout=0.5 between fc1 and fc2. Number of neurons was reduced due to OOM allocations.

References

Implementation

Reading

Dataset

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].