All Projects → ShadyF → cnn-rnn-classifier

ShadyF / cnn-rnn-classifier

Licence: other
A practical example on how to combine both a CNN and a RNN to classify images.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to cnn-rnn-classifier

Rmdl
RMDL: Random Multimodel Deep Learning for Classification
Stars: ✭ 375 (+697.87%)
Mutual labels:  classification, rnn
Tensorflow cookbook
Code for Tensorflow Machine Learning Cookbook
Stars: ✭ 5,984 (+12631.91%)
Mutual labels:  classification, rnn
Text Classification Cnn Rnn
CNN-RNN中文文本分类,基于TensorFlow
Stars: ✭ 3,613 (+7587.23%)
Mutual labels:  classification, rnn
Eda nlp
Data augmentation for NLP, presented at EMNLP 2019
Stars: ✭ 902 (+1819.15%)
Mutual labels:  classification, rnn
Predicting Myers Briggs Type Indicator With Recurrent Neural Networks
Stars: ✭ 43 (-8.51%)
Mutual labels:  classification, rnn
Machine Learning
My Attempt(s) In The World Of ML/DL....
Stars: ✭ 78 (+65.96%)
Mutual labels:  classification, rnn
Tensorflow Tutorial
Tensorflow tutorial from basic to hard, 莫烦Python 中文AI教学
Stars: ✭ 4,122 (+8670.21%)
Mutual labels:  classification, rnn
Deep Music Genre Classification
🎵 Using Deep Learning to Categorize Music as Time Progresses Through Spectrogram Analysis
Stars: ✭ 23 (-51.06%)
Mutual labels:  classification, rnn
Rnn Theano
使用Theano实现的一些RNN代码,包括最基本的RNN,LSTM,以及部分Attention模型,如论文MLSTM等
Stars: ✭ 31 (-34.04%)
Mutual labels:  classification, rnn
Gru Svm
[ICMLC 2018] A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection
Stars: ✭ 76 (+61.7%)
Mutual labels:  classification, rnn
Lstm Music Genre Classification
Music genre classification with LSTM Recurrent Neural Nets in Keras & PyTorch
Stars: ✭ 166 (+253.19%)
Mutual labels:  classification, rnn
video title classification
Video short title classification.
Stars: ✭ 12 (-74.47%)
Mutual labels:  rnn
DrowsyDriverDetection
This is a project implementing Computer Vision and Deep Learning concepts to detect drowsiness of a driver and sound an alarm if drowsy.
Stars: ✭ 82 (+74.47%)
Mutual labels:  rnn
MachineLearning Exercises Python TensorFlow
Python&機械学習ライブラリ TensorFlow の使い方の練習コード集。特にニューラルネットワークを重点的に取り扱い。
Stars: ✭ 36 (-23.4%)
Mutual labels:  rnn
Rep-Counter
AI Exercise Rep Counter based on Google's Human Pose Estimation Library (Posenet)
Stars: ✭ 47 (+0%)
Mutual labels:  rnn
Awesome-Tensorflow2
基于Tensorflow2开发的优秀扩展包及项目
Stars: ✭ 45 (-4.26%)
Mutual labels:  classification
stylegan-pokemon
Generating Pokemon cards using a mixture of StyleGAN and RNN to create beautiful & vibrant cards ready for battle!
Stars: ✭ 47 (+0%)
Mutual labels:  rnn
char-rnnlm-tensorflow
Char RNN Language Model based on Tensorflow
Stars: ✭ 14 (-70.21%)
Mutual labels:  rnn
Recurrent-Neural-Network-for-BitCoin-price-prediction
Recurrent Neural Network (LSTM) by using TensorFlow and Keras in Python for BitCoin price prediction
Stars: ✭ 53 (+12.77%)
Mutual labels:  rnn
dltf
Hands-on in-person workshop for Deep Learning with TensorFlow
Stars: ✭ 14 (-70.21%)
Mutual labels:  rnn

This is a practical example on how to combine both a CNN and a RNN to classify images.

NOTE: This classifier was tested with the tiny-imagenet-100 dataset only.

Network Architecture

The network consists of two different branches: a CNN branch which uses the Xception model, pretrained on imagenet and provided by Keras (https://keras.io/applications/#xception) and another indepented RNN branch.

Each one of these branches runs parallel to each other.

Initially, the entire network takes an RGB image whose shape is 299x299x3.

On the CNN branch, this image is taken as is (299x299x3) and passed through the pretrained Xception model until it reaches the final convolution block which has the bottleneck features, which is of size (batch_size, 2048).

On the other branch, the 299x299x3 image is transformed into a grayscale image of size 299x299x1 to be able to properly split it into chunks to feed it into the RNN. Afterwards, this 299x299 image is reshaped into (23, 3887), where 23 is the timesteps and 3887 is the dim of each timestep. These values were chosen because 233887 == 299299. The reshaped image is then passed through two LSTM layers, each of which are of (batch_size, 2048) output.

Next, now that we have (batch_size, 2048) from both the CNN and RNN branches, these two outputs are merged using element-wise multiplication. The output of this multiplication is then fed to the classification layer which consists of 100 nodes (100 classes) and a softmax activation.

Network Training

The network was trained in two phases. In the first phase, all the layers of the CNN were frozen and only the last classification layer and the RNN network were trained. This was done using the RMSProp optimizer.

In the second phase, all the layers of the entire network were unfrozen and finetuned using Adam optimizer with a learning rate of 0.0001.

Using this two phase training technique, the cnn/rnn model combination is able to achieve a Top 5 Accuracy of 96.14% on a minified version of the ImageNet dataset that contains only 100 classes (tiny-imagenet-100)

Dataset Structure

Keras’ ImageDataGenerator flow_from_directory method expects the dataset to be in a certain structure.

The restructure_dataset.py script in the helpers directory can be used to reorganize the original dataset (given it has the same structure as the tiny-imagenet-100 dataset) into the strucutre Keras expects.

Image Preprocessing

The Xception model expects images to be processed in a certain way. However, because Keras’ built in ImageDataGenerator is used, We could not easily preprocess the input while using the fit_generator() training method.

Consequently, in cnn_rnn_classifier.py, a new class was created, CustomImageDataGenerator that inherits from ImageDataGenerator and has an overloaded standardize() method which is called by ImageDataGenerator before batch is yielded to fit_generator().

The standardize() method of CustomImageDataGenerator applies the Xception model’s required preprocessing on the input.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].