All Projects → visipedia → Tf_classification

visipedia / Tf_classification

Licence: mit
Training, evaluation and testing code for image classification using TensorFlow

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Tf classification

Sytora
A sophisticated smart symptom search engine
Stars: ✭ 111 (-12.6%)
Mutual labels:  classification
Bird Recognition Review
A list of useful resources in the bird sound (song and calls) recognition, such as datasets, papers, links to open source projects and competitions
Stars: ✭ 116 (-8.66%)
Mutual labels:  classification
Keras transfer cifar10
Object classification with CIFAR-10 using transfer learning
Stars: ✭ 120 (-5.51%)
Mutual labels:  classification
Pyani
Python module for average nucleotide identity analyses
Stars: ✭ 111 (-12.6%)
Mutual labels:  classification
Mlr
Machine Learning in R
Stars: ✭ 1,542 (+1114.17%)
Mutual labels:  classification
Sinaweibo Emotion Classification
新浪微博情感分析应用
Stars: ✭ 118 (-7.09%)
Mutual labels:  classification
Pointnet Keras
Keras implementation for Pointnet
Stars: ✭ 110 (-13.39%)
Mutual labels:  classification
Handwritten Digit Recognition Using Deep Learning
Handwritten Digit Recognition using Machine Learning and Deep Learning
Stars: ✭ 127 (+0%)
Mutual labels:  classification
Traffic Signs
Building a CNN based traffic signs classifier.
Stars: ✭ 115 (-9.45%)
Mutual labels:  classification
Project alias
Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.
Stars: ✭ 1,577 (+1141.73%)
Mutual labels:  classification
Shiftresnet Cifar
ResNet with Shift, Depthwise, or Convolutional Operations for CIFAR-100, CIFAR-10 on PyTorch
Stars: ✭ 112 (-11.81%)
Mutual labels:  classification
Cinc Challenge2017
ECG classification from short single lead segments (Computing in Cardiology Challenge 2017 entry)
Stars: ✭ 112 (-11.81%)
Mutual labels:  classification
Mobilenet
A Clearer and Simpler MobileNet Implementation in TensorFlow
Stars: ✭ 119 (-6.3%)
Mutual labels:  classification
Dni.pytorch
Implement Decoupled Neural Interfaces using Synthetic Gradients in Pytorch
Stars: ✭ 111 (-12.6%)
Mutual labels:  classification
Robovision
AI and machine leaning-based computer vision for a robot
Stars: ✭ 126 (-0.79%)
Mutual labels:  classification
Natural Synaptic
Javascript Natural Language Classifier
Stars: ✭ 110 (-13.39%)
Mutual labels:  classification
Model Quantization
Collections of model quantization algorithms
Stars: ✭ 118 (-7.09%)
Mutual labels:  classification
Pytorch Speech Commands
Speech commands recognition with PyTorch
Stars: ✭ 128 (+0.79%)
Mutual labels:  classification
Free adv train
Official TensorFlow Implementation of Adversarial Training for Free! which trains robust models at no extra cost compared to natural training.
Stars: ✭ 127 (+0%)
Mutual labels:  classification
Ml Dl Scripts
The repository provides usefull python scripts for ML and data analysis
Stars: ✭ 119 (-6.3%)
Mutual labels:  classification

TensorFlow Classification

This repo contains training, testing and classifcation code for image classification using TensorFlow. Whole image classification as well as multi instance bounding box classification is supported.

Checkout the Wiki for more detailed tutorials.


Requirements

TensorFlow 1.0+ is required. The code is tested with TensorFlow 1.3 and Python 2.7 on Ubuntu 16.04 and Mac OSX 10.11. Check out the requirements.txt file for a list of python dependencies.


Prepare the Data

The models require the image data to be in a specific format. You can use the Visipedia tfrecords repo to produce the files.

For the commands below, I'll assume that you have created a DATASET_DIR environment variable that points to the directory that contains your tfrecords:

$ export DATASET_DIR=/home/ubuntu/tf_datasets/cub

Directory Structure

I have found that its useful to have the following directory and file setup:

  • experiment/
    • logdir/
      • train_summaries/
      • val_summaries/
      • test_summaries/
      • results/
      • finetune/
        • train_summaries/
        • val_summaries/
    • cmds.txt
    • config_train.yaml
    • config_test.yaml
    • config_export.yaml

The purpose of each directory and file will be explained below.

The cmds.txt is useful to save the different training and testing commands. There are quite a few command-line arguments to some of the scripts, so its convienent to compose the commands in an editor.

For the commands below, I'll assume that you have created a EXPERIMENT_DIR environment variable that points to your experiment directory:

$ export EXPERIMENT_DIR=/home/ubuntu/tf_experiments/cub

Configuration

There are example configuration files in the config directory. At the very least you'll need a config_train.yaml file, and you'll probably want a config_test.yaml file. It is convienent to copy the example configuration files into your experiment directory. See the configuration README for more details.

Choose a Network Architecture

This repo currently supports the Google Inception, ResNet and MobileNet flavor of networks. See the nets README for more information on the different Inception versions. At the moment, inception_v3 probably offers the best tradeoff in terms of size and performance, although its always worth experimenting with a few different architectures. The README also contains links where you can download checkpoint files for the models. In most cases you should start your training from these checkpoint files rather than training from scratch.

You can specify the name of the choosen network in the configuration yaml file. Alternatively you can pass it in as a command-line argument to most of the scripts.

For the commands below, I'll assume that you have created an environment variable that points to the pretrained checkpoint file that you downloaded:

$ export PRETRAINED_MODEL=/home/ubuntu/tf_models/inception_v3.ckpt

Data Visualization

Now that you have a configuration script for training, it is a good idea to visualize the inputs to the network and ensure that they look good. This allows you to debug any problems with your tfrecords and lets you play with different augmentation techniques. Visualize your data by doing:

$ CUDA_VISIBLE_DEVICES=1 python visualize_train_inputs.py \
--tfrecords $DATASET_DIR/train* \
--config $EXPERIMENT_DIR/config_train.yaml

If you are in a virtualenv and Matplotlib is complaining, then you may need to modify your environment. See this FAQ and this document for fixing this issue. I use a virtualenv on my Mac OSX 10.11 machine and I needed to do the PYTHONHOME work around for Matplotlib to work properly. In this case the command looks like:

$ CUDA_VISIBLE_DEVICES=1 frameworkpython visualize_train_inputs.py \
--tfrecords $DATASET_DIR/train* \
--config $EXPERIMENT_DIR/config_train.yaml

Training and Validating

It's recommended to start from a pretrained network when training a network on your own data. However, this isn't necessary and you can train from scratch if you have enough data. The following warmup section assumes you are starting from a pretrained network. See the nets README to find links to pretrained checkpoint files.

Finetune A Pretrained Network

Finetuning a pretrained network essentially uses the pretrained network as a generic feature extractor and learns a new final layer that will output predictions for your target classes (rather than the original classes that the pretrained network was trained on). To do this, we will specify the pretrained model as the starting point, and only allow the logits layers to be modified. We can put the trained models in the experiment/logdir/finetune directory.

$ CUDA_VISIBLE_DEVICES=0 python train.py \
--tfrecords $DATASET_DIR/train* \
--logdir $EXPERIMENT_DIR/logdir/finetune \
--config $EXPERIMENT_DIR/config_train.yaml \
--pretrained_model $PRETRAINED_MODEL \
--trainable_scopes InceptionV3/Logits InceptionV3/AuxLogits \
--checkpoint_exclude_scopes InceptionV3/Logits InceptionV3/AuxLogits \
--learning_rate_decay_type fixed \
--lr 0.01 

Monitoring Progress

We'll want to monitor performance of the model on a validation set. Once the model performance starts to plateau we can assume that the final layer is warmed up and we can switch to full training. We can monitor the validation performance by running:

$ CUDA_VISIBLE_DEVICES=1 python test.py \
--tfrecords $DATASET_DIR/val* \
--save_dir $EXPERIMENT_DIR/logdir/finetune/val_summaries \
--checkpoint_path $EXPERIMENT_DIR/logdir/finetune \
--config $EXPERIMENT_DIR/config_test.yaml \
--batches 100 \
--eval_interval_secs 300

You may want to also monitor the accuracy on the train set. Simply pass in the train tfrecords to the test.py script and change the output directory:

$ CUDA_VISIBLE_DEVICES=1 python test.py \
--tfrecords $DATASET_DIR/train* \
--save_dir $EXPERIMENT_DIR/logdir/finetune/train_summaries \
--checkpoint_path $EXPERIMENT_DIR/logdir/finetune \
--config $EXPERIMENT_DIR/config_test.yaml \
--batches 100 \
--eval_interval_secs 300

Keeping the train summaries and val summaries in separate directories will keep the tensorboard ui clean. To monitor the training process you can fireup tensorboard:

$ tensorboard --logdir=$EXPERIMENT_DIR/logdir --port=6006

Training the Entire Network

The benefit of finetuning a network is that the training is very fast, as only the last layer is modified. However, to get the best performance you'll typically want to modify more (or all) of the layers of the network. Starting from a pretrained network (which can happen to be a finetuned network), this full training step essentially adapts the network to operating on the domain of your specific dataset. We'll store the generated files in the experiment/logdir directory. You can do the finetuning process as a warmup and then start the full train:

$ CUDA_VISIBLE_DEVICES=0 python train.py \
--tfrecords $DATASET_DIR/train* \
--logdir $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_train.yaml \
--pretrained_model $EXPERIMENT_DIR/logdir/finetune

Or you can just start the full train from a pretrained model:

$ CUDA_VISIBLE_DEVICES=0 python train.py \
--tfrecords $DATASET_DIR/train* \
--logdir $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_train.yaml \
--pretrained_model $PRETRAINED_MODEL \
--checkpoint_exclude_scopes InceptionV3/Logits InceptionV3/AuxLogits

Or if you have enough data, you may not want to even use the pretrained model. Rather you can train from scratch:

$ CUDA_VISIBLE_DEVICES=0 python train.py \
--tfrecords $DATASET_DIR/train* \
--logdir $EXPERIMENT_DIR/logdir/ \
--config $EXPERIMENT_DIR/config_train.yaml

Monitoring Progress

For watching the validation performance we can do:

$ CUDA_VISIBLE_DEVICES=1 python test.py \
--tfrecords $DATASET_DIR/val* \
--save_dir $EXPERIMENT_DIR/logdir/val_summaries \
--checkpoint_path $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_test.yaml \
--batches 100 \
--eval_interval_secs 300

Similar for the train data:

$ CUDA_VISIBLE_DEVICES=1 python test.py \
--tfrecords $DATASET_DIR/train* \
--save_dir $EXPERIMENT_DIR/train_summaries \
--checkpoint_path $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_test.yaml \
--batches 100 \
--eval_interval_secs 300

The command for tensorboard doesn't need to change:

$ tensorboard --logdir=$EXPERIMENT_DIR/logdir --port=6006

You will be able to see the fine-tune and the full train data plotted on the same plots.


Test

Once performance on the validation data has plateaued (or some other criterion has been met), you can test the model on a held out set of images to see how well it generalizes to new data:

$ CUDA_VISIBLE_DEVICES=1 python test.py \
--tfrecords $DATASET_DIR/test* \
--save_dir $EXPERIMENT_DIR/logdir/test_summaries \
--checkpoint_path $EXPERIMENT_DIR/logdir \
--config $EXPERIMENT_DIR/config_test.yaml \
--batch_size 32 \
--batches 100

If you are happy with the performance of the model, then you are ready to classify new images and export the model for production use. Otherwise its back to the drawing board to figure out how to increase performance.


Classifying

If you want to classify data offline using the trained model then you can do:

CUDA_VISIBLE_DEVICES=1 python classify.py \
--tfrecords $DATASET_DIR/new/* \
--checkpoint_path $EXPERIMENT_DIR/logdir \
--save_path $EXPERIMENT_DIR/logdir/results/classification_results.npz \
--config $EXPERIMENT_DIR/config_test.yaml \
--batch_size 32 \
--batches 1000 \
--save_logits

The output of the script is a numpy uncompressed .npz file saved at --save_path. The file will contain at least 2 arrays: one that contains ids and one that contains the predicted class label. If --save_logits is specified, then the raw logits (before going through the softmax) will also be saved.


Export & Compress

To export a model for easy use on a mobile device you can use:

python export.py \
--checkpoint_path model.ckpt-399739 \
--export_dir ./export \
--export_version 1 \
--config config_export.yaml \
--class_names class-codes.txt

The input node is called images and the output node is called Predictions. Checkout this wiki article for more tips.

If you are going to use the model with TensorFlow Serving then you can use the following:

python export.py \
--checkpoint_path model.ckpt-399739 \
--export_dir ./export \
--export_version 1 \
--config config_export.yaml \
--serving \
--add_preprocess \
--class_names class-codes.txt

Check out the resources in the tfserving directory for more help with deploying on TensorFlow Serving.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].