All Projects → kevinzakka → densenet

kevinzakka / densenet

Licence: MIT license
A PyTorch Implementation of "Densely Connected Convolutional Networks"

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to densenet

tf-matplotlib
Seamlessly integrate matplotlib figures as tensorflow summaries.
Stars: ✭ 119 (+138%)
Mutual labels:  tensorboard
speech-recognition-transfer-learning
Speech command recognition DenseNet transfer learning from UrbanSound8k in keras tensorflow
Stars: ✭ 18 (-64%)
Mutual labels:  densenet
mloperator
Machine Learning Operator & Controller for Kubernetes
Stars: ✭ 85 (+70%)
Mutual labels:  tensorboard
Master-Thesis
Deep Reinforcement Learning in Autonomous Driving: the A3C algorithm used to make a car learn to drive in TORCS; Python 3.5, Tensorflow, tensorboard, numpy, gym-torcs, ubuntu, latex
Stars: ✭ 33 (-34%)
Mutual labels:  tensorboard
yolov3-ios
Using yolo v3 object detection on ios platform.
Stars: ✭ 55 (+10%)
Mutual labels:  densenet
tfsum
Enable TensorBoard for TensorFlow Go API
Stars: ✭ 32 (-36%)
Mutual labels:  tensorboard
progressive-growing-of-gans.pytorch
Unofficial PyTorch implementation of "Progressive Growing of GANs for Improved Quality, Stability, and Variation".
Stars: ✭ 51 (+2%)
Mutual labels:  tensorboard
image-recognition
采用深度学习方法进行刀具识别。
Stars: ✭ 19 (-62%)
Mutual labels:  densenet
labml
🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱
Stars: ✭ 1,213 (+2326%)
Mutual labels:  tensorboard
Ensemble-of-Multi-Scale-CNN-for-Dermatoscopy-Classification
Fully supervised binary classification of skin lesions from dermatoscopic images using an ensemble of diverse CNN architectures (EfficientNet-B6, Inception-V3, SEResNeXt-101, SENet-154, DenseNet-169) with multi-scale input.
Stars: ✭ 25 (-50%)
Mutual labels:  densenet
DenseNet-Tensorflow
Reimplementation of DenseNet
Stars: ✭ 16 (-68%)
Mutual labels:  densenet
lightning-hydra-template
PyTorch Lightning + Hydra. A very user-friendly template for rapid and reproducible ML experimentation with best practices. ⚡🔥⚡
Stars: ✭ 1,905 (+3710%)
Mutual labels:  tensorboard
gluon2pytorch
Gluon to PyTorch deep neural network model converter
Stars: ✭ 72 (+44%)
Mutual labels:  densenet
awesome-computer-vision-models
A list of popular deep learning models related to classification, segmentation and detection problems
Stars: ✭ 419 (+738%)
Mutual labels:  densenet
deeplearning-mpo
Replace FC2, LeNet-5, VGG, Resnet, Densenet's full-connected layers with MPO
Stars: ✭ 26 (-48%)
Mutual labels:  densenet
deepvac
PyTorch Project Specification.
Stars: ✭ 507 (+914%)
Mutual labels:  tensorboard
python cv AI ML
用python做计算机视觉,人工智能,机器学习,深度学习等
Stars: ✭ 73 (+46%)
Mutual labels:  densenet
deep-scite
🚣 A simple recommendation engine (by way of convolutions and embeddings) written in TensorFlow
Stars: ✭ 20 (-60%)
Mutual labels:  tensorboard
TensorMONK
A collection of deep learning models (PyTorch implemtation)
Stars: ✭ 21 (-58%)
Mutual labels:  densenet
flor
FLOR: Fast Low-Overhead Recovery. FLOR lets you log ML training data post-hoc, with hindsight.
Stars: ✭ 123 (+146%)
Mutual labels:  tensorboard

Densely Connected Convolutional Networks

This is a PyTorch implementation of the DenseNet architecture as described in Densely Connected Convolutional Networks by G. Huang, Z. Liu, K. Weinberger, and L. van der Maaten.

Drawing

To-do

  • Multi-GPU support
  • Unique model checkpointing (clashes can currently occur)

Requirements

  • Python 3
  • PyTorch (newest version)
  • tqdm
  • tensorboard_logger

Usage

This implementation currently supports training on the CIFAR-10 and CIFAR-100 datasets (support for ImageNet coming soon).

Basically, when training a model, you should specify whether to use the bottleneck variant of the dense block or not --bottleneck and if so, what compression factor to use --compression. You should also specify the total number of layers num_layers_total in the model. By default, data augmentation is performed which means no dropout, so if you choose to turn that off, you should specify a desired dropout rate --dropout_rate.

Furthermore, checkpoints of the model are saved at the end of every epoch. This means that you can resume training from your latest epoch by using the --resume=True argument. Note that this will work only after you've run at least 1 epoch of training. When testing a model, reuse whatever command you used to train the model and add the --is_train=False argument. This will load the model with the best validation accuracy and test it on the test set.

Note that you can use tensorboard to view losses and accuracy by setting the use_tensorboard argument in which case you need to run tensorboard --logdir=./logs/ in a separate shell.

Finally, to see all possible options, run:

python main.py --help

which will print:

usage: main.py [-h] [--num_blocks NUM_BLOCKS]
               [--num_layers_total NUM_LAYERS_TOTAL]
               [--growth_rate GROWTH_RATE] [--bottleneck BOTTLENECK]
               [--compression COMPRESSION] [--dataset DATASET]
               [--valid_size VALID_SIZE] [--batch_size BATCH_SIZE]
               [--num_worker NUM_WORKER] [--augment AUGMENT]
               [--shuffle SHUFFLE] [--show_sample SHOW_SAMPLE]
               [--is_train IS_TRAIN] [--epochs EPOCHS] [--init_lr INIT_LR]
               [--momentum MOMENTUM] [--weight_decay WEIGHT_DECAY]
               [--lr_decay LR_DECAY] [--dropout_rate DROPOUT_RATE]
               [--random_seed RANDOM_SEED] [--data_dir DATA_DIR]
               [--ckpt_dir CKPT_DIR] [--logs_dir LOGS_DIR] [--num_gpu NUM_GPU]
               [--use_tensorboard USE_TENSORBOARD] [--resume RESUME]
               [--print_freq PRINT_FREQ]

DenseNet

optional arguments:
  -h, --help            show this help message and exit

Network:
  --num_blocks NUM_BLOCKS
                        # of Dense blocks to use in the network
  --num_layers_total NUM_LAYERS_TOTAL
                        Total # of layers in the network
  --growth_rate GROWTH_RATE
                        Growth rate (k) of the network
  --bottleneck BOTTLENECK
                        Whether to use bottleneck layers
  --compression COMPRESSION
                        Compression factor theta in the range [0, 1]

Data:
  --dataset DATASET     Which dataset to work with. Can be CIFAR10, CIFAR100
                        or Imagenet
  --valid_size VALID_SIZE
                        Proportion of training set used for validation
  --batch_size BATCH_SIZE
                        # of images in each batch of data
  --num_worker NUM_WORKER
                        # of subprocesses to use for data loading
  --augment AUGMENT     Whether to apply data augmentation or not
  --shuffle SHUFFLE     Whether to shuffle the dataset after every epoch
  --show_sample SHOW_SAMPLE
                        Whether to visualize a sample grid of the data

Training:
  --is_train IS_TRAIN   Whether to train or test the model
  --epochs EPOCHS       # of epochs to train for
  --init_lr INIT_LR     Initial learning rate value
  --momentum MOMENTUM   Nesterov momentum value
  --weight_decay WEIGHT_DECAY
                        weight decay penalty
  --lr_decay LR_DECAY, --list LR_DECAY
                        List containing fractions of the total number of
                        epochs in which the learning rate is decayed. Enter
                        empty string if you want a constant lr.
  --dropout_rate DROPOUT_RATE
                        Dropout rate used with non-augmented datasets

Misc:
  --random_seed RANDOM_SEED
                        Seed to ensure reproducibility
  --data_dir DATA_DIR   Directory in which data is stored
  --ckpt_dir CKPT_DIR   Directory in which to save model checkpoints
  --logs_dir LOGS_DIR   Directory in which Tensorboard logs wil be stored
  --num_gpu NUM_GPU     # of GPU's to use. A value of 0 will run on the CPU
  --use_tensorboard USE_TENSORBOARD
                        Whether to use tensorboard for visualization
  --resume RESUME       Whether to resume training from most recent checkpoint
  --print_freq PRINT_FREQ
                        How frequently to display training details on screen

You can edit the default values of these arguments in the config.py file.

Here's an example command for training a DenseNet-BC-100 architecture with a growth rate of 12, data augmentation, tensorboard visualization and with GPU:

python main.py \
--num_layers_total=100 \
--bottleneck=True \
--compression=0.5 \
--num_gpu=1 \
--use_tensorboard=True

Performance

I trained DenseNet-40 and DenseNet-BC-100 variants on the CIFAR-10 dataset but was not able to reproduce the author's results. I don't know if this stems from an error in the implementation or an unlucky seed... Training the 2 models in parallel took 2 days to complete on a p2.xlarge AWS instance with 1 GPU. If I get the time, I'll create clean, minimal instructions for setting up a similar instance for free.

Model Test Error
Densenet-40 9%
Densenet-BC-100 ~ 7%

Here are some tensorboard visualizations comparing the two models:

Drawing

From looking at the losses and accuracies, it's clearly visible that decreasing the learning rate at earlier times than mentioned in the paper can shorten the training time by a large factor. In fact, I noticed during training that the train accuracy and loss would just stagnate for a dozen epochs and then have a significant jump when the learning rate was decreased halway through. I'll be testing out this intuition and report my findings at a later date.

References

  • Thanks to Taehoon Kim for inspiring the general file hierarchy and layout of this project.
  • Thanks to the PyTorch ImageNet training example for helping me code the Trainer class.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].