All Projects → tdeboissiere → DeepLearningBenchmarks

tdeboissiere / DeepLearningBenchmarks

Licence: other
No description, website, or topics provided.

Programming Languages

python
139335 projects - #7 most used programming language

Deep Learning benchmarks

TO DO: Update corrected results with keras 2 (maybe pytorch as well)

Deep learning benchmarks largely inspired by vgg-benchmarks.

We tried to get the most out of each framework (GPU util is at 99% for all scripts) but some optimizations may have been overlooked. Fixes and contributions welcome !

Maxwell Titan X results

Standard VGG16 benchmark

Framework Time 1
Keras (Theano backend) 241.478 ms
Keras (TensorFlow backend) 362.206 ms 2
Tensorflow NHWC no XLA 365.122 ms
Tensorflow NHWC with XLA 300.424 ms
Tensorflow NCHW no XLA 298.478 ms
Tensorflow NCHW with XLA 294.063 ms

VGG16 + Batch Normalization (BN) benchmark

Framework Time 1
Keras (Theano backend) + BN 347.546 ms
Keras (TensorFlow backend) mode 0 560.938 ms
Tensorflow NHWC + BN no XLA 493.235 ms
Tensorflow NHWC + BN + XLA 341.702 ms
Tensorflow NHWC + fused BN no XLA 395.963 ms
Tensorflow NHWC + fused BN + XLA 450.777 ms
Tensorflow NCHW + BN no XLA 3642.178 ms
Tensorflow NCHW + BN + XLA 326.325 ms
Tensorflow NCHW + fused BN no XLA 322.396 ms
Tensorflow NCHW + fused BN + XLA 345.121 ms

1: Mean time for 100 (forward + backward + weight update) trials on a VGG16 network with mini batch size of 16. The timer is started right before the first trial and stopped right after the last trial. The reported time is obtained by dividing this interval by the number of trials.

2: Note that at the moment, keras uses traditional NHWC tensorflow ordering

System specs

  • Ubuntu 14.04
  • Cuda 8.0
  • cuDNN 5.1.10
  • theano '0.9.0beta1.dev-173eef98360c23d7418bad3a36f5fb938724f05f' (cuda backend)
  • tensorflow 1.0.0 (compiled from source with CUDA 8.0 cuDNN 5.1.10 and XLA JIT)
  • Keras 2.0.1

Usage

python main.py

optional arguments:

  --run_keras           Run keras benchmark
  --run_tensorflow      Run pure tensorflow benchmark
  --batch_size BATCH_SIZE
                        Batch size
  --n_trials N_TRIALS   Number of full iterations (forward + backward +
                        update)
  --use_XLA             Whether to use XLA compiler
  --data_format DATA_FORMAT
                        Tensorflow image format
  --use_bn              Use batch normalization (tf benchmark)
  --use_fused           Use fused batch normalization (tf benchmark)

Examples

python main.py --run_keras --keras_backend theano

This will run a keras benchmark with theano backend.

python main.py --run_tensorflow --data_format NHWC --use_XLA

This will run a pure tensorflow benchmark with NHWC image ordering and using XLA compiler as shown in Using JIT compilation

Notes

If running a keras tensorflow benchmark, make sure the ~/.keras/keras.json file is set to { "image_dim_ordering": "tf", "epsilon": 1e-07, "floatx": "float32", "backend": "tensorflow" }

If running a keras theano benchmark, make sure the ~/.keras/keras.json file is set to { "image_dim_ordering": "th", "epsilon": 1e-07, "floatx": "float32", "backend": "theano" }

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].