All Projects → dividiti → Ck Caffe

dividiti / Ck Caffe

Licence: other
Collective Knowledge workflow for Caffe to automate installation across diverse platforms and to collaboratively evaluate and optimize Caffe-based workloads across diverse hardware, software and data sets (compilers, libraries, tools, models, inputs):

Projects that are alternatives of or similar to Ck Caffe

ctuning-programs
Collective Knowledge extension with unified and customizable benchmarks (with extensible JSON meta information) to be easily integrated with customizable and portable Collective Knowledge workflows. You can easily compile and run these benchmarks using different compilers, environments, hardware and OS (Linux, MacOS, Windows, Android). More info:
Stars: ✭ 41 (-78.65%)
Mutual labels:  json-api, opencl, cuda
Primitiv
A Neural Network Toolkit.
Stars: ✭ 164 (-14.58%)
Mutual labels:  cmake, opencl, cuda
Parenchyma
An extensible HPC framework for CUDA, OpenCL and native CPU.
Stars: ✭ 71 (-63.02%)
Mutual labels:  opencl, cuda
Hashcat
World's fastest and most advanced password recovery utility
Stars: ✭ 11,014 (+5636.46%)
Mutual labels:  opencl, cuda
Spoc
Stream Processing with OCaml
Stars: ✭ 115 (-40.1%)
Mutual labels:  opencl, cuda
Ktt
Kernel Tuning Toolkit
Stars: ✭ 33 (-82.81%)
Mutual labels:  opencl, cuda
Soul Engine
Physically based renderer and simulation engine for real-time applications.
Stars: ✭ 37 (-80.73%)
Mutual labels:  opencl, cuda
Cltune
CLTune: An automatic OpenCL & CUDA kernel tuner
Stars: ✭ 114 (-40.62%)
Mutual labels:  opencl, cuda
Juice
The Hacker's Machine Learning Engine
Stars: ✭ 743 (+286.98%)
Mutual labels:  opencl, cuda
Nnvm
No description or website provided.
Stars: ✭ 1,639 (+753.65%)
Mutual labels:  opencl, cuda
Mixbench
A GPU benchmark tool for evaluating GPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL)
Stars: ✭ 130 (-32.29%)
Mutual labels:  opencl, cuda
Compactcnncascade
A binary library for very fast face detection using compact CNNs.
Stars: ✭ 152 (-20.83%)
Mutual labels:  opencl, cuda
Neanderthal
Fast Clojure Matrix Library
Stars: ✭ 927 (+382.81%)
Mutual labels:  opencl, cuda
Arraymancer
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Stars: ✭ 793 (+313.02%)
Mutual labels:  opencl, cuda
Autodock Gpu
AutoDock for GPUs and other accelerators
Stars: ✭ 65 (-66.15%)
Mutual labels:  opencl, cuda
Pyopencl
OpenCL integration for Python, plus shiny features
Stars: ✭ 790 (+311.46%)
Mutual labels:  opencl, cuda
Futhark
💥💻💥 A data-parallel functional programming language
Stars: ✭ 1,641 (+754.69%)
Mutual labels:  opencl, cuda
Luxcore
LuxCore source repository
Stars: ✭ 601 (+213.02%)
Mutual labels:  opencl, cuda
Vexcl
VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP
Stars: ✭ 626 (+226.04%)
Mutual labels:  opencl, cuda
Babelstream
STREAM, for lots of devices written in many programming models
Stars: ✭ 121 (-36.98%)
Mutual labels:  opencl, cuda

compatibility automation workflow

DOI License

News

  • 20181205: It seems that Caffe for Android fails with the latest NDK. However, we checked that we can still automatically build Caffe for Android via CK with the NDK r13b and Boost 1.64 as described here.

cknowledge.org/ai: Crowdsourcing benchmarking and optimisation of AI

A suite of open-source tools for collecting knowledge on optimising AI:

Collective Knowledge repository for collaboratively optimising Caffe-based designs

Introduction

CK-Caffe is an open framework for collaborative and reproducible optimisation of convolutional neural networks. It's based on the Caffe framework from the Berkeley Vision and Learning Center (BVLC) and the Collective Knowledge framework for customizable cross-platform builds and experimental workflows with JSON API from the cTuning Foundation (see CK intro for more details: 1, 2 ). In essence, CK-Caffe is an open-source suite of convenient wrappers and workflows with unified JSON API for simple and customized building, evaluation and multi-objective optimisation of various Caffe implementations (CPU, CUDA, OpenCL) across diverse platforms from mobile devices and IoT to supercomputers.

As outlined in our vision, we invite the community to collaboratively design and optimize convolutional neural networks to meet the performance, accuracy and cost requirements for deployment on a range of form factors - from sensors to self-driving cars. To this end, CK-Caffe leverages the key capabilities of CK to crowdsource experimentation across diverse platforms, CNN designs, optimization options, and so on; exchange experimental data in a flexible JSON-based format; and apply leading-edge predictive analytics to extract valuable insights from the experimental data.

See cKnowledge.org/ai, reproducible and CK-powered AI/SW/HW co-design competitions at ACM/IEEE conferences, shared optimization statistics, reusable AI artifact in the CK format and online demo of CK AI API with self-optimizing DNN for more details.

Maintainers

  • Linux/MacOS: dividiti - not actively maintained
  • Windows: currently no maintainer

Authors/contributors

Public benchmarking results

Comparing the accuracy of 4 models

In this Jupyter notebook, we compare the Top-1 and Top-5 accuracy of 4 models:

The experimental data (stored in the main CK-Caffe repository under 'experiment') essentially confirms that SqueezeNet matches (and even slightly exceeds) the accuracy of AlexNet on the ImageNet validation set (50,000 images).

Comparing the performance across models and configurations

We have performed several detailed performance analysis studies across a range of platforms using CK-Caffe. The following results are publicly available:

Quick installation on Ubuntu

Please refer to our Installation Guide for detailed instructions for Ubuntu, Gentoo, Yocto, RedHat, CentOS, Windows and Android.

Installing general dependencies

$ sudo apt install coreutils \
                   build-essential \
                   make \
                   cmake \
                   wget \
                   git \
                   python \
                   python-pip

Installing essential Caffe dependencies

$ sudo apt install libleveldb-dev \
                   libsnappy-dev \
                   gfortran

Installing optional Caffe dependencies

CK can automatically build the following dependencies from source using versions that should work well together. Installing via apt, however, is somewhat faster.

$ sudo apt install libboost-all-dev \
                   libgflags-dev \
                   libgoogle-glog-dev \
                   libhdf5-serial-dev \
                   liblmdb-dev \
                   libprotobuf-dev \
                   protobuf-compiler \
                   libopencv-dev

Installing CK

$ sudo pip install ck

Skip "sudo" if installing on Windows.

Alternatively, you can install CK in a user space as follows:

$ git clone http://github.com/ctuning/ck ck-master
$ export PATH=$PWD/ck-master/bin:$PATH
$ export PYTHONPATH=$PWD/ck-master:$PYTHONPATH

Testing CK

$ ck version

We suggest you to configure CK to install packages to the CK virtual environment entries (env):

$ ck set kernel var.install_to_env=yes

Installing CK-Caffe repository

$ ck pull repo:ck-caffe --url=https://github.com/dividiti/ck-caffe

Installing CK packages

Very often latest Caffe conflicts with the older protobuf version installed on a system. That's why we suggest to install protobuf via CK before installing Caffe:

$ ck install package --tags=protobuf-host

Building Caffe and all dependencies via CK

The first time you run caffe benchmark (on Linux or Windows), CK will build and install all missing dependencies for your machine, download required data sets and will start benchmark:

$ ck run program:caffe

CK may ask you to select some detected software and packages to be used for installation (when multiple choices are available). In such cases, we suggest you to either use a default value (just press Enter) or stable (recommended) versions.

Testing installation via image classification

 $ ck compile program:caffe-classification --speed
 $ ck run program:caffe-classification

Note that you will be asked to select a JPEG image from available CK data sets. We have added standard demo images (cat.jpg, catgrey.jpg, fish-bike.jpg, computer_mouse.jpg) to the 'ctuning-datasets-min' repository.

You can list them via:

 $ ck pull repo:ctuning-datasets-min
 $ ck search dataset --tags=dnn

You can minimize interactive selection of multiple software dependencies by adding "--reuse_deps" flag during compilation, i.e.

 $ ck compile program:caffe-classification --speed --reuse_deps
 $ ck run program:caffe-classification

If you have Android SDK and NDK installed, you can compile and run the same classification example on your Android device connected to a host machine via ADB as follows:

 $ ck compile program:caffe-classification --speed --target_os=android21-arm64
 $ ck run program:caffe-classification --target_os=android21-arm64

Participating in collaborative evaluation and optimization of various Caffe engines and models (on-going crowd-benchmarking)

You can participate in crowd-benchmarking of Caffe via:

$ ck crowdbench caffe --user={your email or ID to acknowledge contributions} --env.CK_CAFFE_BATCH_SIZE=5

During collaborative benchmarking, you can select various engines (which will be built on your machine) and models for evaluation.

You can also manually install additional flavours of Caffe engines across diverse hardware and OS (Linux/Windows/Android on Odroid, Raspberry Pi, ARM, Intel, AMD, NVIDIA, etc.) as described here.

You can also install extra models as follows:

 $ ck list package --tags=caffemodel
 $ ck install package:{name of above packages}

You can even evaluate DNN engines on Android mobile devices connected via adb to your host machine via:

$ ck crowdbench caffe --target_os=android21-arm64 --env.CK_CAFFE_BATCH_SIZE=1

Feel free to try different batch sizes by changing command line option --env.CK_CAFFE_BATCH_SIZE.

You can crowd-benchmark Caffe on Windows without re-compilation, i.e. using Caffe CPU or OpenCL binaries pre-built by the CK. You should install such binaries as follows:

 $ ck install package:lib-caffe-bvlc-master-cpu-bin-win

or

 $ ck install package:lib-caffe-bvlc-opencl-libdnn-viennacl-bin-win

You can also use this Android app to crowdsource benchmarking of ARM-based Caffe libraries for image recognition.

You can see continuously aggregated results in the public Collective Knowledge repository.

You can also open this website from the command line:

 $ ck browse experiment.bench.caffe

Unifying multi-dimensional and multi-objective autotuning

It is also possible to take advantage of our universal multi-objective CK autotuner to optimize Caffe. As a first simple example, we added batch size tuning via CK. You can invoke it as follows:

$ ck autotune caffe

All results will be recorded in the local CK repository and you will be given command lines to plot graphs or replay experiments such as:

$ ck plot graph:{experiment UID}
$ ck replay experiment:{experiment UID} --point={specific optimization point}

Unifying AI API

CK allows us to unify AI interfaces while collaboratively optimizing underneath engines. For example, we added similar support to install, use and evaluate Caffe2 and TensorFlow via CK:

$ ck pull repo:ck-caffe2
$ ck pull repo:ck-tensorflow

$ ck install package:lib-caffe2-master-eigen-cpu-universal --env.CAFFE_BUILD_PYTHON=ON
$ ck install package:lib-tensorflow-1.1.0-cpu
$ ck install package:lib-tensorflow-1.1.0-cuda

$ ck run program:caffe2 --cmd_key=classify
$ ck run program:tensorflow --cmd_key=classify

$ ck crowdbench caffe2 --env.BATCH_SIZE=5 --user=i_want_to_ack_my_contribution
$ ck crowdbench tensorflow --env.BATCH_SIZE=5 --user=i_want_to_ack_my_contribution

$ ck autotune caffe2
$ ck autotune tensorflow

Creating dataset subsets

The ILSVRC2012 validation dataset contains 50K images. For quick experiments, you can create a subset of this dataset, as follows. Run:

$ ck install package:imagenet-2012-val-lmdb-256

When prompted, enter the number of images to convert to LMDB, say, N = 100. The first N images will be taken.

Creating realistic/representative training sets

We provided an option in all our AI crowd-tuning tools to let the community report and share mispredictions (images, correct label and wrong misprediction) to gradually and collaboratively build realistic data/training sets:

Customizing caffe benchmarking via CK command line

You can customize various Caffe parameters such as batch size and iterations via CK command line:

$ ck run program:caffe --env.CK_CAFFE_BATCH_SIZE=1 --env.CK_CAFFE_ITERATIONS=10

Installing CK on Windows, Android and various flavours of Linux

You can find details about CK-Caffe installation for Windows, various flavours of Linux and Android here.

Online demo of a unified CK-AI API

  • Simple demo to classify images with continuous optimization of DNN engines underneath, sharing of mispredictions and creation of a community training set; and to predict compiler optimizations based on program features.

Next steps

CK-Caffe is part of an ambitious long-term and community-driven project to enable collaborative and systematic optimization of realistic workloads across diverse hardware in terms of performance, energy usage, accuracy, reliability, hardware price and other costs (ARM TechCon'16 talk and demo, DATE'16, CPC'15).

We are working with the community to unify and crowdsource performance analysis and tuning of various DNN frameworks (or any representative workloads) using Collective Knowledge Technology:

We continue to gradually expose various design and optimization choices including full parameterization of existing models.

Open R&D challenges

We use crowd-benchmarking and crowd-tuning of such realistic workloads across diverse hardware for open academic and industrial R&D challenges - join this community effort!

Related publications with long-term vision

@inproceedings{Lokhmotov:2016:OCN:2909437.2909449,
 author = {Lokhmotov, Anton and Fursin, Grigori},
 title = {Optimizing Convolutional Neural Networks on Embedded Platforms with OpenCL},
 booktitle = {Proceedings of the 4th International Workshop on OpenCL},
 series = {IWOCL '16},
 year = {2016},
 location = {Vienna, Austria},
 url = {http://doi.acm.org/10.1145/2909437.2909449},
 acmid = {2909449},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Convolutional neural networks, OpenCL, collaborative optimization, deep learning, optimization knowledge repository},
}

@inproceedings{ck-date16,
    title = {{Collective Knowledge}: towards {R\&D} sustainability},
    author = {Fursin, Grigori and Lokhmotov, Anton and Plowman, Ed},
    booktitle = {Proceedings of the Conference on Design, Automation and Test in Europe (DATE'16)},
    year = {2016},
    month = {March},
    url = {https://www.researchgate.net/publication/304010295_Collective_Knowledge_Towards_RD_Sustainability}
}

Testimonials and awards

Troubleshooting

  • When compiling OpenCL version of Caffe on Linux targeting NVidia GPU, select generic x86_64/libOpenCL.so rather than NVidia OpenCL driver when asked by the CK.

Feedback

Feel free to engage with our community via this mailing list:

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].