BENN

Codes for Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?

CVPR 2019 Paper

If using the code, please cite our paper: BibTex

If you have any question related to the codes or models, please open an issue. If you have general questions about principle of BENN or have any further idea of improving it, please contact us by email: [email protected], [email protected]. Please, no commercial use before getting permission from authors.

Notice: As mentioned in the paper (Section 7) we are aware of the overfitting problem caused by the ensemble technique. If retraining the models, they should basically match the results shown in the paper as well as here, but could be either slightly higher or lower due to random initialization, epoch selection, overfitting, etc. If you have a good idea of how to resolve the overfitting issue of ensemble methods, please contact the authors and we can further improve BENN.

Train BENN on CIFAR-10 dataset

A customized Network-In-Network (NIN) model is used. Please see paper for architecture details.

Ensemble	Model	Train	LR	BNN (start)	BENN (end)	Overfitting from	Best Voting	Models Directory	Logs
Bagging	AB	Seq	0.0001	67.35	81.32	20	Soft Max Vote	models	L
BoostA	AB	Seq	0.01	67.08	81.93	25	Soft Max Vote	models	L
BoostA	AB	Indp	0.01	70.59	82.12	20	Soft Max Vote	models	L
BoostB	AB	Seq	0.01	62.87	82.58	30	Soft Max Vote	models	L
BoostB	AB	Indp	0.01	69.65	82.13	21	Soft Max Vote	models	L
BoostC	AB	Seq	0.0001	67.88	79.40	27	Soft Max Vote	models	L
BoostD	AB	Indp	0.001	68.72	82.04	22	Soft Max Vote	models	L

Bagging	SB	Seq	0.001	77.87	89.12	25	Soft Max Vote	models	L
BoostA	SB	Seq	0.01	80.33	88.12	15	Soft Max Vote	models	L
BoostB	SB	Seq	0.001	84.23	87.9	31	Soft Max Vote	models	L
BoostC	SB	Seq	0.001	83.68	89.00	25	Soft Max Vote	models	L
BoostC	SB	Indp	0.01	80.38	87.72	23	Soft Max Vote	models	L
BoostD	SB	Seq	0.001	84.5	88.83	24	Soft Max Vote	models	L

Hints

Generally, we have:

🏠 2 different models (you can specify with --arch allbinnet/nin), corresponding to AB and SB models in the paper

⏳ 2 different training modes (independent training, and sequential training)

⚙️ 5 different ensemble schemes (Bagging, Boost A, Boost B, Boost C, and Boost D)

📊 2 voting strategies (hard majority vote, soft max vote)

Retrain models

For example:

$ python main_bagging_SB.py --epochs 0 --retrain_epochs 100 --root_dir PATH/TO/YOUR/models_bagging_SB/

Test pre-trained models

First download the models from the links above, then run the corresponding python script to test pre-trained models and you should get the exact same numbers comparing with our logs above. For example:

$ python main_bagging_SB.py --epochs 0 --retrain_epochs 0 --root_dir PATH/TO/YOUR/DOWNLOADED/models_bagging_SB/

Notice: For AB models, you should get around 79-82% accuracy for 32 ensembles. For SB models, you should get around 87-89% accuracy for 32 ensembles (usually 15-20 is a reasonable choice due to overfitting). The single BNN should have around 69-73% and 83-84% accuracy for AB and SB model respectively.

Train BENN on ImageNet dataset

Notice: Be sure to use SB model, and make sure each BNN is well converged before ensemble. Due to overfitting and optimization instability as observed in Section 6.2 from the paper, you may want to train BENN multiple times and pick the best combination. You may also explore model search on BENN.

ResNet-18 is presented here for best performance. We are currently testing the stability of gain of more ensembles up to 10 BNNs so please stay tuned. More uploaded models are coming soon (i.e., BENN-6 and BENN-10).

Ensemble	Model	Train	LR	BNN	BENN-3 Ensemble	Best Voting	Models Directory	Logs
Bagging	SB	Indp	0.001	48.87	54.34	Soft Max Vote	models	logs
Boost	SB	Indp	0.001	48.87	55.83	Soft Max Vote	models	logs

Retrain models

For example:

$ python2 main_bagging_imagenet.py --epochs 0 --retrain_epochs 100 --root_dir PATH/TO/YOUR/models_bagging/

Test pre-trained models

For example:

$ python2 main_bagging_imagenet.py --epochs 0 --retrain_epochs 0 --root_dir PATH/TO/YOUR/DOWNLOADED/models_bagging/

As for other arguments, please refer to the head of the code for explanation. The retrained models will have different performance each time so feel free to play with multiple settings.

Train BENN on your own network architecture and dataset

To train BENN for your own application, you can directly reuse the BENN training part of this code. More details will be provided. If you successfully train BENN on some new applications with new architectures and achieve satisfying performance, please contact the authors and we will add a link here.

Acknowledgement

The single BNN training part of this code is mostly written by referencing XNOR-Net and Jiecao Yu's implementation. Please consider them as well if you use our code. Based on our testing, XNOR-Net is the most stable and reliable open source BNN training scheme with product-level codes.

Check list

Release CIFAR-10 Training Code
Release CIFAR-10 Pretrained Models
Release ImageNet Training Code
Release ImageNet Pretrained Models
Release Additional ImageNet Pretrained Models

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

XinDongol / BENN-PyTorch

Programming Languages

BENN

Train BENN on CIFAR-10 dataset

Hints

Retrain models

Test pre-trained models

Train BENN on ImageNet dataset

Retrain models

Test pre-trained models

Train BENN on your own network architecture and dataset

Acknowledgement

Check list