All Projects → max-andr → square-attack

max-andr / square-attack

Licence: BSD-3-Clause License
Square Attack: a query-efficient black-box adversarial attack via random search [ECCV 2020]

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to square-attack

sparse-rs
Sparse-RS: a versatile framework for query-efficient sparse black-box adversarial attacks
Stars: ✭ 24 (-73.03%)
Mutual labels:  random-search, adversarial-attacks, black-box-attacks
s-attack
[CVPR 2022] S-attack library. Official implementation of two papers "Vehicle trajectory prediction works, but not everywhere" and "Are socially-aware trajectory prediction models really socially-aware?".
Stars: ✭ 51 (-42.7%)
Mutual labels:  robustness, adversarial-attacks
domain-shift-robustness
Code for the paper "Addressing Model Vulnerability to Distributional Shifts over Image Transformation Sets", ICCV 2019
Stars: ✭ 22 (-75.28%)
Mutual labels:  adversarial-attacks, black-box-attacks
SimP-GCN
Implementation of the WSDM 2021 paper "Node Similarity Preserving Graph Convolutional Networks"
Stars: ✭ 43 (-51.69%)
Mutual labels:  robustness, adversarial-attacks
TIGER
Python toolbox to evaluate graph vulnerability and robustness (CIKM 2021)
Stars: ✭ 103 (+15.73%)
Mutual labels:  robustness, adversarial-attacks
procedural-advml
Task-agnostic universal black-box attacks on computer vision neural network via procedural noise (CCS'19)
Stars: ✭ 47 (-47.19%)
Mutual labels:  adversarial-attacks, black-box-attacks
POPQORN
An Algorithm to Quantify Robustness of Recurrent Neural Networks
Stars: ✭ 44 (-50.56%)
Mutual labels:  robustness, adversarial-attacks
DiagnoseRE
Source code and dataset for the CCKS201 paper "On Robustness and Bias Analysis of BERT-based Relation Extraction"
Stars: ✭ 23 (-74.16%)
Mutual labels:  robustness, adversarial-attacks
perceptual-advex
Code and data for the ICLR 2021 paper "Perceptual Adversarial Robustness: Defense Against Unseen Threat Models".
Stars: ✭ 44 (-50.56%)
Mutual labels:  robustness, adversarial-attacks
cool-papers-in-pytorch
Reimplementing cool papers in PyTorch...
Stars: ✭ 21 (-76.4%)
Mutual labels:  adversarial-attacks
well-classified-examples-are-underestimated
Code for the AAAI 2022 publication "Well-classified Examples are Underestimated in Classification with Deep Neural Networks"
Stars: ✭ 21 (-76.4%)
Mutual labels:  adversarial-attacks
Attack-ImageNet
No.2 solution of Tianchi ImageNet Adversarial Attack Challenge.
Stars: ✭ 41 (-53.93%)
Mutual labels:  adversarial-attacks
Adversarial-Distributional-Training
Adversarial Distributional Training (NeurIPS 2020)
Stars: ✭ 52 (-41.57%)
Mutual labels:  robustness
shortcut-perspective
Figures & code from the paper "Shortcut Learning in Deep Neural Networks" (Nature Machine Intelligence 2020)
Stars: ✭ 67 (-24.72%)
Mutual labels:  robustness
GeFs
Generative Forests in Python
Stars: ✭ 23 (-74.16%)
Mutual labels:  robustness
robustness-vit
Contains code for the paper "Vision Transformers are Robust Learners" (AAAI 2022).
Stars: ✭ 78 (-12.36%)
Mutual labels:  robustness
cycle-confusion
Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".
Stars: ✭ 67 (-24.72%)
Mutual labels:  robustness
Adversarial-Examples-Paper
Paper list of Adversarial Examples
Stars: ✭ 20 (-77.53%)
Mutual labels:  adversarial-attacks
advrank
Adversarial Ranking Attack and Defense, ECCV, 2020.
Stars: ✭ 19 (-78.65%)
Mutual labels:  adversarial-attacks
belay
Robust error-handling for Kotlin and Android
Stars: ✭ 35 (-60.67%)
Mutual labels:  robustness

Square Attack: a query-efficient black-box adversarial attack via random search

ECCV 2020

Maksym Andriushchenko*, Francesco Croce*, Nicolas Flammarion, Matthias Hein

EPFL, University of Tübingen

Paper: https://arxiv.org/abs/1912.00049

* denotes equal contribution

News

  • [Jul 2020] The paper is accepted at ECCV 2020! Please stop by our virtual poster for the latest insights in black-box adversarial attacks (also check out our recent preprint Sparse-RS paper where we use random search for sparse attacks).
  • [Mar 2020] Our attack is now part of AutoAttack, an ensemble of attacks used for automatic (i.e., no hyperparameter tuning needed) robustness evaluation. Table 2 in the AutoAttack paper shows that at least on 6 models our black-box attack outperforms gradient-based methods. Always useful to have a black-box attack to prevent inaccurate robustness claims!
  • [Mar 2020] We also achieve the best results on TRADES MNIST benchmark!
  • [Jan 2020] The Square Attack achieves the best results on MadryLab's MNIST challenge, outperforming all white-box attacks! In this case we used 50 random restarts of our attack, each with a query limit of 20000.
  • [Nov 2019] The Square Attack breaks the recently proposed defense from "Bandlimiting Neural Networks Against Adversarial Attacks" (https://github.com/robust-ml/robust-ml.github.io/issues/15).

Abstract

We propose the Square Attack, a score-based black-box L2- and Linf-adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. Square Attack is based on a randomized search scheme which selects localized square-shaped updates at random positions so that at each iteration the perturbation is situated approximately at the boundary of the feasible set. Our method is significantly more query efficient and achieves a higher success rate compared to the state-of-the-art methods, especially in the untargeted setting. In particular, on ImageNet we improve the average query efficiency in the untargeted setting for various deep networks by a factor of at least 1.8 and up to 3 compared to the recent state-of-the-art Linf-attack of Al-Dujaili & O’Reilly. Moreover, although our attack is black-box, it can also outperform gradient-based white-box attacks on the standard benchmarks achieving a new state-of-the-art in terms of the success rate.


The code of the Square Attack can be found in square_attack_linf(...) and square_attack_l2(...) in attack.py.
Below we show adversarial examples generated by our method for Linf and L2 perturbations:

About the paper

The general algorithm of the attack is extremely simple and relies on the random search algorithm: we try some update and accept it only if it helps to improve the loss:

The only thing we customize is the sampling distribution P (see the paper for details). The main idea behind the choice of the sampling distributions is that:

  • We start at the boundary of the feasible set with a good initialization that helps to improve the query efficiency (particularly for the Linf-attack).
  • Every iteration we stay at the boundary of the feasible set by changing squared-shaped regions of the image.

In the paper we also provide convergence analysis of a variant of our attack in the non-convex setting, and justify the main algorithmic choices such as modifying squares and using the same sign of the update.

This simple algorithm is sufficient to significantly outperform much more complex approaches in terms of the success rate and query efficiency:

Here are the complete success rate curves with respect to different number of queries. We note that the Square Attack also outperforms the competing approaches in the low-query regime.

The Square Attack also performs very well on adversarially trained models on MNIST achieving results competitive or better than white-box attacks despite the fact our attack is black-box:

Interestingly, the L2 perturbations for the Linf adversarially trained model are challenging for many attacks, including white-box PGD, and also other black-box attacks. However, the Square Attack is able to much more accurately assess the robustness in this setting:

Running the code

attack.py is the main module that implements the Square Attack, see the command line arguments there. The main functions which implement the attack are square_attack_linf() and square_attack_l2().

In order to run the untargeted Linf Square Attack on ImageNet models from the PyTorch repository you need to specify a correct path to the validation set (see IMAGENET_PATH in data.py) and then run:

  • python attack.py --attack=square_linf --model=pt_vgg --n_ex=1000 --eps=12.75 --p=0.05 --n_iter=10000
  • python attack.py --attack=square_linf --model=pt_resnet --n_ex=1000 --eps=12.75 --p=0.05 --n_iter=10000
  • python attack.py --attack=square_linf --model=pt_inception --n_ex=1000 --eps=12.75 --p=0.05 --n_iter=10000

Note that eps=12.75 is then divided by 255, so in the end it is equal to 0.05.

For performing targeted attacks, one should use additionally the flag --targeted, use a lower p, and specify more iterations --n_iter=100000 since it usually takes more iteration to achieve a misclassification to some particular, randomly chosen class.

The rest of the models have to downloaded first (see the instructions below), and then can be evaluated in the following way:

Post-averaging models:

  • python attack.py --attack=square_linf --model=pt_post_avg_cifar10 --n_ex=1000 --eps=8.0 --p=0.3 --n_iter=20000
  • python attack.py --attack=square_linf --model=pt_post_avg_imagenet --n_ex=1000 --eps=8.0 --p=0.3 --n_iter=20000

Clean logit pairing and logit squeezing models:

  • python attack.py --attack=square_linf --model=clp_mnist --n_ex=1000 --eps=0.3 --p=0.3 --n_iter=20000
  • python attack.py --attack=square_linf --model=lsq_mnist --n_ex=1000 --eps=0.3 --p=0.3 --n_iter=20000
  • python attack.py --attack=square_linf --model=clp_cifar10 --n_ex=1000 --eps=16.0 --p=0.3 --n_iter=20000
  • python attack.py --attack=square_linf --model=lsq_cifar10 --n_ex=1000 --eps=16.0 --p=0.3 --n_iter=20000

Adversarially trained model (with only 1 restart; note that the results in the paper are based on 50 restarts):

  • python attack.py --attack=square_linf --model=madry_mnist_robust --n_ex=10000 --eps=0.3 --p=0.8 --n_iter=20000

The L2 Square Attack can be run similarly, but please check the recommended hyperparameters in the paper (Section B of the supplement) and make sure that you specify the right value eps taking into account whether the pixels are in [0, 1] or in [0, 255] for a particular dataset dataset and model. For example, for the standard ImageNet models, the correct L2 eps to specify is 1275 since after division by 255 it will become 5.0.

Saved statistics

In the folder metrics, we provide saved statistics of the attack on 4 models: Inception-v3, ResNet-50, VGG-16-BN.
Here are simple examples how to load the metrics file.

Linf attack

To print the statistics from the last iteration:

metrics = np.load('metrics/2019-11-10 15:57:14 model=pt_resnet dataset=imagenet n_ex=1000 eps=12.75 p=0.05 n_iter=10000.metrics.npy')
iteration = np.argmax(metrics[:, -1])  # max time is the last available iteration
acc, acc_corr, mean_nq, mean_nq_ae, median_nq, avg_loss, time_total = metrics[iteration]
print('[iter {}] acc={:.2%} acc_corr={:.2%} avg#q={:.2f} avg#q_ae={:.2f} med#q_ae={:.2f} (p={}, n_ex={}, eps={}, {:.2f}min)'.
      format(n_iters+1, acc, acc_corr, mean_nq, mean_nq_ae, median_nq_ae, p, n_ex, eps, time_total/60))

Then one can also create different plots based on the data contained in metrics. For example, one can use 1 - acc_corr to plot the success rate of the Square Attack at different number of queries.

L2 attack

In this case we provide the number of queries necessary to achieve misclassification (n_queries[i] = 0 means that the image i was initially misclassified, n_queries[i] = 10001 indicates that the attack could not find an adversarial example for the image i). To load the metrics and compute the success rate of the Square Attack after k queries, you can run:

n_queries = np.load('metrics/square_l2_resnet50_queries.npy')['n_queries']
success_rate = float(((n_queries > 0) * (n_queries <= k)).sum()) / (n_queries > 0).sum()

Models

Note that in order to evaluate other models, one has to first download them and move them to the folders specified in model_path_dict from models.py:

For the first 4 models, one has to additionally update the paths in the checkpoint file in the following way:

model_checkpoint_path: "model.ckpt"
all_model_checkpoint_paths: "model.ckpt"

Requirements

  • PyTorch 1.0.0
  • Tensorflow 1.12.0

Contact

Do you have a problem or question regarding the code? Please don't hesitate to open an issue or contact Maksym Andriushchenko or Francesco Croce directly.

Citation

@article{ACFH2020square,
  title={Square Attack: a query-efficient black-box adversarial attack via random search},
  author={Andriushchenko, Maksym and Croce, Francesco and Flammarion, Nicolas and Hein, Matthias},
  conference={ECCV},
  year={2020}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].