All Projects → uclaml → RayS

uclaml / RayS

Licence: MIT license
RayS: A Ray Searching Method for Hard-label Adversarial Attack (KDD2020)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to RayS

TIGER
Python toolbox to evaluate graph vulnerability and robustness (CIKM 2021)
Stars: ✭ 103 (+139.53%)
Mutual labels:  attack, robustness
chainer-ADDA
Adversarial Discriminative Domain Adaptation in Chainer
Stars: ✭ 24 (-44.19%)
Mutual labels:  adversarial
sgx-tutorial-space18
Tutorial: Uncovering and mitigating side-channel leakage in Intel SGX enclaves
Stars: ✭ 44 (+2.33%)
Mutual labels:  attack
robust-local-lipschitz
A Closer Look at Accuracy vs. Robustness
Stars: ✭ 75 (+74.42%)
Mutual labels:  robustness
hitbsecconf-ctf-2021
HITB SECCONF EDU CTF 2021. Developed with ❤️ by Hackerdom team and HITB.
Stars: ✭ 17 (-60.47%)
Mutual labels:  attack-defense
cycle-confusion
Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".
Stars: ✭ 67 (+55.81%)
Mutual labels:  robustness
AdversarialBinaryCoding4ReID
Codes of the paper "Adversarial Binary Coding for Efficient Person Re-identification"
Stars: ✭ 12 (-72.09%)
Mutual labels:  adversarial
perceptual-advex
Code and data for the ICLR 2021 paper "Perceptual Adversarial Robustness: Defense Against Unseen Threat Models".
Stars: ✭ 44 (+2.33%)
Mutual labels:  robustness
Pummel
Socks5 Proxy HTTP/HTTPS-Flooding (cc) attack
Stars: ✭ 53 (+23.26%)
Mutual labels:  attack
Pentest-Bookmarkz
A collection of useful links for Pentesters
Stars: ✭ 118 (+174.42%)
Mutual labels:  attack
ructfe-2019
RuCTFE 2019. Developed with ♥ by HackerDom team
Stars: ✭ 24 (-44.19%)
Mutual labels:  attack-defense
icestick-lpc-tpm-sniffer
FPGA-based LPC bus sniffing tool for Lattice iCEstick Evaluation Kit
Stars: ✭ 41 (-4.65%)
Mutual labels:  attack
robustness-vit
Contains code for the paper "Vision Transformers are Robust Learners" (AAAI 2022).
Stars: ✭ 78 (+81.4%)
Mutual labels:  robustness
WARP
Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming. Outperforming `GPT-3` on SuperGLUE Few-Shot text classification. https://aclanthology.org/2021.acl-long.381/
Stars: ✭ 66 (+53.49%)
Mutual labels:  adversarial
s-attack
[CVPR 2022] S-attack library. Official implementation of two papers "Vehicle trajectory prediction works, but not everywhere" and "Are socially-aware trajectory prediction models really socially-aware?".
Stars: ✭ 51 (+18.6%)
Mutual labels:  robustness
Generalization-Causality
关于domain generalization,domain adaptation,causality,robutness,prompt,optimization,generative model各式各样研究的阅读笔记
Stars: ✭ 482 (+1020.93%)
Mutual labels:  robustness
byeintegrity5-uac
Bypass UAC at any level by abusing the Task Scheduler and environment variables
Stars: ✭ 21 (-51.16%)
Mutual labels:  attack
rc4md5cry
rc4md5cry: denial of service for rc4-md5 shadowsocks nodes (shadowboom paper is pending)
Stars: ✭ 15 (-65.12%)
Mutual labels:  attack
Attack-Defense-Platform
A framework that help to create CTF Attack with Defense competition quickly
Stars: ✭ 23 (-46.51%)
Mutual labels:  attack-defense
ddos
DDoS Attack & Protection Tools for Windows, Linux & Android
Stars: ✭ 84 (+95.35%)
Mutual labels:  attack

RayS: A Ray Searching Method for Hard-label Adversarial Attack (KDD2020)

"RayS: A Ray Searching Method for Hard-label Adversarial Attack"
Jinghui Chen, Quanquan Gu
https://arxiv.org/abs/2006.12792

This repository contains our PyTorch implementation of RayS: A Ray Searching Method for Hard-label Adversarial Attack in the paper RayS: A Ray Searching Method for Hard-label Adversarial Attack (accepted by KDD 2020).

What is RayS

RayS is a hard-label adversarial attack which only requires the target model's hard-label output (prediction label).

It is gradient-free, hyper-parameter free, and is also independent of adversarial losses such as CrossEntropy or C&W.

Therefore, RayS can be used as a good sanity check for possible "falsely robust" models (models that may overfit to certain types of gradient-based attacks and adversarial losses).

Average Decision Boundary Distance (ADBD)

RayS also proposed a new model robustness metric: ADBD (average decision boundary distance), which reflects examples' average distance to their closest decision boundary.

Model Robustness: ADBD Leaderboard

We tested the robustness of recently proposed robust models which are trained on the CIFAR-10 dataset with the maximum L_inf norm perturbation strength epsilon=0.031 (8/255). The robustness is evaluated on the entire CIFAR-10 testset (10000 examples).

Note:

  • Ranking is based on the ADBD (average decision boundary distance) metric under RayS attack with the default query limit set as 40000. Reducing the query limit will accelerate the process but may lead to inaccurate ADBD value. For fast checking purpose, we recommend evaluating on subset of CIFAR-10 testset (e.g., 1000 examples).
  • * denotes model using extra data for training.
  • Robust Acc (RayS) represents robust accuracy under RayS attack for L_inf norm perturbation strength epsilon=0.031 (8/255). For truly robust models, this value could be larger than the reported value (using white-box attacks) due to the hard-label limitation. For the current best robust accuracy evaluation, please refers to AutoAttack, which uses an ensemble of four white-box/black-box attacks.
  • ADBD represents our proposed Average Decision Boundary Distance metric, which is independent to the perturbation strength epsilon. It reflects the overall model robustness through the lens of decision boundary distance. ADBD can be served as a complement to the traditional robust accuracy metric. Furthermore, ADBD only depends on hard-label output and can be adopted for cases where back-propgation or even soft-labels are not available.
Method Natural Acc Robust Acc
(Reported)
Robust Acc
(RayS)
ADBD
WAR
(Wu et al., 2020)
*
85.6 59.8 63.2 0.0480
RST
(Carmon et al., 2019)
*
89.7 62.5 64.6 0.0465
HYDRA
(Sehwag et al., 2020)
*
89.0 57.2 62.1 0.0450
MART
(Wang et al., 2020)
*
87.5 65.0 62.2 0.0439
UAT++
(Alayrac et al., 2019)
*
86.5 56.3 62.1 0.0426
Pretraining
(Hendrycks et al., 2019)
*
87.1 57.4 60.1 0.0419
Robust-overfitting
(Rice et al., 2020)
85.3 58.0 58.6 0.0404
TRADES
(Zhang et al., 2019b)
85.4 56.4 57.3 0.0403
Backward Smoothing
(Chen et al., 2020)
85.3 54.9 55.1 0.0403
Adversarial Training (retrained)
(Madry et al., 2018)
87.4 50.6 54.0 0.0377
MMA
(Ding et al., 2020)
84.4 47.2 47.7 0.0345
Adversarial Training (original)
(Madry et al., 2018)
87.1 47.0 50.7 0.0344
Fast Adversarial Training
(Wong et al., 2020)
83.8 46.1 50.1 0.0334
Adv-Interp
(Zhang & Xu, 2020)
91.0 68.7 46.9 0.0305
Feature-Scatter
(Zhang & Wang, 2019)
91.3 60.6 44.5 0.0301
SENSE
(Kim & Wang, 2020)
91.9 57.2 43.9 0.0288

Please contact us if you want to add your model to the leaderboard.

How to use RayS to evaluate your model robustness:

Prerequisites:

  • Python
  • Numpy
  • CUDA

PyTorch models

Import RayS attack by

from general_torch_model import GeneralTorchModel
torch_model = GeneralTorchModel(model, n_class=10, im_mean=None, im_std=None)

from RayS import RayS
attack = RayS(torch_model, epsilon=args.epsilon)

where:

  • torch_model is the PyTorch model under GeneralTorchModel warpper; For models using transformed images (exceed the range of [0,1]), simply set im_mean=[0.5, 0.5, 0.5] and im_std=[0.5, 0.5, 0.5] for instance,
  • epsilon is the maximum adversarial perturbation strength.

To actually run RayS attack, use

x_adv, queries, adbd, succ = attack(data, label, query_limit)

it returns:

  • x_adv: the adversarial examples found by RayS,
  • queries: the number of queries used for finding the adversarial examples,
  • adbd: the average decision boundary distance for each example,
  • succ: indicate whether each example being successfully attacked.
  • Sample usage on attacking a robust model:
  -  python3 attack_robust.py --dataset rob_cifar_trades --query 40000 --batch 1000  --epsilon 0.031
  • You can also use --num 1000 argument to limit the number of examples to be attacked as 1000. Default num is set as 10000 (the whole CIFAR10 testset).

TensorFlow models

To evaluate TensorFlow models with RayS attack:

from general_tf_model import GeneralTFModel 
tf_model = GeneralTFModel(model.logits, model.x_input, sess, n_class=10, im_mean=None, im_std=None)

from RayS import RayS
attack = RayS(tf_model, epsilon=args.epsilon)

where:

  • model.logits: logits tensor return by the Tensorflow model,
  • model.x_input: placeholder for model input (NHWC format),
  • sess: TF session .

The remaining part is the same as evaluating PyTorch models.

Reproduce experiments in the paper:

  • Run attacks on a naturally trained model (Inception):
  -  python3 attack_natural.py --dataset inception --epsilon 0.05
  • Run attacks on a naturally trained model (Resnet):
  -  python3 attack_natural.py --dataset resnet --epsilon 0.05
  • Run attacks on a naturally trained model (Cifar):
  - python3 attack_natural.py --dataset cifar --epsilon 0.031
  • Run attacks on a naturally trained model (MNIST):
  - python3 attack_natural.py --dataset mnist --epsilon 0.3

Citation

Please check our paper for technical details and full results.

@inproceedings{chen2020rays,
  title={RayS: A Ray Searching Method for Hard-label Adversarial Attack},
  author={Chen, Jinghui and Gu, Quanquan},
  booktitle={Proceedings of the 26rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
  year={2020}
}

Contact

If you have any question regarding RayS attack or the ADBD leaderboard above, please contact [email protected], enjoy!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].