All Projects → akshaychawla → Adversarial-Examples-in-PyTorch

akshaychawla / Adversarial-Examples-in-PyTorch

Licence: other
Pytorch code to generate adversarial examples on mnist and ImageNet data.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Adversarial-Examples-in-PyTorch

ImageModels
ImageNet model implemented using the Keras Functional API
Stars: ✭ 63 (-43.75%)
Mutual labels:  imagenet
imagenet-autoencoder
Autoencoder trained on ImageNet Using Torch 7
Stars: ✭ 18 (-83.93%)
Mutual labels:  imagenet
super-gradients
Easily train or fine-tune SOTA computer vision models with one open source training library
Stars: ✭ 429 (+283.04%)
Mutual labels:  imagenet
efficientnetv2.pytorch
PyTorch implementation of EfficientNetV2 family
Stars: ✭ 366 (+226.79%)
Mutual labels:  imagenet
StudyAdversarials
Some of my experiments targeting adversarial instances
Stars: ✭ 12 (-89.29%)
Mutual labels:  adversarial-networks
SKNet-PyTorch
Nearly Perfect & Easily Understandable PyTorch Implementation of SKNet
Stars: ✭ 62 (-44.64%)
Mutual labels:  imagenet
PyTorch-LMDB
Scripts to work with LMDB + PyTorch for Imagenet training
Stars: ✭ 49 (-56.25%)
Mutual labels:  imagenet
simpleAICV-pytorch-ImageNet-COCO-training
SimpleAICV:pytorch training example on ImageNet(ILSVRC2012)/COCO2017/VOC2007+2012 datasets.Include ResNet/DarkNet/RetinaNet/FCOS/CenterNet/TTFNet/YOLOv3/YOLOv4/YOLOv5/YOLOX.
Stars: ✭ 276 (+146.43%)
Mutual labels:  imagenet
Stochastic-Quantization
Training Low-bits DNNs with Stochastic Quantization
Stars: ✭ 70 (-37.5%)
Mutual labels:  imagenet
PyTorch-Model-Compare
Compare neural networks by their feature similarity
Stars: ✭ 119 (+6.25%)
Mutual labels:  imagenet
dan
Demo code for the paper ''Distributional Adversarial Networks''
Stars: ✭ 18 (-83.93%)
Mutual labels:  adversarial-networks
SharpPeleeNet
ImageNet pre-trained SharpPeleeNet can be used in real-time Semantic Segmentation/Objects Detection
Stars: ✭ 13 (-88.39%)
Mutual labels:  imagenet
ShapeTextureDebiasedTraining
Code and models for the paper Shape-Texture Debiased Neural Network Training (ICLR 2021)
Stars: ✭ 95 (-15.18%)
Mutual labels:  imagenet
BottleneckTransformers
Bottleneck Transformers for Visual Recognition
Stars: ✭ 231 (+106.25%)
Mutual labels:  imagenet
TF-NAS
TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture Search (ECCV2020)
Stars: ✭ 66 (-41.07%)
Mutual labels:  imagenet
img classification deep learning
No description or website provided.
Stars: ✭ 19 (-83.04%)
Mutual labels:  imagenet
datumaro
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.
Stars: ✭ 274 (+144.64%)
Mutual labels:  imagenet
alexnet-architecture.tensorflow
Unofficial TensorFlow implementation of "AlexNet" architecture.
Stars: ✭ 15 (-86.61%)
Mutual labels:  imagenet
image-classification
A collection of SOTA Image Classification Models in PyTorch
Stars: ✭ 70 (-37.5%)
Mutual labels:  imagenet
etiketai
Etiketai is an online tool designed to label images, useful for training AI models
Stars: ✭ 63 (-43.75%)
Mutual labels:  imagenet

Adversarial Examples

Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake; they’re like optical illusions for machines. However, they look almost identical to the original inputs when seen through the naked eye.

Adversarial examples are an important aspect of AI research due to the security concerns regarding AI's widespread use in the real world. for e.g. An adversarialized stop sign might appear like a merge symbol to a self driving car, which compromises the safety of the vehicle.

This repository is an attempt to implement 2 common methods to produce adversarial examples.The directory structure is as follows.

.
+-- .gitignore --> do not track
+-- README.md --> This document.
+-- Method 1 - optimizing for noise --> Method based on [1] 
|   +-- attack.py --> Class that performs the attack
|   +-- attack_mnist.py --> use attack.py on mnist dataset
|   +-- visualize_adv_examples.py --> vis the results
+-- Method 2 - Fast gradient sign method
|   +-- imnet-fast-gradient.py --> fgsm on VGG16 w/ images from ImageNet. 
|   +-- mnist-fast-gradient.py  --> fgsm on Mnist dataset
|   +-- visualize_imnet.py 
|   +-- visualize_mnist.py
+-- common
|   +-- train_mnist.py --> train a simple nn on mnist and save to weights.pkl
|   +-- generate_5k.py --> extract 5k random mnist samples from the dataset. 
|   +-- labels.json --> map ImageNet classes <--> # between 0-999

Method 1 - Optimizing for noise

In the method presented in [1] the authors find that neural networks are not stable to small perturbations in input space. Specifically, it is possible to optimize for a small perturbation that misclassifies an image but is visually similar to the original image. In the paper, the author use an L-BFGS optimizer to solve:

    minimize ||r||_2, subject to
    1. f(x+r) = l
    2. x + r in [0,1] 
    where l = target class
    r = noise 
    f = nn mapping images -> labels s.t f(x) -> k (correct class) 

However, in this implementation I use an SGD optimizer to find "r", I do this by fixing the input "x" and weights of the network and minimizing the cross entropy loss between network output and target label "l". I honor the 2nd constraint by clamping x+r to [0,1]. I also try to keep the values of "r" to a minimum by imposing L2/L1/No regularization.

The following table shows the min, max and mean perturbation for No/L1/L2 regularization.

Mean Max Min
No regularization 0.0151 1.00202 -0.999
L1 regularization 0.0155 1.00323 -1.000
L2 regularization 0.0150 1.00285 -1.002

Method 2 - Fast gradient sign method

In [2] the authors propose an easier method to generate adversarial examples known as fast gradient sign method. This method makes use of the idea that deep models behave in a linear manner and that a large number of small variations in a high dimensional input space can cause a significant change in the output of the model. According the the paper, an adversarial example can be generated by:

    x_adversarial = x + eta * sign( dL/dx )
    where
    eta = scale of perturbation
    dL/dx = gradient of loss function w.r.t. input
    sign = signum function

Mean, Max, Min noise: 0.0373817, 0.1, -0.1

How to run

    # Download and generate 5k mnist samples 
    cd common/ 
    python generate_5k.py # creates 5k_samples.pkl 
    
    # Train NN on mnist 
    python train_mnist.py # creates weights.pkl 
    
    # Method 1 
    cd ../Method\ 1\ -\ optimizing\ for\ noise/
    python attack_mnist.py --> generates bulk...pkl file 
    python visualize_adv_examples.py bulk...pkl # visualize adv examples on a grid
    
    # Method 2 
    cd ../Method\ 2\ -\ Fast\ gradient\ sign\ method/
    python mnist-fast-gradient.py  # runs on 5k images and creates bulk_mnist_fgsd.pkl
    python visualize_mnist.py bulk_mnist_fgsd.pkl # visualize on a grid
    

Some observations

  1. FGSM is faster to compute in comparison to method 1.
  2. In FGSM, the noise is spread accross the entirety of the image instead of being localized. FGSM hence gives noticeably 'cleaner' images.
  3. The minimum epsilon required to change the classification of an image varies for each sample. Hence, to get the mimimum possible perturbation required for misclassification, one can run a bilinear search.
  4. In Method 1, it is possible to control the target class, however in FGSM it is not possible to do so.

References

[1] Intriguing properties of neural networks. Szegedy et al. (ICLR 2014). paper

[2] Explaining and Harnessing Adversarial Examples. IJ Goodfellow et al. (ICLR 2015) paper

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].