All Projects → ebagdasa → backdoors101

ebagdasa / backdoors101

Licence: MIT License
Backdoors Framework for Deep Learning and Federated Learning. A light-weight tool to conduct your research on backdoors.

Programming Languages

python
139335 projects - #7 most used programming language
HTML
75241 projects

Projects that are alternatives of or similar to backdoors101

MSBackdoor
[Discontinued] Transform your payload into fake powerpoint (.ppt)
Stars: ✭ 35 (-80.66%)
Mutual labels:  backdoors, backdoor-attacks
bftkv
A distributed key-value storage that's tolerant to Byzantine fault.
Stars: ✭ 27 (-85.08%)
Mutual labels:  research
chainer-ADDA
Adversarial Discriminative Domain Adaptation in Chainer
Stars: ✭ 24 (-86.74%)
Mutual labels:  adversarial
mnist1d
A 1D analogue of the MNIST dataset for measuring spatial biases and answering "science of deep learning" questions.
Stars: ✭ 72 (-60.22%)
Mutual labels:  research
PySDM
Pythonic particle-based (super-droplet) warm-rain/aqueous-chemistry cloud microphysics package with box, parcel & 1D/2D prescribed-flow examples in Python, Julia and Matlab
Stars: ✭ 26 (-85.64%)
Mutual labels:  research
RayS
RayS: A Ray Searching Method for Hard-label Adversarial Attack (KDD2020)
Stars: ✭ 43 (-76.24%)
Mutual labels:  adversarial
failure-modes
Collection of how and why our software systems fail
Stars: ✭ 18 (-90.06%)
Mutual labels:  research
Awesome-Multi-Task-Learning
A list of multi-task learning papers and projects.
Stars: ✭ 230 (+27.07%)
Mutual labels:  research
saffrontree
SaffronTree: Reference free rapid phylogenetic tree construction from raw read data
Stars: ✭ 17 (-90.61%)
Mutual labels:  research
tulip
Scaleable input gradient regularization
Stars: ✭ 19 (-89.5%)
Mutual labels:  adversarial-machine-learning
x86-Assembly-Reverse-Engineering
🛠 Knowledge about the topic of x86 assembly & disassembly 🛠
Stars: ✭ 27 (-85.08%)
Mutual labels:  research
Research
Non-technical Blockchain Research Topics
Stars: ✭ 22 (-87.85%)
Mutual labels:  research
Main
Management materials and content
Stars: ✭ 32 (-82.32%)
Mutual labels:  research
ethereum-privacy
Profiling and Deanonymizing Ethereum Users
Stars: ✭ 37 (-79.56%)
Mutual labels:  research
Mava
A library of multi-agent reinforcement learning components and systems
Stars: ✭ 355 (+96.13%)
Mutual labels:  research
research
ethereum, leveldb
Stars: ✭ 25 (-86.19%)
Mutual labels:  research
MOON
Model-Contrastive Federated Learning (CVPR 2021)
Stars: ✭ 93 (-48.62%)
Mutual labels:  federated-learning
awesome-machine-learning-reliability
A curated list of awesome resources regarding machine learning reliability.
Stars: ✭ 31 (-82.87%)
Mutual labels:  adversarial-machine-learning
KD3A
Here is the official implementation of the model KD3A in paper "KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation".
Stars: ✭ 63 (-65.19%)
Mutual labels:  federated-learning
CAM
macOS camera recording using ffmpeg
Stars: ✭ 43 (-76.24%)
Mutual labels:  research

Backdoors 101

drawing

Backdoors 101 — is a PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models. It includes real-world datasets, centralized and federated learning, and supports various attack vectors. The code is mostly based on "Blind Backdoors in Deep Learning Models (USENIX'21)" and "How To Backdoor Federated Learning (AISTATS'20)" papers, but we always look for incorporating newer results.

If you have a new defense or attack, let us know (raise an issue or send an email), happy to help porting it. If you are doing research on backdoors and want some assistance don't hesitate to ask questions.

Table of contents

Current status

We try to incorporate new attacks and defenses as well as to extend the supported datasets and tasks. Here is the high-level overview of the possible attack vectors:

drawing

Backdoors

  • Pixel-pattern (incl. single-pixel) - traditional pixel modification attacks.
  • Physical - attacks that are triggered by physical objects.
  • Semantic backdoors - attacks that don't modify the input (e.g. react on features already present in the scene).

TODO clean-label (good place to contribute).

Injection methods

  • Data poisoning - adds backdoors into the dataset.
  • Batch poisoning - injects backdoor samples directly into the batch during training.
  • Loss poisoning - modifies the loss value during training (supports dynamic loss balancing, see Sec 3.4 )

TODO: model poisoning (good place to contribute!).

Datasets

  • Image Classification - ImageNet, CIFAR-10, Pipa face identification, MultiMNIST, MNIST.
  • Text - IMDB reviews datasets, Reddit (coming)

TODO: Face recognition, eg Celeba or VGG. We already have some code, but need expertise on producing good models (good place to contribute!).

Defenses

  • Input perturbation - NeuralCleanse + added evasion.
  • Model anomalies - SentiNet + added evasion.
  • Spectral clustering / fine-pruning + added evasion.

TODO: Port Jupyter notebooks demonstrating defenses and evasions. Add new defenses and evasions (good place to contribute!).

Training regimes

  • Centralized training.
  • Differentially private / gradient shaping training.
  • Federated Learning (CIFAR-10 only).

Basics

First, we want to give some background on backdoor attacks, note that our definition is inclusive of many other definitions stated before and supports all the new attacks (e.g. clean-label, feature-mix, semantic).

  1. Deep Learning. We focus on supervised learning setting where our goal is to learn some task m: X -> Y (we call it a main task) on some domain of inputs X and labels Y. A model θ for task m is trained on tuples (x,y) ∈ (X,Y) using some loss criterion L (e.g. cross-entropy): L(θ(x), y).

  2. Backdoor definition. A backdoor introduces malicious behavior m* additional to the main behavior m the model is trained for. Therefore, we state that a backdoor attack is essentially a multi-task setting with two or more tasks: main task m and backdoor task m*, and if needed evasion tasks mev . The model trained for two tasks will exhibit both normal and backdoor behavior.

  3. Backdoor data. In order to introduce a backdoor task m*: X* -> Y* the model has to be trained on a different domain of backdoor inputs and labels: (X*, Y*). Intuitively we can differentiate that the backdoor domain X* contains inputs that contain backdoor features. The main domain X might also include backdoor inputs, i.e. when backdoors are naturally occurring features. However, note that the input domain X* should not prevail in the main task domain X, e.g. X \ X* ≈ 0, otherwise two tasks will collude.

  4. Backdoor feature. Initially, a backdoor trigger was defined as a pixel pattern, therefore clearly separating the backdoor domain X* from the main domain X. However, recent works on semantic backdoors, edge-case backdoors and physical backdoors allow the backdoor feature to be a part of the unmodified input (ie. a particular model of a car or an airplane that will be misclassified as birds).

    We propose to use synthesizers that transform non -backdoored inputs to contain backdoor features and create backdoor labels. For example in image backdoors. The input synthesizer can simply insert a pixel pattern on top of an image, perform more complex transformations, or substitute the image with a backdoored image (edge-case backdoors).

  5. Complex backdoors. A domain of backdoor labels Y* can contain many labels. This setting is different from all other backdoor attacks, where the presence of a backdoor feature would always result in a specific label. However, our setting allows a new richer set of attacks for example a model trained on a task to count people in the image might contain a backdoor task to identify particular individuals.

drawing

  1. Supporting multiple backdoors. Our definition enables multiple backdoor tasks. As a toy example we can attack a model that recognizes a two -digit number and inject two new backdoor tasks: one that sums up digits and another one that multiplies them.

drawing

  1. Methods to inject backdoor task. Depending on a selected threat model the attack can inject backdoors by poisoning the training dataset, directly mixing backdoor inputs into a training batch, altering loss functions, or modifying model weights. Our framework supports all these methods, but primarily focuses on injecting backdoors by adding a special loss value. We also utilize Multiple Gradient Descent Algorithm (MGDA) to efficiently balance multiple losses.

Installation

Now, let's configure the system:

  • Install all dependencies: pip install -r requirements.txt.
  • Create two directories: runs for Tensorboard graphs and saved_models to store results.
  • Startup Tensorboard: tensorboard --logdir=runs/.

Next, let's run some basic attack on MNIST dataset. We use YAML files to configure the attacks. For MNIST attack, please refer to the configs /mnist_params.yaml file. For the full set of available parameters see the dataclass Parameters. Let's start the training:

python training.py --name mnist --params configs/mnist_params.yaml --commit none

Argument name specifies Tensorboard name and commit just records the commit id into a log file for reproducibility.

Repeating Experiments

For imagenet experiments you can use imagenet_params.yaml.

python training.py --name imagenet --params configs/imagenet_params.yaml --commit none

For NLP experiments we also created a repo with backdoored transformers.

This is the commit.

To run NLP experiment just run this script.

Structure

Our framework includes a training file training.py that heavily relies on a Helper object storing all the necessary objects for training. The helper object contains the main Task that stores models, datasets, optimizers, and other parameters for the training. Another object Attack contains synthesizers and performs loss computation for multiple tasks.

Citation

@inproceedings {bagdasaryan2020blind,
 author = {Eugene Bagdasaryan and Vitaly Shmatikov},
 title = {Blind Backdoors in Deep Learning Models},
 booktitle = {30th {USENIX} Security Symposium ({USENIX} Security 21)},
 year = {2021},
 isbn = {978-1-939133-24-3},
 pages = {1505--1521},
 url = {https://www.usenix.org/conference/usenixsecurity21/presentation/bagdasaryan},
 publisher = {{USENIX} Association},
 month = aug,
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].