All Projects → RishabhMaheshwary → hard-label-attack

RishabhMaheshwary / hard-label-attack

Licence: other
Natural Language Attacks in a Hard Label Black Box Setting.

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to hard-label-attack

consistency
Implementation of models in our EMNLP 2019 paper: A Logic-Driven Framework for Consistency of Neural Models
Stars: ✭ 26 (+0%)
Mutual labels:  bert, nli
T3
[EMNLP 2020] "T3: Tree-Autoencoder Constrained Adversarial Text Generation for Targeted Attack" by Boxin Wang, Hengzhi Pei, Boyuan Pan, Qian Chen, Shuohang Wang, Bo Li
Stars: ✭ 25 (-3.85%)
Mutual labels:  bert, adversarial-attacks
bert nli
A Natural Language Inference (NLI) model based on Transformers (BERT and ALBERT)
Stars: ✭ 97 (+273.08%)
Mutual labels:  bert, nli
tensorflow-ml-nlp-tf2
텐서플로2와 머신러닝으로 시작하는 자연어처리 (로지스틱회귀부터 BERT와 GPT3까지) 실습자료
Stars: ✭ 245 (+842.31%)
Mutual labels:  bert, nli
Filipino-Text-Benchmarks
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
Stars: ✭ 22 (-15.38%)
Mutual labels:  bert, nli
KitanaQA
KitanaQA: Adversarial training and data augmentation for neural question-answering models
Stars: ✭ 58 (+123.08%)
Mutual labels:  bert, adversarial-attacks
CAIL
法研杯CAIL2019阅读理解赛题参赛模型
Stars: ✭ 34 (+30.77%)
Mutual labels:  bert
SA-BERT
CIKM 2020: Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots
Stars: ✭ 71 (+173.08%)
Mutual labels:  bert
backprop
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (+780.77%)
Mutual labels:  bert
Sohu2019
2019搜狐校园算法大赛
Stars: ✭ 26 (+0%)
Mutual labels:  bert
Cross-Lingual-MRC
Cross-Lingual Machine Reading Comprehension (EMNLP 2019)
Stars: ✭ 66 (+153.85%)
Mutual labels:  bert
PromptPapers
Must-read papers on prompt-based tuning for pre-trained language models.
Stars: ✭ 2,317 (+8811.54%)
Mutual labels:  bert
Romanian-Transformers
This repo is the home of Romanian Transformers.
Stars: ✭ 60 (+130.77%)
Mutual labels:  bert
BertSimilarity
Computing similarity of two sentences with google's BERT algorithm。利用Bert计算句子相似度。语义相似度计算。文本相似度计算。
Stars: ✭ 348 (+1238.46%)
Mutual labels:  bert
DE-LIMIT
DeEpLearning models for MultIlingual haTespeech (DELIMIT): Benchmarking multilingual models across 9 languages and 16 datasets.
Stars: ✭ 90 (+246.15%)
Mutual labels:  bert
FasterTransformer
Transformer related optimization, including BERT, GPT
Stars: ✭ 1,571 (+5942.31%)
Mutual labels:  bert
wechsel
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Stars: ✭ 39 (+50%)
Mutual labels:  bert
ganbert
Enhancing the BERT training with Semi-supervised Generative Adversarial Networks
Stars: ✭ 205 (+688.46%)
Mutual labels:  bert
JointIDSF
BERT-based joint intent detection and slot filling with intent-slot attention mechanism (INTERSPEECH 2021)
Stars: ✭ 55 (+111.54%)
Mutual labels:  bert
gender-unbiased BERT-based pronoun resolution
Source code for the ACL workshop paper and Kaggle competition by Google AI team
Stars: ✭ 42 (+61.54%)
Mutual labels:  bert

Generating Natural Language Attacks in a Hard Label Black Box Setting

This repository contains source code for the research work described in our AAAI 2021 paper:

Generating Natural Language Attacks in a Hard Label Black Box Setting

The hard label attack has also been implemented in TextAttack library.

Follow these steps to run the attack from the library:

  1. Fork the repository

  2. Run the following command to install it.

    $ cd TextAttack
    $ pip install -e . ".[dev]"
    
  3. Run the following command to attack bert-base-uncased trained on MovieReview dataset.

    $ textattack attack --recipe hard-label-attack --model bert-base-uncased-mr --num-examples 100
    

Take a look at the models directory in TextAttack to run the attack across any dataset and any target model.

Instructions for running the attack from this repository.

Requirements

  • Pytorch >= 0.4
  • Tensorflow >= 1.0
  • Numpy
  • Python >= 3.6
  • Tensorflow 2.1.0
  • TensorflowHub

Download Dependencies

  • Download pretrained target models for each dataset bert, lstm, cnn unzip it.

  • Download the counter-fitted-vectors from here and place it in the main directory.

  • Download top 50 synonym file from here and place it in the main directory.

  • Download the glove 200 dimensional vectors from here unzip it.

How to Run:

Use the following command to get the results.

For BERT model

python3 classification_attack.py \
        --dataset_path path_to_data_samples_to_attack  \
        --target_model Type_of_taget_model (bert,wordCNN,wordLSTM) \
        --counter_fitting_cos_sim_path path_to_top_50_synonym_file \
        --target_dataset dataset_to_attack (imdb,ag,yelp,yahoo,mr) \
        --target_model_path path_to_pretrained_target_model \
        --USE_cache_path " " \
        --max_seq_length 256 \
        --sim_score_window 40 \
        --nclasses classes_in_the_dataset_to_attack

Example of attacking BERT on IMDB dataset.


python3 classification_attack.py \
        --dataset_path data/imdb  \
        --target_model bert \
        --counter_fitting_cos_sim_path mat.txt \
        --target_dataset imdb \
        --target_model_path bert/imdb \
        --USE_cache_path " " \
        --max_seq_length 256 \
        --sim_score_window 40 \
        --nclasses 2

Example of attacking BERT on SNLI dataset.


python3 nli_attack.py \
        --dataset_path data/snli  \
        --target_model bert \
        --counter_fitting_cos_sim_path mat.txt \
        --target_dataset snli \
        --target_model_path bert/snli \
        --USE_cache_path "nli_cache" \
        --sim_score_window 40

Results

The results will be available in results_hard_label directory for classification task and in results_nli_hard_label for entailment tasks. For attacking other target models look at the commands folder.

Training target models

To train BERT on a particular dataset use the commands provided in the BERT directory. For training LSTM and CNN models run the train_classifier.py --<model_name> --<dataset>.

If you find our repository helpful, consider citing our work.

@article{maheshwary2020generating,
  title={Generating Natural Language Attacks in a Hard Label Black Box Setting},
  author={Maheshwary, Rishabh and Maheshwary, Saket and Pudi, Vikram},
  journal={arXiv preprint arXiv:2012.14956},
  year={2020}
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].