All Projects → jind11 → Textfooler

jind11 / Textfooler

A Model for Natural Language Attack on Text Classification and Inference

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Textfooler

Neuronblocks
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Stars: ✭ 1,356 (+355.03%)
Mutual labels:  natural-language-processing, text-classification
Monkeylearn Python
Official Python client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Python apps.
Stars: ✭ 143 (-52.01%)
Mutual labels:  natural-language-processing, text-classification
Texting
[ACL 2020] Tensorflow implementation for "Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks"
Stars: ✭ 103 (-65.44%)
Mutual labels:  natural-language-processing, text-classification
Nlp Tutorial
A list of NLP(Natural Language Processing) tutorials
Stars: ✭ 1,188 (+298.66%)
Mutual labels:  natural-language-processing, text-classification
Pyss3
A Python package implementing a new machine learning model for text classification with visualization tools for Explainable AI
Stars: ✭ 191 (-35.91%)
Mutual labels:  natural-language-processing, text-classification
Monkeylearn Ruby
Official Ruby client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Ruby apps.
Stars: ✭ 76 (-74.5%)
Mutual labels:  natural-language-processing, text-classification
Nlp Pretrained Model
A collection of Natural language processing pre-trained models.
Stars: ✭ 122 (-59.06%)
Mutual labels:  natural-language-processing, text-classification
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (-87.25%)
Mutual labels:  natural-language-processing, text-classification
Fastnlp
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Stars: ✭ 2,441 (+719.13%)
Mutual labels:  natural-language-processing, text-classification
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+744.97%)
Mutual labels:  natural-language-processing, text-classification
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+279.87%)
Mutual labels:  natural-language-processing, text-classification
Catalyst
Accelerated deep learning R&D
Stars: ✭ 2,804 (+840.94%)
Mutual labels:  natural-language-processing, text-classification
Textblob Ar
Arabic support for textblob
Stars: ✭ 60 (-79.87%)
Mutual labels:  natural-language-processing, text-classification
Bible text gcn
Pytorch implementation of "Graph Convolutional Networks for Text Classification"
Stars: ✭ 90 (-69.8%)
Mutual labels:  natural-language-processing, text-classification
Scdv
Text classification with Sparse Composite Document Vectors.
Stars: ✭ 54 (-81.88%)
Mutual labels:  natural-language-processing, text-classification
Kadot
Kadot, the unsupervised natural language processing library.
Stars: ✭ 108 (-63.76%)
Mutual labels:  natural-language-processing, text-classification
Nlp In Practice
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Stars: ✭ 790 (+165.1%)
Mutual labels:  natural-language-processing, text-classification
Easy Deep Learning With Allennlp
🔮Deep Learning for text made easy with AllenNLP
Stars: ✭ 32 (-89.26%)
Mutual labels:  natural-language-processing, text-classification
Textvec
Text vectorization tool to outperform TFIDF for classification tasks
Stars: ✭ 167 (-43.96%)
Mutual labels:  natural-language-processing, text-classification
Bert4doc Classification
Code and source for paper ``How to Fine-Tune BERT for Text Classification?``
Stars: ✭ 220 (-26.17%)
Mutual labels:  natural-language-processing, text-classification

TextFooler

A Model for Natural Language Attack on Text Classification and Inference

This is the source code for the paper: Jin, Di, et al. "Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment." arXiv preprint arXiv:1907.11932 (2019). If you use the code, please cite the paper:

@article{jin2019bert,
  title={Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment},
  author={Jin, Di and Jin, Zhijing and Zhou, Joey Tianyi and Szolovits, Peter},
  journal={arXiv preprint arXiv:1907.11932},
  year={2019}
}

Data

Our 7 datasets are here.

Prerequisites:

Required packages are listed in the requirements.txt file:

pip install requirements.txt

How to use

  • Run the following code to install the esim package:
cd ESIM
python setup.py install
cd ..
python comp_cos_sim_mat.py [PATH_TO_COUNTER_FITTING_WORD_EMBEDDINGS]
  • Run the following code to generate the adversaries for text classification:
python attack_classification.py

For Natural langauge inference:

python attack_nli.py

Examples of run code for these two files are in run_attack_classification.py and run_attack_nli.py. Here we explain each required argument in details:

  • --dataset_path: The path to the dataset. We put the 1000 examples for each dataset we used in the paper in the folder data.
  • --target_model: Name of the target model such as ''bert''.
  • --target_model_path: The path to the trained parameters of the target model. For ease of replication, we shared the trained BERT model parameters, the trained LSTM model parameters, and the trained CNN model parameters on each dataset we used.
  • --counter_fitting_embeddings_path: The path to the counter-fitting word embeddings.
  • --counter_fitting_cos_sim_path: This is optional. If given, then the pre-computed cosine similarity scores based on the counter-fitting word embeddings will be loaded to save time. If not, it will be calculated.
  • --USE_cache_path: The path to save the USE model file (Downloading is automatic if this path is empty).

Two more things to share with you:

  1. In case someone wants to replicate our experiments for training the target models, we shared the used seven datasets we have processed for you!

  2. In case someone may want to use our generated adversary results towards the benchmark data directly, here it is.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].