The Papers of Adversarial Examples

A Complete List of All (arXiv) Adversarial Example Papers

Adversarial Examples in Computer Vision

Adversarial Attack

[1] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow and Rob Fergus. Intriguing properties of neural networks. ICLR 2014.

Gradented-Based Attack

[1] Ian J. Goodfellow, Jonathon Shlens and Christian Szegedy. Explaining and Harnessing Adversarial Examples. ICLR 2015.

[2] Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L Yuille. Improving transferability of adversarial examples with input diversity. CVPR 2019.

[3] Lei Wu, Zhanxing Zhu, Cheng Tai and Weinan E. Enhancing the Transferability of Adversarial Examples with Noise Reduced Gradient. ICLR 2018 rejected.

[4] Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu and Jianguo Li. Boosting Adversarial Attacks with Momentum. CVPR 2018.

[5] Lianli Gao, Qilong Zhang, Jingkuan Song, Xianglong Liu and Heng Tao Shen. Patch-wise Attack for Fooling Deep Neural Network. ECCV 2020.

GAN-Based Attack

[1] Hyrum S. Anderson, Jonathan Woodbridge and Bobby Filar. DeepDGA: Adversarially-Tuned Domain Generation and Detection AISec 2016.

[2] Chaowei Xiao, Bo Li, Jun-Yan Zhu, Warren He, Mingyan Liu and Dawn Song. Generating Adversarial Examples with Adversarial Networks. IJCAI 2018.

[3] Yang Song, Rui Shu, Nate Kushman and Stefano Ermon. Constructing Unrestricted Adversarial Examples with Generative Models. NeurIPS 2018.

[4] Xiaosen Wang, Kun He and John E. Hopcroft. AT-GAN: A Generative Attack Model for Adversarial Transferring on Generative Adversarial Nets. arXiv Preprint arXiv:1904.07793 2019.

[5] Tao Bai, Jun Zhao, Jinlin Zhu, Shoudong Han, Jiefeng Chen and Bo Li. AI-GAN: Attack-Inspired Generation of Adversarial Examples. arXiv Preprint arXiv:2002.02196 2020.

Transferability

full list

[1] Jiadong Lin, Chuanbiao Song, Kun He, Liwei Wang, John E. Hopcroft. Nesterov Accelerated Gradient and Scale Invariance for Improving Transferability of Adversarial Examples. ICLR 2020.

[2] Weibin Wu, Yuxin Su, Xixian Chen, Shenglin Zhao, Irwin King, Michael R. Lyu, Yu-Wing Tai. Boosting the Transferability of Adversarial Samples via Attention. CVPR 2020.

[3] Nathan Inkawhich, Kevin Liang, Lawrence Carin and Yiran Chen. Transferable Perturbations of Deep Feature Distributions. ICLR 2020.

[4] Kaizhao Liang, Jacky Y. Zhang, Oluwasanmi Koyejo, Bo Li. Does Adversarial Transferability Indicate Knowledge Transferability?. arXiv Preprint arXiv:2006.14512 2020.

[5] Junhua Zou, Zhisong Pan, Junyang Qiu, Xin Liu, Ting Rui, Wei Li. Improving the Transferability of Adversarial Examples with Resized-Diverse-Inputs, Diversity-Ensemble and Region Fitting. ECCV 2020.

[6] Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, Alan Yuille. Improving Transferability of Adversarial Examples with Input Diversity. CVPR 2019.

[7] Yinpeng Dong, Tianyu Pang, Hang Su, Jun Zhu. Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks. CVPR 2019.

[8] Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, Jianguo Li. Boosting Adversarial Attacks with Momentum. CVPR 2018.

[9] Xiaosen Wang, Xuanran He, Jingdong Wang, Kun He. Admix: Enhancing the Transferability of Adversarial Attacks. ICCV 2021.

[10] Xiaosen Wang, Kun He. Enhancing the Transferability of Adversarial Attacks through Variance Tuning. CVPR 2021.

[11] Xiaosen Wang, Jiadong Lin, Han Hu, Jingdong Wang, Kun He. Boosting Adversarial Transferability through Enhanced Momentum. arXiv Preprint arXiv:2103.10609 2021.

[12] Weibin Wu, Yuxin Su, Michael R. Lyu, Irwin King. Improving the Transferability of Adversarial Samples With Adversarial Transformations. CVPR 2020.

[13] Zhibo Wang, Hengchang Guo, Zhifei Zhang, Wenxin Liu, Zhan Qin, Kui Ren. Feature Importance-aware Transferable Adversarial Attacks. ICCV 2021.

Hard Label Attack

[1] Wieland Brendel, Jonas Rauber, Matthias Bethge. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. ICLR 2018.

[2] Andrew Ilyas, Logan Engstrom, Anish Athalye, Jessy Lin. Black-box Adversarial Attacks with Limited Queries and Information. ICML 2018.

[3] Minhao Cheng, Thong Le, Pin-Yu Chen, Jinfeng Yi, Huan Zhang, Cho-Jui Hsieh. Query-Efficient Hard-label Black-box Attack:An Optimization-based Approach. ICLR 2019.

[4] Yinpeng Dong, Hang Su, Baoyuan Wu, Zhifeng Li, Wei Liu, Tong Zhang, Jun Zhu. Efficient Decision-based Black-box Adversarial Attacks on Face Recognition. CVPR 2019.

[5] Yujia Liu, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard. A geometry-inspired decision-based attack. ICCV 2019.

[6] Jianbo Chen, Michael I. Jordan, Martin J. Wainwright. HopSkipJumpAttack: A Query-Efficient Decision-Based Attack. SP 2020.

[7] Minhao Cheng, Simranjit Singh, Patrick Chen, Pin-Yu Chen, Sijia Liu, Cho-Jui Hsieh. Sign-OPT: A Query-Efficient Hard-label Adversarial Attack. ICLR 2020.

[8] Ali Rahmati, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, Huaiyu Dai. GeoDA: a geometric framework for black-box adversarial attacks. CVPR 2020.

[9] Huichen Li, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, Bo Li. QEBA: Query-Efficient Boundary-Based Blackbox Attack. CVPR 2020.

[10] Weilun Chen, Zhaoxiang Zhang, Xiaolin Hu, Baoyuan Wu. Boosting Decision-based Black-box Adversarial Attacks with Random Sign Flip. ECCV 2020.

[11] Thibault Maho, Teddy Furon, Erwan Le Merrer. SurFree: a fast surrogate-free black-box attack. CVPR 2021.

[12] Huichen Li, Linyi Li, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, Bo Li. Nonlinear Projection Based Gradient Estimation for Query Efficient Blackbox Attacks. AISTATS 2021.

Unrestricted Adversarial Examples

[1] Yang Song, Rui Shu, Nate Kushman and Stefano Ermon. Constructing Unrestricted Adversarial Examples with Generative Models. NeurIPS 2018.

[2] Xiaosen Wang, Kun He and John E. Hopcroft. AT-GAN: A Generative Attack Model for Adversarial Transferring on Generative Adversarial Nets. arXiv Preprint arXiv:1904.07793.

Black-box attacks

[1] Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, Cho-Jui Hsieh. ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models. ACM Workshop on Artificial Intelligence and Security (AISec) 2017.

[2] Andrew Ilyas, Logan Engstrom, Anish Athalye, Jessy Lin. Black-box Adversarial Attacks with Limited Queries and Information. ICML 2018.

[3] Andrew Ilyas, Logan Engstrom, Aleksander Madry. Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors. ICLR 2019.

[4] Arjun Nitin Bhagoji, Warren He, Bo Li, Dawn Song. Practical Black-box Attacks on Deep Neural Networks using Efficient Query Mechanisms. ECCV 2018.

[5] Shuyu Cheng, Yinpeng Dong, Tianyu Pang, Hang Su and Jun Zhu. Improving Black-box Adversarial Attacks with a Transfer-based Prior. NeurIPS 2019.

Hard-label attacks

[1] Wieland Brendel, Jonas Rauber and Matthias Bethge. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. ICLR 2018.

[2] Andrew Ilyas, Logan Engstrom, Anish Athalye and Jessy Lin. Black-box Adversarial Attacks with Limited Queries and Information. ICML 2018.

[3] Yinpeng Dong, Hang Su, Baoyuan Wu, Zhifeng Li, Wei Liu, Tong Zhang and Jun Zhu. Efficient Decision-based Black-box Adversarial Attacks on Face Recognition. CVPR 2019.

[4] Minhao Cheng, Thong Le, Pin-Yu Chen, Jinfeng Yi, Huan Zhang and Cho-Jui Hsieh. Query-Efficient Hard-label Black-box Attack:An Optimization-based Approach. ICLR 2019.

[5] Minhao Cheng, Simranjit Singh, Patrick Chen, Pin-Yu Chen, Sijia Liu and Cho-Jui Hsieh. Sign-OPT: A Query-Efficient Hard-label Adversarial Attack. ICLR 2020.

[6] Weilun Chen, Zhaoxiang Zhang, Xiaolin Hu, and Baoyuan Wu. Boosting Decision-based Black-box Adversarial Attacks with Random Sign Flip. ECCV 2020.

[7] Yujia Liu, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard. A Geometry-Inspired Decision-Based Attack. ICCV 2019.

[8] Thibault Maho, Teddy Furon, Erwan Le Merrer. SurFree: a fast surrogate-free black-box attack. arXiv Preprint arXiv:2011.12807.

Others

[1] Xiaoyi Dong, Jiangfan Han, Dongdong Chen, Jiayang Liu, Huanyu Bian, Zehua Ma, Hongsheng Li, Xiaogang Wang, Weiming Zhang and Nenghai Yu. Robust Superpixel-Guided Attentional Adversarial Attack. CVPR 2020.

[2] Linxi Jiang, Xingjun Ma, Zejia Weng, James Bailey and Yu-Gang Jiang. Imbalanced Gradients: A New Cause of Overestimated Adversarial Robustness. arXiv Preprint arXiv:2006.13726.

Unrecognized Images

[1] Anh Nguyen, Jason Yosinski and Jeff Clune. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. CVPR 2015.

Adversarial Defense

Adversarial Training

[1] Ian J. Goodfellow, Jonathon Shlens and Christian Szegedy. Explaining and Harnessing Adversarial Examples. ICLR 2015.

[2] Runtian Zhai, Tianle Cai, Di He, Chen Dan, Kun He, John Hopcroft and Liwei Wang. Adversarially Robust Generalization Just Requires More Unlabeled Data arXiv Preprint arXiv:1906.00555.

[3] Yair Carmon, Aditi Raghunathan, Ludwig Schmidt, Percy Liang and John C. Duchi. Unlabeled Data Improves Adversarial Robustness. arXiv Preprint arXiv:1905.13736.

[4] Jonathan Uesato, Jean-Baptiste Alayrac, Po-Sen Huang, Robert Stanforth, Alhussein Fawzi and Pushmeet Kohli. Are Labels Required for Improving Adversarial Robustness?. arXiv Preprint arXiv:1905.13725.

[5] Chuanbiao Song, Kun He, Liwei Wang and John E. Hopcroft. Improving the Generalization of Adversarial Training with Domain Adaptation . ICLR 2019.

[6] Hang Yu, Aishan Liu, Xianglong Liu, Gengchao Li, Ping Luo, Ran Cheng, Jichen Yang and Chongzhi Zhang. PDA: Progressive Data Augmentation for General Robustness of Deep Neural Networks. arXiv Preprint arXiv:1909.04839.

[7] Chuanbiao Song, Kun He, Jiadong Lin, Liwei Wang and John E. Hopcroft. Robust Local Features for Improving the Generalization of Adversarial Training. ICLR 2020.

[8] Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, Michael I. Jordan. Theoretically Principled Trade-off between Robustness and Accuracy. ICML 2019.

[9] Yuanhao Xiong and Cho-Jui Hsieh. Improved Adversarial Training via Learned Optimizer. arXiv Preprint arXiv:2004.12227.

[10] Pranjal Awasthi, Natalie Frank and Mehryar Mohri. Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks. arXiv Preprint arXiv:2004.13617.

[11] Yisen Wang, Difan Zou, Jinfeng Yi, James Bailey, Xingjun Ma and Quanquan Gu. Improving Adversarial Robustness Requires Revisiting Misclassified Examples. ICLR 2020.

[12] Chang Xiao, Peilin Zhong, Changxi Zheng. Enhancing Adversarial Defense by k-Winners-Take-All. ICLR 2020.

[13] Saehyung Lee, Hyungyu Lee and Sungroh Yoon. Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization. CVPR 2020.

[14] Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui and Ruitong Huang. MMA Training: Direct Input Space Margin Maximization through Adversarial Training. ICLR 2020.

[15] Harini Kannan, Alexey Kurakin and Ian Goodfellow. Adversarial Logit Pairing. arXiv Preprint arXiv:1803.06373.

[16] Cihang Xie, Mingxing Tan, Boqing Gong, Alan Yuille and Quoc V. Le. Smooth Adversarial Training. arXiv Preprint arXiv:2006.14536.

[17] Anh Bui, Trung Le, He Zhao, Paul Montague, Olivier deVel, Tamas Abraham and Dinh Phung. Improving Adversarial Robustness by Enforcing Local and Global Compactness. ECCV 2020.

[18] David Stutz, Matthias Hein and Bernt Schiele. Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks. ICML 2020.

[19] Tianyu Pang, Chao Du, and Jun Zhu. Max-Mahalanobis Linear Discriminant Analysis Networks. ICML 2018.

[20] Ali Shafahi, Mahyar Najibi, Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S. Davis, Gavin Taylor, Tom Goldstein. Adversarial Training for Free!. NeurIPS 2019.

[21] Yinpeng Dong, Zhijie Deng, Tianyu Pang, Hang Su, Jun Zhu. Adversarial Distributional Training for Robust Deep Learning. NeurIPS 2020.

[22] Alex Lamb, Vikas Verma, Juho Kannala, Yoshua Bengio. [Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy]. ACM AISec 2019.

[23] Alfred Laugros, Alice Caplier, Matthieu Ospici. Addressing Neural Network Robustness with Mixup and Targeted Labeling Adversarial Training. ECCV 2019.

[24] Saehyung Lee, Hyungyu Lee, Sungroh Yoon. Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization. CVPR 2020.

[25] Tao Bai, Jinqi Luo, Jun Zhao, Bihan Wen, Qian Wang. Recent Advances in Adversarial Training for Adversarial Robustness. arXiv Preprint arXiv:2102.01356 2021.

[26] Leslie Rice, Eric Wong, J. Zico Kolter. Overfitting in adversarially robust deep learning. ICML 2020.

[27] Jason Bunk, Srinjoy Chattopadhyay, B. S. Manjunath, Shivkumar Chandrasekaran. Adversarially Optimized Mixup for Robust Classification. arXiv Preprint arXiv:2103.11589 2021.

[28] Zuxuan Wu, Tom Goldstein, Larry S. Davis, Ser-Nam Lim. THAT: Two Head Adversarial Training for Improving Robustness at Scale. arXiv Preprint arXiv:2103.13612 2021.

Fast Adversarial Training

[1] Eric Wong, Leslie Rice, J. Zico Kolter. Fast is better than free: Revisiting adversarial training. ICLR 2020.

[2] Maksym Andriushchenko, Nicolas Flammarion. Understanding and Improving Fast Adversarial Training. NeurIPS 2020.

[3] Hoki Kim, Woojin Lee, Jaewook Lee. Understanding Catastrophic Overfitting in Single-step Adversarial Training. AAAI 2021.

GAN-Based Defense

[1] Pouya Samangouei, Maya Kabkab and Rama Chellappa. Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models. ICLR 2018.

[2] Yang Song, Taesup Kim, Sebastian Nowozin, Stefano Ermon and Nate Kushman. PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples ICLR 2018.

[3] Guoqing Jin, Shiwei Shen, Dongming Zhang, Feng Dai and Yongdong Zhang. APE-GAN: Adversarial Perturbation Elimination with GAN. ICASSP 2019.

Certified defense

[1] Eric Wong and J. Zico Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. ICML 2017.

[2] Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann and Pushmeet Kohli. Scalable Verified Training for Provably Robust Image Classification. ICCV 2019.

[3] Jeremy M Cohen, Elan Rosenfeld and J. Zico Kolter. Certified Adversarial Robustness via Randomized Smoothing. ICML 2019.

[4] Guang-He Lee, Yang Yuan, Shiyu Chang and Tommi S. Jaakkola. Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers. NeurIPS 2019.

Others

[1] Matthew J. Roos. Utilizing a null class to restrict decision spaces and defend against neural network adversarial attacks. arXiv Preprint arXiv:2002.10084 2020.

[2] Alvin Chan, Yi Tay, Yew Soon Ong and Jie Fu. Jacobian Adversarially Regularized Networks for Robustness. ICLR 2020.

[3] Christian Etmann, Sebastian Lunz, Peter Maass and Carola-Bibiane Schönlieb. On the Connection Between Adversarial Robustness and Saliency Map Interpretability. ICML 2019.

[4] Zhun Deng, Linjun Zhang, Amirata Ghorbani and James Zou. Improving Adversarial Robustness via Unlabeled Out-of-Domain Data. arXiv Preprint arXiv:2006.08476 2020.

[5] Tianyu Pang, Kun Xu, Yinpeng Dong, Chao Du, Ning Chen and Jun Zhu. Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness. ICLR 2020.

Others

[1] Haohan Wang, Xindi Wu, Zeyi Huang, Eric P. Xing. High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks. CVPR 2020.

[2] Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin and David Lopez-Paz. Mixup: Beyond Empirical Risk Minimization. ICLR 2018.

[3] Ludwig Schmidt, Shibani Santurkar, Dimitris Tsipras, Kunal Talwar and Aleksander Mądry. Adversarially Robust Generalization Requires More Data. NeurIPS 2018.

[4] Tianyuan Zhang and Zhanxing Zhu. Interpreting Adversarially Trained Convolutional Neural Networks. ICML 2019.

[5] Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A. Wichmann and Wieland Brendel. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. ICLR 2019.

[6] Dong Yin, Raphael Gontijo Lopes, Jonathon Shlens, Ekin D. Cubuk and Justin Gilmer. A Fourier Perspective on Model Robustness in Computer Vision. NeurIPS 2019.

[7] Chengzhi Mao, Amogh Gupta, Vikram Nitin, Baishakhi Ray, Shuran Song, Junfeng Yang and Carl Vondrick. Multitask Learning Strengthens Adversarial Robustness. arXiv Preprint arXiv:2007.07236.

[8] Xiao Wang, Siyue Wang, Pin-Yu Chen, Yanzhi Wang, Brian Kulis, Xue Lin and Peter Chin. Protecting Neural Networks with Hierarchical Random Switching: Towards Better Robustness-Accuracy Trade-off for Stochastic Defenses. IJCAI 2019.

[9] Matteo Terzi, Alessandro Achille, Marco Maggipinto, Gian Antonio Susto. Adversarial Training Reduces Information and Improves Transferability. arXiv Preprint arXiv:2007.11259 2020.

Adversarial Examples in Natural Language Processing

Survey

[1] Wei Emma Zhang, Quan Z. Sheng, Ahoud Alhazmi and Chenliang Li. Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey. arXiv Preprint arXiv:1901.06796 2019.

[2] Wenqi Wang, Lina Wang, Benxiao Tang, Run Wang and Aoshuang Ye. Towards a Robust Deep Neural Network in Text Domain A Survey. arXiv Preprint arXiv:1902.07285 2019.

Newest paper

Adversarial Attack

[1] John X. Morris, Eli Lifland, Jin Yong Yoo and Yanjun Qi. TextAttack: A Framework for Adversarial Attacks in Natural Language Processing. arXiv Preprint axXiv:2005.05909 2020.

Character-Level

[1] Javid Ebrahimi, Anyi Rao, Daniel Lowd and Dejing Dou. HotFlip: White-Box Adversarial Examples for Text Classification. ACL 2018.

[2] Ji Gao, Jack Lanchantin, Mary Lou Soffa and Yanjun Qi. Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. IEEE S&P workshop 2018.

Word-Level

[1] Nicolas Papernot, Patrick McDaniel, Ananthram Swami and Richard Harang. Crafting Adversarial Input Sequences for Recurrent Neural Networks. MILCOM 2016.

[2] Volodymyr Kuleshov, Shantanu Thakoor, Tingfung Lau and Stefano Ermon. Adversarial Examples for Natural Language Classification Problems . ICLR 2018 rejected.

[3] Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava and Kai-Wei Chang. Generating Natural Language Adversarial Examples. EMNLP 2018.

[4] Shuhuai Ren, Yihe Deng, Kun He and Wanxiang Che. Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. ACL 2019.

[5] Huangzhao Zhang, Hao Zhou, Ning Miao and Lei Li. Generating Fluent Adversarial Examples for Natural Languages. ACL 2019.

[6] Yi-Ting Tsai, Min-Chu Yang and Han-Yu Chen. Adversarial Attack on Sentiment Classification. ACL workshop 2019.

[7] Samuel Barham and Soheil Feizi. Interpretable Adversarial Training for Text arXiv Preprint arXiv:1905.12864 2019.

[8] Di Jin, Zhijing Jin, Joey Tianyi Zhou and Peter Szolovits. Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment arXiv Preprint arXiv:1907.11932 2019.

[9] Xiaosen Wang, Hao Jin and Kun He. Natural Language Adversarial Attacks and Defenses in Word Level. arXiv Preprint arXiv:1909.06723 2019.

[10] Suranjana Samanta and Sameep Mehta. Towards Crafting Text Adversarial Samples. arXiv Preprint arXiv:1707.02812 2017.

[11] Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu and Maosong Sun. Word-level Textual Adversarial Attacking as Combinatorial Optimizatio. ACL 2020.

[12] Xiaosen Wang, Yichen Yang, Yihe Deng and Kun He. Adversarial Training with Fast Gradient Projection Method against Synonym Substitution based Text Attacks. AAAI 2021.

[13] Linyang Li, Yunfan Shao, Demin Song, Xipeng Qiu and Xuanjing Huang. Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces. arXiv Preprint arXiv:2012.14769 2020.

[14] Bushra Sabir, M. Ali Babar, Raj Gaire. ReinforceBug: A Framework to Generate Adversarial Textual Examples. NAACL 2021.

[15] Xuanli He, Lingjuan Lyu, Qiongkai Xu, Licaho Sun. Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!. NAACL 2021.

[16] Zhao Meng, Roger Wattenhofer. A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples. COLING 2020.

[17] Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi. A Strong Baseline for Query Efficient Attacks in a Black Box Setting. EMNLP 2021.

[18] Yangyi Chen, Jin Su, Wei Wei. Multi-granularity Textual Adversarial Attack with Behavior Cloning. EMNLP 2021.

[19] Shengcai Liu, Ning Lu, Cheng Chen, Ke Tang. Efficient Combinatorial Optimization for Word-level Adversarial Textual AttackarXiv Preprint arXiv:2109.02229 2021.

Both

[1] Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li and Wenchang Shi. Deep text classification can be fooled. IJCAI 2018.

[2] Jinfeng Li, Shouling Ji, Tianyu Du, Bo Li and Ting Wang. TextBugger: Generating Adversarial Text Against Real-world Applications. NDSS 2019.

Universal Adversarial Examples

[1] Di Li, Danilo Vasconcellos Vargas and Sakurai Kouichi. Universal Rules for Fooling Deep Neural Networks based Text Classification. CEC 2019.

[2] Melika Behjati, Seyed-Mohsen Moosavi-Dezfooli, Mahdieh Soleymani Baghshah and Pascal Frossard. Universal Adversarial Attacks on Text Classifiers. ICASSP 2019.

Adversarial Defense

Character-Level

[1] Danish Pruthi, Bhuwan Dhingra and Zachary C. Lipton. Combating Adversarial Misspellings with Robust Word Recognition. ACL 2019.

[2] Hui Liu, Yongzheng Zhang, Yipeng Wang, Zheng Lin, Yige Chen. Joint Character-level Word Embedding and Adversarial Stability Training to Defend Adversarial Text. AAAI 2020.

Word-Level

[1] Ishai Rosenberg, Asaf Shabtai, Yuval Elovici and Lior Rokach. Defense Methods Against Adversarial Examples for Recurrent Neural Networks. arXiv Preprint arXiv:1901.09963 2019.

[2] Xiaosen Wang, Hao Jin and Kun He. Natural Language Adversarial Attacks and Defenses in Word Level. arXiv Preprint arXiv:1909.06723 2019.

[3] Yi Zhou, Xiaoqing Zheng, Cho-Jui Hsieh, Kai-wei Chang and Xuanjing Huang. Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble. arXiv Preprint arXiv:2006.11627 2020.

[4] Xiaosen Wang, Yichen Yang, Yihe Deng and Kun He. Adversarial Training with Fast Gradient Projection Method against Synonym Substitution based Text Attacks. AAAI 2021.

[5] Rishabh Maheshwary, Saket Maheshwary and Vikram Pudi. Generating Natural Language Attacks in a Hard Label Black Box Setting. AAAI 2021.

[6] Chenglei Si, Zhengyan Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun. Better Robustness by More Coverage: Adversarial Training with Mixup Augmentation for Robust Fine-tuning. arXiv Preprint arXiv:2012.15699 2020.

[7] Xinshuai Dong, Xinshuai_Dong, Anh Tuan Luu, Rongrong Ji, Hong Liu. Towards Robustness Against Natural Language Word Substitutions. ICLR 2021.

[8] Jin Yong Yoo, Yanjun Qi. Towards Improving Adversarial Training of NLP Models. EMNLP Findings 2021.

[9] Rongzhou Bao, Jiayi Wang, Hai Zhao. Defending Pre-trained Language Models from Adversarial Word Substitutions Without Performance Sacrifice. ACL Findings 2021.

Both

[1] Nestor Rodriguez and Sergio Rojas-Galeano. Shielding Google's language toxicity model against adversarial attacks. arXiv Preprint arXiv:1801.01828 2018.

Certified defense

[1] Robin Jia, Aditi Raghunathan, Kerem Göksel and Percy Liang. Certified Robustness to Adversarial Word Substitutions. EMNLP-IJCNLP 2019.

[2] Po-Sen Huang, Robert Stanforth, Johannes Welbl, Chris Dyer, Dani Yogatama, Sven Gowal, Krishnamurthy Dvijotham, Pushmeet Kohli. Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation. EMNLP-IJCNLP 2019.

[3] Jiehang Zeng, Xiaoqing Zheng, Jianhan Xu, Linyang Li, Liping Yuan, Xuanjing Huang. Certified Robustness to Text Adversarial Attacks by Randomized [MASK]. ACL Findings 2021.

Detection

[1] Yichao Zhou, Jyun-Yu Jiang, Kai-Wei Chang and Wei Wang. Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification. EMNLP-IJCNLP 2019.

[2] Maximilian Mozes, Pontus Stenetorp, Bennett Kleinberg, Lewis D. Griffin. Frequency-Guided Word Substitutions for Detecting Textual Adversarial Examples. EACL 2021.

[3] Xiaosen Wang, Yifeng Xiong, Kun He. Randomized Substitution and Vote for Textual Adversarial Example Detection. arXiv Preprint arXiv:2109.05698 2021.

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

xiaosen-wang / Adversarial-Examples-Paper

Labels

Projects that are alternatives of or similar to Adversarial-Examples-Paper

The Papers of Adversarial Examples

Adversarial Examples in Computer Vision

Adversarial Attack

Gradented-Based Attack

GAN-Based Attack

Transferability

Hard Label Attack

Unrestricted Adversarial Examples

Black-box attacks

Hard-label attacks

Others

Unrecognized Images

Adversarial Defense

Adversarial Training

Fast Adversarial Training

GAN-Based Defense

Certified defense

Others

Others

Adversarial Examples in Natural Language Processing

Survey

Adversarial Attack

Character-Level

Word-Level

Both

Universal Adversarial Examples

Adversarial Defense

Character-Level

Word-Level

Both

Certified defense

Detection