Adversarial Attacks in Artificial Intelligence： A Survey

易平，王科迪，黄程，顾双驰，邹福泰，李建华

doi:10.16183/j.cnki.jsjtu.2018.10.019

Journal of Shanghai Jiaotong University >

2018 , Vol. 52 >Issue 10: 1298 - 1306

DOI: https://doi.org/10.16183/j.cnki.jsjtu.2018.10.019

Adversarial Attacks in Artificial Intelligence： A Survey

易平，王科迪，黄程，顾双驰，邹福泰，李建华

Expand

Shanghai Key Laboratory of Integrated Administration Technologies for Information Security; School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China

Online published: 2025-07-02

Fold

Abstract

With the widespread use of artificial intelligence, artificial intelligence security has drawn public attention. The research on adversarial attacks in artificial intelligence has become a hotspot of artificial intelligence security. This paper first introduces the concept of adversarial attacks and the causes of adversarial attacks. The main reason is that the inconsistency between the model boundary and the real boundary leads to the existence of adversarial space. This paper review the works that design adversarial attacks, detect methods and defense methods agaisnt the attacks. The adversarial attacks including FGSM and JSMA attacks, the main idea of the attacks is to find the fast gradient direction of the model, adding perturbation according the direction and causing model misjudgment. Finally, some future research directions are proposed.

Key words： artificial intelligence; artificial intelligence security; deep learning; adversarial attack; adversarial learning

Cite this article

易平，王科迪，黄程，顾双驰，邹福泰，李建华 . Adversarial Attacks in Artificial Intelligence： A Survey[J]. Journal of Shanghai Jiaotong University, 2018 , 52(10) : 1298 -1306 . DOI: 10.16183/j.cnki.jsjtu.2018.10.019

References

［1］LECUN Y, BENGIO Y, HINTON G. Deep learning［J］. Nature, 2015, 521(7553): 436-444. ［2］GOODFELLOW I, YOSHUA B, AARON C. Deep learning［M］. Boston: MIT Press, 2016. ［3］WANG Xinggang, YANG Wei, JEFFREY W, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: Deep learning versus non-deep learning［J］. Scientific Reports, 2017, 7(1): 15415. ［4］XIONG H Y, ALIPANAHI B. The human splicing code reveals new insights into the genetic determinants of disease［J］. Science, 2015, 347 (6218): 144-153. ［5］WEBB S. Deep learning for biology［J］. Nature, 2018, 554(2): 555-557. ［6］BRANSON K. A deep (learning) dive into a cell［J］. Nature Methods, 2018, 15(4): 253-254. ［7］DENG Yue, BAO Feng, KONG Youyong, et al. Deep direct reinforcement learning for financial signal representation and trading［J］. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3): 653-664. ［8］HE Ying, ZHAO Nan, YIN Hongxi. Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach［J］. IEEE Transactions on Vehicular Technology, 2018, 67(1): 44-55. ［9］ZHAO Dongbin, CHEN Yaran, LV Le. Deep reinforcement learning with visual attention for vehicle classification［J］. IEEE Transactions on Cognitive and Developmental Systems, 2017,9(4): 356-367. ［10］AKHTAR N, MIAN A. Threat of adversarial attacks on deep learning in computer vision: a survey［J］. IEEE Access, 2018, 6(2): 14410-14430. ［11］GOODFELLOW I, SHLENS J, CHRISTIAN S. Explaining and harnessing adversarial examples［EB/OL］. (2015-03-20)［2018-06-23］. https://arxiv.org/abs/1412.6572. ［12］GUO Chuan, RANA M, CISSE M, et al. Countering adversarial images using input transformations ［EB/OL］. (2018-01-25)［2018-06-23］. https://arxiv.org/abs/1711.00117. ［13］SINHA A, NAMKOONG H, DUCHI J. Certifying some distributional robustness with principled adversarial training ［EB/OL］. (2018-05-01)［2018-06-23］. https://arxiv.org/abs/1710.10571. ［14］SONG Yang, KIM T, NOWOZIN S, et al. Pixel defend: Leveraging generative models to understand and defend against adversarial examples ［EB/OL］. (2018-05-01)［2018-06-23］. https://arxiv.org/abs/1710.10766. ［15］XIE Cihang, WANG Jianyu, ZHANG Zhishuai, et al. Mitigating adversarial effects through randomization ［EB/OL］. (2018-02-28)［2018-06-23］. https://arxiv.org/abs/1711.01991. ［16］MCDANIEL P, PAPERNOT N, CELIK Z B. Machine learning in adversarial settings［J］. IEEE Security & Privacy, 2016, 14(3): 68-72. ［17］PAPERNOT N, MCDANIEL P, JHA S, et al. The limitations of deep learning in adversarial settings［C］//IEEE European Symposium on Security and Privacy (EuroS&P). Saarbrucken, Germany: IEEE, 2016: 372-387. ［18］KURAKIN A, GOODFELLOW I, BENGIO S. Adversarial examples in the physical world［EB/OL］. (2018-05-28) ［2018-06-23］. https://arxiv.org/abs/1805.10997. ［19］TRAMER F, GOODFELLOW I, BONEH D, et al. Ensemble adversarial training: attacks and defenses ［EB/OL］. (2017-05-19)［2018-06-23］. https://arxiv.org/abs/1705.07204. ［20］MOOSAVIDEZFOOLI S, FAWZI A, FROSSARD P. DeepFool: A simple and accurate method to fool deep neural networks［EB/OL］. (2015-11-14)［2018-06-23］. https://arxiv.org/abs/1511.04599. ［21］BRENDEL W, RAUBER J, BETHGE M. Decision-based adversarial attacks: Reliable attacks against blackbox machine learning models［EB/OL］. (2017-12-12)［2018-06-23］. https://arxiv.org/abs/1712.04248. ［22］CISSE M, ADI Y, NEVEROVA N, et al. Houdini: Fooling deep structured prediction models ［EB/OL］. (2017-07-17) ［2018-06-23］. https://arxiv.org/abs/1707.05373. ［23］HE W, LI Bo, SONG D. Decision boundary analysis of adversarial examples［EB/OL］. (2018-02-16)［2018-06-23］. https://openreview.net/forum?id=BkpiPMbA-. ［24］ZHAO Zhengli, DUA D, SINGH S. Generating natural adversarial examples［EB/OL］. (2017-10-31)［2018-06-23］. https://arxiv.org/abs/1710.11342. ［25］XIAO Chaowei, ZHU Junyan, LI Bo, et al. Spatially transformed adversarial examples［EB/OL］. (2018-01-08) ［2018-06-23］. https://arxiv.org/abs/1801.02612. ［26］CARLINI N, WAGNER D. Towards evaluating the robustness of neural networks［EB/OL］. (2016-08-16) ［2018-06-23］. https://arxiv.org/abs/1608.04644. ［27］PAPERNOT N, MCDANIEL P, GOODFELLOW I, et al. Practical black-box attacks against machine learning［EB/OL］. (2016-02-08)［2018-06-23］. https://arxiv.org/abs/1602.02697. ［28］PAPERNOT N, GOODFELLOW I, SHEATSLEY R, et al. Cleverhans v1. 0.0: An adversarial machine learning library［EB/OL］. (2016-10-03)［2018-06-23］. https://arxiv.org/abs/1610.00768. ［29］TANAY T, GRIFFIN L. A boundary tilting persepective on the phenomenon of adversarial examples［EB/OL］. (2016-08-27)［2018-06-23］. https://arxiv.org/abs/1608.07690. ［30］FAWZI A, FAWZI O, FROSSARD P. Fundamental limits on adversarial robustness［EB/OL］. (2015-04-27)［2018-06-23］. http://www.alhusseinfawzi.info/papers/workshop_dl.pdf. ［31］TABACOF P, VALLE E. Exploring the space of adversarial images［C］//IEEE International Joint Conference on Neural Networks (IJCNN). Vancouver, BC, Canada: IEEE, 2016: 2161-4407. ［32］LECUN Y, BOSER B, DENKER J S, et al. Backpropagation applied to handwritten zip code recognition［J］. Neural Computation, 1989, 1(4): 541-551. ［33］DENG J, DONG W, SOCHER R, et al. Imagenet: A large-scale hierarchical image database［C］//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Miami, USA: IEEE, 2009: 248-255. ［34］TRAMER F, PAPERNOT N, GOODFELLOW I, et al. The space of transferable adversarial examples［EB/OL］. (2017-04-11)［2018-06-23］. https://arxiv.org/abs/1704.03453. ［35］KROTOV D, HOPFIELD J J. Dense associative memory is robust to adversarial inputs［EB/OL］. (2016-08-27)［2018-06-23］. https://arxiv.org/abs/1701.00939. ［36］MOOSAVI-DEZFOOLI S M, FAWZI A, FAWZI O, et al. Universal adversarial perturbations［EB/OL］. (2016-10-26)［2018-06-23］. https://arxiv.org/abs/1610.08401. ［37］DZIUGAITE G K, GHAHRAMANI Z, ROY D M. A study of the effect of JPG compression on adversarial images［EB/OL］. (2016-08-02)［2018-06-23］. https://arxiv.org/abs/1608.00853. ［38］DAS N, SHANBHOGUE M, CHEN S, et al. Keeping the bad guys out: protecting and vaccinating deep learning with JPEG compression［EB/OL］. (2017-05-08)［2018-06-23］. https://arxiv.org/abs/1705.02900. ［39］SHIN R, SONG D. JPEG-resistant adversarial images［EB/OL］. (2017-08-14)［2018-06-23］. https://machine-learning-and-security.github.io/papers/mlsec17_paper_54.pdf. ［40］AKHTAR N, LIU Jian, MIAN A. Defense against universal adversarial perturbations［EB/OL］. (2017-11-16)［2018-06-23］. https://arxiv.org/abs/1711.05929. ［41］XIE Cihang, WANG Jianyu, ZHANG Zhishuai, et al. Adversarial examples for semantic segmentation and object detection［EB/OL］. (2017-05-24)［2018-06-23］. https://arxiv.org/abs/1703.08603. ［42］WANG Qinglong, GUO Wenbo, ZHANG Kaixuan, et al. Learning adversary-resistant deep neural networks［EB/OL］. (2016-12-05)［2018-06-23］. https://arxiv.org/abs/1612.01401. ［43］GU Shixiang, RIGAZIO L. Towards deep neural network architectures robust to adversarial examples［EB/OL］. (2014-12-11)［2018-06-23］. https://arxiv.org/abs/1412.5068. ［44］RIFAI S, VINCENT P, MULLER X, et al. Contractive auto-encoders: Explicit invariance during feature extraction［C］//ICML’11 Proceedings of the 28th International Conference on International Conference on Machine Learning. Washington, USA: Omnipress, 2011: 833-840. ［45］ROSS A, DOSHIVELEZ F. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients［EB/OL］. (2017-11-26)［2018-06-23］. https://arxiv.org/abs/1711.09404. ［46］PAPERNOT N, MCDANIEL P, WU Xi, et al. Distillation as a defense to adversarial perturbations against deep neural networks［C］//IEEE Symposium on Security and Privacy (SP). San Jose, CA, USA: IEEE, 2016: 2375-1207. ［47］GAO Ji, WANG Beilun, LIU Zeming, et al. Masking deep neural network models for robustness against adversarial samples［EB/OL］. (2017-02-22)［2018-06-23］. https://arxiv.org/abs/1702.06763. ［48］LEE H, HAN S, LEE J. Generative adversarial trainer: defense to adversarial perturbations with GAN［EB/OL］. (2017-05-09)［2018-06-23］. https://arxiv.org/abs/1705.03387. ［49］MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks［EB/OL］. (2017-06-19)［2018-06-23］. https://arxiv.org/abs/1706.06083. ［50］MA Xingjun, LI Bo, WANG Yisen, et al. Characterizing adversarial subspaces using local intrinsic dimensionality［EB/OL］. (2018-01-08)［2018-06-23］. https://arxiv.org/abs/1801.02613. ［51］SAM P, KABKAB M, CHELLAPPA R. Defense-GAN: Protecting classifiers against adversarial attacks using generative models［EB/OL］. (2018-05-17)［2018-06-23］. https://arxiv.org/abs/1805.06605. ［52］RAGHUNATHAN A, STEINHARDT J, LIANG P. Certified defenses against adversarial examples［EB/OL］. (2018-01-29)［2018-06-23］. https://arxiv.org/abs/1801.09344. ［53］BUCKMAN J, ROY A, GOODFELLOW I, et al. Thermometer encoding: One hot way to resist adversarial examples［EB/OL］. (2018-02-16)［2018-06-23］. https://openreview.net/forum?id=S18Su--CW. ［54］WENG Xuwei, ZAHNG Huan, CHEN Pinyu, et al. Evaluating the robustness of neural networks: An extreme value theory approach［EB/OL］. (2018-01-31)［2018-06-23］. https://arxiv.org/abs/1801.10578. ［55］ELSAYED G F, PAPERNOT N, GOODFELLOW I, et al. Adversarial examples that fool both human and computer Vision［EB/OL］. (2018-02-22)［2018-06-23］. https://arxiv.org/abs/1802.08195.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References