上海交通大学学报 ›› 2018, Vol. 52 ›› Issue (10): 1298-1306.doi: 10.16183/j.cnki.jsjtu.2018.10.019
易平,王科迪,黄程,顾双驰,邹福泰,李建华
发布日期:2025-07-02
通讯作者:
李建华,男,教授,博士生导师,电话(Tel.):021-62932899;lijh888@sjtu.edu.cn
作者简介:易平(1969-),男,河南省洛阳市人,副教授,现主要从事人工智能安全研究.
基金资助:YI Ping,WANG Kedi,HUANG Cheng,GU Shuangchi,ZOU Futai,LI Jianhua
Published:2025-07-02
摘要: 随着人工智能的广泛应用,人工智能安全也开始引起人们的关注,其中人工智能对抗攻击已经成为人工智能安全研究热点.为此,介绍了对抗攻击的概念和产生对抗样本的原因,主要因为模型判断边界与真实系统边界的不一致导致对抗空间的存在;论述了几种经典生成对抗样本的方法,包括快速梯度和雅克比映射攻击,对抗攻击的主要思路是寻找模型梯度变化最快方向,按这个方向加入扰动从而导致模型误判;论述了检测对抗攻击的方法和对抗攻击的防御方法,并提出未来的一些研究方向.
中图分类号:
易平,王科迪,黄程,顾双驰,邹福泰,李建华. 人工智能对抗攻击研究综述[J]. 上海交通大学学报, 2018, 52(10): 1298-1306.
YI Ping,WANG Kedi,HUANG Cheng,GU Shuangchi,ZOU Futai,LI Jianhua. Adversarial Attacks in Artificial Intelligence: A Survey[J]. Journal of Shanghai Jiao Tong University, 2018, 52(10): 1298-1306.
| [1]LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444. [2]GOODFELLOW I, YOSHUA B, AARON C. Deep learning[M]. Boston: MIT Press, 2016. [3]WANG Xinggang, YANG Wei, JEFFREY W, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: Deep learning versus non-deep learning[J]. Scientific Reports, 2017, 7(1): 15415. [4]XIONG H Y, ALIPANAHI B. The human splicing code reveals new insights into the genetic determinants of disease[J]. Science, 2015, 347 (6218): 144-153. [5]WEBB S. Deep learning for biology[J]. Nature, 2018, 554(2): 555-557. [6]BRANSON K. A deep (learning) dive into a cell[J]. Nature Methods, 2018, 15(4): 253-254. [7]DENG Yue, BAO Feng, KONG Youyong, et al. Deep direct reinforcement learning for financial signal representation and trading[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3): 653-664. [8]HE Ying, ZHAO Nan, YIN Hongxi. Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach[J]. IEEE Transactions on Vehicular Technology, 2018, 67(1): 44-55. [9]ZHAO Dongbin, CHEN Yaran, LV Le. Deep reinforcement learning with visual attention for vehicle classification[J]. IEEE Transactions on Cognitive and Developmental Systems, 2017,9(4): 356-367. [10]AKHTAR N, MIAN A. Threat of adversarial attacks on deep learning in computer vision: a survey[J]. IEEE Access, 2018, 6(2): 14410-14430. [11]GOODFELLOW I, SHLENS J, CHRISTIAN S. Explaining and harnessing adversarial examples[EB/OL]. (2015-03-20)[2018-06-23]. https://arxiv.org/abs/1412.6572. [12]GUO Chuan, RANA M, CISSE M, et al. Countering adversarial images using input transformations [EB/OL]. (2018-01-25)[2018-06-23]. https://arxiv.org/abs/1711.00117. [13]SINHA A, NAMKOONG H, DUCHI J. Certifying some distributional robustness with principled adversarial training [EB/OL]. (2018-05-01)[2018-06-23]. https://arxiv.org/abs/1710.10571. [14]SONG Yang, KIM T, NOWOZIN S, et al. Pixel defend: Leveraging generative models to understand and defend against adversarial examples [EB/OL]. (2018-05-01)[2018-06-23]. https://arxiv.org/abs/1710.10766. [15]XIE Cihang, WANG Jianyu, ZHANG Zhishuai, et al. Mitigating adversarial effects through randomization [EB/OL]. (2018-02-28)[2018-06-23]. https://arxiv.org/abs/1711.01991. [16]MCDANIEL P, PAPERNOT N, CELIK Z B. Machine learning in adversarial settings[J]. IEEE Security & Privacy, 2016, 14(3): 68-72. [17]PAPERNOT N, MCDANIEL P, JHA S, et al. The limitations of deep learning in adversarial settings[C]//IEEE European Symposium on Security and Privacy (EuroS&P). Saarbrucken, Germany: IEEE, 2016: 372-387. [18]KURAKIN A, GOODFELLOW I, BENGIO S. Adversarial examples in the physical world[EB/OL]. (2018-05-28) [2018-06-23]. https://arxiv.org/abs/1805.10997. [19]TRAMER F, GOODFELLOW I, BONEH D, et al. Ensemble adversarial training: attacks and defenses [EB/OL]. (2017-05-19)[2018-06-23]. https://arxiv.org/abs/1705.07204. [20]MOOSAVIDEZFOOLI S, FAWZI A, FROSSARD P. DeepFool: A simple and accurate method to fool deep neural networks[EB/OL]. (2015-11-14)[2018-06-23]. https://arxiv.org/abs/1511.04599. [21]BRENDEL W, RAUBER J, BETHGE M. Decision-based adversarial attacks: Reliable attacks against blackbox machine learning models[EB/OL]. (2017-12-12)[2018-06-23]. https://arxiv.org/abs/1712.04248. [22]CISSE M, ADI Y, NEVEROVA N, et al. Houdini: Fooling deep structured prediction models [EB/OL]. (2017-07-17) [2018-06-23]. https://arxiv.org/abs/1707.05373. [23]HE W, LI Bo, SONG D. Decision boundary analysis of adversarial examples[EB/OL]. (2018-02-16)[2018-06-23]. https://openreview.net/forum?id=BkpiPMbA-. [24]ZHAO Zhengli, DUA D, SINGH S. Generating natural adversarial examples[EB/OL]. (2017-10-31)[2018-06-23]. https://arxiv.org/abs/1710.11342. [25]XIAO Chaowei, ZHU Junyan, LI Bo, et al. Spatially transformed adversarial examples[EB/OL]. (2018-01-08) [2018-06-23]. https://arxiv.org/abs/1801.02612. [26]CARLINI N, WAGNER D. Towards evaluating the robustness of neural networks[EB/OL]. (2016-08-16) [2018-06-23]. https://arxiv.org/abs/1608.04644. [27]PAPERNOT N, MCDANIEL P, GOODFELLOW I, et al. Practical black-box attacks against machine learning[EB/OL]. (2016-02-08)[2018-06-23]. https://arxiv.org/abs/1602.02697. [28]PAPERNOT N, GOODFELLOW I, SHEATSLEY R, et al. Cleverhans v1. 0.0: An adversarial machine learning library[EB/OL]. (2016-10-03)[2018-06-23]. https://arxiv.org/abs/1610.00768. [29]TANAY T, GRIFFIN L. A boundary tilting persepective on the phenomenon of adversarial examples[EB/OL]. (2016-08-27)[2018-06-23]. https://arxiv.org/abs/1608.07690. [30]FAWZI A, FAWZI O, FROSSARD P. Fundamental limits on adversarial robustness[EB/OL]. (2015-04-27)[2018-06-23]. http://www.alhusseinfawzi.info/papers/workshop_dl.pdf. [31]TABACOF P, VALLE E. Exploring the space of adversarial images[C]//IEEE International Joint Conference on Neural Networks (IJCNN). Vancouver, BC, Canada: IEEE, 2016: 2161-4407. [32]LECUN Y, BOSER B, DENKER J S, et al. Backpropagation applied to handwritten zip code recognition[J]. Neural Computation, 1989, 1(4): 541-551. [33]DENG J, DONG W, SOCHER R, et al. Imagenet: A large-scale hierarchical image database[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Miami, USA: IEEE, 2009: 248-255. [34]TRAMER F, PAPERNOT N, GOODFELLOW I, et al. The space of transferable adversarial examples[EB/OL]. (2017-04-11)[2018-06-23]. https://arxiv.org/abs/1704.03453. [35]KROTOV D, HOPFIELD J J. Dense associative memory is robust to adversarial inputs[EB/OL]. (2016-08-27)[2018-06-23]. https://arxiv.org/abs/1701.00939. [36]MOOSAVI-DEZFOOLI S M, FAWZI A, FAWZI O, et al. Universal adversarial perturbations[EB/OL]. (2016-10-26)[2018-06-23]. https://arxiv.org/abs/1610.08401. [37]DZIUGAITE G K, GHAHRAMANI Z, ROY D M. A study of the effect of JPG compression on adversarial images[EB/OL]. (2016-08-02)[2018-06-23]. https://arxiv.org/abs/1608.00853. [38]DAS N, SHANBHOGUE M, CHEN S, et al. Keeping the bad guys out: protecting and vaccinating deep learning with JPEG compression[EB/OL]. (2017-05-08)[2018-06-23]. https://arxiv.org/abs/1705.02900. [39]SHIN R, SONG D. JPEG-resistant adversarial images[EB/OL]. (2017-08-14)[2018-06-23]. https://machine-learning-and-security.github.io/papers/mlsec17_paper_54.pdf. [40]AKHTAR N, LIU Jian, MIAN A. Defense against universal adversarial perturbations[EB/OL]. (2017-11-16)[2018-06-23]. https://arxiv.org/abs/1711.05929. [41]XIE Cihang, WANG Jianyu, ZHANG Zhishuai, et al. Adversarial examples for semantic segmentation and object detection[EB/OL]. (2017-05-24)[2018-06-23]. https://arxiv.org/abs/1703.08603. [42]WANG Qinglong, GUO Wenbo, ZHANG Kaixuan, et al. Learning adversary-resistant deep neural networks[EB/OL]. (2016-12-05)[2018-06-23]. https://arxiv.org/abs/1612.01401. [43]GU Shixiang, RIGAZIO L. Towards deep neural network architectures robust to adversarial examples[EB/OL]. (2014-12-11)[2018-06-23]. https://arxiv.org/abs/1412.5068. [44]RIFAI S, VINCENT P, MULLER X, et al. Contractive auto-encoders: Explicit invariance during feature extraction[C]//ICML’11 Proceedings of the 28th International Conference on International Conference on Machine Learning. Washington, USA: Omnipress, 2011: 833-840. [45]ROSS A, DOSHIVELEZ F. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients[EB/OL]. (2017-11-26)[2018-06-23]. https://arxiv.org/abs/1711.09404. [46]PAPERNOT N, MCDANIEL P, WU Xi, et al. Distillation as a defense to adversarial perturbations against deep neural networks[C]//IEEE Symposium on Security and Privacy (SP). San Jose, CA, USA: IEEE, 2016: 2375-1207. [47]GAO Ji, WANG Beilun, LIU Zeming, et al. Masking deep neural network models for robustness against adversarial samples[EB/OL]. (2017-02-22)[2018-06-23]. https://arxiv.org/abs/1702.06763. [48]LEE H, HAN S, LEE J. Generative adversarial trainer: defense to adversarial perturbations with GAN[EB/OL]. (2017-05-09)[2018-06-23]. https://arxiv.org/abs/1705.03387. [49]MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[EB/OL]. (2017-06-19)[2018-06-23]. https://arxiv.org/abs/1706.06083. [50]MA Xingjun, LI Bo, WANG Yisen, et al. Characterizing adversarial subspaces using local intrinsic dimensionality[EB/OL]. (2018-01-08)[2018-06-23]. https://arxiv.org/abs/1801.02613. [51]SAM P, KABKAB M, CHELLAPPA R. Defense-GAN: Protecting classifiers against adversarial attacks using generative models[EB/OL]. (2018-05-17)[2018-06-23]. https://arxiv.org/abs/1805.06605. [52]RAGHUNATHAN A, STEINHARDT J, LIANG P. Certified defenses against adversarial examples[EB/OL]. (2018-01-29)[2018-06-23]. https://arxiv.org/abs/1801.09344. [53]BUCKMAN J, ROY A, GOODFELLOW I, et al. Thermometer encoding: One hot way to resist adversarial examples[EB/OL]. (2018-02-16)[2018-06-23]. https://openreview.net/forum?id=S18Su--CW. [54]WENG Xuwei, ZAHNG Huan, CHEN Pinyu, et al. Evaluating the robustness of neural networks: An extreme value theory approach[EB/OL]. (2018-01-31)[2018-06-23]. https://arxiv.org/abs/1801.10578. [55]ELSAYED G F, PAPERNOT N, GOODFELLOW I, et al. Adversarial examples that fool both human and computer Vision[EB/OL]. (2018-02-22)[2018-06-23]. https://arxiv.org/abs/1802.08195. |
| [1] | . 基于改进YOLOv5l的交通信号灯识别[J]. J Shanghai Jiaotong Univ Sci, 2026, 31(2): 319-333. |
| [2] | 郭琦, 闫军, 郝乾鹏, 韩东, 杨志豪, 闫馨月, 张海鹏, 李然. 基于闭环聚类和多目标优化的风电短期功率预测方法[J]. 上海交通大学学报, 2026, 60(2): 246-255. |
| [3] | 陈亮汶, 朱宇昕, 沈涛, 俞羿帆, 凌霄, 盛庆红. 基于红外尾迹匹配的舰船目标检测算法[J]. 空天防御, 2026, 9(1): 80-90. |
| [4] | 夏筱彦, 张宇, 胡锡坤, 钟平. 基于扩散模型的无人机遥感目标检测物理对抗攻击方法研究[J]. 空天防御, 2026, 9(1): 52-62. |
| [5] | 罗志军, 王健瑞, 殷佳伟. 复杂战场环境下的任务驱动智能目标识别方法综述[J]. 空天防御, 2026, 9(1): 1-11. |
| [6] | . 基于深度学习序列方法的多人姿态估计用来检测人体与关键点位置[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(6): 1103-1113. |
| [7] | 王语阳, 张琛, 张宇, 王一鸣, 许颇, 蔡旭. 提升弱网有功稳定输出能力的光伏逆变器Q-V下垂系数在线调整方法[J]. 上海交通大学学报, 2025, 59(6): 845-856. |
| [8] | 荣光, 张业鑫, 唐朝, 陈金宝, 周奕玲, 王建园. 基于仿真数据驱动的无人飞行器故障诊断技术研究[J]. 空天防御, 2025, 8(6): 73-84. |
| [9] | . 基于三维卷积特征金字塔网络的高光谱卫星图像分类[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(6): 1073-1084. |
| [10] | . 基于CEEMDAN 和 GRU的停车位预测[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 962-975. |
| [11] | . 具有视觉伪装性的人脸识别对抗性图案生成方法[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 911-922. |
| [12] | . 基于ALBERT的中国诗酒文化命名实体识别[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 1065-1072. |
| [13] | . 面向太阳能电池复杂缺陷检测的新型多步深度学习方法[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 1050-1064. |
| [14] | 夏伊琳, 刘刚, 鄢丛强, 蔡云泽. 基于深度学习的SAR图像舰船尾迹旋转框检测算法研究[J]. 空天防御, 2025, 8(5): 64-74. |
| [15] | 谭左红, 万小博, 刘伟, 潘通林, 樊琎. 典型作战场景下的分布式协同作战关键技术发展综述[J]. 空天防御, 2025, 8(5): 10-16. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||