上海交通大学学报(自然版) ›› 2018, Vol. 52 ›› Issue (10): 1298-1306.doi: 10.16183/j.cnki.jsjtu.2018.10.019
易平,王科迪,黄程,顾双驰,邹福泰,李建华
通讯作者:
李建华,男,教授,博士生导师,电话(Tel.):021-62932899;lijh888@sjtu.edu.cn
作者简介:
易平(1969-),男,河南省洛阳市人,副教授,现主要从事人工智能安全研究.
基金资助:
YI Ping,WANG Kedi,HUANG Cheng,GU Shuangchi,ZOU Futai,LI Jianhua
摘要: 随着人工智能的广泛应用,人工智能安全也开始引起人们的关注,其中人工智能对抗攻击已经成为人工智能安全研究热点.为此,介绍了对抗攻击的概念和产生对抗样本的原因,主要因为模型判断边界与真实系统边界的不一致导致对抗空间的存在;论述了几种经典生成对抗样本的方法,包括快速梯度和雅克比映射攻击,对抗攻击的主要思路是寻找模型梯度变化最快方向,按这个方向加入扰动从而导致模型误判;论述了检测对抗攻击的方法和对抗攻击的防御方法,并提出未来的一些研究方向.
中图分类号:
易平,王科迪,黄程,顾双驰,邹福泰,李建华. 人工智能对抗攻击研究综述[J]. 上海交通大学学报(自然版), 2018, 52(10): 1298-1306.
YI Ping,WANG Kedi,HUANG Cheng,GU Shuangchi,ZOU Futai,LI Jianhua. Adversarial Attacks in Artificial Intelligence: A Survey[J]. Journal of Shanghai Jiaotong University, 2018, 52(10): 1298-1306.
[1]LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444. [2]GOODFELLOW I, YOSHUA B, AARON C. Deep learning[M]. Boston: MIT Press, 2016. [3]WANG Xinggang, YANG Wei, JEFFREY W, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: Deep learning versus non-deep learning[J]. Scientific Reports, 2017, 7(1): 15415. [4]XIONG H Y, ALIPANAHI B. The human splicing code reveals new insights into the genetic determinants of disease[J]. Science, 2015, 347 (6218): 144-153. [5]WEBB S. Deep learning for biology[J]. Nature, 2018, 554(2): 555-557. [6]BRANSON K. A deep (learning) dive into a cell[J]. Nature Methods, 2018, 15(4): 253-254. [7]DENG Yue, BAO Feng, KONG Youyong, et al. Deep direct reinforcement learning for financial signal representation and trading[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3): 653-664. [8]HE Ying, ZHAO Nan, YIN Hongxi. Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach[J]. IEEE Transactions on Vehicular Technology, 2018, 67(1): 44-55. [9]ZHAO Dongbin, CHEN Yaran, LV Le. Deep reinforcement learning with visual attention for vehicle classification[J]. IEEE Transactions on Cognitive and Developmental Systems, 2017,9(4): 356-367. [10]AKHTAR N, MIAN A. Threat of adversarial attacks on deep learning in computer vision: a survey[J]. IEEE Access, 2018, 6(2): 14410-14430. [11]GOODFELLOW I, SHLENS J, CHRISTIAN S. Explaining and harnessing adversarial examples[EB/OL]. (2015-03-20)[2018-06-23]. https://arxiv.org/abs/1412.6572. [12]GUO Chuan, RANA M, CISSE M, et al. Countering adversarial images using input transformations [EB/OL]. (2018-01-25)[2018-06-23]. https://arxiv.org/abs/1711.00117. [13]SINHA A, NAMKOONG H, DUCHI J. Certifying some distributional robustness with principled adversarial training [EB/OL]. (2018-05-01)[2018-06-23]. https://arxiv.org/abs/1710.10571. [14]SONG Yang, KIM T, NOWOZIN S, et al. Pixel defend: Leveraging generative models to understand and defend against adversarial examples [EB/OL]. (2018-05-01)[2018-06-23]. https://arxiv.org/abs/1710.10766. [15]XIE Cihang, WANG Jianyu, ZHANG Zhishuai, et al. Mitigating adversarial effects through randomization [EB/OL]. (2018-02-28)[2018-06-23]. https://arxiv.org/abs/1711.01991. [16]MCDANIEL P, PAPERNOT N, CELIK Z B. Machine learning in adversarial settings[J]. IEEE Security & Privacy, 2016, 14(3): 68-72. [17]PAPERNOT N, MCDANIEL P, JHA S, et al. The limitations of deep learning in adversarial settings[C]//IEEE European Symposium on Security and Privacy (EuroS&P). Saarbrucken, Germany: IEEE, 2016: 372-387. [18]KURAKIN A, GOODFELLOW I, BENGIO S. Adversarial examples in the physical world[EB/OL]. (2018-05-28) [2018-06-23]. https://arxiv.org/abs/1805.10997. [19]TRAMER F, GOODFELLOW I, BONEH D, et al. Ensemble adversarial training: attacks and defenses [EB/OL]. (2017-05-19)[2018-06-23]. https://arxiv.org/abs/1705.07204. [20]MOOSAVIDEZFOOLI S, FAWZI A, FROSSARD P. DeepFool: A simple and accurate method to fool deep neural networks[EB/OL]. (2015-11-14)[2018-06-23]. https://arxiv.org/abs/1511.04599. [21]BRENDEL W, RAUBER J, BETHGE M. Decision-based adversarial attacks: Reliable attacks against blackbox machine learning models[EB/OL]. (2017-12-12)[2018-06-23]. https://arxiv.org/abs/1712.04248. [22]CISSE M, ADI Y, NEVEROVA N, et al. Houdini: Fooling deep structured prediction models [EB/OL]. (2017-07-17) [2018-06-23]. https://arxiv.org/abs/1707.05373. [23]HE W, LI Bo, SONG D. Decision boundary analysis of adversarial examples[EB/OL]. (2018-02-16)[2018-06-23]. https://openreview.net/forum?id=BkpiPMbA-. [24]ZHAO Zhengli, DUA D, SINGH S. Generating natural adversarial examples[EB/OL]. (2017-10-31)[2018-06-23]. https://arxiv.org/abs/1710.11342. [25]XIAO Chaowei, ZHU Junyan, LI Bo, et al. Spatially transformed adversarial examples[EB/OL]. (2018-01-08) [2018-06-23]. https://arxiv.org/abs/1801.02612. [26]CARLINI N, WAGNER D. Towards evaluating the robustness of neural networks[EB/OL]. (2016-08-16) [2018-06-23]. https://arxiv.org/abs/1608.04644. [27]PAPERNOT N, MCDANIEL P, GOODFELLOW I, et al. Practical black-box attacks against machine learning[EB/OL]. (2016-02-08)[2018-06-23]. https://arxiv.org/abs/1602.02697. [28]PAPERNOT N, GOODFELLOW I, SHEATSLEY R, et al. Cleverhans v1. 0.0: An adversarial machine learning library[EB/OL]. (2016-10-03)[2018-06-23]. https://arxiv.org/abs/1610.00768. [29]TANAY T, GRIFFIN L. A boundary tilting persepective on the phenomenon of adversarial examples[EB/OL]. (2016-08-27)[2018-06-23]. https://arxiv.org/abs/1608.07690. [30]FAWZI A, FAWZI O, FROSSARD P. Fundamental limits on adversarial robustness[EB/OL]. (2015-04-27)[2018-06-23]. http://www.alhusseinfawzi.info/papers/workshop_dl.pdf. [31]TABACOF P, VALLE E. Exploring the space of adversarial images[C]//IEEE International Joint Conference on Neural Networks (IJCNN). Vancouver, BC, Canada: IEEE, 2016: 2161-4407. [32]LECUN Y, BOSER B, DENKER J S, et al. Backpropagation applied to handwritten zip code recognition[J]. Neural Computation, 1989, 1(4): 541-551. [33]DENG J, DONG W, SOCHER R, et al. Imagenet: A large-scale hierarchical image database[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Miami, USA: IEEE, 2009: 248-255. [34]TRAMER F, PAPERNOT N, GOODFELLOW I, et al. The space of transferable adversarial examples[EB/OL]. (2017-04-11)[2018-06-23]. https://arxiv.org/abs/1704.03453. [35]KROTOV D, HOPFIELD J J. Dense associative memory is robust to adversarial inputs[EB/OL]. (2016-08-27)[2018-06-23]. https://arxiv.org/abs/1701.00939. [36]MOOSAVI-DEZFOOLI S M, FAWZI A, FAWZI O, et al. Universal adversarial perturbations[EB/OL]. (2016-10-26)[2018-06-23]. https://arxiv.org/abs/1610.08401. [37]DZIUGAITE G K, GHAHRAMANI Z, ROY D M. A study of the effect of JPG compression on adversarial images[EB/OL]. (2016-08-02)[2018-06-23]. https://arxiv.org/abs/1608.00853. [38]DAS N, SHANBHOGUE M, CHEN S, et al. Keeping the bad guys out: protecting and vaccinating deep learning with JPEG compression[EB/OL]. (2017-05-08)[2018-06-23]. https://arxiv.org/abs/1705.02900. [39]SHIN R, SONG D. JPEG-resistant adversarial images[EB/OL]. (2017-08-14)[2018-06-23]. https://machine-learning-and-security.github.io/papers/mlsec17_paper_54.pdf. [40]AKHTAR N, LIU Jian, MIAN A. Defense against universal adversarial perturbations[EB/OL]. (2017-11-16)[2018-06-23]. https://arxiv.org/abs/1711.05929. [41]XIE Cihang, WANG Jianyu, ZHANG Zhishuai, et al. Adversarial examples for semantic segmentation and object detection[EB/OL]. (2017-05-24)[2018-06-23]. https://arxiv.org/abs/1703.08603. [42]WANG Qinglong, GUO Wenbo, ZHANG Kaixuan, et al. Learning adversary-resistant deep neural networks[EB/OL]. (2016-12-05)[2018-06-23]. https://arxiv.org/abs/1612.01401. [43]GU Shixiang, RIGAZIO L. Towards deep neural network architectures robust to adversarial examples[EB/OL]. (2014-12-11)[2018-06-23]. https://arxiv.org/abs/1412.5068. [44]RIFAI S, VINCENT P, MULLER X, et al. Contractive auto-encoders: Explicit invariance during feature extraction[C]//ICML’11 Proceedings of the 28th International Conference on International Conference on Machine Learning. Washington, USA: Omnipress, 2011: 833-840. [45]ROSS A, DOSHIVELEZ F. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients[EB/OL]. (2017-11-26)[2018-06-23]. https://arxiv.org/abs/1711.09404. [46]PAPERNOT N, MCDANIEL P, WU Xi, et al. Distillation as a defense to adversarial perturbations against deep neural networks[C]//IEEE Symposium on Security and Privacy (SP). San Jose, CA, USA: IEEE, 2016: 2375-1207. [47]GAO Ji, WANG Beilun, LIU Zeming, et al. Masking deep neural network models for robustness against adversarial samples[EB/OL]. (2017-02-22)[2018-06-23]. https://arxiv.org/abs/1702.06763. [48]LEE H, HAN S, LEE J. Generative adversarial trainer: defense to adversarial perturbations with GAN[EB/OL]. (2017-05-09)[2018-06-23]. https://arxiv.org/abs/1705.03387. [49]MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[EB/OL]. (2017-06-19)[2018-06-23]. https://arxiv.org/abs/1706.06083. [50]MA Xingjun, LI Bo, WANG Yisen, et al. Characterizing adversarial subspaces using local intrinsic dimensionality[EB/OL]. (2018-01-08)[2018-06-23]. https://arxiv.org/abs/1801.02613. [51]SAM P, KABKAB M, CHELLAPPA R. Defense-GAN: Protecting classifiers against adversarial attacks using generative models[EB/OL]. (2018-05-17)[2018-06-23]. https://arxiv.org/abs/1805.06605. [52]RAGHUNATHAN A, STEINHARDT J, LIANG P. Certified defenses against adversarial examples[EB/OL]. (2018-01-29)[2018-06-23]. https://arxiv.org/abs/1801.09344. [53]BUCKMAN J, ROY A, GOODFELLOW I, et al. Thermometer encoding: One hot way to resist adversarial examples[EB/OL]. (2018-02-16)[2018-06-23]. https://openreview.net/forum?id=S18Su--CW. [54]WENG Xuwei, ZAHNG Huan, CHEN Pinyu, et al. Evaluating the robustness of neural networks: An extreme value theory approach[EB/OL]. (2018-01-31)[2018-06-23]. https://arxiv.org/abs/1801.10578. [55]ELSAYED G F, PAPERNOT N, GOODFELLOW I, et al. Adversarial examples that fool both human and computer Vision[EB/OL]. (2018-02-22)[2018-06-23]. https://arxiv.org/abs/1802.08195. |
[1] | 曾国治, 魏子清, 岳宝, 丁云霄, 郑春元, 翟晓强. 基于CNN-RNN组合模型的办公建筑能耗预测[J]. 上海交通大学学报, 2022, 56(9): 1256-1261. |
[2] | 吴庶宸, 戚宗锋, 李建勋. 基于深度学习的智能全局灵敏度分析[J]. 上海交通大学学报, 2022, 56(7): 840-849. |
[3] | 吕超凡, 言颖杰, 林力, 柴岗, 鲍劲松. 基于点云语义分割算法的下颌角截骨面设计[J]. 上海交通大学学报, 2022, 56(11): 1509-1517. |
[4] | 陶海红, 闫莹菲. 一种基于GA-CNN的网络化雷达节点遴选算法[J]. 空天防御, 2022, 5(1): 1-5. |
[5] | 金丽洁, 武亚涛. 基于双CNN的雷达信号调制类型识别方法[J]. 空天防御, 2022, 5(1): 66-70. |
[6] | 何清波,姜添曦. 人工智能可以通过操纵波来实现吗?[J]. 上海交通大学学报, 2021, 55(Sup.1): 1-2. |
[7] | 李少远, 殷翔. 智能体自我博弈学习是否存在性能极限?[J]. 上海交通大学学报, 2021, 55(Sup.1): 3-4. |
[8] | 王兴志, 翟海保, 严亚勤, 吴庆曦. 基于数字孪生和深度学习的新一代调控系统预调度方法[J]. 上海交通大学学报, 2021, 55(S2): 37-41. |
[9] | 王岩, 陈耀然, 韩兆龙, 周岱, 包艳. 基于互信息理论与递归神经网络的短期风速预测模型[J]. 上海交通大学学报, 2021, 55(9): 1080-1086. |
[10] | 王宇, 余岳峰, 朱小磊, 张忠孝. 基于光流法和深度学习的燃气火焰稳定性[J]. 上海交通大学学报, 2021, 55(4): 462-470. |
[11] | 蔡云泽, 张彦军. 基于双通道特征增强集成注意力网络的红外弱小目标检测方法[J]. 空天防御, 2021, 4(4): 14-22. |
[12] | 赵仁星, 王玲, 冯明月, 贾文通. “马赛克战”作战概念构想及对策分析[J]. 空天防御, 2021, 4(3): 48-54. |
[13] | 郑德重, 杨媛媛, 谢哲, 倪扬帆, 李文涛. 基于Gaussian混合的距离度量学习数据划分方法[J]. 上海交通大学学报, 2021, 55(2): 131-140. |
[14] | 蒋兴浩, 赵泽宇, 许可. 基于视觉的飞行器智能目标检测对抗攻击技术[J]. 空天防御, 2021, 4(1): 8-13. |
[15] | 曹靖豪, 张俊举, 黄维, 姚若彤, 张平. 基于多尺度特征融合的无人机识别与检测[J]. 空天防御, 2021, 4(1): 60-64. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||