Missile-Target Situation Assessment Model Based on Reinforcement Learning

doi:10.1007/s12204-020-2226-8

Journal of Shanghai Jiao Tong University(Science) ›› 2020, Vol. 25 ›› Issue (5): 561-568.doi: 10.1007/s12204-020-2226-8

Missile-Target Situation Assessment Model Based on Reinforcement Learning

ZHANG Yun (张贇), Lü Runyan (吕润妍), CAI Yunze (蔡云泽)

(Department of Automation; Key Laboratory of System Control and Information Processing of Ministry of Education;
Key Laboratory of Marine Intelligent Equipment and System of Ministry of Education,
Shanghai Jiao Tong University, Shanghai 200240, China)

出版日期:2020-10-28 发布日期:2020-09-11
通讯作者: CAI Yunze (蔡云泽) E-mail:yzcai@sjtu.edu.cn

Missile-Target Situation Assessment Model Based on Reinforcement Learning

ZHANG Yun (张贇), Lü Runyan (吕润妍), CAI Yunze (蔡云泽)

(Department of Automation; Key Laboratory of System Control and Information Processing of Ministry of Education;
Key Laboratory of Marine Intelligent Equipment and System of Ministry of Education,
Shanghai Jiao Tong University, Shanghai 200240, China)

Online:2020-10-28 Published:2020-09-11
Contact: CAI Yunze (蔡云泽) E-mail:yzcai@sjtu.edu.cn

摘要/Abstract

摘要： In situation assessment (SA) of missile versus target fighter, the traditional SA models generally
have the characteristics of strong subjectivity and poor dynamic adaptability. This paper considers SA as an
expectation of future returns and establishes a missile-target simulation battle model. The actor-critic (AC)
algorithm in reinforcement learning (RL) is used to train the evaluation network, and a missile-target SA model
is established in simulation battle training. Simulation and comparative experiments show that the model can
effectively estimate the expected effect of missile attack under the current situation, and it provides an effective
basis for missile attack decision.

关键词: situation assessment (SA), battle model, reinforcement learning (RL), actor-critic (AC) algorithm

Abstract: In situation assessment (SA) of missile versus target fighter, the traditional SA models generally
have the characteristics of strong subjectivity and poor dynamic adaptability. This paper considers SA as an
expectation of future returns and establishes a missile-target simulation battle model. The actor-critic (AC)
algorithm in reinforcement learning (RL) is used to train the evaluation network, and a missile-target SA model
is established in simulation battle training. Simulation and comparative experiments show that the model can
effectively estimate the expected effect of missile attack under the current situation, and it provides an effective
basis for missile attack decision.

Key words: situation assessment (SA), battle model, reinforcement learning (RL), actor-critic (AC) algorithm

中图分类号:

TP 391
TP 18

ZHANG Yun, Lü Runyan, CAI Yunze . Missile-Target Situation Assessment Model Based on Reinforcement Learning[J]. Journal of Shanghai Jiao Tong University(Science), 2020, 25(5): 561-568.

参考文献 16

[1]	ENDSLEY M R. Toward a theory of situation awareness in dynamic systems [J]. Human Factors, 1995,37(1): 32-64.
[2]	STEINBERG A N, BOWMAN C L,WHITE F E. Revisions to the JDL data fusion model [J]. Proceedings of SPIE, 1999, 3719: 430-441.
[3]	CHEN X, WEI X M, XU G Y. Multiple unmanned aerial vehicle decentralized cooperative air combat decision making with fuzzy situation [J]. Journal of Shanghai Jiao Tong University, 2014, 48(7): 907-913(in Chinese).
[4]	YAN C C, HAO Y S. Threat assessment of aerial target based on AHP [J]. Computing Technology and Automation,2011, 30(2): 118-121 (in Chinese).
[5]	CHEN J, YU G H, GAO X G. Cooperative threat assessment of multi-aircrafts based on synthetic fuzzy cognitive map [J]. Journal of Shanghai Jiao Tong University(Science), 2012, 17(2): 228-232.
[6]	LU C G, ZHOU Z L, LIU H Q, et al. Situation assessment of far-distance attack air combat based on mixed dynamic bayesian networks [C]//Proceedings of the 37th Chinese Control Conference. Wuhan, China:Chinese Association of Automation, 2018: 1133-1138.
[7]	PENG P, WEN Y, YANG Y D, et al. Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play Star-Craft combat games [EB/OL]. (2017-09-14) [2020-07-14]. https://arxiv.org/pdf/1703.10069v4.pdf.
[8]	ZHOU Z Q, QIAN J G, WANG Y Z. Research on ballistic missile situation grade model based on BP neural network [J]. Fire Control & Command Control, 2015,40(5): 53-56 (in Chinese).
[9]	LIU P, MA Y F. A deep reinforcement learning based intelligent decision method for UCAV air combat[C]//17th Asian Simulation Conference. Melaka,Malaysia: Springer, 2017: 274-286.
[10]	YANG Q M, ZHANG J D, SHI G Q, et al. Maneuver decision of UAV in short-range air combat based on deep reinforcement learning [J]. IEEE Access, 2020, 8:363-378.
[11]	LI Y T, HAN T, SUN C, et al. An optimization method of air combat situation assessment function based on inverse reinforcement learning [J]. Fire Control & Command Control, 2019, 44(8): 101-106 (in Chinese).
[12]	PETERS J, SCHAAL S. Natural actor-critic [J]. Neurocomputing,2008, 71(7/8/9): 1180-1190.
[13]	LV H W, GAO Y, HUANG Q L, et al. Research on multi-target assignment model in air combat [J]. Journal of Naval Aeronautical and Astronautical University,2008, 23(1): 59-61 (in Chinese).
[14]	JIANG L T, KOU Y N, WANG D, et al. A dynamic variable weight method for situation assessment in close-range air combat [J]. Electronic Optics & Control,2019, 26(4): 1-5 (in Chinese).
[15]	CHEN D J, WANG J. Air defense target threat assessment based on intuitionistic fuzzy sets [J]. Journal of Detection & Control, 2019, 41(4): 46-51 (in Chinese).
[16]	FAN Z H, SHI B H, CHEN J Y, et al. A novel dynamic bayesian network based threat assessment algorithm[C]//2017 4th International Conference on Systems and Informatics (ICSAI ). Hangzhou, China: IEEE,2017: 611-615.

相关文章 15

[1]	蒋祖华1, 周宏明2, 陶宁蓉3, 李柏鹤1. 基于知识的船舶曲面分段建造调度及应用[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(5): 759-765.
[2]	于佳琪1，王殊轶1，王浴屺1，谢华2，吴张檑1，付小妮1，马邦峰1. 基于增强现实技术的新型经皮肾穿刺训练可视化工具[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(4): 517-.
[3]	姜锐1，朱瑞祥1，蔡萧萃1，苏虎2. 具有增强注意力的前景分割网络[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(3): 360-369.
[4]	祝楷, 熊柏青, 闫宏伟, 张永安, 李志辉, 李锡武, 刘红伟, 温凯, 闫丽珍, . 辊道传送速度对大规格铝合金厚板应力分布及演变影响的数值模拟研究[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(2): 255-263.
[5]	. [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(6): 757-767.
[6]	. [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(2): 190-201.
[7]	. [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(2): 240-249.
[8]	. [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 7-14.
[9]	. [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 24-35.
[10]	. [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 99-111.
[11]	. [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 121-136.
[12]	. [J]. J Shanghai Jiaotong Univ Sci, 2021, 26(5): 577-586.
[13]	. [J]. J Shanghai Jiaotong Univ Sci, 2021, 26(5): 587-597.
[14]	. [J]. J Shanghai Jiaotong Univ Sci, 2021, 26(5): 670-679.
[15]	SHI Lianxing (石连星), WANG Zhiheng (王志恒), LI Xiaoyong (李小勇) . Novel Data Placement Algorithm for Distributed Storage System Based on Fault-Tolerant Domain[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(4): 463-470.

Missile-Target Situation Assessment Model Based on Reinforcement Learning

Missile-Target Situation Assessment Model Based on Reinforcement Learning

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献 16

相关文章 15

编辑推荐

Metrics

本文评价