基于近端策略动态优化的多智能体编队方法

Abstract
Figure/Table
References
Related Citation (2)

Download: PDF (1672 KB) (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract Unmanned aerial vehicle (UAV) cluster systems have advantages in redundancy of capabilities, high destruction resistance, and adaptability to complex scenarios, allowing more efficient mission execution and information acquisition. In recent years, deep reinforcement learning techniques have been combined into UAV cluster formation control methods to treat the drawbacks of cluster dimension explosion and difficulty in modelling cluster systems. However, deep reinforcement learning has problems such as low training efficiency. In this paper, a cluster formation method using an improved proximal policy optimization method was proposed. It could solve the slow convergence problems and neglect of high-value actions of the traditional proximal policy optimization method by using the dynamic estimation method as the evaluation mechanism, and effectively improve the data utilization rate. Simulation results verified the improvement in the training efficiency and sample reuse problems, thus achieving the optimized performance.

Key words： unmanned aerial vehicle clustering deep reinforcement learning proximal policy optimization inverse reinforcement learning cluster decision making

Received: 16 October 2023 Published: 11 May 2024

ZTFLH:	V 249
	TP 273

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors

Cite this article:

URL:

https://www.qk.sjtu.edu.cn/ktfy/EN/ OR https://www.qk.sjtu.edu.cn/ktfy/EN/Y2024/V7/I2/52