制导、导航与控制

风扰下无人机栖落机动的强化学习控制设计

  • 张威振 ,
  • 何真 ,
  • 汤张帆
展开
  • 南京航空航天大学 自动化学院,南京 211106
张威振(1998—),硕士生,从事飞行控制研究.
何 真,副教授;E-mail:hezhen@nuaa.edu.cn.

收稿日期: 2024-05-24

  修回日期: 2024-06-12

  录用日期: 2024-06-19

  网络出版日期: 2024-07-26

基金资助

国家自然科学基金(61873126)

Reinforcement Learning Control Design for Perching Maneuver of Unmanned Aerial Vehicles with Wind Disturbances

  • ZHANG Weizhen ,
  • HE Zhen ,
  • TANG Zhangfan
Expand
  • College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China

Received date: 2024-05-24

  Revised date: 2024-06-12

  Accepted date: 2024-06-19

  Online published: 2024-07-26

摘要

针对无人机在风扰环境下的栖落机动过程,利用含控制的非线性动力学稀疏辨识(SINDYc)方法与模仿深度强化学习(IDRL)方法设计栖落机动的控制策略.首先,采用域随机化方法建立具有多种风况的栖落机动系统训练环境.然后,基于历史数据和候选函数库,采用SINDYc方法离线学习各个风况下栖落机动系统的稀疏模型,以有效辨识风况信息.接着,在具有多种风况的栖落机动系统训练环境中,采用IDRL算法进行栖落机动控制策略的训练,得到风扰下的栖落机动控制策略.最后,通过数值仿真验证了所设计的栖落机动控制策略在风扰环境下的有效性.

本文引用格式

张威振 , 何真 , 汤张帆 . 风扰下无人机栖落机动的强化学习控制设计[J]. 上海交通大学学报, 2024 , 58(11) : 1753 -1761 . DOI: 10.16183/j.cnki.jsjtu.2024.187

Abstract

This paper addresses the issue of perching maneuver of unmanned aerial vehicles in wind-disturbed environments, by combining the control-oriented sparse identification of nonlinear dynamics with control (SINDYc) method and the imitation deep reinforcement learning (IDRL) control strategy. The study focuses on the design of control strategies for perching maneuvers. First, a training environment for the perching system is established using domain randomization, which incorporates various wind conditions. Then, the SINDYc method is employed to learn sparse models of the perching system offline under different wind conditions, using historical data and a candidate function library, to effectively identify the wind information. Afterwards, the perching control strategy is trained using an IDRL algorithm within the training environment that encompasses multiple wind conditions, resulting in a control strategy for perching in wind-disturbed scenarios. Finally, numerical simulations are conducted to verify the effectiveness of the proposed perching control strategy in wind-disturbed environments.

参考文献

[1] BERG A M, BIEWENER A A. Wing and body kinematics of takeoff and landing flight in the pigeon (Columba livia)[J]. The Journal of Experimental Biology, 2010, 213(Pt 10): 1651-1658.
[2] ROBERTS J W, CORY R, TEDRAKE R. On the controllability of fixed-wing perching[C]//2009 American Control Conference. St. Louis, USA: IEEE, 2009: 2018-2023.
[3] MOORE J, CORY R, TEDRAKE R. Robust post-stall perching with a simple fixed-wing glider using LQR-Trees[J]. Bioinspiration & Biomimetics, 2014, 9(2): 025013.
[4] RODERICK W R T, CUTKOSKY M R, LENTINK D. Bird-inspired dynamic grasping and perching in arboreal environments[J]. Science Robotics, 2021, 6(61): eabj7562.
[5] FLETCHER L J, CLARKE R J, RICHARDSON T S, et al. Reinforcement learning for a perched landing in the presence of wind[C]//AIAA Scitech 2021 Forum. Reston, USA: AIAA; 2021: AIAA 2021-1282.
[6] 赵辉, 周杰, 王红君, 等. 基于CEEMDAN-PE和QGA-BP的短期风速预测[J]. 电子技术应用, 2018, 44(12): 60-64.
  ZHAO Hui, ZHOU Jie, WANG Hongjun, et al. Short-term wind speed prediction based on CEEMDAN-PE and QGA-BP[J]. Application of Electronic Technique, 2018, 44(12): 60-64.
[7] 黄勇东, 陈冬沣, 肖建华, 等. 基于小波包分解和改进差分算法的神经网络短期风速预测方法[J]. 浙江电力, 2017, 36(6): 1-7.
  HUANG Yongdong, CHEN Dongfeng, XIAO Jianhua, et al. Short-term wind speed forecast method based on WPD-IDE-NN[J]. Zhejiang Electric Power, 2017, 36(6): 1-7.
[8] 袁亮, 何真, 王月. 变体无人机栖落机动建模与轨迹优化[J]. 南京航空航天大学学报, 2018, 50(2): 266-275.
  YUAN Liang, HE Zhen, WANG Yue. Modeling and trajectory optimization of perching maneuvers for morphing UAV[J]. Journal of Nanjing University of Aeronautics & Astronautics, 2018, 50(2): 266-275.
[9] 黄赞, 何真, 仇靖雯. 基于深度强化学习的无人机栖落机动控制策略设计[J]. 导航定位与授时, 2022, 9(6): 25-32.
  HUANG Zan, HE Zhen, QIU Jingwen. Design of UAV perching maneuver control strategy based on deep reinforcement learning[J]. Navigation Positioning & Timing, 2022, 9(6): 25-32.
[10] CORY R, TEDRAKE R. Experiments in fixed-wing UAV perching[C]//AIAA Guidance, Navigation and Control Conference and Exhibit. Honolulu, USA: AIAA; 2008: AIAA 2008-7256.
[11] HE Z, LI D, LU Y P. Disturbance compensation based piecewise linear control design for perching maneuvers[J]. IEEE Transactions on Aerospace & Electronic Systems, 2019, 55(1): 192-204.
[12] BRUNTON S L, PROCTOR J L, KUTZ J N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems[J]. Proceedings of the National Academy of Sciences of the United States of America, 2016, 113(15): 3932-3937.
[13] YU C, VELU A, VINITSKY E, et al. The surprising effectiveness of ppo in cooperative multi-agent games[J]. Advances in Neural Information Processing Systems, 2022, 35: 24611-24624.
文章导航

/