Guidance, Navigation and Control

Reinforcement Learning Control Design for Perching Maneuver of Unmanned Aerial Vehicles with Wind Disturbances

  • ZHANG Weizhen ,
  • HE Zhen ,
  • TANG Zhangfan
Expand
  • College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China

Received date: 2024-05-24

  Revised date: 2024-06-12

  Accepted date: 2024-06-19

  Online published: 2024-07-26

Abstract

This paper addresses the issue of perching maneuver of unmanned aerial vehicles in wind-disturbed environments, by combining the control-oriented sparse identification of nonlinear dynamics with control (SINDYc) method and the imitation deep reinforcement learning (IDRL) control strategy. The study focuses on the design of control strategies for perching maneuvers. First, a training environment for the perching system is established using domain randomization, which incorporates various wind conditions. Then, the SINDYc method is employed to learn sparse models of the perching system offline under different wind conditions, using historical data and a candidate function library, to effectively identify the wind information. Afterwards, the perching control strategy is trained using an IDRL algorithm within the training environment that encompasses multiple wind conditions, resulting in a control strategy for perching in wind-disturbed scenarios. Finally, numerical simulations are conducted to verify the effectiveness of the proposed perching control strategy in wind-disturbed environments.

Cite this article

ZHANG Weizhen , HE Zhen , TANG Zhangfan . Reinforcement Learning Control Design for Perching Maneuver of Unmanned Aerial Vehicles with Wind Disturbances[J]. Journal of Shanghai Jiaotong University, 2024 , 58(11) : 1753 -1761 . DOI: 10.16183/j.cnki.jsjtu.2024.187

References

[1] BERG A M, BIEWENER A A. Wing and body kinematics of takeoff and landing flight in the pigeon (Columba livia)[J]. The Journal of Experimental Biology, 2010, 213(Pt 10): 1651-1658.
[2] ROBERTS J W, CORY R, TEDRAKE R. On the controllability of fixed-wing perching[C]//2009 American Control Conference. St. Louis, USA: IEEE, 2009: 2018-2023.
[3] MOORE J, CORY R, TEDRAKE R. Robust post-stall perching with a simple fixed-wing glider using LQR-Trees[J]. Bioinspiration & Biomimetics, 2014, 9(2): 025013.
[4] RODERICK W R T, CUTKOSKY M R, LENTINK D. Bird-inspired dynamic grasping and perching in arboreal environments[J]. Science Robotics, 2021, 6(61): eabj7562.
[5] FLETCHER L J, CLARKE R J, RICHARDSON T S, et al. Reinforcement learning for a perched landing in the presence of wind[C]//AIAA Scitech 2021 Forum. Reston, USA: AIAA; 2021: AIAA 2021-1282.
[6] 赵辉, 周杰, 王红君, 等. 基于CEEMDAN-PE和QGA-BP的短期风速预测[J]. 电子技术应用, 2018, 44(12): 60-64.
  ZHAO Hui, ZHOU Jie, WANG Hongjun, et al. Short-term wind speed prediction based on CEEMDAN-PE and QGA-BP[J]. Application of Electronic Technique, 2018, 44(12): 60-64.
[7] 黄勇东, 陈冬沣, 肖建华, 等. 基于小波包分解和改进差分算法的神经网络短期风速预测方法[J]. 浙江电力, 2017, 36(6): 1-7.
  HUANG Yongdong, CHEN Dongfeng, XIAO Jianhua, et al. Short-term wind speed forecast method based on WPD-IDE-NN[J]. Zhejiang Electric Power, 2017, 36(6): 1-7.
[8] 袁亮, 何真, 王月. 变体无人机栖落机动建模与轨迹优化[J]. 南京航空航天大学学报, 2018, 50(2): 266-275.
  YUAN Liang, HE Zhen, WANG Yue. Modeling and trajectory optimization of perching maneuvers for morphing UAV[J]. Journal of Nanjing University of Aeronautics & Astronautics, 2018, 50(2): 266-275.
[9] 黄赞, 何真, 仇靖雯. 基于深度强化学习的无人机栖落机动控制策略设计[J]. 导航定位与授时, 2022, 9(6): 25-32.
  HUANG Zan, HE Zhen, QIU Jingwen. Design of UAV perching maneuver control strategy based on deep reinforcement learning[J]. Navigation Positioning & Timing, 2022, 9(6): 25-32.
[10] CORY R, TEDRAKE R. Experiments in fixed-wing UAV perching[C]//AIAA Guidance, Navigation and Control Conference and Exhibit. Honolulu, USA: AIAA; 2008: AIAA 2008-7256.
[11] HE Z, LI D, LU Y P. Disturbance compensation based piecewise linear control design for perching maneuvers[J]. IEEE Transactions on Aerospace & Electronic Systems, 2019, 55(1): 192-204.
[12] BRUNTON S L, PROCTOR J L, KUTZ J N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems[J]. Proceedings of the National Academy of Sciences of the United States of America, 2016, 113(15): 3932-3937.
[13] YU C, VELU A, VINITSKY E, et al. The surprising effectiveness of ppo in cooperative multi-agent games[J]. Advances in Neural Information Processing Systems, 2022, 35: 24611-24624.
Outlines

/