Processing math: 100%

基于高斯过程回归和深度强化学习的水下扑翼推进性能寻优方法
杨映荷, 魏汉迪, 范迪夏, 李昂

Optimization Method of Underwater Flapping Foil Propulsion Performance Based on Gaussian Process Regression and Deep Reinforcement Learning
YANG Yinghe, WEI Handi, FAN Dixia, LI Ang
表5 传统TD3算法与GPR-TD3方法所需样本数量及习得动作
Tab.5 Number of samples and learned actions required by traditional reinforcement learning and GPR-TD3 methods
类别 传统强化学习样本数量 GPR-TD3样本数量 动作向量
推进速度局部最优 9 200 290 [ 0.70055652.860]
推进效率局部最优 6 200 290 [ 0.70055652.837]
推进速度100 mm/s 5 300 290 [ 0.50155653.630]
推进速度80 mm/s 5 900 290 [ 0.5005553.6154.500]
推进速度70 mm/s 4 300 290 [ 0.6755552.1782.734]