J Shanghai Jiaotong Univ Sci ›› 2024, Vol. 29 ›› Issue (3): 377-387.doi: 10.1007/s12204-023-2679-7

• •    下一篇

基于深度强化学习的多模态多目标多机器人任务分配算法

苗镇华1,黄文焘2,张依恋3,范勤勤1*   

  1. (1. 上海海事大学 物流研究中心,上海 201306; 2. 上海交通大学 电力传输与功率变换控制教育部重点实验室,上海200240; 3. 上海海事大学 航运技术与控制工程交通行业重点实验室,上海 201306)
  • 接受日期:2023-08-24 出版日期:2024-05-28 发布日期:2024-05-28

Multi-Robot Task Allocation Using Multimodal Multi-Objective Evolutionary Algorithm Based on Deep Reinforcement Learning

MIAO Zhenhua1(苗镇华),HUANG Wentao2(黄文焘),ZHANG Yilian3(张依恋), FAN Qinqin1*(范勤勤)   

  1. (1. Logistics Research Center, Shanghai Maritime University, Shanghai 201306, China; 2. Key Laboratory of Control of Power Transmission and Conversion of Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China;3. Key Laboratory of Marine Technology and Control Engineering of Ministry of Communications, Shanghai Maritime University, Shanghai 201306, China)
  • Accepted:2023-08-24 Online:2024-05-28 Published:2024-05-28

摘要: 多机器人任务分配直接影响多机器人协作系统的整体性能。为提高多机器人协作系统的有效性、鲁棒性和安全性,本文提出一种基于深度强化学习的多模态多目标进化算法。在所提算法中,使用一种改进的多模态多目标进化算法来对多机器人任务分配问题进行求解,并在最后一代利用深度强化学习以端到端的方式给出各个机器人执行任务的路线。为验证所提算法的性能,与三种知名的多模态多目标进化算法在三种不同场景的多机器人任务分配问题上进行比较。实验结果表明,所提算法能够提供尽可能多的等效方案来提高多机器人协作系统在不确定环境下的可用性和鲁棒性,并且能够找到最佳方案来提高多机器人协作系统的整体任务执行效率。

关键词: 多机器人任务分配, 多机器人协同, 路径规划, 多模态多目标进化算法, 深度强化学习

Abstract: The overall performance of multi-robot collaborative systems is significantly affected by the multirobot task allocation. To improve the effectiveness, robustness, and safety of multi-robot collaborative systems,a multimodal multi-objective evolutionary algorithm based on deep reinforcement learning is proposed in this paper. The improved multimodal multi-objective evolutionary algorithm is used to solve multi-robot task allocation problems. Moreover, a deep reinforcement learning strategy is used in the last generation to provide a high-quality path for each assigned robot via an end-to-end manner. Comparisons with three popular multimodal multi-objective evolutionary algorithms on three different scenarios of multi-robot task allocation problems are carried out to verify the performance of the proposed algorithm. The experimental test results show that the proposed algorithm can generate sufficient equivalent schemes to improve the availability and robustness of multirobot collaborative systems in uncertain environments, and also produce the best scheme to improve the overall task execution efficiency of multi-robot collaborative systems.

Key words: multi-robot task allocation, multi-robot cooperation, path planning, multimodal multi-objective evo, lutionary algorithm, deep reinforcement learning

中图分类号: