基于深度强化学习的多模态多目标多机器人任务分配算法

doi:10.1007/s12204-023-2679-7

摘要/Abstract

摘要： 多机器人任务分配直接影响多机器人协作系统的整体性能。为提高多机器人协作系统的有效性、鲁棒性和安全性，本文提出一种基于深度强化学习的多模态多目标进化算法。在所提算法中，使用一种改进的多模态多目标进化算法来对多机器人任务分配问题进行求解，并在最后一代利用深度强化学习以端到端的方式给出各个机器人执行任务的路线。为验证所提算法的性能，与三种知名的多模态多目标进化算法在三种不同场景的多机器人任务分配问题上进行比较。实验结果表明，所提算法能够提供尽可能多的等效方案来提高多机器人协作系统在不确定环境下的可用性和鲁棒性，并且能够找到最佳方案来提高多机器人协作系统的整体任务执行效率。

关键词: 多机器人任务分配, 多机器人协同, 路径规划, 多模态多目标进化算法, 深度强化学习

Abstract: The overall performance of multi-robot collaborative systems is significantly affected by the multirobot task allocation. To improve the effectiveness, robustness, and safety of multi-robot collaborative systems,a multimodal multi-objective evolutionary algorithm based on deep reinforcement learning is proposed in this paper. The improved multimodal multi-objective evolutionary algorithm is used to solve multi-robot task allocation problems. Moreover, a deep reinforcement learning strategy is used in the last generation to provide a high-quality path for each assigned robot via an end-to-end manner. Comparisons with three popular multimodal multi-objective evolutionary algorithms on three different scenarios of multi-robot task allocation problems are carried out to verify the performance of the proposed algorithm. The experimental test results show that the proposed algorithm can generate sufficient equivalent schemes to improve the availability and robustness of multirobot collaborative systems in uncertain environments, and also produce the best scheme to improve the overall task execution efficiency of multi-robot collaborative systems.

Key words: multi-robot task allocation, multi-robot cooperation, path planning, multimodal multi-objective evo, lutionary algorithm, deep reinforcement learning

中图分类号:

TP301.6

苗镇华1, 黄文焘2, 张依恋3, 范勤勤1. 基于深度强化学习的多模态多目标多机器人任务分配算法[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 377-387.

MIAO Zhenhua(苗镇华), HUANG Wentao(黄文焘), ZHANG Yilian(张依恋), FAN Qinqin(范勤勤). Multi-Robot Task Allocation Using Multimodal Multi-Objective Evolutionary Algorithm Based on Deep Reinforcement Learning[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 377-387.

参考文献

[1] PATLE B K, BABU L G, PANDEY A, et al. A review: On path planning strategies for navigation of mobile robot [J]. Defence Technology, 2019, 15(4): 582-606.
[2] KAMRAN S, FARHAT I, TALHA MAHBOOB A, et al. The impact of artificial intelligence and robotics on the future employment opportunities [J]. Trends in Computer Science and Information Technology, 2020:50-54.
[3] MA Y E, LI B, HUANG W T, et al. An improved NSGA-II based on multi-task optimization for multiUAV maritime search and rescue under severe weather [J]. Journal of Marine Science and Engineering, 2023, 11(4): 781.
[4] CAO R Y, LI S C, JI Y H, et al. Task assignment of multiple agricultural machinery cooperation based on improved ant colony algorithm [J]. Computers and Electronics in Agriculture, 2021, 182: 105993.
[5] SEENU N, KUPPAN CHETTY R M, RAMYA M M, et al. Review on state-of-the-art dynamic task allocation strategies for multiple-robot systems [J]. Industrial Robot: the International Journal of Robotics Research and Application, 2020, 47(6): 929-942.
[6] LEE D H, ZAHEER S A, KIM J H. A resourceoriented, decentralized auction algorithm for multirobot task allocation [J]. IEEE Transactions on Automation Science and Engineering, 2015, 12(4): 1469-1481.
[7] CHEN X Y, ZHANG P, DU G L, et al. A distributed method for dynamic multi-robot task allocation problems with critical time constraints [J]. Robotics and Autonomous Systems, 2019, 118: 31-46.
[8] LEE D H. Resource-based task allocation for multirobot systems [J]. Robotics and Autonomous Systems, 2018, 103: 151-161.
[9] WANG S L, LIU Y J, QIU Y T, et al. Cooperative task allocation for multi-robot systems based on multiobjective ant colony system [J]. IEEE Access, 2022, 10:56375-56387.
[10] ALITAPPEH R J, JEDDISARAVI K. Multi-robot exploration in task allocation problem [J]. Applied Intelligence, 2022, 52(2): 2189-2211.
[11] LI J, YANG F. Task assignment strategy for multirobot based on improved grey wolf optimizer [J]. Journal of Ambient Intelligence and Humanized Computing, 2020, 11(12): 6319-6335.
[12] XUE F, DONG T T, YOU S Q, et al. A hybrid manyobjective competitive swarm optimization algorithm for large-scale multirobot task allocation problem [J]. International Journal of Machine Learning and Cybernetics, 2021, 12(4): 943-957.
[13] HUANG L, DING Y S, ZHOU M C, et al. Multiplesolution optimization strategy for multirobot task allocation [J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020, 50(11): 4283-4294.
[14] MIAO Z H, HUANG W T, JIANG Q C, et al. A novel multimodal multi-objective optimization algorithm for multi-robot task allocation [J]. Transactions of the Institute of Measurement and Control, 2023.https://doi.org/10.1177/01423312231183588.
[15] LI K W, ZHANG T, WANG R. Deep reinforcement learning for multiobjective optimization [J]. IEEE Transactions on Cybernetics, 2021, 51(6): 3103-3114.
[16] Li K, Zhang T, Wang R, et al. Research reviews of combinatorial optimization methods based on deep reinforcement learning [J]. Acta Automatica Sinica, 2021, 47(11): 2521-2537 (in Chinese).
[17] NAZARI M, OROOJLOOY A, SNYDER L V, et al. Deep reinforcement learning for solving the vehicle routing problem [DB/OL]. (2018-02-12) [2023-08-03]. https://arxiv.org/abs/1802.04240.
[18] Huang L. Intelligent optimization and dynamic coordination of multi-robot patrolling system [D]. Shanghai: Donghua University, 2020 (in Chinese).
[19] LIANG J, XU W W, YUE C T, et al. Multimodal multiobjective optimization with differential evolution [J]. Swarm and Evolutionary Computation, 2019, 44:1028-1059.
[20] FAN Q Q, ERSOY O K. Zoning search with adaptive resource allocating method for balanced and imbalanced multimodal multi-objective optimization [J]. IEEE/CAA Journal of Automatica Sinica, 2021, 8(6):1163-1176.
[21] FAN Q Q, YAN X F. Solving multimodal multiobjective problems through zoning search [J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems,2021, 51(8): 4836-4847.
[22] LIANG J, QIAO K J, YUE C T, et al. A clusteringbased differential evolution algorithm for solving multimodal multi-objective optimization problems [J]. Swarm and Evolutionary Computation, 2021, 60:100788.
[23] QU B Y, LI C, LIANG J, et al. A self-organized speciation based multi-objective particle swarm optimizer for multimodal multi-objective problems [J]. Applied Soft Computing, 2020, 86: 105886.
[24] GUERREIRO A P, FONSECA C M, PAQUETE L. The hypervolume indicator [J]. ACM Computing Surveys, 2022, 54(6): 1-42.
[25] HUANG H, YANG S L, LI X Q, et al. An embedded Hamiltonian graph-guided heuristic algorithm for two-echelon vehicle routing problem [J]. IEEE Transactions on Cybernetics, 2022, 52(7): 5695-5707.