Intelligent Robots

Cooperative Pursuit of Unmanned Surface Vehicles Using Multi-Agent Reinforcement Learning

Expand
  • College of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, Liaoning, China

Received date: 2024-11-13

  Accepted date: 2024-12-02

  Online published: 2026-02-12

Abstract

This paper is concerned with the cooperative pursuit of unmanned surface vehicles (USVs) against the dynamic escaping target using multi-agent reinforcement learning. The Markov game process is established for pursuit-evasion, and the success criteria for cooperative capture of USVs are given by using distance and angle constraints. By virtue of the centralized training and decentralized execution framework as well as the long short-term memory network, cooperative pursuit training is conducted using the multi-agent soft actor-critic reinforcement learning, which can optimize capture performance of USVs against the escaping target. Besides, to avoid the occurrence of lazy capturer and increase the capture success rate, a multi-stage reward guidance method is developed, where the training process can be optimized according to the current states of both sides, effectively guiding vehicle to achieve the capture task from easy to difficult. Simulations are provided to illustrate the effectiveness of the proposed reinforcement learning method for cooperative pursuit of USVs.

Cite this article

Qu Xingru, Li Chu, Jiang Yuze, Long Feifei, Zhang Rubo . Cooperative Pursuit of Unmanned Surface Vehicles Using Multi-Agent Reinforcement Learning[J]. Journal of Shanghai Jiaotong University(Science), 2026 , 31(1) : 187 -194 . DOI: 10.1007/s12204-025-2816-6

References

[1] MU Z X, PAN J, ZHOU Z Y, et al. A survey of the pursuit–evasion problem in swarm intelligence [J]. Frontiers of Information Technology & Electronic Engineering, 2023, 24(8): 1093-1116.

[2] GAN W H, QU X Q, SONG D L, et al. Multi-USV cooperative chasing strategy based on obstacles assistance and deep reinforcement learning [J]. IEEE Transactions on Automation Science and Engineering, 2024, 21(4): 5895-5910.

[3] CHEN L, DUAN H B. Cooperative enclosing control for networked unmanned aerial vehicles to faster target [J]. Journal of Guidance, Control, and Dynamics, 2024, 47(2): 366-374.

[4] ZHOU M, WANG Z H, WANG J, et al. Multi-robot collaborative hunting in cluttered environments with obstacle-avoiding voronoi cells [J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(7): 1643-1655.

[5] XING N, ZHANG H T, ZHU L J. Prescribed-time collective evader-capturing for autonomous surface vehicles [J]. Automatica, 2024, 167: 111761.

[6] FAN Z L, YANG H Y, LIU F, et al. Reinforcement learning method for target hunting control of multi-robot systems with obstacles [J]. International Journal of Intelligent Systems, 2022, 37(12): 11275-11298.

[7] FANG X, WANG C, XIE L H, et al. Cooperative pursuit with multi-pursuer and one faster free-moving evader [J]. IEEE Transactions on Cybernetics, 2022, 52(3): 1405-1414.

[8] CHEN C, LIANG X, ZHANG Z, et al. Cooperative strategy based on a two-layer game model for inferior USVs to intercept a superior USV [J]. Ocean Engineering, 2024, 293: 116600.

[9] SUN W, TSIOTRAS P, LOLLA T, et al. Multiple-pursuer/one-evader pursuit–evasion game in dynamic flowfields [J]. Journal of Guidance, Control, and Dynamics, 2017, 40(7): 1627-1637.

[10] QU X R, JIANG Y Z, ZHANG R B, et al. A deep reinforcement learning-based path-following control scheme for an uncertain under-actuated autonomous marine vehicle [J]. Journal of Marine Science and Engineering, 2023, 11(9): 1762.

[11] DONG Y B, CUI T, ZHOU Y F, et al. Reward function design method for long episode pursuit tasks under polar coordinate in multi-agent reinforcement learning [J]. Journal of Shanghai Jiao Tong University (Science), 2024, 29(4): 646-655.

[12] DU W B, GUO T, CHEN J, et al. Cooperative pursuit of unauthorized UAVs in urban airspace via multi-agent reinforcement learning [J]. Transportation Research Part C: Emerging Technologies, 2021, 128: 103122.

[13] MA J C, LU H M, XIAO J H, et al. Multi-robot target encirclement control with collision avoidance via deep reinforcement learning [J]. Journal of Intelligent & Robotic Systems, 2020, 99(2): 371-386.

[14] XIA J W, LUO Y S, LIU Z K, et al. Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning [J]. Defence Technology, 2023, 29: 80-94.

[15] NANTOGMA S, ZHANG S Y, YU X W, et al. Multi-USV dynamic navigation and target capture: A guided multi-agent reinforcement learning approach [J]. Electronics, 2023, 12(7): 1523.

[16] QU X Q, GAN W H, SONG D L, et al. Pursuit-evasion game strategy of USV based on deep reinforcement learning in complex multi-obstacle environment [J]. Ocean Engineering, 2023, 273: 114016.

[17] LI F B, YIN M M, WANG T D, et al. Distributed pursuit-evasion game of limited perception USV swarm based on multiagent proximal policy optimization [J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2024, 54(10): 6435-6446.

[18] ZHANG H Q, SHI J H, WU L H, et al. Multi-agent self-organizing cooperative hunting in non-convex environment with improved MADDPG algorithm [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(8): 2080-2090 (in Chinese).

[19] FOSSEN T. Handbook of marine craft hydrodynamics and motion control [M]. Chichester: Wiley, 2011.

[20] HE Z C, DONG L, SONG C W, et al. Multiagent soft actor-critic based hybrid motion planner for mobile robots [J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(12): 10980-10992.

[21] WANG N, SUN Z, JIAO Y H, et al. Surge-heading guidance-based finite-time path following of underactuated marine vehicles [J]. IEEE Transactions on Vehicular Technology, 2019, 68(9): 8523-8532.

Outlines

/