J Shanghai Jiaotong Univ Sci ›› 2025, Vol. 30 ›› Issue (6): 1125-1133.doi: 10.1007/s12204-023-2666-z
• Automation & Computer Technologies • Previous Articles Next Articles
李春阳,朱晓庆,阮晓钢,刘鑫源,张思远
Received:2023-02-20
Accepted:2023-03-14
Online:2025-11-21
Published:2023-11-06
CLC Number:
LI Chunyang, ZHU Xiaoqing, RUAN Xiaogang, LIU Xinyuan, ZHANG Siyuan. Gait Learning Reproduction for Quadruped Robots Based on Experience Evolution Proximal Policy Optimization[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(6): 1125-1133.
| [1] YANG J J, SUN H, WANG C H, et al. An overview of quadruped robots [J]. Navigation Positioning and Timing, 2019, 6(5): 61-73 (in Chinese). [2] ZHANG W, TAN W H, LI Y B. Locmotion control of quadruped robot based on deep reinforcement learning: Review and prospect [J]. Journal of Shandong University (Health Sciences), 2020, 58(8): 61-66 (in Chinese). [3] KOHL N, STONE P. Policy gradient reinforcement learning for fast quadrupedal locomotion [C]//IEEE International Conference on Robotics and Automation, 2004. New Orleans: IEEE, 2004: 2619-2624. [4] YANG C Y, YUAN K, ZHU Q G, et al. Multi-expert learning of adaptive legged locomotion [J]. Science Robotics, 2020, 5(49): eabb2174. [5] LEE J, HWANGBO J, WELLHAUSEN L, et al. Learning quadrupedal locomotion over challenging terrain [J]. Science Robotics, 2020, 5(47): eabc5986. [6] THOR M, KULVICIUS T, MANOONPONG P. Generic neural locomotion control framework for legged robots [J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(9): 4013-4025. [7] PENG X B, ABBEEL P, LEVINE S, et al. DeepMimic: Example-guided deep reinforcement learning of physics-based character skills [J]. ACM Transactions on Graphics, 2018, 37(4): 1-14. [8] PENG X B, COUMANS E, ZHANG T N, et al. Learning agile robotic locomotion skills by imitating animals [DB/OL]. (2020-04-02). https://arxiv.org/abs/2004.00784 [9] RAHME M, ABRAHAM I, ELWIN M L, et al. Linear policies are sufficient to enable low-cost quadrupedal robots to traverse rough terrain [C]//2021 IEEE/RSJ International Conference on Intelligent Robots and Systems. Prague: IEEE, 2021: 8469-8476. [10] TAN J, ZHANG T, COUMANS E, et al. Sim-to-real: Learning agile locomotion for quadruped robots[J]. (2018-04-27). https://arxiv.org/abs/1804.10332 [11] WANG Z, CHEN C L, DONG D Y. Instance weighted incremental evolution strategies for reinforcement learning in dynamic environments [J]. IEEE Transactions on Neural Networks and Learning Systems, 2022. https://doi.org/10.1109/TNNLS.2022.3160173 [12] BELLEGARDA G, CHEN Y Y, LIU Z C, et al. Robust high-speed running for quadruped robots via deep reinforcement learning [C]//2022 IEEE/RSJ International Conference on Intelligent Robots and Systems. Kyoto: IEEE, 2022: 10364-10370. [13] SHENG J P, CHEN Y Y, FANG X, et al. Bio-inspired rhythmic locomotion for quadruped robots [J]. IEEE Robotics and Automation Letters, 2022, 7(3): 6782-6789. [14] SHI H J, ZHOU B, ZENG H S, et al. Reinforcement learning with evolutionary trajectory generator: A general approach for quadrupedal locomotion [J]. IEEE Robotics and Automation Letters, 2022, 7(2): 3085-3092. [15] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms [DB/OL]. (2017-07-20). https://arxiv.org/abs/1707.06347 [16] PITCHAI M, XIONG X F, THOR M, et al. CPG driven RBF network control with reinforcement learning for gait optimization of a dung beetle-like robot[M]// Artificial neural networks and machine learning – ICANN 2019: Theoretical neural computation. Cham: Springer, 2019: 698-710. [17] SALIMANS T, HO J, CHEN X, et al. Evolution strategies as a scalable alternative to reinforcement learning [DB/OL]. (2017-05-10). https://arxiv.org/abs/1703.03864 [18] SUTTON R S, MCALLESTER D, SINGH S, et al. Policy gradient methods for reinforcement learning with function approximation [C]// 12th International Conference on Neural Information Processing Systems. Denver: ACM, 1999: 1057-1063. [19] BIE T, ZHU X Q, FU Y, et al. Safety priority path planning method based on Safe-PPO algorithm [J]. Journal of Beijing University of Aeronautics and Astronautics, 2023, 49(8): 2108-2118 (in Chinese). [20] SCHULMAN J, MORITZ P, LEVINE S, et al. High-dimensional continuous control using generalized advantage estimation [DB/OL]. (2015-06-08). https://arxiv.org/abs/1506.02438 [21] COUMANS E, BAI Y F. PyBullet quickstart guide [EB/OL]. [2023-02-01]. https://usermanual.wiki/Document/PyBullet20Quickstart20Guide.543993445.pdf |
| [1] | YU Xinyi, XU Siyu, FAN Yuehai, OU Linlin. Self-Adaptive LSAC-PID Approach Based on Lyapunov Reward Shaping for Mobile Robots [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(6): 1085-1102. |
| [2] | Cheng Hongyu, Zhang Han, Wang Shuang , Xie Le. Design of a 6-DOF Master Robot for Robot-Assisted Minimally Invasive Surgery [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(4): 658-667. |
| [3] | Wang Wei, Zhou Cheng, Jiang Jinlei, Cui Xinyuan, Yan Guozheng, Cui Daxiang. Optimization of Wireless Power Receiving Coil for Near-Infrared Capsule Robot [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(3): 425-432. |
| [4] | Li Tao, Zhao Zhigang, Zhu Mingtong, Zhao Xiangtang. Cable Vector Collision Detection Algorithm for Multi-Robot Collaborative Towing System [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(2): 319-329. |
| [5] | Fu Yujia, Zhang Jian, Zhou Liping, Liu Yuanzhi, Qin Minghui, Zhao Hui, Tao Wei. Passive Binocular Optical Motion Capture Technology Under Complex Illumination [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(2): 352-362. |
| [6] | Nie Wei, Liang Xinwu. Efficient Fully Convolutional Network and Optimization Approach for Robotic Grasping Detection Based on RGB-D Images [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(2): 399-416. |
| [7] | ZHAO Yanfei1,2,3(赵艳飞), XIAO Peng4 (肖鹏), WANG Jingchuan1,2,3* (王景川), GUO Rui4*(郭锐). Semi-Autonomous Navigation Based on Local Semantic Map for Mobile Robot [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 27-33. |
| [8] | FU Hang1 (傅航),XU Jiangchang1 (许江长), LI Yinwei2,4* (李寅炜),ZHOU Huifang2,4 (周慧芳),CHEN Xiaojun1,3* (陈晓军). Augmented Reality Based Navigation System for Endoscopic Transnasal Optic Canal Decompression [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 34-42. |
| [9] | ZHOU Hanwei1 (周涵巍),ZHU Xinping1 (朱心平),MA Youwei2 (马有为),WANG Kundong1* (王坤东). Low Latency Soft Fiberoptic Choledochoscope Robot Control System [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 43-52. |
| [10] | HE Guisong (贺贵松), HUANG Xuegong* (黄学功),LI Feng(李峰). Coordination Design of a Power-Assisted Ankle Exoskeleton Robot Based on Active-Passive Combined Drive [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 197-208. |
| [11] | LIU Yuesheng (刘月笙), HE Ning∗ (贺宁), HE Lile (贺利乐), ZHANG Yiwen (张译文), XI Kun (习坤), ZHANG Mengrui (张梦芮). Self-Tuning of MPC Controller for Mobile Robot Path Tracking Based on Machine Learning [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(6): 1028-1036. |
| [12] | DONG Yubo1 (董玉博), CUI Tao1 (崔涛), ZHOU Yufan1 (周禹帆), SONG Xun2 (宋勋), ZHU Yue2 (祝月), DONG Peng1∗ (董鹏). Reward Function Design Method for Long Episode Pursuit Tasks Under Polar Coordinate in Multi-Agent Reinforcement Learning [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 646-655. |
| [13] | DU Haikuo1,2 (杜海阔), GUO Zhengyu3,4(郭正玉), ZHANG Lulu1,2(章露露), CAI Yunze1,2∗ (蔡云泽). Multi-Objective Loosely Synchronized Search for Multi-Objective Multi-Agent Path Finding with Asynchronous Actions [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 667-677. |
| [14] | DONG Dejin1,2 (董德金), DONG Shiyin3 (董诗音), ZHANG Lulu1,2 (章露露), CAI Yunze1,2∗ (蔡云泽). Multi-AGVs Scheduling with Vehicle Conflict Consideration in Ship Outfitting Items Warehouse [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 725-736. |
| [15] | LI Shuyi (李舒逸), LI Minzhe (李旻哲), JING Zhongliang∗ (敬忠良). Multi-Agent Path Planning Method Based on Improved Deep Q-Network in Dynamic Environments [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 601-612. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||