Research on Multi-unmanned Vehicle Path Planning Based on the APF-MASAC Algorithm
Author:
Affiliation:

1.Nanjing University of Posts and Telecommunications;2.College of Science,Hohai University,Jiangsu Nanjing 211100;3.China;4.key laboratory of traffic information and safety of higher education institutes

Fund Project:

Autonomous driving decision-making system based on deep reinforcement learning

  • Article
  • | |
  • Metrics
  • |
  • Reference [17]
  • | |
  • Cited by
  • | |
  • Comments
    Abstract:

    Aiming at the path planning problem of multiple unmanned vehicles in the real environment, an algorithm design scheme is proposed under the framework of Multi-Agent Soft Actor-Critic (MASAC). To optimize and verify the performance of the algorithm, this paper optimizes the algorithm from three aspects. Firstly, based on the potential shaping reward technology, a dense reward function is designed to provide more abundant, timely and effective feedback signals for the learning process of the algorithm, thus significantly accelerating the convergence speed of the algorithm. Secondly, the traditional experience replay buffer is improved by using the double consecutive frame technology. This technology incorporates two consecutive frames of observation data as a whole unit into the experience replay buffer, effectively capturing the dynamic information of environmental state changes and improving the training efficiency and stability. Thirdly, relying on the Gazebo simulation platform, a highly realistic dynamic obstacle environment is built, which provides a rich variety of and extremely challenging training samples for the training of the algorithm, ensuring that the algorithm can be fully learned and optimized under simulated real conditions. Finally, the effectiveness of the algorithm is verified through ablation experiments and robustness tests.

    Reference
    [1] Tang G, Tang C, Claramunt C, et al. Geometric A-star Algorithm: An Improved A-Star Algorithm for AGV Path Planning in a Port Environment[J]. IEEE Access, 2021, 9: 59196-59210.
    [2] Chen Y, Bai G, Zhan Y, et al. Path Planning and Obstacle Avoiding of The USV Based on Improved ACO-APF Hybrid Algorithm with Adaptive Early-Warning[J]. IEEE Access, 2021, 9: 40728-40742.
    [3] Li B, Qi X, Yu B, et al. Trajectory Planning for UAV Based on Improved ACO Algorithm[J]. IEEE Access, 2019, 8: 2995-3006.
    [4] Lin S, Liu A, Wang J, et al. An Intelligence-Based Hybrid PSO-SA For Mobile Robot Path Planning in Ware House[J]. Journal of Computational Science, 2023, 67: 101938
    [5] Lv L, Zhang S, Ding D, et al. Path Planning via an Improved DQN-Based Learning Policy[J]. IEEE Access, 2019, 7: 67319-67330.
    [6] Yang Y, Li J, Peng L. Multi-robot Path Planning Based on a Deep Reinforcement Learning DQN Algorithm[J]. CAAI Transactions on Intelligence Technology, 2020, 5(3).
    [7] Tordesillas J, How J P. MADER: Trajectory Planner in Multi-Agent and Dynamic Environments[J]. 2020.
    [8] Haarnoja T, Zhou A, Abbeel P, et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor[C]//International conference on machine learning. PMLR, 2018: 1861-1870.
    [9] Dong Y, Zou X. Mobile Robot Path Planning Based on Improved DDPG Reinforcement Learning Algorithm[C]. 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS). IEEE, 2020: 52-56.
    [10] Zhu Q, Hu J, Cai W, et al. A New Robot Navigation Algorithm for Dynamic Unknown Environments Based on Dynamic Path Re-Computation and an Improved Scout Ant Algorithm[J]. Applied Soft Computing, 2011, 11(8): 4667-4676.
    [11] 杨南禹,时正华. 基于人工势场-SAC算法的无人车路径规划研究[J]. 计算技术与自动化, 2024, 43(2): 82-87.
    [12] Lowe R., Wu Y., et al. Multi-agent Actor-Critic for Mixed Cooperative Competitive Environments[C]. International Conference on Neural Information Processing Systems, Los Angeles, CA, USA, 2017, 6382-6393.
    [13] 肖硕, 黄珍珍, 张国鹏等. 基于SAC的多智能体深度强化学习算法[J]. 电子学报, 2021, 49(09):? 1675-1681.
    [14] Qie H, Shi D, Shen T, et al. Joint Optimization of? Multi-UAV Target Assignment and Path Planning Based on Multi-Agent Reinforcement Learning[J]. IEEE Access, 2019, 7: 146264-146272.
    [15] Li X, Liu H, Li J, et al. Deep deterministic policy gradient algorithm for crowd-evacuation path planning[J]. Computers Industrial Engineering, 2021, 161: 107621.
    [16] Son K, Kim D, Kang W J, et al. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning[J]. 2019.
    [17] Semnani S H, Liu H, Everett M, et al. Multi-agent Motion Planning for Dense and Dynamic Environments via Deep Reinforcement Learning[J]. arXiv, 2020.
    Related
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation
Share
Article Metrics
  • Abstract:19
  • PDF: 0
  • HTML: 0
  • Cited by: 0
History
  • Received:December 15,2024
  • Revised:March 24,2025
  • Adopted:March 24,2025
Article QR Code

Address:No. 219, Ningliu Road, Nanjing, Jiangsu Province

Postcode:210044

Phone:025-58731025