基于APF-MASAC算法的多无人车路径规划研究
作者:
作者单位:

1.安徽三联学院,安徽省普通高校交通信息与安全重点实验室;2.南京邮电大学 现代邮政学院;3.河海大学 数学学院

基金项目:

基于深度强化学习的自动驾驶决策系统


Research on Multi-unmanned Vehicle Path Planning Based on the APF-MASAC Algorithm
Author:
Affiliation:

1.Nanjing University of Posts and Telecommunications;2.College of Science,Hohai University,Jiangsu Nanjing 211100;3.China;4.key laboratory of traffic information and safety of higher education institutes

Fund Project:

Autonomous driving decision-making system based on deep reinforcement learning

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • | |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    针对真实环境中多无人车路径规划问题,在多智能体柔性动作-评价(Multi-Agent Soft Actor-Critic,MASAC)框架下给出了一种算法设计方案。为实现算法性能的优化与验证,本文从三个环节对算法进行优化。首先,基于势能塑形回报技术,设计了稠密的奖励函数,为算法的学习过程提供更为丰富、及时且有效的反馈信号,进而显著加速算法的收敛速度。其次,采用双连帧技术对传统经验回放池进行改良。该技术通过将连续的两帧观测数据作为一个整体单元纳入经验回放池,有效捕捉了环境状态变化的动态信息,提升训练效率与稳定性。再次,依托 Gazebo 仿真平台搭建了高度逼真的动态障碍物环境,为算法的训练提供了丰富多样且极具挑战性的训练样本,确保算法能够在模拟真实的条件下进行充分学习与优化。最后,通过消融实验和鲁棒性测试验证了算法的有效性。

    Abstract:

    Aiming at the path planning problem of multiple unmanned vehicles in the real environment, an algorithm design scheme is proposed under the framework of Multi-Agent Soft Actor-Critic (MASAC). To optimize and verify the performance of the algorithm, this paper optimizes the algorithm from three aspects. Firstly, based on the potential shaping reward technology, a dense reward function is designed to provide more abundant, timely and effective feedback signals for the learning process of the algorithm, thus significantly accelerating the convergence speed of the algorithm. Secondly, the traditional experience replay buffer is improved by using the double consecutive frame technology. This technology incorporates two consecutive frames of observation data as a whole unit into the experience replay buffer, effectively capturing the dynamic information of environmental state changes and improving the training efficiency and stability. Thirdly, relying on the Gazebo simulation platform, a highly realistic dynamic obstacle environment is built, which provides a rich variety of and extremely challenging training samples for the training of the algorithm, ensuring that the algorithm can be fully learned and optimized under simulated real conditions. Finally, the effectiveness of the algorithm is verified through ablation experiments and robustness tests.

    参考文献
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

闫冬梅,杨南禹,许佳佳,刘磊.基于APF-MASAC算法的多无人车路径规划研究[J].南京信息工程大学学报,,():

复制
分享
文章指标
  • 点击次数:0
  • 下载次数: 0
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2024-12-15
  • 最后修改日期:2025-03-24
  • 录用日期:2025-03-24

地址:江苏省南京市宁六路219号    邮编:210044

联系电话:025-58731025    E-mail:nxdxb@nuist.edu.cn

南京信息工程大学学报 ® 2025 版权所有  技术支持:北京勤云科技发展有限公司