基于改进Q-Learning算法的智能体路径规划研究

  • 打印
  • 收藏
收藏成功


打开文本图片集

关键词:Q-Learning;路径规划;动态学习率;蒙特卡洛树搜索;启发式奖励图分类号:TP391.4;G434 文献标识码:A 文章编号:1006-8228(2025)11-01-04

Path Planning for Agents Based on Improved Q-Learning Algorithm

Liu Shuo',Dong Xisong²,Zhao Wei1

(1.School ofRail Transportation,Shandong Jiaotong University,Jinan,Shandong25oooo,China;

StateKeyboratoryfultimodalArificialntellgenceSstemsstitutefutomatoineseAcademyofiece)

Abstract:Withthegrowingdemandforpathplanningbyagentsincomplexanddynamicenvironments,thelimitationsof traditionalQ-Learningalgorithms—suchasslowconvergence,lowobstacleavoidanceeficiencyandlimitedglobaloptiization capability—havebecomeincreasinglyapprent.ToaddresstheseshortcomingsofQ-Learninginpathplanning,thispaperproposes animprovedmethodthatintegratesadynamiclearningrate,anadaptiveexplorationrate,andMonteCarloTree Search (MCTS). First,anexponentiallydecayingdynamiclearningrateandexplorationrateareitroducedtobalanceexplorationablityintheearly trainingstagewithpolicystabiltyinlaterstages.Second,MCTSiscombinedwithQ-Learningtoleverageitsglobalsearch capablityforoptimizingtheQ-valueupdateprocess.Furthermore,aheurisicfunctionisincorporatedtoimprovethereward mechanism,guidingtheagentmoreeficientlytowardthetarget.Experimentalresultsdemonstratethattheenhancedalgorithm significantlyoutperformstheconventionalapproachintermsofaveragepathsteps,convergencespeed,andstabilityhisstudy offersan eficient and robust solution for agent path planning in complex environments.

eywords:Q-Learning;Path Planning;DynamicLearningRate;Monte Carlo Tree Search;Heuristic Rewai

0引言

在智能化浪潮奔涌的当下,制造业智能化转型迫在眉睫。(剩余7533字)

monitor
客服机器人