基于约束型TD3的动态探索噪声改进算法

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

中图分类号：TP181；TP301.6；TP242 文献标识码：A 文章编号：2096-4706（2025）07-0103-06

Abstract： Aiming atthe problem that unconstrained exploration maycause damage to the mobile car，thisstudy proposes a ReinforcementLearning methodthatcombinesadaptive noiseexplorationandLagrangemultiplierconstraints，aiming tooptimize thetrajectoryplaningofthecarreachingthe targetpoint.Thismethodimprovestheexplorationefciencybydynamically adjusting the noise，uses the TD3algorithmtodeal with thecontinuousaction space，and uses the Lagrange multiplier method to deal withtheconstraints，whichis diferentfromthe wayofaddingthepenaltyofunexpectedbehaviordirectlyintheMarkov Decision Process（MDP）.Simulation experiments show that this methodcan effectively guidethecar to avoid obstacles，educe theviolationofconstraints，andensurethesafetyandreliabilityofthetask，showinggoodtrainingconvergencecharacteristics.

Keywords： SafetyReinforcementLearning; ConstrainedMarkovDecision Proces;trajectoryplanning;TD3algorithm

0 引言

随着自动化技术的飞速发展，机器人技术已在工业制造、服务业等众多领域得以广泛应用[1]，成为提升作业效率与操作精确度的关键要素。（剩余8693字）

试读结束

购买全文5.00元下一篇基于微信小程序的沧州大运河文化旅游系统的设计与实现

现代信息科技

2025年07期

¥18.00/本