基于风险敏感的自动驾驶汽车分层强化学习决策

  • 打印
  • 收藏
收藏成功


打开文本图片集

Risk-sensitive hierarchical reinforcement learning decisionmaking for autonomous vehicles

HUZhilong1,PEI Xiaofei*1,2,ZHOU Honglong1,WEI Weiran2 (1.HubeiKeyLaboratoryofAdvancedTechnologyofAutomotiveComponents,Wuhan UniversityofTechnology, Wuhan43oo70,China;2.HubeiColaborativeInnovationCenterofAutomotiveComponentsTechnologyWuhan UniversityofTechnology,Wuhan,China)

Abstract:Inorder to make the behavior decision of autonomous vehicles fully consider the inherent uncertainty in the traficenvironment,this paper introduced quantile regressionand Conditional Valueat Risk (CVaR)basedonthe traditional RainbowDQNalgorithm,taking low-probabilityrisks intoconsideration,and properly balancing risksand benefits,so thatitcan make saferand more humane driving decisions.Abehavioral decisionmodelwas established basedonthe Markov framework,and the reward functionandaction space weredesigned by comprehensivelyconsidering safety,efficiencyandcomfort.Aplanningand control model wasbuilt,andtwoscenariosof highwayinflowandoutflowand intersectionwerebuiltusing theOpenNatural Driving Inteligent Vehicle SimulationTest Environment (OnSite)platform.The OnSiteevaluation tool was used tosimulateand compare the four algorithms of RainbowDQN-CVaR,RainbowDQN-QR,RainbowDQNand DSAC-T.The results show that in complex highway merging and exiting scenarios and intersection scenarios, theproposed RainbowDQN-CVaRalgorithmscores 55.3% and 47% higher than the traditional RainbowDQN algorithm, 17.7% and 34.3% higher than the RainbowDQN-QRalgorithm,and 2.8% and 62.7% higher than the DSAC-Talgorithm.Theeffectivenessof theRainbowDQN-CVaRbehaviordecisionmodel isverified,and itcan make saferand more reasonable decisions in a more complex traffic environment,making the autonomous driving vehicle have higher driving safetyand efficiency.

Key words:autonomous driving; reinforcement learning; behavioral decision-making; quantile regression; conditional value at risk (CVaR)

面向复杂开放道路环境下的自动驾驶汽车,还存在驾驶行为过于保守、人工接管率高等问题,主要体现在决策规划的智慧程度不高,由此会带来一系列安全风险,如自车碰撞风险、整体交通流的安全风险等。(剩余11946字)

monitor