基于Q-learning分布式训练的无人机自组织网络AODV路由协议

  • 打印
  • 收藏
收藏成功


打开文本图片集

中图分类号:TN915-34 文献标识码:A 文章编号:1004-373X(2025)15-0103-07

Distributed trained Q-learning based AODVroutingprotocol for flyingadhocnetwork

SUN Chen,WANG Yukun,WAN Jiamei,HOU Liang (School of Software,NanchangHangkongUniversity,Nanchang33oo63,China)

Abstract:Becauseof the highly dynamic natureandsparse topologyofflyingad hoc network (FANET) nodes,theexisting Q-learning-based routing protocol exposes the limitations that the Q valueupdate is delayed and it is difficult to adapt to the rapidchangeof network topologyquickly.Inviewof this,adistributedtrainedQ-learning basedAODV(adhocon-demand distancevector)(bbreviatedasDQL-AODVrouting protocolisproposed.Inthisprotocol,eachnodeisregardedasanagent, and the next hop of the packets tobe forwarded isselected according to the Q valueofdistributed training. The Q value ofeach nodeisupdatedlocallyandgloball.Firstly,thelocalrewardValueiscalculatedaccordingtothelifetimeofthelinkbetween nodesandtheloadcapacityofthenodes.Each timetheHello messageisreceived,themore stablenexthoplink isupdated toa higher Q value.Secondly,when the routing request message reaches the node (the object),a global Q valueupdateisperformed, andtheglobalrewardvalueiscalculatedacording tothenumberofforwarding hopsandtheaverageend-to-enddelayof the packet.Finally,theQ-learningalgoritmisusedtooptimizetheHellmessagesendingmechanism,whichalancesthenetwork topologyawareness degreeandrouting overheadefectively.Simulationresultsshowthat,incomparison withtheQL-AODV,the averageend-to-nddelaydatathroughput,packetarrvalrateandroutingcostoftheproposedmethodareoptimizedby19.93 % 15.48% ! 6.24% and 11.76% ,respectively.In addition,the proposed method has stronger convergence capability.To sum up,the effectiveness of the protocol has been verified.

Keywords:FANET;AODVrouting protocol; Q-learning distributed training; link quality; Hell message; routing decisi

0 引言

无人机集群技术已广泛应用于农业植保喷洒、森林火灾防控等受复杂地形与交通条件制约的场景[1-2],执行此类任务高度依赖于无人机自组织网络(FlyingAd-HocNetwork,FANET)以确保在高度机动性及节点分布稀疏环境下通信的稳定性和可靠性。(剩余10169字)

monitor