基于APIQ算法的多无人机攻防对抗策略

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

中图分类号：TP181 文献标志码：A DOI：10.12305/j. issn. 1001-506X. 2025.07. 14

Abstract： Due to the large number of unmanned aerial vehicles （UAVs） in the multi-UAV confrontation environment，there may be some problems such as value function dimension explosion and difcult convergence of strategy network when using conventional deep reinforcement learning methods to deal with such problems. Therefore，a strategy，attention policy interaction Q-learning（APIQ） swarm adversarial algorithm based on value decomposition and atention mechanism is proposed. The value decomposition idea is introduced to alleviate the dimension explosion problem of value function，and the weight of each value in the value decomposition is assigned based on attention mechanism，which promotes the convergence of the policy network.In order to verify the feasibility of APIQ algorithm in the multi-UAV confrontation problem，a realistic environment model is established，and the feasibilityof the algorithm is verified by simulation.The comparison with other algorithms shows that the UAV controled by APIQ algorithm has a higher victory rate in the confrontation.

Keywords ：multi-unmanned aerial vehicle （UAV）；reinforcement learning；value-decomposition network （VDN）；attention mechanism；maneuver decision-making

0 引言

无人机由于其自身低成本、高安全、低故障率等特殊的性能优势，能够有效适应未知、复杂、多变的战场环境[1-5]因而无人机在军事领域上的使用日趋广泛，如何对抗无人机也成为了相关领域的研究热点。（剩余20093字）

试读结束

购买全文6.00元下一篇基于强化学习的装备体系韧性优化方法

系统工程与电子技术

2025年07期

¥24.00/本