面向飞行器智能协同控制的分层双时延策略梯度强化学习方法

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

中图分类号：TJ765 文献标志码：A

DOI： 10.7652/xjtuxb202509009 文章编号：0253-987X（2025）09-0088-11

Abstract： To address the modeling and coordination challenges in intelligent cooperative control of aircraft caused by large-scale systems，complex environments，and resource constraints，this study proposes an intelligent cooperative control method by establishing a hierarchical multi-agent decision-making architecture with the goal of improving decision-making algorithm efficiency. First，aircraft is modeled as an intelligent agent to establish a cooperative control framework. Second，a partially observable Markov decision process （POMDP） model is employed to handle incomplete observation information. Then，to tackle the issues of dynamic game environments and high learning costs，a hierarchical twin-delayed policy gradient reinforcement learning method based on centralized training with decentralized execution is proposed，which effectively combines model-based and model-free mechanisms to leverage existing game environment evolution models. Finall，under the hierarchical decision-making framework，simulations of typical multi-aircraft game scenarios and thousands of multi-scenario tests are conducted. The results demonstrate that the proposed method successfully resolves multi-aircraft cooperative control problem. Compared to the multi-agent reinforcement learning algorithms MAPPO and QMIX，the training time is reduced by 51.03% and 79.03% ，algorithm efficiency （cumulative reward） is improved by 37.51% and 58.73% ， and evasion maneuver success rate is increased by 17.63% and 39.79% ，respectively.

Keywords： intelligent decision-making; multi-aircraft intelligent cooperative control; hierarchical decision；reinforcement learning

当前多飞行器协同控制体系正加速从单平台独立控制向跨域智能博弈演进，其技术突破聚焦于信息物理系统的深度耦合、智能决策架构重构以及集群协同机制创新[1]。（剩余17918字）

试读结束

购买全文6.00元下一篇采用双小波字典的齿轮箱复合故障特征提取方法

西安交通大学学报

2025年09期

¥4.00/本