基于强化学习的人道主义应急物资分配优化研究

打开文本图片集
Research on the Optimization of Humanitarian Emergency Material Allocation Based on Reinforcement Learning
ZHANGJianjunYANGYundan ZHOU Yizhuo
(School of Economics and Management, Tongji University, Shanghai 2Ooo92,China)
Abstract: The efcient allocation of limited humanitarian aid supplies following major emergencies is a critical research topic,aiming to meet the material needs of affected areas while reducing the sufering of disaster victims. This paper addresses this issue by modeling a Mixed Integer Nonlinear Programming (MINLP) problem,which involves solving multi-period dynamic optimization allocation strategies.Reinforcement Learning (RL),as one of the two mainstream methods for current strategy exploration,is particularly suitable for dynamic resource allocation scenarios due to its strong scalability and adaptability to external dynamics through interaction with the environment and feedback signals. We employ the Dueling DQN algorithm to solve for the optimal policy,overcoming the overestimation of Q-values that has been a drawback in previous RL applications to humanitarian aid distribution. This approach more accurately estimates the action-value function for affcted regions. Additionally,the paper introduces a novel stochastic demand assumption,enhancing the model's realism and validity by better reflecting the actual conditions of disaster scenarios. The effectiveness of the proposed method is demonstrated using a numerical example based on the Ya'an earthquake,making this the first study to substantiate the optimization of emergency resource allocation using real data sources with RL. Comparative analysis shows that the Dueling DQN algorithm reduces the total cost by approximately 5% compared to traditional DQN methods, indicating a more effective reduction in the sufering of affected populations. This aligns with the“people-oriented”rescue principle of China and holds significant theoretical and practical implications for humanitarian-based emergencyresponses.
Key words: deep reinforcement learning; humanitarian; emergency supplies distribution
0 引言
在重大突发事件发生后,拯救生命、减轻受灾民众痛苦是灾害救援的首要目标。(剩余11650字)