基于改进VisionTransformer的森林火灾视频识别研究

打开文本图片集
键词:森林火灾;深度学习;目标检测;三维卷积神经网络;VisionTransformer中图分类号:S762 文献标志码:A 文章编号:1000-2006(2025)04-0186-09
Research on forest fire video recognition based on improved Vision Transformer
ZHANG Min,XIN Ying*,HUANG Tianqi
(College of Mechanical and Electrical Engineering,Northeast Forestry University,Harbin15O04O,China)
Abstract:【Objective】Thisresearchaimstoresolvethelimitationsofexistingforestfirerecognitionalgorithmsin temporal featureutilizationandcomputational eficiency,this studyproposesavideo-basedrecognitionmodel(C3DViT)to enhance bothdetection acuracyandoperational eficiencyin practical forest monitoring scenarios.【Method】We presentedahybridarchitectureintegrating 3DConvolutional Neural Networks(3DCNN)with Vision Transformer(ViT). Theframework emploied3Dconvolution kernels to extract spatiotemporal features from video sequences,which were subsequently tokenized intovectorrepresentations.Vision Transformer'sself-atentionmechanism thengloballymodels feature relationshipsacross temporalandspatial dimensions,with final classificationachievedthroughtheMLPHead layer.Comprehensive ablation studiedandcomparative experiments were conductedagainst ResNet5O,LSTM,YOLOv5, and baseline 3DCNN,ViT models.【Result】The C3D-ViT achieves 96. 10% accuracy,outperforming ResNet50 89.07% ),LSTM 93.26% ),and YOLOv5( 91.46% ),and hasimproved compared to the accuracyof the original 3DCNN and Vision Transformer( 93.91% , 90.43% ).The improved C3D-ViT model performs better in recognition performance,with highrecognition accuracyand stabilityunder unfavorable conditions such asocclusion,long distance, andthinsmoke.Thedemandforreal-timedetectioncanberealized.【Conclusion】The C3D-ViTframework ffectively addresses spatiotemporal modelingchallngesinwildfiredetection through synergistic CNN-Transformerinteraction, providing a technically viable solution for next-generation forest fire early warning systems.
Keywords:forest fire;deep learning;object detection;3DCNN;Vision Transformer(ViT)
森林是陆地生态系统重要的组成部分,在全球生态、碳循环及气候演变中起着重要作用[1]。(剩余16718字)