融合注意力的多残差膨胀三维卷积红外视频行为识别

打开文本图片集
关键词:红外视频;融合注意力;多尺度;行为识别
中图分类号:TP391.4 文献标志码:A 文章编号:1001-3695(2025)10-036-3174-08
doi:10.19734/j.issn.1001-3695.2025.01.0030
Infrared video human action recognition based on attention integrated multi-residual inflated 3D ConvNet
Yuan Shuai,Yu Lei†,Wang Jiajie,Xiong Bangshu (KeyLaboratoyfgerocessn&PateRecitioofJgirocencagngongUniesitycgn)
Abstract:Infrared videosarevulnerable tobackground interferenceandlackrichtemporalvariationsduetotheirphysical properties,leading topoorhumanactionrecognition.Toddressthisissue,thispaperproposedamethodcombiningatentionmechanismswithmulti-residualdilated3Donvolutionforinfraredvideoactionrecogition.Fstlyitdesignedtheintegratedulti-imensionalatentio(IMDA)tohandlebackgroundinterferenceandcapturekeyreas inactionframesbyintegratingspatial-temporalandchannelinformation.Secondly,itdesignedthetemporalpyramidsplitatention(TPSA)tocapturedynamicinformationacrossmultiplescalesbysplitingatentioninthetemporal dimension.Basedon TPSA,thispaper furtherdeveloped the temporal pyramidsplitatentionmulti-residual(TSAMR)module toenhancethemodel’sgeneralization inprocessnginfrared videos under diffrent background conditions. Experiments show that this method achieves accuracy of 81. 73% and 90.76% on the IITR-IAR and campus safety action datasets,outperforming existing action recognition methods.
Keywords:infrared video;multi-dimensional attention;multi-scale;action recognition
0 引言
人体行为识别,即依据采集的数据深入分析并准确归类人体的各种动作和姿态,对于众多智能服务而言,是构建其功能的基石。(剩余20032字)