时空变化注意力机制图神经网络的音频事件分类研究
            
                        
                        
            	
            
                  
                
                
            
            
                
                    
                    打开文本图片集
            
            中图分类号:TP18文献标识码:A
文章编号:2096-4706(2025)16-0057-07
Research on Audio Event Classification Based on Graph Neural Network with Spatio-Temporal Variation Attention Mechanism
ZHANG Mohua,LIU Ji (School ofComputerandInformation Enginering,Henan Universityof EconomicsandLaw,Zhengzhou45o046,China)
Abstract:Audio event classification faces challenges in complex scenarios,andthe existing methods strugle to capture temporalrelationshpseffectively.Toaddressthis,thispaperproposesaSpati-TmporalVariationAtentionbasedGraphNeural Network (STVA-GNN), which models audio-visual segments as sequential graph nodes and leverages a Negative Attention Mechanism to compute spatiotemporal variationfeatures betweenadjacent nodes,enhancing intra-modal andcross-modal dynamic information interactions.Thecore inovations include thata Contextual Information Compensation Module (CICM) capturesspatiotemporalevolutionpaterns,andaCross-Modal Gaph Variation IncentiveModule(CMGVI)enhancesaudionode weightsusingvideo-modalspatiotemporalvariations fordepfusion.ExperimentalresultsontheAudioSetdatasetdemonstrate that STVA-GNNachieves mAPandAUC scores of 0.56and0.94respectively,outperforming mainstream methods.Additionaly it maintains a significant advantage in noisy environments,verifying itsrobustness.
Keywords:audioeventclasification;Spatio-TemporalVariationAtentionMechanism;Temporal GraphNeuralNetwork; change information compensation;cross-modal information fusion
0 引言
音频信号是现实世界信息传递的关键载体,尤其在人工智能领域中,音频事件分析已成为研究的前沿热点。(剩余11893字)