双流特征增强与融合的弱监督时序动作定位

  • 打印
  • 收藏
收藏成功


打开文本图片集

关键词:弱监督;时序动作定位;空洞卷积;双流融合

中图分类号:TP391.4 文献标志码:A 文章编号:1001-3695(2025)07-039-2213-07

doi:10.19734/j. issn.1001-3695.2024.09.0373

Abstract:Weakly supervised temporalaction localizationaims to clasifyand locateaction instances inuntrimmed videos usingonlyvideo-levellabels.Existing models typicallusepre-trainedfeature extractors toextractsegment-levelRGBandopticalflowfeaturesfromvideos,butthepre-extractedsegment-levelvideofeaturesonlycovershorttimespansanddootonsider thecomplementarityandcorrelationbetweeRGBandopticalfo,hichaffctstheacuracyoflocalization.Totisnd,tis paper proposedaweakly-supervised temporalaction localization model withdual-streamfeatureenancementand fusion.Firstly,itexpandedthereceptivefieldthrough amulti-scaledensedilatedconvolution,alowing the modeltocover multipletime spansand capture the temporal dependenciesbetween video segments,resulting inenhancedRGBandopticalflow features. Then,itutilizedaconvolutionalnetworktoadaptivelyextractkeyfeaturesfromtheenhancedRGBandopticalflowfeaturesfor fusion,achievingcomplementarycorelationetweenRGBandopticalflowfeatures,furtherenrichingthevideofeaturerepre sentationand improving theaccuracyof themodel'slocalizationperformance.The modelachieves detectionaccuraciesof (20 73.9% and 43.5% on the THUMOS14 and ActivityNet1.3 datasets respectively,outperforming the existing state-of-the-art models,which proves the effectiveness of the proposed model.

Key Words:weakly supervision;temporal action localization;dilated convolution;dual-stream fusion

0 引言

时序动作定位作为视频理解领域中的关键任务,在诸如视频监控、异常检测和视频检索等实际应用中扮演着重要角色。(剩余19775字)

目录
monitor