基于图的稀疏注意力实例分割算法

  • 打印
  • 收藏
收藏成功


打开文本图片集

关键词:视觉图神经网络;稀疏矩阵;注意力机制;实例分割;消融实验;全局稀疏视觉图注意力中图分类号:TN911.73-34;TP391.41 文献标识码:A文章编号:1004-373X(2026)01-0086-09

Graphbasedsparseattentioninstancesegmentationalgorithm

LI Sen,WANG Xiaoming (School ofComputer and SoftwareEngineering,Xihua University,Chengdu 61Oo39,China)

Abstract:The vision graph neural network (vision GNN)provides anew paradigm forcomputer vision.However,the foundationalmodelViGfaceshighcomputationalcostsduetoitsrelianceontheK-nearestneighbor(KNN)algorithmforgraph construction.InthesubsequentlyproposedMobile ViG,sparsevisual graphatention (SVGA)isutilizedtoimproveefficiency, but itsstaticgaphconstruction limitstheacquisitionof gloalfeatures.Toaddresstheseisses,thispaperproposesaglobal sparsevisual graphatention(GSVGA)model.IntheGSVGAmodel,anewsparsematrix isconstructedbyconnectingpixel nodesacrossrows,columns,anddiagonalsoftheimage,soastoeliminatethe featurelossin SVGAandavoid the timeconsumingtensorreshapingoperations.Amax-relativegraphconvolution(MRConv)isintroduced,implementing feature aggregationbyrollngoperationsacrossleft,right,anddownwarddimensions.Thereceptivefieldisexpandedcombinedwith dilatedconvolution,soastoobtainmorefeatureinformationInadition,themoduleintegratestheupdatedfeed-forwardnetwork andselectstheGELUactivationfunctiontoimprovethestabilityoffeaturetransformationunderlarge-scalecalculation. ExperimentalresultsshowthatGSVGAperformswellonCOCO2017dataset,iSAIDdatasetandBDD10OKdataset.Intermsof reasoningeficiencythetimeconsumptionofGSVGAdecreasessignificantlywiththeincreaseof iteration.Itsaccuracyon ImageNet-1K is88.12%,itsmAP @ 0.5:0.95 of the segmentation of medium object instance is O.675 1,and its test set error is thesmalest inthecomparisonalgorithms.Thevisualizationresultsshow thattheGSVGAcansegment singleand multiple instances more accurately,and has strong generalization ability and small object detection ability.

Keywords:vision GNN; sparse matrix;atention mechanism; instance segmentation; ablation experiment; GSVGA

0 引言

近年来随着图神经网络(GNN)的兴起[,GNN这种基于图结构数据的深度学习模型也在计算机视觉中开始应用。(剩余12375字)

monitor
客服机器人