针对图像指代分割的训练后量化策略

打开文本图片集
关键词:图像指代分割;训练后量化;跨模态融合;深度学习
中图分类号:TP391.41;TP183 文献标志码:A 文章编号:1001-3695(2025)07-014-2025-07
doi:10.19734/j. issn.1001-3695.2024.10.0437
Abstract:RISaims tosegmentobjectsdescribedbysentencesinanimagebyintegratingvisualandlinguisticinformation. This technique has strong appication prospects ininteractiveimage editingandlanguage-guided human-machine interaction. However,existing solutions tendtoexplore high-performance models,neglecting practicalapplicationsonedgedeviceswith limited esources.ThepaperproposedaneficientPQframework toaddressthischallenge.Specifically,theanalysisdentifiedtherotcauseofperformancecollpsecausedbyusingtheround-to-nearest(RTN)quantization method.Thentheframework proposedatwo-regionbalancedquantizationstrategytosolvethenon-normaldistributionofactivationvaluesaftersoftmax and GELUoperations inthevisual encoder,andintroducedareordered groupingquantization strategytotacklethequantizationproblemscausedbyoutliersinthelinearlayersactivationof the textencoder.Extensiveexperimentsexploringdierent quantization bitwidthsonthreebenchmark datasetsdemonstratetheclearadvantages ofthe proposed methodover existing PTQ methods.AsthefirstquantizationschemespecificallfortheRIStask,itconfirmsthefeasibilityofeficientlydeployingthe RIS model to edge devices using the PTQ method.
Key words: referring image segmentation(RIS); post-training quantization(PTQ);cross-model fusion; deep learning
0引言
深度学习极大程度提高了视觉算法在许多图像分割任务上的性能,如语义分割[1]实例分割[2]等。(剩余18265字)