针对图像指代分割的训练后量化策略

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

关键词：图像指代分割；训练后量化；跨模态融合；深度学习

中图分类号：TP391.41;TP183 文献标志码：A 文章编号：1001-3695（2025）07-014-2025-07

doi：10.19734/j. issn.1001-3695.2024.10.0437

Abstract：RISaims tosegmentobjectsdescribedbysentencesinanimagebyintegratingvisualandlinguisticinformation. This technique has strong appication prospects ininteractiveimage editingandlanguage-guided human-machine interaction. However，existing solutions tendtoexplore high-performance models，neglecting practicalapplicationsonedgedeviceswith limited esources.ThepaperproposedaneficientPQframework toaddressthischallenge.Specifically，theanalysisdentifiedtherotcauseofperformancecollpsecausedbyusingtheround-to-nearest（RTN）quantization method.Thentheframework proposedatwo-regionbalancedquantizationstrategytosolvethenon-normaldistributionofactivationvaluesaftersoftmax and GELUoperations inthevisual encoder，andintroducedareordered groupingquantization strategytotacklethequantizationproblemscausedbyoutliersinthelinearlayersactivationof the textencoder.Extensiveexperimentsexploringdierent quantization bitwidthsonthreebenchmark datasetsdemonstratetheclearadvantages ofthe proposed methodover existing PTQ methods.AsthefirstquantizationschemespecificallfortheRIStask，itconfirmsthefeasibilityofeficientlydeployingthe RIS model to edge devices using the PTQ method.

Key words： referring image segmentation（RIS）； post-training quantization（PTQ）;cross-model fusion; deep learning

0引言

深度学习极大程度提高了视觉算法在许多图像分割任务上的性能，如语义分割[1]实例分割[2]等。（剩余18265字）

试读结束

购买全文6.00元下一篇基于信息互补与交叉注意力的跨模态检索方法

计算机应用研究

2025年07期

¥12.00/本