基于信息互补与交叉注意力的跨模态检索方法

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

关键词：信息互补；交叉注意力；图卷积网络；跨模态检索

中图分类号：TP391 文献标志码：A 文章编号：1001-3695（2025）07-015-2032-07

doi：10.19734/j.issn.1001-3695.2025.01.0003

Abstract：WiththerapidgrowthofmultimodaldataontheInternet，cross-modalretrievaltechnologyhasatractedwidespread atention.However，some multimodaldataoftenlacksemanticinformation，whichleadstotheinabilityof modelstoaccurately extracttheinherentsemanticfeatures.Aditionally，somemultimodaldatacontainredundantinformationunrelatedtosemantics，whichinterfereswiththemodelextractionofkeyinformation.Toaddresstis，thispaperproposedacrossmodalretrieval methodbasedoninformationcomplementarityandcross-atention（ICCA）.The methodusedaGCN tomodeltherelationships betweenmulti-labelsanddata，supplementing the mising semantic informationinmultimodaldataandthe missing sampledetailinformationinmulti-bels.Moreover，acrossattntionsubmoduleusedulti-labelinformationtoflerouttedudant semantic-irelevantdata.Toachievebetter matchingofsemanticallysimilarimagesand textsinthecommonrepresentation space，this paperproposed asemantic matching lossThislossintegrated multi-labelembeddings intothe image-text matching process，further enhancingthesemanticqualityof thecommonrepresentation.Experimentalresultsonthree widelyuseddatasets NUS-WIDE，MIRFlickr-25K，and MS-COCO demonstrate that ICCA achieves mAPvaluesof0.808，0.859，and0.837， respectively， significantly outperforming existing methods.

KeyWords：informationcomplementarity；cross-attention；graph convolutional network（GCN）；cros-modalretrieval

0 引言

近年来，随着互联网技术的飞速发展，视频、图像、文本等多媒体数据呈现出急剧增长的趋势。（剩余18541字）

试读结束

购买全文6.00元下一篇基于强化学习协同进化算法求解柔性作业车间节能调度问题

计算机应用研究

2025年07期

¥12.00/本