基于图文对比融合的图像人物情感识别

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

关键词：情感识别；视觉语言模型；情境感知；多模态融合

中图分类号：TP391.41 文献标志码：A 文章编号：1001-3695（2025）07-007-1972-06

doi：10.19734/j.issn.1001-3695.2024.12.0497

Abstract：Context-based recognition of human emotions in images has becomean increasingly popular task in recentyears， withaplication value in manyfields.Most existing methodsonly encode thehuman subjectandthe background separately，extracting isolatedfeaturesforsimple interaction，lackinganefectivefeaturefusionmechanismbetweenthesubjectandthecontextualbackground.Aimedtoaddresstheisueoftheinteractionbetweencomplexbackgroundsandthehumansubject，thispaperproposedanewnetwork forhumanemotionrecognitioninimages basedontext-imagecontrastivefusion.Firstly，itdesigned promptwords toextracttextualdescriptionsoftheemotionalstatebetweenthecontextualbackgroundandthetargethumansubjectbyfullyutilizedtheextensivesocialcontext informationandreasoningcapabilitiesof largevisual-language models.Secondly，it proposedatext-imagecontrastivefusionmodule，which fusedthecroppedtargethumansubjectimagefeatureswithhe textdescriptionfeaturesobtainedbasedonthepromptwordsthrough thismodule.Finaly，thefusionalgorithmintroduceda contrastive lossfunction tounifytherepresentationof imageencodingand text encoding，allowing for more accuratecaptureof efectiveemotionalexpresions during fusion.Experimentalresultsshowthat thenetorkcanlearnmoreefectiveemotioalfeature representations，and the network achieves superior results on the EMOTIC dataset with an mAP of 37.30% . The proposed methodbetterintegratesthefeaturesof thehumansubjectandthebackgroundintheimage，therebyimprovingtheaccuracyof human emotion recognition in images.

Key words：emotion recognition；vision-language model；context awareness；multimodal fusion

0 引言

人物情感识别系统已经应用到医疗健康、智慧教育、人机交互等领域，潜移默化地影响着人们的生活，情感识别在真实场景中面临着复杂多变的情况，如何根据情境线索识别人物情感具有重要意义。（剩余12948字）

试读结束

购买全文6.00元下一篇基于深度特征交互与层次化多模态融合的情感识别模型

计算机应用研究

2025年07期

¥12.00/本