多模态数据融合综述

  • 打印
  • 收藏
收藏成功

中图分类号:TP181 文献标志码:A 文章编号:1003-5168(2025)20-0037-06

DOI:10.19968/j.cnki.hnkj.1003-5168.2025.20.008

ASurvey on Multimodal DataFusion

ZHANG Tong1LIChenglin²MENGLi1 WANGHao1 SUYanchao1 (1.Henan Science and Technology Innovation Promotion Center,Zhengzhou 45ooo0, China; 2. Zhengzhou Sias College Kansas International College, Zhengzhou 451150, China)

Abstract:[Purposes] This study systematically analyzes the research progress, challenges and future directions of multimodal data fusion by integrating heterogeneous data from diffrent sensing or acquisition channels,in order to provide reference for further promoting the development and application of multimodal fusion.[Methods] First,the development of multimodal data fusion is combed,tracing its evolution from early military and remote sensing applications to modern methods driven by deep learning.Second, current core challnges are summarized, including fine-grained semantic alignment, dynamic weight adaptation,multimodal causal reasoning,robustness under extreme conditions,and their potential solutions are explored. Finally,future research directions are prospected, encompassing general multimodal representation learning,dynamic adaptive fusion mechanisms,lightweight edge computing,and ethics and sustainable development.[Findings] Multimodal data fusion stillfaces numerous challenges at the technological,data,engineering,and ethical levels.Future research should focus on three core objectives: technological breakthroughs,application scenario expansion,and enhancement of social value. [Conclusions] The ultimate goal of multimodal data fusion is to achieve human-like cross-modal cognitive ability,and its development will propel artificial intelligence from single-modal perception toward multimodal collaborative general intelligence.

Keywords: multimodal learning; data fusion; cross-modal alignment; deep learning; heterogeneous data

0 引言

多模态数据是指通过不同感知或采集方式获得的具有多种表现形式的数据。(剩余9208字)

monitor
客服机器人