大模型驱动的多模态智能感知小车控制方法研究

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

中图分类号：TP18 文献标识码：A

文章编号：2096-4706（2025）22-0179-06

Research on the Control Method of Large Model-driven Multimodal Intelligent Sensing Vehicle

LIUKe，CHENSiwei，KANGYilin，LANJiayi，LIUei （South-Central Minzu University，Wuhan 43oo74，China）

Abstract：The Raspberry Pi intellgent car control system supporting visual perception and Large Language Model driving providesaninteligent interaction control scheme foropen environments.The systemadopts a thre-layerarchitecture design，where the bottom execution layer takes Raspberry Pi5 asitscore， the top perception layer is constructed with cameras andmicrophones，andtheystem'sprocessinglaerisdeployedinthecloudintegratingproceingmodulessuchastefinetuned MiniCPMmodel，SenseVoicespeechrecognitionmodel，GroundingDINOzero-shotObjectDetectionmodeladDepth Anythingmonocular DepthEstimation model.Through theedge-cloudcolaboration mechanism，the systemcan decompose naturallanguage instructions into three subtasks including speechrecognition，semantic parsingandenvironmental pereption， andfinally generate specific motioncontrol instructions.Testresultsshowthatthesystemachieves highaccuracy inspeech recognitionandinstructionparsing，canfectivelyrecognizecomplexandvariablenaturalanguagecommands，andsuesfully breaks through the limitation that traditional embedded inteligent systems relyon fixed instruction sets.

Keywords： Large Language Model; speech recognition; Raspberry Pi; visual perception; Object Detection

0 引言

随着大语言模型、多模态模型和嵌入式系统技术的快速发展，基于大模型的具身智能在各类任务中取得了良好的效果，展现出强大的泛化能力与在各领域内广阔的应用前景[]。（剩余8497字）

试读结束

购买全文6.00元下一篇工业CT几何参数求解方法研究

现代信息科技

2025年22期

¥18.00/本