关于大语言模型一体化评测的研究和实践

打印
收藏

收藏成功

微博 QQ空间微信

打开文本图片集

中图分类号：TP391.1

文献标识码：A 文章编号：2096-4706（2025）11-0059-06

Research and Practice on Integrated Evaluation of Large Language Models

HEQi,HANXiao,MAOHaotian,QIUJianmin (ChinaTelecomCorporationLimitedJiangsu Branch,Nanjing21oo37,China)

Abstract: With the increasing application of LLMs, how to accurately, objectivelyand comprehensively evaluate the ability of large models has becomeanimportanttopicofcommon concern inacademia and idustry.Inrecentyears,Jiangsu Telecom hasactivelycarriedoutthe exploration and practice of LLMs,and reconstructed multiple applications in the BMO domains through large models.Thispaperintroduces theintegratedevaluationschemeandsystempracticeofJiangsuTelecom basedonthecurrntopensourcebig modelecology.Thisschemecanagilelyaccessthelatestreleasedopensourcelargemodels, and realize theblind testselectionoflarge models basedonpracticalapplications,providing ausefulreference forbuilding a morescientificand perfectLargeLanguageModel evaluationsystem.

Keywords:LLMs; evaluation; framework

0 引言

在大模型应用实践初期，往往通过算力分配的方式，由各应用方自行开展大模型实践。（剩余5772字）

试读结束

购买全文5.00元下一篇基于深度学习的樱桃图像分类检测

现代信息科技

2025年11期

¥18.00/本